STAT 536: Genetic Statistics

Size: px
Start display at page:

Download "STAT 536: Genetic Statistics"

Transcription

1 STAT 536: Genetic Statistics Karin S. Dorman Department of Statistics Iowa State University August 22, 2006

2 What is population genetics? A quantitative field of biology, initiated by Fisher, Haldane and Wright in the 1920 s and 1930 s, whose results are largely theoretical rather than observation or experimental. Primary object of study: frequencies and fitnesses of genotypes in natural populations. Primary goal of study: develop models for how frequencies change over time in natural populations and by sampling existing natural populations, assess the historical events in their history and the forces acting on them now. The role of statistics in population genetics is to Estimate model parameters. Test theoretical models given data.

3 Practical Applications The genome projects are producing vast amounts of genetic information and it is the job of statisticians to find how that information correlates/associates with important diseases and traits. What genes are responsible for disease? How do genes interact with the environment to produce disease? What genes determine desirable traits in agricultural species? What is the best breeding strategy to selected, engineered, designed?

4 Example questions of interest Did a population experience a drastic bottleneck (population shrinkage) in the past? For example, the buffalo of the western plains or humans on emigration from Africa or during the Black Plague. Are two distinguishable populations mixing genetically? How much? For example, can tsetse flies cross the great natural obstacle of Lake Victoria? Why did sex evolve? What impact does inbreeding have on a population? For example, how much mutational damage has consanguineous marriages in human populations caused?

5 Hot Applications - African Eve Mom s eggs execute Dad s mitochondria. Science News. 157(1):5

6 Hot Applications - Florida Dentist From K. A. Crandall (1995) J. Virol. 69:

7 Plant Evidence Medical Detectives: Explorations in Forensic Science series considered the following case: A murdered woman is found under a tree. A nearby pager is tracked to a suspect. Is he the murderer? Seed pods, not unlike those produced by the tree, are recovered from the suspect s car. Could they have come from the tree near the murdered woman? RAPD markers were used to link the seed pods in the truck to the seed pods in the tree. How do we know a match implies the seed pods in the truck came from the tree?

8 The Cell Consider claims like: Some people are more susceptible to obesity/diabetes/heart disease because of their family history.

9 Chromosomes Carry the genetic information of an individual. Found in the nucleus of the cell in the vast majority of multicellular organisms. One complete copy in almost every cell of the body. Composed of deoxyribonucleic acid (DNA) and proteins.

10 Chromosome Terminology The centromere is a structure near the middle of the chromosome. The telomere is a structure near the ends of the chromosome arms. The centromere and telomere are involved in cell division.

11 DNA

12 DNA There are four bases: A, C, G, T. The bases are complementary, so A binds to T and C binds to G. A nucleotide is a base connected to one sugar and one phosphate. The carbons of the sugar are labeled 1 to 5. The 5 and the 3 carbons are connected in the linear strand via phosphate group. Two DNA molecules differ only in the sequence of bases. Thus, we can summarize DNA by writing the linear sequence of bases going from 5 to 3 along one strand. The length of a DNA molecule is measured in base pairs (bps).

13 Genes Genes in DNA are transcribed into RNA (single-stranded linear molecule that uses ribose instead of deoxyribose sugar). RNA is translated into proteins, a linear sequence of amino acids that fold into intricate 3-dimensional structures.

14 Genetic Code

15 Meisosis Diploid organisms (such as ourselves) receive one copy of each chromosome from both parents. The two matched chromosomes are called homologous. A cell with a single set of chromosomes, one of each type, is termed haploid. Homologous chromosomes are distributed randomly and independently of each other to daughter cells.

16 Genetic Variability The linear sequence along the chromosome received from your father is not identical to the sequence along the homologous chromosome received from your mother. In general, unless you have an identical twin, you will not have the same linear sequence as anyone else. The resulting genetic variation between individuals is the basis of population genetics studies. Genetic variation is created by abnormal pairing of chromosomes during meiosis, leading to rearrangement of chromosomal segments resulting in insertions, deletions, inversions, translocations, or duplications. point mutations where the wrong nucleotide is substituted during the copy process

17 Terminology of Genetic Variability locus: location along a chromosome. allele: any one of the alternative forms possible at a locus, often symbolized as A and a, for example. genotype: specific genetic makeup of an organism, at a locus, for example AA, Aa, and aa. phenotype: observable feature of an organism that is a product of the genotype and environment. dominant allele: if present at least once, this allele will result in the dominant phenotype. For example, AA and Aa have the same phenotype when A is dominant. codominant alleles: if two alleles are codominant relative to each other, then all three genotypes AA, Aa, and aa are all distinguishable at the phenotype level. recessive alleles: a is called the recessive allele if A is the dominant.

18 Genetic Terminology Continued homozygous: an individual is homozygous at a locus if both alleles are the same. heterozygous: an individual is heterozygous at a locus if the alleles are different. transition: mutation A<->G or C<->T transversion: mutation A,G <-> C,T nonsynonymous: mutation that changes the encoded amino acid synonymous: mutation that does not change the encoded amino acid polymorphic locus: one for which alternative alleles exist

19