INTRODUCTION TO GENETIC EPIDEMIOLOGY (Antwerp Series) Prof. Dr. Dr. K. Van Steen

Size: px
Start display at page:

Download "INTRODUCTION TO GENETIC EPIDEMIOLOGY (Antwerp Series) Prof. Dr. Dr. K. Van Steen"

Transcription

1 INTRODUCTION TO GENETIC EPIDEMIOLOGY (Antwerp Series) Prof. Dr. Dr. K. Van Steen

2 POPULATION BASED GENOME-WIDE ASSOCIATION STUDIES 1 Setting the pace 1.a A hype about GWA studies 1.b Genetic terminology revisited 1.c Genetic association studies 2 Study Designs 2.a Marker level 2.b Subject level 2.c Gender level

3 3 Preliminary analyses 3.a Quality Control 3.b Population genetics 3.c Linkage disequilibrium, haplotypes and SNP tagging 3.d Confounding: population stratification 4 Tests of association 4.a Single SNP 4.b Repeated single SNP tests: Multiple testing correction 4.c Replication 5 Interpretation and follow-up

4 1 Setting the pace 1.a A hype about GWA studies May he live in interesting times. Like it or not we live in interesting times. Robert Kennedy, June 7, 1966

5 How much (sequence) data are available? The complete genome sequence of humans and of many other species provides a new starting point for understanding our basic genetic makeup and how variations in our genetic instructions result in disease.

6

7

8

9

10

11

12 2008 third

13 The pace of the molecular dissection of human disease can be measured by looking at the catalog of human genes and genetic disorders identified so far in OMIM, which is updated daily ( (V. A. McKusick, Mendelian Inheritance in Man (Johns Hopkins Univ. Press, Baltimore, ed. 12, 1998))

14

15 What is OMIM? Online Mendelian Inheritance in Man (OMIM ) is a continuously updated catalog of human genes and genetic disorders and traits, with particular focus on the molecular relationship between genetic variation and phenotypic expression. It is thus considered to be a phenotypic companion to the Human Genome Project. OMIM is a continuation of Dr. Victor A. McKusick's Mendelian Inheritance in Man, which was published through 12 editions, the last in OMIM is currently biocurated at the McKusick-Nathans Institute of Genetic Medicine, The Johns Hopkins University School of Medicine. Frequently asked questions:

16

17

18

19

20 1.b Genetic terminology revisited What is genetic epidemiology? (Ziegler and Van Steen, Brazil 2010)

21 Where is the genetic information located? (Ziegler and Van Steen, Brazil 2010)

22 Where is the genetic information located? (Ziegler and Van Steen, Brazil 2010)

23 Where is the genetic information located? (Ziegler and Van Steen, Brazil 2010)

24 What is recombination? (Ziegler and Van Steen, Brazil 2010)

25 How much do individuals differ with respect to genetic information? (Ziegler and Van Steen, Brazil 2010)

26 How much do individuals differ with respect to genetic information? Genotype: The two alleles inherited at a specific locus. If the alleles are the same, the genotype is homozygous, if different, heterozygous. In genetic association studies, genotypes can be used for analysis as well as alleles or haplotypes. Haplotype: Linear arrangements of alleles on the same chromosome that have been inherited as a unit. A person has two haplotypes for any such series of loci, one inherited maternally and the other paternally. A haplotype may be characterized by a single allele unless a discrete chromosomal segment flanked by two alleles is meant.

27 Are haplotypes always better in association studies for disease? Analyses based on phased haplotype data rather than unphased genotypes may be quite powerful Test 1 vs. 2 for M1: Test 1 vs. 2 for M2: Test haplotype H1 vs. all others: D + d vs. d D + d vs. d D vs. d If the Disease Susceptibility Locus (DSL) is located at a marker, haplotype testing can be less powerful

28 How can individual differences be detected? (Ziegler and Van Steen, Brazil 2010)

29 What are microsatellite markers? (Ziegler and Van Steen, Brazil 2010)

30 What are single nucleotide polymorphisms? (Ziegler and Van Steen, Brazil 2010)

31 Why are SNPs preferred over STRs? (Ziegler and Van Steen, Brazil 2010)

32 Which genotyping methods are currently being used? (Ziegler and Van Steen, Brazil 2010)

33 Which genotyping methods are currently being used? (Ziegler and Van Steen, Brazil 2010)

34 1.c Genetic association studies What is a genome-wide association study? It refers to a method / methodology for interrogating all 10 million variable points across the human genome. Since variation is inherited in groups, or blocks, not all 10 million points have to be tested. Blocks are shorter though (so need for testing more points) the less closely people are related.

35

36

37 What is a genome-wide association study? Hence, a genome-wide association study is an approach that involves rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. Once new genetic associations are identified, researchers can use the information to develop better strategies to detect, treat and prevent the disease. ( The impact on medical care from genome-wide association studies could potentially be substantial. Such research is laying the groundwork for the era of personalized medicine, in which the current one size-fits-all approach to medical care will give way to more customized strategies.

38 What do we need to carry out a genome-wide association study? The tools include - computerized databases that contain the reference human genome sequence, - a map of human genetic variation and - a set of new technologies that can quickly and accurately analyze whole-genome samples for genetic variations that contribute to the onset of a disease. (

39 What do we need to carry out a genome-wide association study?

40 What do we need to carry out a genome-wide association study? To distinguish between true and chance effects, there are several routes to be taken: - Set tight standards for statistical significance - Only consider patterns of polymorphisms that could plausibly have been generated by causal genetic variants (use understanding of and insights into human genetic history or evolutionary processes such as recombination or mutation) - Adequately deal with distorting factors, including missing data and genotyping errors (quality control measures)

41 What is the flow of a genome-wide association study? The genome-wide association study is typically (but not solely!!!) based on a case control design in which single-nucleotide polymorphisms (SNPs) across the human genome are genotyped... (Panel A: small fragment)

42 What is the flow of a genome-wide association study? Panel B, the strength of association between each SNP and disease is calculated on the basis of the prevalence of each SNP in cases and controls. In this example, SNPs 1 and 2 on chromosome 9 are associated with disease, with P values of and 10 8, respectively (Manolio 2010)

43 What is the flow of a genome-wide association study? The plot in Panel C shows the P values for all genotyped SNPs that have survived a quality-control screen, with each chromosome shown in a different color. The results implicate a locus on chromosome 9, marked by SNPs 1 and 2, which are adjacent to each other (graph at right), and other neighboring SNPs. (Manolio 2010)

44 What is the flow of a genome-wide association study? (Ziegler 2009)

45 2 Study Designs What are the components of a study design for GWA studies? The design of a genetic association study may refer to - study scale: Genome-wide Genomic - marker design: Which markers are most informative? Microsatellites? SNPs? CNVs? Which platform is the most promising? - subject design

46 Does scale matter? candidate gene approach Can t see the forest for the trees vs genome-wide screening approach Can t see the trees for the forest

47 Does scale matter?

48 Which genetic markers to select? The Common Disease/Common Variant hypothesis (CDCV) Continuous distribution of genetic variants, shaped by mutation and selection (Ziegler and Van Steen, 2010)

49 Dichotomous Traits Quantitative Traits Observations: The higher the MAF (minor allele frequency), the higher the detection rate? The higher the MAF, the lower the penetrance?

50 Types of genetic diseases Monogenic diseases are those in which defects in a single gene produce disease. Often these disease are severe and appear early in life, e.g., cystic fibrosis. For the population as a whole, they are relatively rare. In a sense, these are pure genetic diseases: They do not require any environmental factors to elicit them. Although nutrition is not involved in the causation of monogenic diseases, these diseases can have implications for nutrition. They reveal the effects of particular proteins or enzymes that also are influenced by nutritional factors (

51 Oligogenic diseases are conditions produced by the combination of two, three, or four defective genes. Often a defect in one gene is not enough to elicit a full-blown disease; but when it occurs in the presence of other moderate defects, a disease becomes clinically manifest. It is the expectation of human geneticists that many chronic diseases can be explained by the combination of defects in a few (major) genes. A third category of genetic disorder is polygenic disease. According to the polygenic hypothesis, many mild defects in genes conspire to produce some chronic diseases. To date the full genetic basis of polygenic diseases has not been worked out; multiple interacting defects are highly complex!!! (

52 Complex diseases refer to conditions caused by many contributing factors. Such a disease is also called a multifactorial disease. - Some disorders, such as sickle cell anemia and cystic fibrosis, are caused by mutations in a single gene. - Common medical problems such as heart disease, diabetes, and obesity likely associated with the effects of multiple genes in combination with lifestyle and environmental factors, all of them possibly interacting. Challenge for many years to come

53 (Glazier et al 2002)

54 Which genetic markers to select? Linkage exists over a very broad region, entire chromosome can be done using data on only DNA markers Broad linkage regions imply studies must be followed up with more DNA markers in the region (Figure: courtesy of Ed Silverman) Must have family data with more than one affected subject E.g., microsatellites

55 Which genetic markers to select? Association exists over a narrow region; markers must be close to disease gene - The basic concept is linkage disequilibrium (LD) see later in this chapter Initially used for candidate genes or in linked regions Can use population-based (unrelated cases) or familybased design E.g., SNPs

56 Which DNA SNPs to select? Costs may play a role, but a balance is needed between costs and chip performance as well as coverage (e.g., exonic regions only?) Some of the fundamental principles of array technology (see future class)

57 Which DNA SNPs to select? (adapted from Manolio 2010)

58 How can technology bias be avoided? (Ziegler and Van Steen, Brazil 2010)

59 How can technology bias be avoided? (Ziegler and Van Steen, Brazil 2010)

60 How can technology bias be avoided? (Ziegler and Van Steen, Brazil 2010)

61 Next generation sequencing will overtake array technology? The competing hypothesis to the CDCV hypothesis is the Common Disease/Rare Variant (CDRV) hypothesis. It argues that multiple rare DNA sequence variations, each with relatively high penetrance, are the major contributors to genetic susceptibility to common diseases. Although some common variants that underlie complex diseases have been identified, and given the recent huge financial and scientific investment in GWA studies, there is no longer a great deal of evidence in support of the CDCV hypothesis and much of it is equivocal... Hence, nowadays, both CDCV and CDRV hypotheses have their place in current research efforts.

62 Next generation sequencing will overtake array technology?

63 Next generation sequencing will overtake array technology?

64 Crucial question: How to best capture disease predisposition? (Gut 2012)

65 Which study subjects to select? (Cordell and Clayton 2005)

66 Which study subjects to select? (Ziegler and Van Steen, 2010)

67 Which study subjects to select? (Ziegler and Van Steen, 2010)

68 Which study subjects to select? (Ziegler and Van Steen, 2010)

69 Which study subjects to select? (Ziegler and Van Steen, 2010)

70 Which study subjects to select?

71 Which study subjects to select? Rare versus common diseases (Lange and Laird 2006)

72 3 Preliminary analyses Is there a standard file format for GWA studies?

73 Is there a standard file format for GWA studies?

74 3.a Quality control Why is quality control important? BEFORE (false positives!!!!): (Ziegler and Van Steen 2010)

75 Why is quality control important? AFTER: (Ziegler and Van Steen 2010)

76 What is the standard quality control? Quality control on different levels: o Subject or sample level o SNP level o X-chromosomal SNP level

77 What are standard filters on the sample level? (Ziegler and Van Steen 2010)

78 What are standard filters on the SNP level? (Ziegler and Van Steen 2010)

79 Reading genotype calling cluster plots

80 What are standard filters on the gender level? (Ziegler and Van Steen 2010)

81 When genotypes are missing: Is there a power advantage in imputing?

82 Is there a power advantage in imputing?

83 Is there a power advantage in imputing?

84 Is there a power advantage in imputing? (Spencer et al 2009)

85 What are the Travemünde criteria? (Ziegler 2009)

86 What are the Travemünde criteria? (Ziegler 2009)

87 3.b Population genetics What does population genetics mean? The concept of frequency lies at the heart of population genetics:

88 What does population genetics mean? A gene by itself is a constant entity (but perhaps harbors mutation) Alternative forms of a gene (allele) can exist at certain frequencies in a population These frequencies can change (via genetic drift, selection, mutation) Resulting in adaptation and evolution There are two major facets: - Describing the pattern of genetic diversity - Investigating the processes that generate this diversity

89 What is it useful for? Breeding - Plant and animal breeding - Pesticide and herbicide resistance - Effects of release of GM organisms People - Forensic analysis (DNA fingerprinting) - Identification of genes for complex human traits - Genetic counseling - Study of human evolution and human origin Ecology and evolution - Inferring evolutionary processes - Preservation of endangered species

90 Theoretical population genetics General theoretical models predict evolution of gene frequencies and other things Highly dependent upon assumptions May or may not be realistic Mathematically satisfying h r ti iri ri t

91 Empirical population genetics Apply statistical models to real data to infer underlying processes Again, adequate sampling is necessary to achieve statistical power Empirical population genetics is often emphasized due to the enormous volumes of genetic data that is out there. h r ti iri ri t

92 Experimental population genetics Test hypotheses in population genetics using controlled experiments Design such that alternative outcomes possible, some of which can reject hypothesis Need controls, replicates and adequate sample size Usually restricted to model organisms - Drosophila, Neurospora, some crop plants h r ti iri ri t

93

94 How does evolution take place? Charles Darwin ( ) In the 19th century, a man called Charles Darwin, a biologist from England, set off on the ship HMS Beagle to investigate species in exotic places (e.g., Galapagos islands). After spending time on the islands, he soon developed a theory that would contradict the creation of man and imply that all species derived from common ancestors through a process called natural selection. Natural selection is considered to be the biggest factor resulting in the diversity of species and their genomes.

95 Darwin s principles One of the prime motives for all species is to reproduce and survive, passing on the genetic information of the species from generation to generation. When species do this they tend to produce more offspring than the environment can support. The lack of resources to nourish these individuals places pressure on the size of the species population, and the lack of resources means increased competition and as a consequence, some organisms will not survive. The organisms that die as a consequence of this competition were not totally random. Darwin found that those organisms more suited to their environment were more likely to survive. This resulted in the well-known phrase survival of the fittest, where the organisms most suited to their environment had more chance of survival if the species falls upon hard times. (see also (

96 Darwin s tree of life Those organisms who are better suited to their environment exhibit desirable characteristics, which is a consequence of their genome being more suitable to begin with. The Tree of Life image that appeared in Darwin s On the Origin of Species by Natural Selection, It was the book's only illustration

97 A group at the European Molecular Biology Laboratory (EMBL) in Heidelberg has developed a computational method that resolves many of the remaining open questions about evolution and has produced what is likely the most accurate tree of life ever:

98 What are the basic characteristics of evolution? Some terminology: A population refers to a group of interbreeding individuals of the same species sharing a common geographical area A species is a group of populations that have the potential to interbreed in nature and produce viable offspring Gene pool is the sum total of all the alleles within a population Evolution refers to the changes in gene (allele) frequencies in a population over time; Evolution takes place at the population, not species level. So populations, not species evolve

99 There are 4 processes of evolution Natural variation Mutation: refers to changes in nucleotide sequences of DNA. Mutations provide new alleles, and therefore are the ultimate source of variation Recombination: refers to reshuffling of the genetic material during meiosis Other variation types Natural selection Reproductive isolation

100 What is natural selection? Natural selection refers to differential reproduction. Organisms with more advantageous gene combinations secure more resources, allowing them to leave more progeny. It is a negative force, nature selects against, not for. Ultimately natural selection leads to adaptation - the accumulation of structural, physiological or behavioral traits that increase an organism's fitness.

101 There are 3 types of natural selection Stabilizing Directional Disruptive Stabilizing Selection maintains an already well adapted condition by eliminating any marked deviations from it. As long as the environment remains unchanged the fittest organisms will also remain unchanged. - Human birth weight averages about seven pounds. Very light or very heavy babies have lower chances of survival. - Stabilizing selection accounts for "living fossils" - organisms that have remained seemingly unchanged for millions of years.

102 Directional Selection favors one extreme form over others. Eventually it produces a change in the population. Directional selection occurs when an organism must adapt to changing conditions. - Industrial melanism: The peppered moths (Biston betularia) fly by night and rest during the day on lichen covered tree trunks where they are preyed upon by birds. Prior to the industrial revolution most of the moths were light colored and well camouflaged. A few dark (melanistic) were occasionally noted. During the industrial revolution soot began to blacken the trees and also cause the death of the lichens. The light colored moths were no longer camouflaged so their numbers decreased quite rapidly. With the blackening of the trees the numbers of dark moths rapidly increased. The frequency of the dark allele increased from less than 1% to over 98% in just 50 generations. Since the 1950's, attempts to reduce industrial pollution in Britain have resulted in an increase in numbers of light form. - Antibiotic resistance in bacteria several resistant strains exist due to overuse

103 Disruptive Selection occurs when two or more character states are favored. - African butterflies (Pseudacraea eurytus) range from orange to blue. Both the orange and blue forms mimic (look like) other foul tasting species (models) so they are rarely eaten. Natural selection eliminates the intermediate forms because they don't look like the models.

104 What is reproductive isolation? Reproductive isolation is the inability of a species to breed successfully with related species due to geographical, behavioral, physiological, or genetic barriers or differences Reproductive isolating mechanisms may restrict the gene flow: - Ecological isolation: The geographic ranges of two species overlap, but their ecological needs or breeding requirements differ enough to cause reproductive isolation - Temporal isolation: when two species whose ranges overlap have different periods of sexual activity or breeding seasons - Behavioral isolation: Species with complex courtship rituals (breeding calls, mating dances, etc.) usually exhibit a stereotyped "give-and-take" between male and female before actual mating takes place. - Mechanical isolation: Morphological differences prevent mating (examples: see

105 What is Hardy-Weinberg Equilibrium (HWE)? The concept HWE arises from the following problem: How do allele frequencies relate to genotype frequencies in a population? If we have genotype frequencies, we can easily compute allele frequencies: -

106 Example on cystic fibrosis

107 Example on cystic fibrosis (continued)

108 From allele frequencies to genotype frequencies Discrete generations No migration or selection No mutation No population stratification Equal initial genotype frequencies in the two sexes Infinite population size Random mating

109 From allele frequencies to genotype frequencies Early in the 20th century biologists believed that natural selection would eventually result in the dominant alleles driving out or eliminating the recessives. Therefore, over a period of time genetic variation would eventually be eliminated in a population At some point the geneticist Punnett was asked to explain the prevalence of blue eyes in humans despite the fact that it is recessive to brown. He couldn't do it so he asked a mathematician colleague named Hardy to explain it. A physician named Weinberg came up with a similar explanation, describing the genetics of non-evolving populations The Hardy-Weinberg law was born: The frequencies of alleles in a population will remain constant unless acted upon by outside agents or forces. A non-evolving population is said to be in Hardy-Weinberg equilibrium

110 What does E stand for in HWE? Equilibrium

111 What does E stand for in HWE?

112 What does E stand for in HWE?

113 What does E stand for in HWE?

114 Computations under HWE are simple Punnett s square

115 Can HWE be visualized? A useful way to look at frequencies in HWE is the following

116 What are signs of deviations from HWE? Decreased or increased HET

117 What are signs of deviations from HWE? Inbreeding

118 Inbreeding coefficient F for an individual X The probability of 2 alleles at a randomly chosen locus being IBD can be computed via the probabilities Hence, F X =4 1/16=1/4

119 Inbreeding coefficient F for an individual X The shortcut loop method determines for a loop (path through common ancestor) a contribution of (½) n to F, where n is the number of individuals in the loop, X excluded Hence, for more than one loop, determine each time (½) n, and sum the contributions to F In the example, the loops are: - ADXC: (½) 3 - BDXC: (½) 3 leading to F X = (½) 3 + (½) 3 = ¼

120 Change in genotype frequencies in response to inbreeding Inbreeding increases expression of recessive alleles: - Genotype frequencies for non-inbred: p 2, 2pq, q 2 - Genotype frequencies for inbred: p 2 +Fpq, 2pq-2Fpq, q 2 +Fpq For instance, if q=0.02, then

121 Change in genotype frequencies in response to inbreeding (continued) For instance, if p=q=0.5 Observe that the allele frequencies f(a)=q and f(a)=p do not change: - f(a) = (q 2 + pqf) + ½ (2pq-2pqF) = q 2 +pq=q(q+p)=q - f(a) = (p 2 + pqf) + ½ (2pq-2pqF) = p 2 +pq=p(p+q)=p

122 Are there other genetic similarity coefficients? F IT is the inbreeding coefficient of an individual (I) relative to the total (T) population; F IS is the inbreeding coefficient of an individual (I) relative to the subpopulation (S) and F ST is the effect of subpopulations (S) compared to the total population (T)

123 Population subdivision

124 Wright s F statistics

125 What are generalized Hardy-Weinberg deviations?

126 Generalized Hardy-Weinberg deviations - example

127 Generalized Hardy-Weinberg deviations - example

128 Generalized Hardy-Weinberg deviations example

129 How can F ST for subpopulations be estimated?

130 How can F ST for subpopulations be estimated?

131 How can F ST for subpopulations be estimated?

132 How can F ST for subpopulations be estimated?

133 How can F ST for subpopulations be estimated?

134 How can HWE be tested?

135 Chi-square goodness-of-fit test

136 Chi-square goodness-of-fit test

137 HWE exact test

138 HWE exact test

139 HWE exact test

140 How do both HWE test compare? Type I error

141 How do both HWE test compare? Type I error

142 How do both HWE test compare?

143 How to interpret HWE test results? A useful tool for interpreting the results of HWE and other tests on many SNPs is the log quantile quantile (QQ) p-value plot: - the negative logarithm of the i-th smallest p-value is plotted against log (i / (L + 1)), where L is the number of SNPs. The 0.3 (or 30%) quantile is the point at which 30% percent of the data fall below and 70% fall above that value. A 45-degree reference line is also plotted as visualization tool: - If the two sets come from a population with the same distribution, the points should fall approximately along this reference line. - The greater the departure from this reference line, the greater the evidence for the conclusion that the two data sets have come from populations with different distributions.

144 How to interpret HWE test results? (Balding 2006)

145 Genetic Association Studies 3.c Linkage disequilibrium, haplotypes and SNP tagging Distances among cities Boston New York Providence Philadelphia Baltimore Providence 59 New York Philadelphia Baltimore Washington

146 Distances among cities

147 Distances among SNPs If a causal polymorphism is not genotyped, we can still hope to detect its effects through Linkage Disequilibrium (LD) with polymorphisms that are typed (key principle behind doing genetic association analysis ). LD is a measure of co-segregation of alleles in a population: Two alleles at different loci that occur together on the same chromosome (or gamete) more often than would be predicted by random chance. In general, LD is taken to be a measure of allelic association. Among the measures that have been proposed for two-locus haplotype data, the two most important are D (Lewontin s D prime) and r 2 (the square correlation coefficient between the two loci under study).

148 Distances among SNPs (Christensen and Murray 2007)

149 Distances among SNPs (Samani et al 2007)) These plots can be generated using the free software Haploview, but also in R! We talk about LD blocks 1 and 2

150 What is linkage disequilibrium (LD)? - more formal definition

151 What does D represent as a measure of LD?

152 What does D represent as a measure of LD? Can similarly show that D Ab = -D AB and D ab = D AB LD is a property of two loci, not their alleles The magnitude of D does not depend on the choice of alleles, thus the magnitude of the coefficient is important, not the sign The range of values the linkage disequilibrium coefficient can take on, varies with allele frequencies

153 What does D represent as a measure of LD?

154 How can D be estimated?

155 How can D be estimated?

156 How to perform a test for LD with D?

157 What does D (Lewontin s D prime ) represent as a measure of LD?

158 How far does linkage disequilibrium extend? (Hecker et al 2003)

159 What does r 2 represent as a measure of LD?

160 How to perform a test for LD with r 2? r 2

161 How to choose between measures of LD? Sample size must be increased by a factor of 1/r 2 to detect an unmeasured variant, compared with the sample size for testing the variant itself. Hence, r 2 has a very nice and useful interpretation! (Jorgenson and Witte 2006) If two loci both have very rare alleles but the loci are not in high LD, it is possible for D to be 1 and r 2 to be small. D is problematic to interpret with rarer alleles.

162 Why does LD occur? Genetic drift: random events due to small/finite population size. E.g., consider a population of 1 million almond trees with a frequency of an allele r at 10%. If a severe ice storm wiped out half, leaving 500,000, it is very likely that the r allele would still be present in the population. However, suppose the initial population size of almond trees were 10 (with the same frequency of r at 10%). It is likely that the same ice storm could wipe the r allele out of the small population. Scenario 1: Intense natural selection or a disaster can cause a population bottleneck, a severe reduction in population size which reduces the diversity of a population. The survivors have very little genetic variability and little chance to adapt if the environment changes. E.g., by the 1890's the population of northern elephant seals was reduced to only 20

163 individuals by hunters. Even though the population has increased to over 30,000 there is no genetic variation in the 24 alleles sampled. A single allele has been fixed by genetic drift and the bottleneck effect. In contrast southern elephant seals have wide genetic variation since their numbers have never reduced by such hunting. Scenario 2: Bottleneck effect, combined with inbreeding, is an especially serious problem for many endangered species because great reductions in their numbers have reduced their genetic variability. This makes them especially vulnerable to changes in their environments and/or diseases. The Cheetah is a prime example.

164 Scenario 3: Sometimes a population bottleneck or migration event can cause a founder effect. A founder effect occurs when a few individuals unrepresentative of the gene pool start a new population. E.g., a recessive allele in homozygous condition causes Dwarfism. In Switzerland the condition occurs in 1 out of 1,000 individuals. Amongst the 12,000 Amish now living in Pennsylvania the condition occurs in 1 out of 14 individuals. All the Amish are descendants of 30 people who migrated from Switzerland in The 30 founder individuals carried a higher than normal percentage of genes for dwarfism. Mutation - if a new mutation occurs in a population then alleles at loci linked with the mutant locus will maintain linkage disequilibrium for many generations. LD lasts longer when linkage is greater (that is, the recombination fraction is much smaller than ½ - very close to 0).

165 Why does LD occur? (continued)

166 How to interpret LD data? The patterns of LD observed in natural populations are the result of a complex interplay between genetic factors and the population's demographic history (Pritchard, 2001). LD is usually a function of distance between the two loci. This is mainly because recombination acts to break down LD in successive generations (Hill, 1966). When a mutation first occurs it is in complete LD with the nearest marker (D' = 1.0). Given enough time and as a function of the distance between the mutation and the marker, LD tends to decay and in complete equilibrium reached D' = 0 value. Thus, it decreases at every generation of random mating unless some process is opposing to the approach to linkage 'equilibrium'.

167 How to interpret LD data? Therefore, the key concept in a (population-based) genetic association study is linkage disequilibrium.

168 How to interpret LD data? It gives the rational for performing genetic association studies Phenotype: The visible or measurable (expressed) characteristic of an organism Trait: Coded phenotype

169 How can one tag SNP serve as proxy for many? (adapted from Manolio 2010)

170 How can one tag SNP serve as proxy for many? (adapted from Manolio 2010)

171 Where is the true causal variant? (Duerr et al 2006)

172 3.d Confounding What is spurious association? Spurious association refers to false positive association results due to not having accounted for population substructure as a confounding factor in the analysis

173 What is spurious association? Typically, there are two characteristics present: - A difference in proportion of individual from two (or more) subpopulation in case and controls - Subpopulations have different allele frequencies at the locus.

174 What are typical methods to deal with population stratification? Methods to deal with spurious associations generated by population structure generally require a number (at least >100) of widely spaced null SNPs that have been genotyped in cases and controls in addition to the candidate SNPs. These methods large group into: o Genomic control methods o Structured association methdos o Principal component-based methods

175 What is genomic control? In Genomic Control (GC), a 1-df association test statistic is computed at each of the null SNPs, and a parameter λ is calculated as the empirical median divided by its expectation under the chi-squared 1-df distribution. Then the association test is applied at the candidate SNPs, and if λ > 1 the test statistics are divided by λ.

176 What is genomic control? The motivation for GC is that, as we expect few if any of the null SNPs to be associated with the phenotype, a value of λ > 1 is likely to be due to the effect of population stratification, and dividing by λ cancels this effect for the candidate SNPs. GC performs well under many scenarios, but can be conservative in extreme settings (and anti-conservative if insufficient null SNPs are used). There is an analogous procedure for a general (2 df) test; The method can also be applied to other testing approaches.

177 What is a structured association method? Structured association (SA) approaches are based on the idea of attributing the genomes of study individuals to hypothetical subpopulations, and testing for association that is conditional on this subpopulation allocation. Several clustering algorithms exist to estimate the number of subpopulations. These approaches (such as Bayesian clustering approaches) are computationally demanding, and because the notion of subpopulation is a theoretical construct that only imperfectly reflects reality, the question of the correct number of subpopulations can never be fully resolved.

178 What is principal components analysis? When many null markers are available, principal components analysis provides a fast and effective way to diagnose population structure. Principal components are linear combinations of the original variables (here SNPs) that optimized in such a way that as much of the variation in the data is retained.

179 In European data, the first 2 principal components nicely reflect the N-S and E-W axes! Y-axis: PC2 (6% of variance); X-axis: PC1 (26% of variance)

180

181 Does the same hold on a global (world) level? (Paschau 2007)

182 4 Tests of association What is the causal model underlying genetic association? (Ziegler and Van Steen 2010)

183 4.a Single SNP What are common association tests (dichotomous traits)? (Ziegler and Van Steen 2010)

184 What are common association tests (dichotomous traits)? (Ziegler and Van Steen 2010)

185 What are common association tests (dichotomous traits)? Penetrances for simple Mendelian inheritance patterns Trait T: coded phenotype Penetrance: P(T Genotype) Complete penetrance: P(T DD) = 1 (simplified definition)

186 What are common association tests (dichotomous traits)?

187 What are common association tests (dichotomous traits)? The Cochran-Armitage trend test measures a linear trend in proportions weighted by general measure of exposure dosage: variable x in regression model =#alleles Max test: computes maximum over standardized tests for different genetic models, providing a global test

188 Which test should be used in applications?

189 How are genetic effects measured? RR being

190 Which odds ratios (measures of effect) can we expect? (A and B) Histograms of susceptibility allele frequency and MAF, respectively, at confirmed susceptibility loci.. (Iles 2008)

191 Which odds ratios (measures of effect) can we expect? (C) Histogram of estimated ORs (estimate of genetic effect size) at confirmed susceptibility loci. (Iles 2008)

192 4.b Repeated single SNP tests The regression framework Regression analysis is used for explaining or modeling the relationship between a single variable Y, called the response, output or dependent variable, and one or more predictor, input, independent or explanatory variables, X 1,, X m. When m=1 it is called simple regression but when m > 1 it is called multiple regression. When there is more than one Y, then it is called multivariate regression The basic syntax for doing regression in R is lm(y~model) to fit linear models and glm() to fit generalized linear models (e.g. logistic regression models in the dichotomous trait setting before). Next slide: syntax!

193

194

195

196

197

198

199

200

201

202 Can screening for 1000nds of SNPs be performed automatically in R? GenAbel is designed for the efficient storage and handling of GWAS data with fast analysis tools for quality control, association with binary and quantitative traits, as well as tools for visualizing results. pbatr provides a GUI to the powerful PBAT software which performs family and population based family and population based studies. The software has been implemented to take advantage of parallel processing, which vastly reduces the computational time required for GWAS. SNPassoc provides another package for carrying out GWAS analysis. It offers descriptive statistics of the data (including patterns of missing data!) and tests for Hardy-Weinberg equilibrium. Single-point analyses with binary or quantitative traits are implemented via generalized linear models, and multiple SNPs can be analyzed for haplotypic associations or epistasis.

203 Is there one tool that fits it all? NO (

204 Other analytic methods Recursive Partitioning (CART; Breiman 1984, Foulkes 2005) Random Forests (Pavolov 1997) Combinatorial Partitioning (Nelson 2001) Multifactor-Dimensionality Reduction (Ritchie 2001) Permutation-Based Procedures (Trimming/Weighting; Hoh 2000) Multivariate Adaptive Regression Splines (Friedman 1991) Boosting (Schapire 1990) Support Vector Machines (Vapnik 2000) Neural Networks (Friedman & Tukey 1974, Friedman & Stuetzle 1981) Bayesian Pathway Modeling (Conti 2003, Cortessis & Thomas 2004) Clique-Finding (Mushlin 2006)

205 What is a multiple testing correction? Simultaneously test m null hypotheses, one for each SNP j H 0j : no association between SNP j and the trait Every statistical test comes with an inherent false positive, or type I error rate which is equal to the threshold set for statistical significance, generally However, this is just the error rate for one test. When more than one test is run, the overall type I error rate is much greater than 5%.

206 What is a multiple testing correction? Suppose 100 statistical tests are run when (1) there are no real effects and (2) these tests are independent, then the probability that no false positives occur in 100 tests is = So the probability that at least one false positive occurs is =0.994 or 99.4% There is not a single measure to quantify false positives (Hochberg et al 1987). Several multiple testing corrections have been developed and curtailed to a genome-wide association context, when deemed necessary: Bonferroni (highly conservative) [divide each single SNPbased p-value by the nr of tests before comparing to the nominal sign level 0.05] vs permutation-based (highly computational demanding) [keep the LD structure, but swap the trait labels among the subjects]

207 4.c Replication May J. Hirschhorn and D. Altshuler J Clin Endo Metab Am J Hum Genet July Am J Hum Genet July PLoS Biol Sept Nat Genet July 2006

208 What does replication mean? Replicating the genotype-phenotype association is the gold standard for proving an association is genuine Most loci underlying complex diseases will not be of large effect. It is unlikely that a single study will unequivocally establish an association without the need for replication SNPs most likely to replicate: - Showing modest to strong statistical significance - Having common minor allele frequency - Exhibiting modest to strong genetic effect size Note: Multi-stage design analysis results should not be seen as evidence for replication...

209 Guidelines for replication studies Replication studies should be of sufficient size to demonstrate the effect Replication studies should conducted in independent datasets Replication should involve the same phenotype Replication should be conducted in a similar population The same SNP should be tested The replicated signal should be in the same direction Joint analysis should lead to a lower p-value than the original report Well-designed negative studies are valuable check the NHGRI Catalog of GWA studies

210 What does validation mean? (Igl et al. 2009)

211 5 Interpretation: functional genomics Are there criteria for assessing the functional significance of a variant? (Rebbeck et al 2004) Criterion Nucleotide Sequence Evolutionary Conservation Population Genetics Experimental Exposures Strong Support V ri t disru ts fu ti k w tif Str g s rv ti r ss s i s, u tig f i y Str g d vi ti s fr t d fr qu i s C sist t vid i hu t rg t tissu V ri t ff ts r v t t b is i t rg t tissu Moderate Support S iss s h g, disru ts ut tiv fu ti tif s rv ti r ss s i s r u tig f i y Some deviations from expected frequencies Neutral Information - Evidence Against N -fu ti h g N t k w N s rv ti N t k w S vid N d t v i b Variant affects metabolism N d t v i b N d vi ti s fr t d fr qu i s N fu ti ff t Variant does not affect metabolism Epidemiology C sist t r d r r du ib rts R r rts with ut i ti N d t v i b N ss i ti

212 ( 2016)

213 (ENCODE: finding all functional elements in the human genome)

214 The more we find, the more we see, the more we come to learn. The more that we explore, the more we shall return. Sir Tim Rice, Aida, 2000

215 A selection of consulted references: Balding D A tutorial on statistical methods for population association studies. Nature Reviews Genetics, 7, Kruglyak L The road to genomewide association studies. Nature Reviews Genetics 9: 314- Li Three lectures on case-control genetic association analysis. Briefings in bioinformatics 9: Peltonen L and McKusick VA Dissecting human disease in the postgenomic era. Science 291, Rebbeck et al Assessing the function of genetic variants in candidate gene association studies 5: 589- Robinson Common Disease, Multiple Rare (and Distant) Variants. PLoS Biology 8(1): e URL: courses.washington.edu/b516 Wang et al Genome-wide association studies: theoretical and practical concerns. Nature Reviews Genetics 6: 109- Ziegler A and Van Steen K 2010: IBS short course on Genome-Wide Association Studies

INTRODUCTION TO GENETIC EPIDEMIOLOGY (1012GENEP1) Prof. Dr. Dr. K. Van Steen

INTRODUCTION TO GENETIC EPIDEMIOLOGY (1012GENEP1) Prof. Dr. Dr. K. Van Steen INTRODUCTION TO GENETIC EPIDEMIOLOGY (1012GENEP1) Prof. Dr. Dr. K. Van Steen GENOME-WIDE ASSOCIATION STUDIES 1 Setting the pace 1.a A hype about GWA studies 1.b Genetic terminology revisited 1.c Genetic

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S Bioinformatics LECTURE 3-16 B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Bioinformatics LECTURE

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 3: Genome-wide Association Studies 1 Setting

More information

Chapter 25 Population Genetics

Chapter 25 Population Genetics Chapter 25 Population Genetics Population Genetics -- the discipline within evolutionary biology that studies changes in allele frequencies. Population -- a group of individuals from the same species that

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

How Populations Evolve. Chapter 15

How Populations Evolve. Chapter 15 How Populations Evolve Chapter 15 Populations Evolve Biological evolution does not change individuals It changes a population Traits in a population vary among individuals Evolution is change in frequency

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Evolution in a Genetic Context

Evolution in a Genetic Context Evolution in a Genetic Context What is evolution? Evolution is the process of change over time. In terms of genetics and evolution, our knowledge of DNA and phenotypic expression allow us to understand

More information

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population Genetic variation in a population increases the chance that some individuals

More information

Population Genetics. Chapter 16

Population Genetics. Chapter 16 Population Genetics Chapter 16 Populations and Gene Pools Evolution is the change of genetic composition of populations over time. Microevolution is change within species which can occur over dozens of

More information

5/2/ Genes and Variation. How Common Is Genetic Variation? Variation and Gene Pools

5/2/ Genes and Variation. How Common Is Genetic Variation? Variation and Gene Pools 16-1 Genes 16-1 and Variation Genes and Variation 1 of 24 How Common Is Genetic Variation? How Common Is Genetic Variation? Many genes have at least two forms, or alleles. All organisms have genetic variation

More information

Section KEY CONCEPT A population shares a common gene pool.

Section KEY CONCEPT A population shares a common gene pool. Section 11.1 KEY CONCEPT A population shares a common gene pool. Genetic variation in a population increases the chance that some individuals will survive. Why it s beneficial: Genetic variation leads

More information

Zoology Evolution and Gene Frequencies

Zoology Evolution and Gene Frequencies Zoology Evolution and Gene Frequencies I. any change in the frequency of alleles (and resulting phenotypes) in a population. A. Individuals show genetic variation, but express the genes they have inherited.

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

Population Genetics (Learning Objectives)

Population Genetics (Learning Objectives) Population Genetics (Learning Objectives) Recognize the quantitative nature of the study of population genetics and its connection to the study of genetics and its applications. Define the terms population,

More information

Population Genetics (Learning Objectives)

Population Genetics (Learning Objectives) Population Genetics (Learning Objectives) Recognize the quantitative nature of the study of population genetics and its connection to the study of genetics and its applications. Define the terms population,

More information

The Evolution of Populations

The Evolution of Populations Chapter 23 The Evolution of Populations PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations Topics How to track evolution allele frequencies Hardy Weinberg principle applications Requirements for genetic equilibrium Types of natural selection Population genetic polymorphism in populations, pp.

More information

Association studies (Linkage disequilibrium)

Association studies (Linkage disequilibrium) Positional cloning: statistical approaches to gene mapping, i.e. locating genes on the genome Linkage analysis Association studies (Linkage disequilibrium) Linkage analysis Uses a genetic marker map (a

More information

Quiz will begin at 10:00 am. Please Sign In

Quiz will begin at 10:00 am. Please Sign In Quiz will begin at 10:00 am Please Sign In You have 15 minutes to complete the quiz Put all your belongings away, including phones Put your name and date on the top of the page Circle your answer clearly

More information

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population! Genetic variation in a population increases the chance that some individuals

More information

CHAPTER 12 MECHANISMS OF EVOLUTION

CHAPTER 12 MECHANISMS OF EVOLUTION CHAPTER 12 MECHANISMS OF EVOLUTION 12.1 Genetic Variation DNA biological code for inheritable traits GENES units of DNA molecule in a chromosome LOCI location of specific gene on DNA molecules DIPLOID

More information

Questions we are addressing. Hardy-Weinberg Theorem

Questions we are addressing. Hardy-Weinberg Theorem Factors causing genotype frequency changes or evolutionary principles Selection = variation in fitness; heritable Mutation = change in DNA of genes Migration = movement of genes across populations Vectors

More information

Evolution of Populations (Ch. 17)

Evolution of Populations (Ch. 17) Evolution of Populations (Ch. 17) Doonesbury - Sunday February 8, 2004 Beak depth of Beak depth Where does Variation come from? Mutation Wet year random changes to DNA errors in gamete production Dry year

More information

Papers for 11 September

Papers for 11 September Papers for 11 September v Kreitman M (1983) Nucleotide polymorphism at the alcohol-dehydrogenase locus of Drosophila melanogaster. Nature 304, 412-417. v Hishimoto et al. (2010) Alcohol and aldehyde dehydrogenase

More information

THE EVOLUTION OF DARWIN S THEORY PT 1. Chapter 16-17

THE EVOLUTION OF DARWIN S THEORY PT 1. Chapter 16-17 THE EVOLUTION OF DARWIN S THEORY PT 1 Chapter 16-17 From Darwin to Today Darwin provided compelling evidence that species and populations change. What he didn t know (and neither did anyone else at the

More information

Population Genetics (Learning Objectives)

Population Genetics (Learning Objectives) Population Genetics (Learning Objectives) Define the terms population, species, allelic and genotypic frequencies, gene pool, and fixed allele, genetic drift, bottle-neck effect, founder effect. Explain

More information

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both?

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both? Frequency 10/6/2014 Variation Chapter 9 Some terms Genotype Allele form of a gene, distinguished by effect on phenotype Haplotype form of a gene, distinguished by DNA sequence Gene copy number of copies

More information

Population Genetics Modern Synthesis Theory The Hardy-Weinberg Theorem Assumptions of the H-W Theorem

Population Genetics Modern Synthesis Theory The Hardy-Weinberg Theorem Assumptions of the H-W Theorem Population Genetics A Population is: a group of same species organisms living in an area An allele is: one of a number of alternative forms of the same gene that may occur at a given site on a chromosome.

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

The Evolution of Populations

The Evolution of Populations Chapter 23 The Evolution of Populations PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Algorithms for Genetics: Introduction, and sources of variation

Algorithms for Genetics: Introduction, and sources of variation Algorithms for Genetics: Introduction, and sources of variation Scribe: David Dean Instructor: Vineet Bafna 1 Terms Genotype: the genetic makeup of an individual. For example, we may refer to an individual

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

The Evolution of Populations

The Evolution of Populations The Evolution of Populations What you need to know How and reproduction each produce genetic. The conditions for equilibrium. How to use the Hardy-Weinberg equation to calculate allelic and to test whether

More information

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus. NAME EXAM# 1 1. (15 points) Next to each unnumbered item in the left column place the number from the right column/bottom that best corresponds: 10 additive genetic variance 1) a hermaphroditic adult develops

More information

REVIEW 5: EVOLUTION UNIT. A. Top 10 If you learned anything from this unit, you should have learned:

REVIEW 5: EVOLUTION UNIT. A. Top 10 If you learned anything from this unit, you should have learned: Period Date REVIEW 5: EVOLUTION UNIT A. Top 10 If you learned anything from this unit, you should have learned: 1. Darwin s Principle of Natural Selection a. Variation individuals within a population possess

More information

Population and Community Dynamics. The Hardy-Weinberg Principle

Population and Community Dynamics. The Hardy-Weinberg Principle Population and Community Dynamics The Hardy-Weinberg Principle Key Terms Population: same species, same place, same time Gene: unit of heredity. Controls the expression of a trait. Can be passed to offspring.

More information

The Evolution of Populations

The Evolution of Populations LECTURE PRESENTATIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry, Michael L. Cain, Steven A. Wasserman, Peter V. Minorsky, Robert B. Jackson Chapter 23 The Evolution of Populations

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section A: Population Genetics

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section A: Population Genetics CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section A: Population Genetics 1. The modern evolutionary synthesis integrated Darwinian selection and Mendelian inheritance 2. A population s gene pool is defined

More information

Population Genetics. If we closely examine the individuals of a population, there is almost always PHENOTYPIC

Population Genetics. If we closely examine the individuals of a population, there is almost always PHENOTYPIC 1 Population Genetics How Much Genetic Variation exists in Natural Populations? Phenotypic Variation If we closely examine the individuals of a population, there is almost always PHENOTYPIC VARIATION -

More information

Module 20: Population Genetics, Student Learning Guide

Module 20: Population Genetics, Student Learning Guide Name: Period: Date: Module 20: Population Genetics, Student Learning Guide Instructions: 1. Work in pairs (share a computer). 2. Make sure that you log in for the first quiz so that you get credit. 3.

More information

MICROEVOLUTION. On the Origin of Species WHAT IS A SPECIES? WHAT IS A POPULATION? Genetic variation: how do new forms arise?

MICROEVOLUTION. On the Origin of Species WHAT IS A SPECIES? WHAT IS A POPULATION? Genetic variation: how do new forms arise? MICROEVOLUTION On the Origin of Species WHAT IS A SPECIES? Individuals in one or more populations Potential to interbreed Produce fertile offspring WHAT IS A POPULATION? Group of interacting individuals

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

Module 20: Population Genetics, Student Learning Guide

Module 20: Population Genetics, Student Learning Guide Name: Period: Date: Module 20: Population Genetics, Student Learning Guide Instructions: 1. Work in pairs (share a computer). 2. Make sure that you log in for the first quiz so that you get credit. 3.

More information

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation

More information

University of York Department of Biology B. Sc Stage 2 Degree Examinations

University of York Department of Biology B. Sc Stage 2 Degree Examinations Examination Candidate Number: Desk Number: University of York Department of Biology B. Sc Stage 2 Degree Examinations 2016-17 Evolutionary and Population Genetics Time allowed: 1 hour and 30 minutes Total

More information

PopGen1: Introduction to population genetics

PopGen1: Introduction to population genetics PopGen1: Introduction to population genetics Introduction MICROEVOLUTION is the term used to describe the dynamics of evolutionary change in populations and species over time. The discipline devoted to

More information

BIOLOGY 3201 UNIT 4 EVOLUTION CH MECHANISMS OF EVOLUTION

BIOLOGY 3201 UNIT 4 EVOLUTION CH MECHANISMS OF EVOLUTION BIOLOGY 3201 UNIT 4 EVOLUTION CH. 20 - MECHANISMS OF EVOLUTION POPULATION GENETICS AND HARDY WEINBERG PRINCIPLE Population genetics: this is a study of the genes in a population and how they may or may

More information

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Heather J. Cordell Professor of Statistical Genetics Institute of Genetic Medicine Newcastle University,

More information

The Evolution of Populations

The Evolution of Populations Microevolution The Evolution of Populations C H A P T E R 2 3 Change in allele frequencies over generations Three mechanisms cause allele frequency change: Natural selection (leads to adaptation) Genetic

More information

Summary Genes and Variation Evolution as Genetic Change. Name Class Date

Summary Genes and Variation Evolution as Genetic Change. Name Class Date Chapter 16 Summary Evolution of Populations 16 1 Genes and Variation Darwin s original ideas can now be understood in genetic terms. Beginning with variation, we now know that traits are controlled by

More information

Edexcel (B) Biology A-level

Edexcel (B) Biology A-level Edexcel (B) Biology A-level Topic 8: Origins of Genetic Variation Notes Meiosis is reduction division. The main role of meiosis is production of haploid gametes as cells produced by meiosis have half the

More information

Chapter 16: How Populations Evolve

Chapter 16: How Populations Evolve Chapter 16: How Populations Evolve AP Curriculum Alignment Evolution is a change in the genetic makeup of a population over time, with natural selection its major driving mechanism. This is a major component

More information

-Is change in the allele frequencies of a population over generations -This is evolution on its smallest scale

-Is change in the allele frequencies of a population over generations -This is evolution on its smallest scale Remember: -Evolution is a change in species over time -Heritable variations exist within a population -These variations can result in differential reproductive success -Over generations this can result

More information

Chapter 23: The Evolution of Populations. 1. Populations & Gene Pools. Populations & Gene Pools 12/2/ Populations and Gene Pools

Chapter 23: The Evolution of Populations. 1. Populations & Gene Pools. Populations & Gene Pools 12/2/ Populations and Gene Pools Chapter 23: The Evolution of Populations 1. Populations and Gene Pools 2. Hardy-Weinberg Equilibrium 3. A Closer Look at Natural Selection 1. Populations & Gene Pools Chapter Reading pp. 481-484, 488-491

More information

Introduction Chapter 23 - EVOLUTION of

Introduction Chapter 23 - EVOLUTION of Introduction Chapter 23 - EVOLUTION of POPULATIONS The blue-footed booby has adaptations that make it suited to its environment. These include webbed feet, streamlined shape that minimizes friction when

More information

Population- group of individuals of the SAME species that live in the same area Species- a group of similar organisms that can breed and produce

Population- group of individuals of the SAME species that live in the same area Species- a group of similar organisms that can breed and produce Dr. Bertolotti Essential Question: Population- group of individuals of the SAME species that live in the same area Species- a group of similar organisms that can breed and produce FERTILE offspring Allele-

More information

POPULATION GENETICS: The study of the rules governing the maintenance and transmission of genetic variation in natural populations.

POPULATION GENETICS: The study of the rules governing the maintenance and transmission of genetic variation in natural populations. POPULATION GENETICS: The study of the rules governing the maintenance and transmission of genetic variation in natural populations. DARWINIAN EVOLUTION BY NATURAL SELECTION Many more individuals are born

More information

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work Genome 371, 1 March 2010, Lecture 13 Association Mapping Mendelian versus Complex Phenotypes How to Perform an Association Study Why Association Studies (Can) Work Introduction to LOD score analysis Common

More information

Study Guide A. Answer Key. The Evolution of Populations

Study Guide A. Answer Key. The Evolution of Populations The Evolution of Populations Answer Key SECTION 1. GENETIC VARIATION WITHIN POPULATIONS 1. b 2. d 3. gene pool 4. combinations of alleles 5. allele frequencies 6. ratio or percentage 7. mutation 8. recombination

More information

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics Genetic Variation and Genome- Wide Association Studies Keyan Salari, MD/PhD Candidate Department of Genetics How many of you did the readings before class? A. Yes, of course! B. Started, but didn t get

More information

Evolutionary Mechanisms

Evolutionary Mechanisms Evolutionary Mechanisms Tidbits One misconception is that organisms evolve, in the Darwinian sense, during their lifetimes Natural selection acts on individuals, but only populations evolve Genetic variations

More information

Average % If you want to complete quiz corrections for extra credit you must come after school Starting new topic today. Grab your clickers.

Average % If you want to complete quiz corrections for extra credit you must come after school Starting new topic today. Grab your clickers. Average 50.83% If you want to complete quiz corrections for extra credit you must come after school Starting new topic today. Grab your clickers. Evolution AP BIO Pacing Evolution Today Mutations Gene

More information

Linkage Disequilibrium

Linkage Disequilibrium Linkage Disequilibrium Why do we care about linkage disequilibrium? Determines the extent to which association mapping can be used in a species o Long distance LD Mapping at the tens of kilobase level

More information

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15 Introduction to Population Genetics Spezielle Statistik in der Biomedizin WS 2014/15 What is population genetics? Describes the genetic structure and variation of populations. Causes Maintenance Changes

More information

LAB 12 Natural Selection INTRODUCTION

LAB 12 Natural Selection INTRODUCTION LAB 12 Natural Selection Objectives 1. Model evolution by natural selection. 2. Determine allele frequencies within a population. 3. Use the Hardy-Weinberg equation to calculate probability of each genotype

More information

CH. 22/23 WARM-UP. 1. List 5 different pieces of evidence for evolution.

CH. 22/23 WARM-UP. 1. List 5 different pieces of evidence for evolution. CH. 22/23 WARM-UP 1. List 5 different pieces of evidence for evolution. 2. (Review) What are the 3 ways that sexual reproduction produces genetic diversity? 3. What is 1 thing you are grateful for today?

More information

GENETICS - CLUTCH CH.21 POPULATION GENETICS.

GENETICS - CLUTCH CH.21 POPULATION GENETICS. !! www.clutchprep.com CONCEPT: HARDY-WEINBERG Hardy-Weinberg is a formula used to measure the frequencies of and genotypes in a population Allelic frequencies are the frequency of alleles in a population

More information

Section A: Population Genetics

Section A: Population Genetics CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section A: Population Genetics 1. The modern evolutionary synthesis integrated Darwinian selection and Mendelian inheritance 2. A population s gene pool is defined

More information

of heritable factor ). 1. The alternative versions of genes are called alleles. Chapter 9 Patterns of Inheritance

of heritable factor ). 1. The alternative versions of genes are called alleles. Chapter 9 Patterns of Inheritance Chapter 9 Biology and Society: Our Longest-Running Genetic Experiment: Dogs Patterns of Inheritance People have selected and mated dogs with preferred traits for more than 15,000 years. Over thousands

More information

17.1 What Is It That Evolves? Microevolution. Microevolution. Ch. 17 Microevolution. Genes. Population

17.1 What Is It That Evolves? Microevolution. Microevolution. Ch. 17 Microevolution. Genes. Population Ch. 17 Microevolution 17.1 What Is It That Evolves? Microevolution Population Defined as all the members of a single species living in a defined geographical area at a given time A sexually reproducing

More information

All the, including all the different alleles, that are present in a

All the, including all the different alleles, that are present in a Evolution as Genetic Change: chapter 16 Date name A group of individuals of the same species that interbreed. All the, including all the different alleles, that are present in a Relative Allele frequency

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

A Primer of Ecological Genetics

A Primer of Ecological Genetics A Primer of Ecological Genetics Jeffrey K. Conner Michigan State University Daniel L. Hartl Harvard University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Contents Preface xi Acronyms,

More information

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility?

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility? Outline Genome wide association studies Helga Westerlind, PhD About GWAS/Complex diseases How to GWAS Imputation What is a genome wide association study? Why are we doing them? How do we know there is

More information

SYLLABUS AND SAMPLE QUESTIONS FOR JRF IN BIOLOGICAL ANTHROPOLGY 2011

SYLLABUS AND SAMPLE QUESTIONS FOR JRF IN BIOLOGICAL ANTHROPOLGY 2011 SYLLABUS AND SAMPLE QUESTIONS FOR JRF IN BIOLOGICAL ANTHROPOLGY 2011 SYLLABUS 1. Introduction: Definition and scope; subdivisions of anthropology; application of genetics in anthropology. 2. Human evolution:

More information

Random Allelic Variation

Random Allelic Variation Random Allelic Variation AKA Genetic Drift Genetic Drift a non-adaptive mechanism of evolution (therefore, a theory of evolution) that sometimes operates simultaneously with others, such as natural selection

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

Exam 1, Fall 2012 Grade Summary. Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average Grade Distribution:

Exam 1, Fall 2012 Grade Summary. Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average Grade Distribution: Exam 1, Fall 2012 Grade Summary Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average 79.4 Grade Distribution: Name: BIOL 464/GEN 535 Population Genetics Fall 2012 Test # 1, 09/26/2012

More information

The Modern Synthesis. Terms and Concepts. Evolutionary Processes. I. Introduction: Where do we go from here? What do these things have in common?

The Modern Synthesis. Terms and Concepts. Evolutionary Processes. I. Introduction: Where do we go from here? What do these things have in common? Evolutionary Processes I. Introduction - The modern synthesis Reading: Chap. 25 II. No evolution: Hardy-Weinberg equilibrium A. Population genetics B. Assumptions of H-W III. Causes of microevolution (forces

More information

Constancy of allele frequencies: -HARDY WEINBERG EQUILIBRIUM. Changes in allele frequencies: - NATURAL SELECTION

Constancy of allele frequencies: -HARDY WEINBERG EQUILIBRIUM. Changes in allele frequencies: - NATURAL SELECTION THE ORGANIZATION OF GENETIC DIVERSITY Constancy of allele frequencies: -HARDY WEINBERG EQUILIBRIUM Changes in allele frequencies: - MUTATION and RECOMBINATION - GENETIC DRIFT and POPULATION STRUCTURE -

More information

Genetic Variation. Genetic Variation within Populations. Population Genetics. Darwin s Observations

Genetic Variation. Genetic Variation within Populations. Population Genetics. Darwin s Observations Genetic Variation within Populations Population Genetics Darwin s Observations Genetic Variation Underlying phenotypic variation is genetic variation. The potential for genetic variation in individuals

More information

General aspects of genome-wide association studies

General aspects of genome-wide association studies General aspects of genome-wide association studies Abstract number 20201 Session 04 Correctly reporting statistical genetics results in the genomic era Pekka Uimari University of Helsinki Dept. of Agricultural

More information

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics BST227 Introduction to Statistical Genetics Lecture 3: Introduction to population genetics 1 Housekeeping HW1 due on Wednesday TA office hours today at 5:20 - FXB G11 What have we studied Background Structure

More information

UNIT 4: EVOLUTION Chapter 11: The Evolution of Populations

UNIT 4: EVOLUTION Chapter 11: The Evolution of Populations CORNELL NOTES Directions: You must create a minimum of 5 questions in this column per page (average). Use these to study your notes and prepare for tests and quizzes. Notes will be stamped after each assigned

More information

The Theory of Evolution

The Theory of Evolution The Theory of Evolution Mechanisms of Evolution Notes Pt. 4 Population Genetics & Evolution IMPORTANT TO REMEMBER: Populations, not individuals, evolve. Population = a group of individuals of the same

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics BST227 Introduction to Statistical Genetics Lecture 3: Introduction to population genetics!1 Housekeeping HW1 will be posted on course website tonight 1st lab will be on Wednesday TA office hours have

More information

11.1. A population shares a common gene pool. The Evolution of Populations CHAPTER 11. Fill in the concept map below.

11.1. A population shares a common gene pool. The Evolution of Populations CHAPTER 11. Fill in the concept map below. SECTION 11.1 GENETIC VARIATION WITHIN POPULATIONS Study Guide KEY CONCEPT A population shares a common gene pool. VOCABULARY gene pool allele frequency MAIN IDEA: Genetic variation in a population increases

More information

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds:

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: 1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: natural selection 21 1) the component of phenotypic variance not explained by

More information

An introduction to genetics and molecular biology

An introduction to genetics and molecular biology An introduction to genetics and molecular biology Cavan Reilly September 5, 2017 Table of contents Introduction to biology Some molecular biology Gene expression Mendelian genetics Some more molecular

More information

COMPUTER SIMULATIONS AND PROBLEMS

COMPUTER SIMULATIONS AND PROBLEMS Exercise 1: Exploring Evolutionary Mechanisms with Theoretical Computer Simulations, and Calculation of Allele and Genotype Frequencies & Hardy-Weinberg Equilibrium Theory INTRODUCTION Evolution is defined

More information

Mathematical Population Genetics

Mathematical Population Genetics Mathematical Population Genetics (Hardy-Weinberg, Selection, Drift and Linkage Disequilibrium ) Chiara Sabatti, Human Genetics 5554B Gonda csabatti@mednet.ucla.edu Populations Which predictions are of

More information

Topic 2: Population Models and Genetics

Topic 2: Population Models and Genetics SCIE1000 Project, Semester 1 2011 Topic 2: Population Models and Genetics 1 Required background: 1.1 Science To complete this project you will require some background information on three topics: geometric

More information

Hardy Weinberg Equilibrium

Hardy Weinberg Equilibrium Gregor Mendel Hardy Weinberg Equilibrium Lectures 4-11: Mechanisms of Evolution (Microevolution) Hardy Weinberg Principle (Mendelian Inheritance) Genetic Drift Mutation Sex: Recombination and Random Mating

More information

thebiotutor.com A2 Biology Unit 5 Genetics

thebiotutor.com A2 Biology Unit 5 Genetics thebiotutor.com A2 Biology Unit 5 Genetics 1 Some important terms Using the example of tall (T) and short (t) pea plants, explain the meaning of the following terms: Gene Allele Phenotype Genotype Homozygous

More information

Bio 6 Natural Selection Lab

Bio 6 Natural Selection Lab Bio 6 Natural Selection Lab Overview In this laboratory you will demonstrate the process of evolution by natural selection by carrying out a predator/prey simulation. Through this exercise you will observe

More information

Population genetics. Population genetics provides a foundation for studying evolution How/Why?

Population genetics. Population genetics provides a foundation for studying evolution How/Why? Population genetics 1.Definition of microevolution 2.Conditions for Hardy-Weinberg equilibrium 3.Hardy-Weinberg equation where it comes from and what it means 4.The five conditions for equilibrium in more

More information

POPULATION GENETICS. Evolution Lectures 4

POPULATION GENETICS. Evolution Lectures 4 POPULATION GENETICS Evolution Lectures 4 POPULATION GENETICS The study of the rules governing the maintenance and transmission of genetic variation in natural populations. Population: A freely interbreeding

More information