Population Genetics II. Bio

Size: px
Start display at page:

Download "Population Genetics II. Bio"

Transcription

1 Population Genetics II. Bio Don Conrad

2 Agenda Population Genetic Inference Mutation Selection Recombination

3 The Coalescent Process ACTT T G C G ACGT ACGT ACTT ACTT AGTT Backward in time process Discovered by JFC Kingman, F. Tajima, R. R. Hudson c DNA sequence diversity is shaped by genealogical history Genealogies are unobserved but can be estimated Conceptual framework for population genetic inference: mutation, recombination, demographic history

4 2 sample coalescent MRCA N = population size of diploid individuals n = sample size of haploid chromosomes T 2 MRCA = most recent common ancestor T 2 = coalescence time for 2 chromosomes sequence1 sequence2

5 2 sample coalescent MRCA Probability that the time of MRCA is t generations ago T 2 " P(T 2 = t) = 1 1 % $ ' # 2N & t 1 " $ # 1 2N If we blur our eyes a bit (as N gets very large) this becomes % ' & sequence1 sequence2! P(T 2 = t) = 1 $ # &e " 2N %! 1 $ # &t " 2 N %

6 2 sample coalescent MRCA E(T 2 ) = 2N Var(T 2 ) = 2N 2 T 2 If we consider t = t/2n, call that coalescent time Then E(T 2 ) = 1 sequence1 sequence2

7 n-coalescent There are possible pairs, each coalescing at rate 1/2N n 2! " # $ % & = n(n 1) 2 P(T n = t) = n 2! " # $ % & 2N! " # # # # # $ % & & & & & e n 2! " # $ % & 2 N! " # # # # # $ % & & & & & t E(T n ) = 2N n 2! " # $ % &

8 n-coalescent Mean elapsed time (*2N) T 2 1 T 3 T 4 T 5 1/3 1/6 1/10 = 2/(5*4) E(TMRCA for n chromosomes)= T 2 + T 3 + T 4 + T n = 2(1-1/n) coalescent units

9 Simon Myers N e and coalescence times in humans and other animals In humans, it is known that appropriate values for N e are surprisingly small. This is approximation is called the effective population size : N e 10,000 in Europe N e 9,500 in East Asia N e < 25,000 for all human populations, highest in Africa

10 N e and coalescence times in humans and other animals The mean coalescence time for two lineages is just E( T 2 ) = 1 in units of 2N e generations, so if we have G=22 years per generation, the average ancestry depth for 2 human chromosomes is 1 2N e G in years (20,000-50,000) 22 = 440,000-1,100,000 years N e varies widely across species (Charlesworth, Nature Reviews Genetics 2009): 25,000,000 for E.coli 2,000,000 for fruit fly D. Melanogaster <100 for Salamanders (Funk et al. 1999) Simon Myers

11 Adding mutations For neutral models, can separately model the genealogical process (the tree) and the mutation process (genetic types) -Infinite sites mutation model

12 What is expected number of mutations between 2 chroms? μ = mutation rate per bp per generation t E(t) = 2N E(π t) = 2tµ E(π ) = 2µE(t) = 4Nµ t ~ Expo (2N) π ~ Pois(2tμ) 4Nu comes up repeatedly in population genetics, often referred to as theta θ = 4Nµ

13 Number of segregating sites in a sample of size n The total time in the tree for a sample of n chromosomes is t 2 L = 4t 4 + 3t 3 + 2t 2 or in general: L = n i=2 i *t i t 3 t 4 n n 1 = ie(t i ) 1 E(L) E(S) = µe(l) = 4Nµ =θ i=2 i=1 i n # 4N & = i% ( $ i(i 1) ' i=2 n 1 = 4N 1 i i=1 Watterson Estimator for population scaled mutation rate n 1 i=1 1 i θˆ W = S n 1 1 i i=1

14 Estimators of Theta Watterson estimator is just one approach, uses summary of the data θˆ W = S n 1 1 i i=1 There is also the Tajima estimator Average # of pairwise differences ˆ θ T = n S 2 p i (1 p i ) n 1 i=1 p i = allele frequency of mutation i S = number of polymorphic sites

15 The site frequency spectrum Under standard neutral model, the site frequency spectrum is a beta distribution with parameters theta/2,theta/2 Expected proportion of sites with count x is theta/x Number of sites Derived allele count Plot of theoretical SFS for 1Mb

16 The sfs under neutrality and selection

17 Daniel MacArthur, Suganti Balasubramaniam The sfs of genic variants splice-disrupting 621 stop-gain 1,654 non-synonymous 84,358 synonymous 61,155

18 Sample-based estimators of Θ using the sfs Estimator Sensitivity Source θ W = 1 n 11 i=1 $ θ π = & n' ) % 2( i=1 i 1 n 1 i( n i)ξ i i=1 θ ξ e = ξ e = ξ 1 # θ H = % n& ( $ 2' n 1 ξ i 1 n 1 i 2 ξ i i=1 θ L = 1 n 1 iξ i n 1 i=1 low Watterson (1975) intermediate Tajima (1989) singleton Fu and Li (1993) high Fay and Wu (2000) high Zeng et al. (2006) Sensitivity = the frequency of observed polymorphisms that makes estimates using a given estimator large relative to the others. Eckert, UCDavis

19 Tajima s D D t = 0 neutral evolution D t = θ π θ W C D t > 0 balancing selection, more intermediate variants D t < 0 positive selection

20 Bamshad & Wooding 2001

21 LCT D Nielsen, et al 2007

22 A typical population genomics study design for detecting positive selection. Akey J M Genome Res. 2009;19: by Cold Spring Harbor Laboratory Press

23 Are humans still experiencing adaptive evolution?

24 ihs(x), a homozygosity-based statistic function of test allele frequency summarizes extent of haplotype homozygosity of derived allele chromosomes compared to ancestral built-in control for local recombination rate PLoS Biology 2006

25

26 Conflicting evidence of population-specific selection (A) Genic SNPs are more likely than non-genic SNPs to have extreme allele frequency differences Coop, et al PLoS 2009

27 Conflicting evidence of population-specific selection F st (B) Maximum allele frequency difference is well explained by average genetic distance between populations Coop, et al PLoS 2009

28 Polygenic model of adaptation Pritchard, et al Curr Biology, 2010

29 Linkage Disequilibrium nonrandom associations between alleles Compare to HWE: Under HWE, gametes unite at random. Pr(A,a) = pr(a) * pr(a) where A and a are alleles at same locus LD statistics measure to what extent Pr(A,B) = pr(a) * pr(b) when A and B are alleles at different loci Example applications: mapping genes and measuring recombination

30 Linkage Disequilibrium A/a B/b Frequencies: P A P B P a P b The 4 allele frequencies P AB P Ab P ab P ab A B A b a B a b The 4 haplotype frequencies Define D = P AB P A P B = P AB P ab P Ab P ab If D!= 0 then we have LD r 2 = D/(P A P B P a P b )

31 Where does LD come from? Final Aligned Data Set:

32 LD-based Case-Control Association Study l l locate disease locus Unlikely to be among our genotyped markers è Detect indirectly using available markers Cases (affected) --A C A----G---X----T---C---A T C A----G---X----C---C---A A G G----G---X----C---C---A A C A----G---X----T---C---A T C A----G---X----T---C---A T C A----T---X----T---A---A---- Controls (unaffected) --A C A----T---X----T---A---A A G G----G---X----C---C---A A G G----T---X----C---A---A A C A----G---X----T---C---A T C A---T---X----T---A---A T C G----T---X----A---A---A----

33 Where does array content come from? Publicly Funded Genomics Projects Human Genome Project Phase I HapMap Project CNV Project Phase II HapMap 1000 Genomes Project 10k 500k 1M Number of SNPs on an Affymetrix Chip

34 The International HapMap Project -HapMap project ( ) cost $120 Million USD -Measured variation at 4 million SNPs in 4 populations (90 Europeans, 90 Nigerians, 45 Chinese, 45 Japanese) Results: - Over 2.8 M SNP with > 5% allele frequency -80% of variants within each population can be captured with 30% of the most informative SNPs, tagsnps -Nigerians require most tag SNPs, followed by Europeans and then Asians

35 HapMap tagsnps are useful for other populations Conrad, et al. (2006) genotyped 3000 SNPs in 52 populations across the globe (the Human Genome Diversity Panel or HGDP) C+J: Asian, CEU: European, YRI: Nigerian

36 The serial bottleneck model

37 The recombination rate Can vary hugely along a sequence Determines association between loci in the population Is hard to measure directly, because recombination occurs on average only ~1 in 100,000,000 meioses between any pair of successive nucleotides in the genome. Can be measured indirectly, by parametric analysis of variation data Researchers in Oxford, and elsewhere, have developed such parametric approaches (Li and Stephens, 2003; Ptak et al. 2005; Hudson 2001, McVean 2002, McVean et al. 2004) Now, we are considering ρ = 4Nr

38 Marginal Trees Time Position A Position B Position C Physical position

39 Calculated a composite likelihood for a sample of haplotypes: L(4Nr) = L(n i, n j, n ij 4Nr ij ) -sum over pairs -lookup table -RJMCMC ij

40

41 Nature Genet 2008

42 Science 2009

43 Nat Genet 2010

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

Recombination, and haplotype structure

Recombination, and haplotype structure 2 The starting point We have a genome s worth of data on genetic variation Recombination, and haplotype structure Simon Myers, Gil McVean Department of Statistics, Oxford We wish to understand why the

More information

Detecting selection on nucleotide polymorphisms

Detecting selection on nucleotide polymorphisms Detecting selection on nucleotide polymorphisms Introduction At this point, we ve refined the neutral theory quite a bit. Our understanding of how molecules evolve now recognizes that some substitutions

More information

Haplotypes, linkage disequilibrium, and the HapMap

Haplotypes, linkage disequilibrium, and the HapMap Haplotypes, linkage disequilibrium, and the HapMap Jeffrey Barrett Boulder, 2009 LD & HapMap Boulder, 2009 1 / 29 Outline 1 Haplotypes 2 Linkage disequilibrium 3 HapMap 4 Tag SNPs LD & HapMap Boulder,

More information

Molecular Evolution. COMP Fall 2010 Luay Nakhleh, Rice University

Molecular Evolution. COMP Fall 2010 Luay Nakhleh, Rice University Molecular Evolution COMP 571 - Fall 2010 Luay Nakhleh, Rice University Outline (1) The neutral theory (2) Measures of divergence and polymorphism (3) DNA sequence divergence and the molecular clock (4)

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 7 Measuring, Organizing, and Interpreting Marker Variation

More information

Signatures of a population bottleneck can be localised along a recombining chromosome

Signatures of a population bottleneck can be localised along a recombining chromosome Signatures of a population bottleneck can be localised along a recombining chromosome Céline Becquet, Peter Andolfatto Bioinformatics and Modelling, INSA of Lyon Institute for Cell, Animal and Population

More information

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Supplementary Material online Population genomics in Bacteria: A case study of Staphylococcus aureus

Supplementary Material online Population genomics in Bacteria: A case study of Staphylococcus aureus Supplementary Material online Population genomics in acteria: case study of Staphylococcus aureus Shohei Takuno, Tomoyuki Kado, Ryuichi P. Sugino, Luay Nakhleh & Hideki Innan Contents Estimating recombination

More information

An Introduction to Population Genetics

An Introduction to Population Genetics An Introduction to Population Genetics THEORY AND APPLICATIONS f 2 A (1 ) E 1 D [ ] = + 2M ES [ ] fa fa = 1 sf a Rasmus Nielsen Montgomery Slatkin Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

More information

Summary. Introduction

Summary. Introduction doi: 10.1111/j.1469-1809.2006.00305.x Variation of Estimates of SNP and Haplotype Diversity and Linkage Disequilibrium in Samples from the Same Population Due to Experimental and Evolutionary Sample Size

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

Nature Genetics: doi: /ng.3254

Nature Genetics: doi: /ng.3254 Supplementary Figure 1 Comparing the inferred histories of the stairway plot and the PSMC method using simulated samples based on five models. (a) PSMC sim-1 model. (b) PSMC sim-2 model. (c) PSMC sim-3

More information

Association studies (Linkage disequilibrium)

Association studies (Linkage disequilibrium) Positional cloning: statistical approaches to gene mapping, i.e. locating genes on the genome Linkage analysis Association studies (Linkage disequilibrium) Linkage analysis Uses a genetic marker map (a

More information

Detecting ancient admixture using DNA sequence data

Detecting ancient admixture using DNA sequence data Detecting ancient admixture using DNA sequence data October 10, 2008 Jeff Wall Institute for Human Genetics UCSF Background Origin of genus Homo 2 2.5 Mya Out of Africa (part I)?? 1.6 1.8 Mya Further spread

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

A brief introduction to population genetics

A brief introduction to population genetics A brief introduction to population genetics Population genetics Definition studies distributions & changes of allele frequencies in populations over time effects considered: natural selection, genetic

More information

Algorithms for Genetics: Introduction, and sources of variation

Algorithms for Genetics: Introduction, and sources of variation Algorithms for Genetics: Introduction, and sources of variation Scribe: David Dean Instructor: Vineet Bafna 1 Terms Genotype: the genetic makeup of an individual. For example, we may refer to an individual

More information

On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study

On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study J.M. Comeron, M. Kreitman, F.M. De La Vega Pacific Symposium on Biocomputing 8:478-489(23)

More information

(c) Suppose we add to our analysis another locus with j alleles. How many haplotypes are possible between the two sites?

(c) Suppose we add to our analysis another locus with j alleles. How many haplotypes are possible between the two sites? OEB 242 Midterm Review Practice Problems (1) Loci, Alleles, Genotypes, Haplotypes (a) Define each of these terms. (b) We used the expression!!, which is equal to!!!!!!!! and represents sampling without

More information

I See Dead People: Gene Mapping Via Ancestral Inference

I See Dead People: Gene Mapping Via Ancestral Inference I See Dead People: Gene Mapping Via Ancestral Inference Paul Marjoram, 1 Lada Markovtsova 2 and Simon Tavaré 1,2,3 1 Department of Preventive Medicine, University of Southern California, 1540 Alcazar Street,

More information

It s not a fundamental force like mutation, selection, and drift.

It s not a fundamental force like mutation, selection, and drift. What is Genetic Draft? It s not a fundamental force like mutation, selection, and drift. It s an effect of mutation at a selected locus, that reduces variation at nearby (linked) loci, thereby reducing

More information

Linkage Disequilibrium. Adele Crane & Angela Taravella

Linkage Disequilibrium. Adele Crane & Angela Taravella Linkage Disequilibrium Adele Crane & Angela Taravella Overview Introduction to linkage disequilibrium (LD) Measuring LD Genetic & demographic factors shaping LD Model predictions and expected LD decay

More information

Simple inheritance. Defective Gene. Disease

Simple inheritance. Defective Gene. Disease Simple inheritance Defective Gene Disease Cystic Fibrosis Hemochromatosis Achondroplasia Huntington s Disease Neurofibromatosis Fanconi Anemia Muscular Dystrophy Ataxia Telangiectasia Werner Syndrome Complex

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

Supplementary Note: Detecting population structure in rare variant data

Supplementary Note: Detecting population structure in rare variant data Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to

More information

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm The Whole Genome TagSNP Selection and Transferability Among HapMap Populations Reedik Magi, Lauris Kaplinski, and Maido Remm Pacific Symposium on Biocomputing 11:535-543(2006) THE WHOLE GENOME TAGSNP SELECTION

More information

Lecture 19: Hitchhiking and selective sweeps. Bruce Walsh lecture notes Synbreed course version 8 July 2013

Lecture 19: Hitchhiking and selective sweeps. Bruce Walsh lecture notes Synbreed course version 8 July 2013 Lecture 19: Hitchhiking and selective sweeps Bruce Walsh lecture notes Synbreed course version 8 July 2013 1 Hitchhiking When an allele is linked to a site under selection, its dynamics are considerably

More information

Quantitative Genetics for Using Genetic Diversity

Quantitative Genetics for Using Genetic Diversity Footprints of Diversity in the Agricultural Landscape: Understanding and Creating Spatial Patterns of Diversity Quantitative Genetics for Using Genetic Diversity Bruce Walsh Depts of Ecology & Evol. Biology,

More information

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics CMSC423: Bioinformatic Algorithms, Databases and Tools Some Genetics CMSC423 Fall 2009 2 Chapter 13 Reading assignment CMSC423 Fall 2009 3 Gene association studies Goal: identify genes/markers associated

More information

Park /12. Yudin /19. Li /26. Song /9

Park /12. Yudin /19. Li /26. Song /9 Each student is responsible for (1) preparing the slides and (2) leading the discussion (from problems) related to his/her assigned sections. For uniformity, we will use a single Powerpoint template throughout.

More information

Neutrality Test. Neutrality tests allow us to: Challenges in neutrality tests. differences. data. - Identify causes of species-specific phenotype

Neutrality Test. Neutrality tests allow us to: Challenges in neutrality tests. differences. data. - Identify causes of species-specific phenotype Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection sweep tests Positive selection is when a new

More information

Genotype Prediction with SVMs

Genotype Prediction with SVMs Genotype Prediction with SVMs Nicholas Johnson December 12, 2008 1 Summary A tuned SVM appears competitive with the FastPhase HMM (Stephens and Scheet, 2006), which is the current state of the art in genotype

More information

SNP Selection. Outline of Tutorial. Why Do We Need tagsnps? Concepts of tagsnps. LD and haplotype definitions. Haplotype blocks and definitions

SNP Selection. Outline of Tutorial. Why Do We Need tagsnps? Concepts of tagsnps. LD and haplotype definitions. Haplotype blocks and definitions SNP Selection Outline of Tutorial Concepts of tagsnps University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for Human Genetics

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Supplementary Methods 2. Supplementary Table 1: Bottleneck modeling estimates 5

Supplementary Methods 2. Supplementary Table 1: Bottleneck modeling estimates 5 Supplementary Information Accelerated genetic drift on chromosome X during the human dispersal out of Africa Keinan A, Mullikin JC, Patterson N, and Reich D Supplementary Methods 2 Supplementary Table

More information

Population Genetics Sequence Diversity Molecular Evolution. Physiology Quantitative Traits Human Diseases

Population Genetics Sequence Diversity Molecular Evolution. Physiology Quantitative Traits Human Diseases Population Genetics Sequence Diversity Molecular Evolution Physiology Quantitative Traits Human Diseases Bioinformatics problems in medicine related to physiology and quantitative traits Note: Genetics

More information

Questions we are addressing. Hardy-Weinberg Theorem

Questions we are addressing. Hardy-Weinberg Theorem Factors causing genotype frequency changes or evolutionary principles Selection = variation in fitness; heritable Mutation = change in DNA of genes Migration = movement of genes across populations Vectors

More information

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score Midterm 1 Results 10 Midterm 1 Akey/ Fields Median - 69 8 Number of Students 6 4 2 0 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 Exam Score Quick review of where we left off Parental type: the

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

N e =20,000 N e =150,000

N e =20,000 N e =150,000 Evolution: For Review Only Page 68 of 80 Standard T=100,000 r=0.3 cm/mb r=0.6 cm/mb p=0.1 p=0.3 N e =20,000 N e =150,000 Figure S1: Distribution of the length of ancestral segment according to our approximation

More information

More Introduction to Positive Selection

More Introduction to Positive Selection More Introduction to Positive Selection Ryan Hernandez Tim O Connor ryan.hernandez@ucsf.edu 1 Genome-wide scans The EHH approach does not lend itself to a genomewide scan. Voight, et al. (2006) create

More information

Inference about Recombination from Haplotype Data: Lower. Bounds and Recombination Hotspots

Inference about Recombination from Haplotype Data: Lower. Bounds and Recombination Hotspots Inference about Recombination from Haplotype Data: Lower Bounds and Recombination Hotspots Vineet Bafna Department of Computer Science and Engineering University of California at San Diego, La Jolla, CA

More information

What is genetic variation?

What is genetic variation? enetic Variation Applied Computational enomics, Lecture 05 https://github.com/quinlan-lab/applied-computational-genomics Aaron Quinlan Departments of Human enetics and Biomedical Informatics USTAR Center

More information

Computational Population Genomics

Computational Population Genomics Computational Population Genomics Rasmus Nielsen Departments of Integrative Biology and Statistics, UC Berkeley Department of Biology, University of Copenhagen Rai N, Chaubey G, Tamang R, Pathak AK, et

More information

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15 Introduction to Population Genetics Spezielle Statistik in der Biomedizin WS 2014/15 What is population genetics? Describes the genetic structure and variation of populations. Causes Maintenance Changes

More information

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations Topics How to track evolution allele frequencies Hardy Weinberg principle applications Requirements for genetic equilibrium Types of natural selection Population genetic polymorphism in populations, pp.

More information

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions Measurement of Molecular Genetic Variation Genetic Variation Is The Necessary Prerequisite For All Evolution And For Studying All The Major Problem Areas In Molecular Evolution. How We Score And Measure

More information

Balancing and disruptive selection The HKA test

Balancing and disruptive selection The HKA test Natural selection The time-scale of evolution Deleterious mutations Mutation selection balance Mutation load Selection that promotes variation Balancing and disruptive selection The HKA test Adaptation

More information

Familial Breast Cancer

Familial Breast Cancer Familial Breast Cancer SEARCHING THE GENES Samuel J. Haryono 1 Issues in HSBOC Spectrum of mutation testing in familial breast cancer Variant of BRCA vs mutation of BRCA Clinical guideline and management

More information

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000)

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000) HC70A & SAS70A Winter 2009 Genetic Engineering in Medicine, Agriculture, and Law Tracking Human Ancestry Professor John Novembre Themes Global patterns of human genetic diversity Tracing our ancient ancestry

More information

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Mark J. Rieder Department of Genome Sciences mrieder@u.washington washington.edu Epidemiology Studies Cohort Outcome Model to fit/explain

More information

Model based inference of mutation rates and selection strengths in humans and influenza. Daniel Wegmann University of Fribourg

Model based inference of mutation rates and selection strengths in humans and influenza. Daniel Wegmann University of Fribourg Model based inference of mutation rates and selection strengths in humans and influenza Daniel Wegmann University of Fribourg Influenza rapidly evolved resistance against novel drugs Weinstock & Zuccotti

More information

Population Genetic Approaches. to Detect Natural Selection in. Drosophila melanogaster

Population Genetic Approaches. to Detect Natural Selection in. Drosophila melanogaster Population Genetic Approaches to Detect Natural Selection in Drosophila melanogaster Dissertation der Fakultät für Biologie der Ludwig-Maximilians-Universität München vorgelegt von Sascha Glinka aus Heilbronn

More information

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome Of course, every person on the planet with the exception of identical twins has a unique

More information

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Ranajit Chakraborty, Ph.D. Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Overview Some brief remarks about SNPs Haploblock structure of SNPs in the human genome Criteria

More information

Evolutionary Genetics: Part 1 Polymorphism in DNA

Evolutionary Genetics: Part 1 Polymorphism in DNA Evolutionary Genetics: Part 1 Polymorphism in DNA S. chilense S. peruvianum Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik Color code Color code: Red = Important result or definition

More information

Genome Scanning by Composite Likelihood Prof. Andrew Collins

Genome Scanning by Composite Likelihood Prof. Andrew Collins Andrew Collins and Newton Morton University of Southampton Frequency by effect Frequency Effect 2 Classes of causal alleles Allelic Usual Penetrance Linkage Association class frequency analysis Maj or

More information

12/8/09 Comp 590/Comp Fall

12/8/09 Comp 590/Comp Fall 12/8/09 Comp 590/Comp 790-90 Fall 2009 1 One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one

More information

THE classic model of genetic hitchhiking predicts

THE classic model of genetic hitchhiking predicts Copyright Ó 2011 by the Genetics Society of America DOI: 10.1534/genetics.110.122739 Detecting Directional Selection in the Presence of Recent Admixture in African-Americans Kirk E. Lohmueller,*,,1 Carlos

More information

Phasing of 2-SNP Genotypes based on Non-Random Mating Model

Phasing of 2-SNP Genotypes based on Non-Random Mating Model Phasing of 2-SNP Genotypes based on Non-Random Mating Model Dumitru Brinza and Alexander Zelikovsky Department of Computer Science, Georgia State University, Atlanta, GA 30303 {dima,alexz}@cs.gsu.edu Abstract.

More information

Human linkage analysis. fundamental concepts

Human linkage analysis. fundamental concepts Human linkage analysis fundamental concepts Genes and chromosomes Alelles of genes located on different chromosomes show independent assortment (Mendel s 2nd law) For 2 genes: 4 gamete classes with equal

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

The HapMap Project and Haploview

The HapMap Project and Haploview The HapMap Project and Haploview David Evans Ben Neale University of Oxford Wellcome Trust Centre for Human Genetics Human Haplotype Map General Idea: Characterize the distribution of Linkage Disequilibrium

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2015 Human Genetics Series Thursday 4/02/15 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

LD Mapping and the Coalescent

LD Mapping and the Coalescent Zhaojun Zhang zzj@cs.unc.edu April 2, 2009 Outline 1 Linkage Mapping 2 Linkage Disequilibrium Mapping 3 A role for coalescent 4 Prove existance of LD on simulated data Qualitiative measure Quantitiave

More information

Lecture WS Evolutionary Genetics Part I - Jochen B. W. Wolf 1

Lecture WS Evolutionary Genetics Part I - Jochen B. W. Wolf 1 N µ s m r - - - Mutation Effect on allele frequencies We have seen that both genotype and allele frequencies are not expected to change by Mendelian inheritance in the absence of any other factors. We

More information

Office Hours. We will try to find a time

Office Hours.   We will try to find a time Office Hours We will try to find a time If you haven t done so yet, please mark times when you are available at: https://tinyurl.com/666-office-hours Thanks! Hardy Weinberg Equilibrium Biostatistics 666

More information

arxiv: v1 [q-bio.pe] 16 Jul 2013

arxiv: v1 [q-bio.pe] 16 Jul 2013 A model-based approach for identifying signatures of balancing selection in genetic data arxiv:1307.4137v1 [q-bio.pe] 16 Jul 2013 Michael DeGiorgio 1, Kirk E. Lohmueller 1,, and Rasmus Nielsen 1,2,3 1

More information

Lecture 10 : Whole genome sequencing and analysis. Introduction to Computational Biology Teresa Przytycka, PhD

Lecture 10 : Whole genome sequencing and analysis. Introduction to Computational Biology Teresa Przytycka, PhD Lecture 10 : Whole genome sequencing and analysis Introduction to Computational Biology Teresa Przytycka, PhD Sequencing DNA Goal obtain the string of bases that make a given DNA strand. Problem Typically

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

Cornell Probability Summer School 2006 Ancestral Recombination Graph

Cornell Probability Summer School 2006 Ancestral Recombination Graph Cornell Probability Summer School 200 Ancestral Recombination Graph Simon Tavaré Lecture 3 Why recombination? In the era of genomic polymorphism data, the need for models that include recombination is

More information

Structure, Measurement & Analysis of Genetic Variation

Structure, Measurement & Analysis of Genetic Variation Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

Whole Genome Sequencing. Biostatistics 666

Whole Genome Sequencing. Biostatistics 666 Whole Genome Sequencing Biostatistics 666 Genomewide Association Studies Survey 500,000 SNPs in a large sample An effective way to skim the genome and find common variants associated with a trait of interest

More information

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Third Pavia International Summer School for Indo-European Linguistics, 7-12 September 2015 HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Brigitte Pakendorf, Dynamique du Langage, CNRS & Université

More information

Human linkage analysis. fundamental concepts

Human linkage analysis. fundamental concepts Human linkage analysis fundamental concepts Genes and chromosomes Alelles of genes located on different chromosomes show independent assortment (Mendel s 2nd law) For 2 genes: 4 gamete classes with equal

More information

Modelling genes: mathematical and statistical challenges in genomics

Modelling genes: mathematical and statistical challenges in genomics Modelling genes: mathematical and statistical challenges in genomics Peter Donnelly Abstract. The completion of the human and other genome projects, and the ongoing development of high-throughput experimental

More information

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Heather J. Cordell Professor of Statistical Genetics Institute of Genetic Medicine Newcastle University,

More information

Dan Geiger. Many slides were prepared by Ma ayan Fishelson, some are due to Nir Friedman, and some are mine. I have slightly edited many slides.

Dan Geiger. Many slides were prepared by Ma ayan Fishelson, some are due to Nir Friedman, and some are mine. I have slightly edited many slides. Dan Geiger Many slides were prepared by Ma ayan Fishelson, some are due to Nir Friedman, and some are mine. I have slightly edited many slides. Genetic Linkage Analysis A statistical method that is used

More information

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2012 1. Understanding Human Genetic Variation

More information

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Statistical Methods for Quantitative Trait Loci (QTL) Mapping Statistical Methods for Quantitative Trait Loci (QTL) Mapping Lectures 4 Oct 10, 011 CSE 57 Computational Biology, Fall 011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 1:00-1:0 Johnson

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2018 Human Genetics Series Thursday 4/5/18 Nancy L. Saccone, Ph.D. Dept of Genetics nlims@genetics.wustl.edu / 314-747-3263 What

More information

Population stratification. Background & PLINK practical

Population stratification. Background & PLINK practical Population stratification Background & PLINK practical Variation between, within populations Any two humans differ ~0.1% of their genome (1 in ~1000bp) ~8% of this variation is accounted for by the major

More information

Genetics Effective Use of New and Existing Methods

Genetics Effective Use of New and Existing Methods Genetics Effective Use of New and Existing Methods Making Genetic Improvement Phenotype = Genetics + Environment = + To make genetic improvement, we want to know the Genetic value or Breeding value for

More information

Outline. Detecting Selective Sweeps. Are we still evolving? Finding sweeping alleles

Outline. Detecting Selective Sweeps. Are we still evolving? Finding sweeping alleles Outline Detecting Selective Sweeps Alan R. Rogers November 15, 2017 Questions Have humans evolved rapidly or slowly during the past 40 kyr? What functional categories of gene have evolved most? Selection

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2017 Human Genetics Series Tuesday 4/10/17 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:

More information

Lecture 21: Association Studies and Signatures of Selection. November 6, 2006

Lecture 21: Association Studies and Signatures of Selection. November 6, 2006 Lecture 21: Association Studies and Signatures of Selection November 6, 2006 Announcements Outline due today (10 points) Only one reading for Wednesday: Nielsen, Molecular Signatures of Natural Selection

More information

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility?

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility? Outline Genome wide association studies Helga Westerlind, PhD About GWAS/Complex diseases How to GWAS Imputation What is a genome wide association study? Why are we doing them? How do we know there is

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 2 Genes, Genomes, and Mendel Nicholas Wheeler & David Harry

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

The Effect of Change in Population Size on DNA Polymorphism

The Effect of Change in Population Size on DNA Polymorphism Copyright 0 1989 by the Genetics Society of America The Effect of Change in Population Size on DNA Polymorphism Fumio Tajima Department of Biology, Kyushu University, Fukuoka 812, Japan Manuscript received

More information

Natural selection and the distribution of Identity By. Descent in the human genome

Natural selection and the distribution of Identity By. Descent in the human genome Genetics: Published Articles Ahead of Print, published on June 30, 2010 as 10.1534/genetics.110.113977 Natural selection and the distribution of Identity By Descent in the human genome Anders Albrechtsen,

More information

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY.

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY. The psoriasis associated deletion of late cornified envelope genes LCE3B and LCE3C has been maintained under balancing selection since Human Denisovan divergence Petar Pajic 1 *, Yen Lung Lin 1 *, Duo

More information

TEST FORM A. 2. Based on current estimates of mutation rate, how many mutations in protein encoding genes are typical for each human?

TEST FORM A. 2. Based on current estimates of mutation rate, how many mutations in protein encoding genes are typical for each human? TEST FORM A Evolution PCB 4673 Exam # 2 Name SSN Multiple Choice: 3 points each 1. The horseshoe crab is a so-called living fossil because there are ancient species that looked very similar to the present-day

More information