Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000)

Similar documents
Detecting ancient admixture using DNA sequence data

Supplementary Note: Detecting population structure in rare variant data

How about the genes? Biology or Genes? DNA Structure. DNA Structure DNA. Proteins. Life functions are regulated by proteins:

Computational Workflows for Genome-Wide Association Study: I

Linkage Disequilibrium. Adele Crane & Angela Taravella

Genome-wide association studies (GWAS) Part 1

ISAG FAO Genetic Diversity

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Answers to additional linkage problems.

Chapter 23: The Evolution of Populations. 1. Populations & Gene Pools. Populations & Gene Pools 12/2/ Populations and Gene Pools

Supplementary Figures

can be found from OMIM (Online Mendelian Inheritance in Man),

FORENSIC GENETICS. DNA in the cell FORENSIC GENETICS PERSONAL IDENTIFICATION KINSHIP ANALYSIS FORENSIC GENETICS. Sources of biological evidence

Genetic variation, genetic drift (summary of topics)

Concepts: What are RFLPs and how do they act like genetic marker loci?

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

BTRY 7210: Topics in Quantitative Genomics and Genetics

Learning about human population history from ancient and modern genomes

Tracing Your Matrilineal Ancestry: Mitochondrial DNA PCR and Sequencing

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

GENETIC CHARACTERIZATION OF LIVESTOCK POPULATIONS AND ITS USE IN CONSERVATION DECISION-MAKING

POPULATION GENETICS. Evolution Lectures 4

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

An introduction to genetics and molecular biology

Mutations during meiosis and germ line division lead to genetic variation between individuals

A Human Migration Map Based on Mitochondrial DNA (mtdna) Haplogroups

Genome-Wide Association Studies (GWAS): Computational Them

4.1. Genetics as a Tool in Anthropology

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Quality Control Assessment in Genotyping Console

Chp 10 Patterns of Inheritance

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications

Genetics Sperm Meiotic cell division Egg Chromosome Segments of DNA Code DNA for traits Code for a trait Gene

Human Population Genetic Structure and Diversity Inferred from Polymorphic L1 (LINE-1) and Alu Insertions

Gen e e n t e i t c c V a V ri r abi b li l ty Biolo l gy g Lec e tur u e e 9 : 9 Gen e et e ic I n I her e itan a ce

Haplotype Identification through DNA Sequencing Processes

Wolf population genetics in Europe

SNPs - GWAS - eqtls. Sebastian Schmeier

The Evolution of Populations

Extraction, Sequencing and Analysis of Human mtdna Taken from a U.S. Population

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Beyond Mendel s Laws of Inheritance

Lecture 8: Sequencing and SNP. Sept 15, 2006

Runs of Homozygosity Analysis Tutorial

Genetics of SNP Markers

LINKAGE AND CHROMOSOME MAPPING IN EUKARYOTES

Random Allelic Variation

Inferring Ethnicity from Mitochondrial DNA Sequence

Applications and Uses. (adapted from Roche RealTime PCR Application Manual)

Mendel & Inheritance. SC.912.L.16.1 Use Mendel s laws of segregation and independent assortment to analyze patterns of inheritance.

Advanced Plant Technology Program Vocabulary

BIOSTATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH. Genetic Variation & CANCERS

Population Genetics (Learning Objectives)

MAPPING GENES TO TRAITS IN DOGS USING SNPs

Exploring Mendelian Genetics 11-3

From AP investigative Laboratory Manual 1

Zebrafish and Skin Color Reference Data

Inconsistencies in Neanderthal Genomic DNA Sequences

Quality Measures for CytoChip Microarrays

The Modern Synthesis. Causes of microevolution. The Modern Synthesis. Microevolution. Genetic Drift. Genetic drift example

Using genetic markers in analyses of natural populations

Sequence Variations. Baxevanis and Ouellette, Chapter 7 - Sequence Polymorphisms. NCBI SNP Primer:

Beyond Mendel s Laws of Inheritance

Overview One of the promises of studies of human genetic variation is to learn about human history and also to learn about natural selection.

APPLICATION INFORMATION

Characterization of Allele-Specific Copy Number in Tumor Genomes

Study Guide A. Answer Key. The Evolution of Populations

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

Chapter 14: Mendel and the Gene Idea

Genetics Lecture 16 Forensics

SNP calling and VCF format

"Genetics in geographically structured populations: defining, estimating and interpreting FST."

Conifer Translational Genomics Network Coordinated Agricultural Project

Data Mining and Applications in Genomics

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Genetic Variation. Genetic Variation within Populations. Population Genetics. Darwin s Observations

Detecting selection on nucleotide polymorphisms

7-1. Read this exercise before you come to the laboratory. Review the lecture notes from October 15 (Hardy-Weinberg Equilibrium)

Observing Patterns in Inherited Traits. Chapter 11

LAB. WALRUSES AND WHALES AND SEALS, OH MY!

Exploring Mendelian Genetics

H3A - Genome-Wide Association testing SOP

Human linkage analysis. fundamental concepts

Population and Community Dynamics. The Hardy-Weinberg Principle

XXXXXXXX. 40 Scientific American, October 2010 Photograph/Illustration by Artist Name Scientific American

Introduction to Bioinformatics

Book chapter appears in:

Concepts of Genetics Ninth Edition Klug, Cummings, Spencer, Palladino

POPULATION GENETICS: The study of the rules governing the maintenance and transmission of genetic variation in natural populations.

Dr. Mallery Biology Workshop Fall Semester CELL REPRODUCTION and MENDELIAN GENETICS

Linking Genetic Variation to Important Phenotypes

3I03 - Eukaryotic Genetics Repetitive DNA

Name Date Class. In the space at the left, write the letter of the term or phrase that best completes each statement or answers each question.

Transcription:

HC70A & SAS70A Winter 2009 Genetic Engineering in Medicine, Agriculture, and Law Tracking Human Ancestry Professor John Novembre Themes Global patterns of human genetic diversity Tracing our ancient ancestry Clines versus clusters debate Within-continent patterns Personalized genomic ancestry inference What really is ancestry? Admixture and Chromosome painting Natural selection and patterns of human genetic diversity Salivary Amylase Eye color (OCA2) A puzzle of human ancestry The fossil record in East Asia contains a 60,000 year gap Did archaic humans outside-of-africa die out, only to be replaced later on or is this just an incomplete fossil record? Homo erectus Jin and Su, Nature Reviews Genetics (2000) Competing models of human origins Which of these is the best supported hypothesis for origins of humans around the globe? (A) Recent Out-of-Africa origin (B) Recent Out-of-Africa origin with minor contribution from ancestral archaic humans (C) Shared multi-regional evolution (D) Independent multiple origins Jin and Su, Nature Reviews Genetics (2000) Human genome diversity panel All human mitochondrial DNA sequences have a common ancestor Eve 120-220k years ago Tree shape is consistent with Outof-Africa origin A global-scale sample of human genetic diversity Non-African sequences have a common ancestor at 25-75k years ago Mitochondrial Eve existed in the recent past Not a perfect sampling, but the best studied yet

Review: Single nucleotide polymorphisms (SNPs) and haplotypes DNA Chips Can Detect SNP Genotypes (Or Haplotypes) Across An Individual s Genome This Can Then Be Correlated With Diseases &/or Geographical Associations Oceania Americas Population tree based on similarity of allele frequencies at ~650,000 SNPs Population tree provides further support for Out-ofAfrica origins of humans Global patterns of haplotype diversity East Asia Decay of haplotype heterozygosity is consistent with a serial bottlenecks during Out-of-Africa expansion Central Asia Europe North Africa / Mid-East Sub-Saharan Africa Li et al, Science (2008) Distance via waypoints to: Addis Ababa, Rift Valley, Ethiopia (Putative origin of early modern humans) Most Genetic Diversity Originated in the Founder Populations to Modern Humans Examples of paths from East Africa to each of the HGDP populations Decay of diversity is consistent with Outof-Africa expansion Interpolated map of genetic diversity

Summary: Human origins At a global scale, genome-wide diversity patterns broadly consistent with a single, recent origin of modern humans in Africa Continental-scale clusters of human variation? Results of an unsupervised clustering algorithm on microsatellite diversity Further support available from recent Y-chromosome and autosomal gene TMRCA dates Note: A very small number of loci show very ancient TMRCA dates (eg 2 million years old) Open question: Are these evidence for rare, ancient contributions from archaic humans? Genetic clusters approximate continental-scale regions Rosenberg et al (2002) Science Continental-scale clusters of human variation? Or does differentiation increase smoothly with geographic distance ( clines )? Results of unsupervised clustering algorithm on ~650,000 SNPs Clusters are now more detailed Europe, Middle East, and Central South Asia distinguishable Li et al (2008) Science Clines vs clusters debate Perhaps clusters are due to impact of geographic barriers? Sahara Tibetan Plateau Bering Strait Malay archipelago Or perhaps clusters are an artifact due to gaps in sampling? Either way: Variation is mostly found within rather than between groups Within populations Among population / within groups Among groups Li et al (2008) Science

Fine-scale analysis within a continent: Europe as a case study Americas East Asia Oceania North Africa Mid-East Central Asia Sub-Saharan Africa Europe Admixture proportions assuming K unknown populations K=2 K=3 The closer you look, the more clusters you see K=4 K=5 K=6 Image from EU Panorama 27 Europeans 1400 individuals from 37 unique populations in Europe Analysis using 197,000 SNP loci Jakobsson et al (2008) Nature Two-dimensional summary of European genetic diversity Principal Components Analsysis (PCA)=Reduction of dimensions Pop 1 0 = AA 1 = AB 2 = BB PC1 PC2-008 010-005 005 008-002 004-003 -005 007 008-002 -004 006-002 005 008-002 -004 003 015 010 005 PC2 Individuals Admixed Individuals 0055 0042-0052 0010 0052 0043-0058 0020-0030 -0050 0052 0043-0058 -004 0020-0031 -0053 0014 0533-0044 000-005 -010-006 -004-002 000 002 004 006 PC1 PCA is often a useful tool to visualize patterns of population structure PC1 DK DK SI SI SI PC2 SK SK SK SI Genome-wide patterns of variation mirror the geography of Europe DK DK SI SI SI SK SK PC2 DK Pop 2 PC1 Single Nucleotide Polymorphisms Differentiation found at small spatial-scale Novembre et al (2008) Nature Subtle differentiation within Switzerland

Differentiation within Finland Per locus information content extremely low Minor allele frequency 00 02 04 06 25 random SNPs At any given SNP, there is little variation among populations PCA methods pool weak signals across many, many loci to reveal differentiation NW W C SW S Region of Europe Across loci mean FST=0004 Summary: Clines versus Clusters Applications: Ancestry Inference in Personalized Genomics At a global scale, support for clusters and clines can be observed Unresolved: Do geographic barriers lead to global-scale cluster results or is it uneven sampling? With large numbers of SNP markers, patterns of differentiation are detectable even at small-scales within continents (although often more clinal than clustered ) Note: Variation is still predominately within vs between groups Future directions: With whole genomes - we may detect even more subtle patterns of differentiation What do we mean by ancestry? What do we mean by ancestry? My mtdna test comes back and says I have a Native American mitochondrial haplotype Does this mean: (a) I am is likely to be 100% Native American (b) My mother s side is likely to be 100% Native American (c) My mother s mother s mother s mother s mother s mother s mother was likely Native American (d) Not enough information to tell If we have mixed ancestry, our chromosomes are a patchwork of DNA with different ancestries But by chance, some ancestors are lost and leave no trace in our genomes at all (Example: Purple ancestor) Ancestry is relative to time: implicit interest in the geographic origins of our relatively recent ancestors mtdna and Y- chromsome: Trace only one of your ancestors at any given point back in time

Pushing the limits: Ancestry inference within continents Sensitivity to reference populations: Asian ancestry might actually be Native American Painting is the result of statistical inference, but often difficult to communicate the uncertainty Leave-one-out crossvalidation performance: For countries with n>15: 50%: < 240 km 90%: < 630 km Overall: 50% : < 540 km 90% : < 840 km PCA of HGDP populations 400 km Customer with mixed East Asian / European ancestry show in green Customer appears to be Central Asian Caveat: With PCA, mixed ancestry leads to an intermediate positioning DK G G G G G G G G GG F G G G G G G F F G G G F G G G F F G F F F G F G G G G G F F G G F G F G F F G G G SI F G F F F G F F F G I F F F G F G G G G G F F G F F G G F G G F F G G G G G G F F F F F F G F F F F F F G G G F F F F F F G F F F F F F F G F F G F F G F G F G F F G F F G F G G F F F F F G F F G G F F F F G F G F F G F F G F F F SI F F F G F G F F F F F F F F F F F F F F F G F F F F F F F G F F I U F F F F YA I I I G I I I I I I I SK I Regression-based approach using PC scores PCA and mixed ancestry Performance: Spatial prediction Note: 23andme FAQ warns customer of this issue Summary: Personal ancestry inference What s your opinion: Immense potential, but numerous challenges: Obtaining the appropriate reference samples Communicating to customer: What we really mean by ancestry Statistical uncertainty in the inferences Limitations of particular analyses Danger of customers forming notions of group membership / race? Personal ancestry inference is: (a) A waste of money (b) A harmless hobby if that s what you re in to (c) Fascinating - sign me up (d) The first steps towards genetic elitism Customer Customer

Question: Salivary amylase copy number variation Are humans still evolving? (a) Yes (b) No Japanese individual 10 + 4 = 14 copies AMY1 copy number variation as seen by fluorescent in-situ hybridization (SH) Biaka individual 3 + 3 = 6 copies Copy-number variation is highly variable among human populations 1 + 1 = 2 copies Chimpanzee Diet and Variation in salivary amylase copy number The impact of natural selection on population differences High-starch diet populations have higher average copy number for than low-starch diet populations SNPs in genic regions and especially nonsynonymous SNPs are more often extremely differentiated than non-genic SNPs Positive selection in one population but not another can lead to differentiation at SNPs in the region of the gene under selection The most differentiated SNPs between human groups are likely to be in regions that have undergone recent adaptive evolution Bariero et al (2008) Nature Genetics Differentiation at a random region of Chromosome 15 Categories of genes that appear to have been under selection judging by extreme population differentiation What we see as surface indicators of race (skin, hair, eyecolor) are some of the most unique regions of the genome SNP: rs16971844 Ancestral Allele: A* Derived Allele: G 60 30 0 270 300 30 0 30 0 30 60 90 120 150 Bariero et al (2008) Nature Genetics

Differentiation at around the OCA2 gene OCA2 came up in Lecture 5 Was it a gene that is known to affect: (a) Type II diabetes susceptibility (b) Eye-color (c) Intelligence (d) Cystic-fibrosis Differentiation at SNP rs1448484 in the OCA2 gene SNP: rs1448484 Ancestral Allele: G Derived Allele: A 60 30 0 270 300 30 0 30 0 30 60 90 120 150 SNPs in Human P Protein Gene Lead To Different Eye Colors Differentiation at around the OCA2 gene Human Eye Color TE: OCA2 also affects skin color and hair color Differentiation at SNP rs7496968 (known to affect eye-color) in the OCA2 gene G allele carried on haplotype known to predispose individuals to brown eyes Conclusions: Humans have been recently, and continue to be, evolving Patterns of genetic variation in humans point towards recent common ancestry in Africa With modern large-scale data sets, we can identify the personal ancestry of individuals to a fine spatial scale Many of the most differentiated regions of the genome seem to be the results of selection related to: Novel diets External morphology in response to different climates Immune system / disease evasion Beyond these few differentiated regions (some of which are relevant to medicine), most variation is found globally (and most of the genome doesn t even vary) Intelligent views about our common genetic heritage and diversity will be crucial in our post-genomic world