Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics

Size: px
Start display at page:

Download "Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics"

Transcription

1 Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics 1

2 What is population structure? Popula8on Structure A set of individuals characterized by some measure of gene8c dis8nc8on A popula8on is usually characterized by a dis8nct distribu8on over genotypes Example Genotypes aa aa AA Popula8on 1 Popula8on 2 2

3 1000 Genome Projects 3

4 Motivation Reconstruc*ng individual ancestry: The Genographic Project hjps://genographic.na8onalgeographic.com/genographic/index.html Studying human migra*on Out of Africa Mul*- regional hypothesis Study of various traits Lactose intolerance Origins in Europe? Infer from Migra8on studies Muta8on studies in popula8ons 4

5 200,000 years ago 50,000 years ago 30,000 years ago 10,000 years ago hjps://genographic.na8onalgeographic.com/ genographic/index.html 5

6 Overview Background Hardy- Weinberg Equilibrium Gene8c driz Wright s F ST Inferring popula8on structure from genotype data Structure (Falush et al., 2003) Matrix factoriza8on/dimensionality reduc8on methods (Engelhardt & Stephens, 2010) 6

7 Hardy-Weinberg Equilibrium Hardy- Weinberg Equilibruim Under random ma8ng, both allele and genotype frequencies in a popula8on remain constant over genera8ons. Assump8ons of the standard random ma8ng Diploid organism Sexual reproduc8on Nonoverlapping genera8ons Random ma8ng Large popula8on size Equal allele frequencies in the sexes No migra8on/muta8on/selec8on Chi- square test for Hardy- Weinberg equilibrium 7

8 Hardy-Weinberg Equilibrium D, H, R: genotype frequencies for AA, Aa, aa, respec8vely. p q: allele frequencies of A and a 8

9 Hardy-Weinberg Equilibrium The genotype and allele frequencies of the offspring 9

10 Genetic Drift The change in allele frequencies in a popula8on due to random sampling Neutral process unlike natural selec8on But gene8c driz can eliminate an allele from the given popula8on. The effect of gene8c driz is larger in a small popula8on 10

11 Population Divergence Wright s F ST Sta8s8cs used to quan8fy the extent of divergence among mul8ple popula8ons rela8ve to the overall gene8c diversity Summarizes the average devia8on of a collec8on of popula8ons a way from the mean F ST = Var(p k )/p (1-p ) p : the overall frequency of an allele across all subpopulations p k :the allele frequency within population k 11

12 Scenarios of How Populations Evolve 12

13 Methods for Learning Population Structure from Genetic Markers Low- dimensional projec8on PCA- based methods (PaJerson et al., PLoS Gene8cs 2006) Clustering Distance- based (Bowcock et al., Nature 1994) Model- based STRUCTURE (Pritchard et al., Gene8cs 2000) mstruct (Shringarpure & Xing, Gene8cs 2008) 13

14 Probabilistic Models for Population Structure Mixture model Cluster individuals into K popula8ons Admixture model The genotypes of each individual are an admixture of mul8ple ancestor popula8ons Assumes alleles are in linkage equilibrium Linkage model Model recombina8on, correla8on in alleles across chromosome F model Model correla8on in alleles in ancestry 14

15 Mixture Model K popula8ons z (i) : popula8on of origin of individual i For each of the K popula8ons p klj : the frequency of allele j at locus l in popula8on k 15

16 Admixture Model Relax the assump8on of one ancestor per individual in mixture model Individuals can have ancestors in mul8ple different popula8ons q k (i) : propor8on of individual i s genome derived from popula8on k Alleles at different lock can come from different popula8ons 16

17 Structure Model Hypothesis: Modern popula8ons are created by an intermixing of ancestral popula8ons. An individual s genome contains contribu8ons from one or more ancestral popula8ons. The contribu8ons of popula8ons can be different for different individuals. Other assump8ons Hardy- weinberg equilbrium No linkage disequilbrium Markers are i.i.d (independent and iden8cally distributed) 17

18 Linkage Model From admixture model, replace the assump8on that the ancestry labels z il for individual i, locus l are independent with the assump8on that adjacent z il are correlated. Use Poisson process to model the correla8on between neighboring alleles d l : distance between locus l and locus l+1 r: recombina8on rate 18

19 Linkage Model As recombina8on rate r goes to infinity, all loci become independent and linkage model becomes admixture model. Recombina8on rate r can be viewed as being related to the number of genera8ons since admixture occurred. Use MCMC algorithm to fit the unkown parameters. 19

20 F Model Introduce correla8ons in allele frequencies among ancestral popula8ons p Al : allele frequencies in ancestral popula8ons modeled as symmetric Dirichlet distribu8on Subpopula8ons of the ancestral popula8on go through gene8c driz at different rate F k Individuals are admixture of those K popula8ons who went through gene8c driz from the common ancestral popula8on 20

21 F Model Rela8onship between F k and F ST Designed to between closely related popula8ons with similar allele frequencies 21

22 Scenarios of How Populations Evolve 22

23 Unknown Parameters To Be Estimated q i : the admixture propor8ons of individual i p k : allele frequencies of popula8on k z i : popula8on label for each locus of individual i r : recombina8on rate F k : es8mate of popula8on divergence from the ancestral popula8on 23

24 Population Structure from Ancestry Proportion of Each Individual How to display popula8on structure? Ancestral proportion Africa Europe Mid- East Cent./S. Asia East Asia Oceania Genetic structure of Human Populations (Rosenberg et al., #( 2002 Science 24

25 Population of Origin Assignments of a Single Individual True origin Es8mated Origin (Phased data) Es8mated Origin (Unphased data) 25

26 Admixture vs Divergence 26

27 Posterior Distribution of Recombination Rate Using the original dataset AZer permu8ng the genotype loci 27

28 Distinguishing Between Two Closely Related Populations 28

29 Three Sources of Linkage Disequilibrium Mixture LD Due to varia8on in ancestry across individuals that induce correla8on among markers at different loci Modeled by admixture model Admixture LD Due to unbroken chunks of DNA derived from an ancestor popula8on. Modeled by linkage model Background LD Due to LD within popula8ons Decays at smaller scale 29

30 Low-dimensional Projections Gene8c data is very large Number of markers may range from a few hundreds to hundreds of thousands Thus each individual is described by a high- dimensional vector of marker configura8ons A low- dimensional projec8on allows easy visualiza8on Technique used Factor analysis Many sta8s8cal methods exist ICA, PCA, NMF etc. Principal Components Analysis (next slide) Allows projec8on of individuals into a low dimensional space Usually projected to 2 dimensions to allow visualiza8on 30

31 Principal Component Analysis Most common form of factor analysis The new variables/dimensions... Are linear combina8ons of the original ones Are uncorrelated with one another Orthogonal in original dimension space Capture as much of the original variance in the data as possible Are called Principal Components Demo at hjp:// 31

32 What are the new axes? Original Variable B PC 2 PC 1 Original Variable A Orthogonal direc8ons of greatest variance in data Projec8ons along PC1 discriminate the data most along any one axis 32

33 Principal Components First principal component is the direc8on of greatest variability (covariance) in the data Second is the next orthogonal (uncorrelated) direc8on of greatest variability So first remove all the variability along the first component, and then find the next direc8on of greatest variability And so on 33

34 Dimensionality Reduction Can ignore the components of lesser significance. You do lose some informa8on, but if the eigenvalues are small, you don t lose much n dimensions in original data calculate n eigenvectors and eigenvalues choose only the first p eigenvectors, based on their eigenvalues final data set has only p dimensions 34

35 PCA Analysis (Cavalli-sforza,1978) Plot of geographical distribu8on of 3 PCs (Intensity propor8onal to value of each component) First blue Second - green Third - red 35

36 Matrix Factorization and Population Structure Matrix factoriza8on for learning popula8on structure Genotype Data (NxP matrix) N: number of samples P: number of genotypes = Individuals ancestry propor8ons (NxK matrix) K: number of subpopula8ons x Subpopula8on Allele Frequencies (KxP matrix) 36

37 Unifying Framework of Matrix Factorization Admixture Based on probability models: rows of Λ and columns of F should sum to 1. Works well if the individuals are admixtures of discretely separated popula8ons PCA Based on eigen decomposi8on: columns of Λ are orthogonal, rows of F are orthnormal. Works well for the case of isola8on- by- distance (con8nuous varia8on of popula8ons among individuals) Sparse factor model Sparsity via automa8c relevance determina8on prior 37

38 Discrete/Admixed Populations Loading 1 Loading 2 Loading 3 SFA PCA Admixture 38

39 Isolation-by-Distance Models 39

40 SFA Clustered Populations in 1d Habitat Assume two popula8ons Assume five popula8ons Admixture Assume two popula8ons Assume five popula8ons PCA 40

41 Analysis of European Genotype Data PCA SFAm Admixture 41

42 Comparison of Different Methods Advantages PCA Sta8s8cal tests for significance of results (PaJerson et al. 2006) Easy visualiza8on Model- based Clustering Genera8ve process that explicitly models admixture Clustering is probabilis8c: it is possible to assign confidence level of clusters Disadvantages No intui8on about underlying processes Computa8onally more demanding Based on assump8ons of evolu8onary models: Structure: No models of muta8on, recombina8on Muta8on added in mstruct Recombina8on added in extension by Falush et al. 42

Popula'on Structure Computa.onal Genomics Seyoung Kim

Popula'on Structure Computa.onal Genomics Seyoung Kim Popula'on Structure 02-710 Computa.onal Genomics Seyoung Kim What is Popula'on Structure? Popula.on Structure A set of individuals characterized by some measure of gene.c dis.nc.on A popula.on is usually

More information

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics,

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics, Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es Addie Thompson Genomics, 11-15-2016 Outline History and terminology Sta5s5cs and breeding Linkage and associa5on analysis,

More information

Gene Regulatory Networks Computa.onal Genomics Seyoung Kim

Gene Regulatory Networks Computa.onal Genomics Seyoung Kim Gene Regulatory Networks 02-710 Computa.onal Genomics Seyoung Kim Transcrip6on Factor Binding Transcrip6on Control Gene transcrip.on is influenced by Transcrip.on factor binding affinity for the regulatory

More information

Forensics and DNA Sta1s1cs. Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland

Forensics and DNA Sta1s1cs. Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland Forensics and DNA Sta1s1cs Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland References Goodwin, Linacre, and Hadi (2007) An Introduc+on to Forensic Gene+cs, Wiley. Butler

More information

Questions we are addressing. Hardy-Weinberg Theorem

Questions we are addressing. Hardy-Weinberg Theorem Factors causing genotype frequency changes or evolutionary principles Selection = variation in fitness; heritable Mutation = change in DNA of genes Migration = movement of genes across populations Vectors

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Lisa J. Strug, PhD Guest Lecturer Biosta)s)cs Laboratory Course (CHL5207/8) March 5, 2015 Gene Mapping in the News Study Finds Gene

More information

Natural Selection Advanced Topics in Computa8onal Genomics

Natural Selection Advanced Topics in Computa8onal Genomics Natural Selection 02-715 Advanced Topics in Computa8onal Genomics Natural Selection Compara8ve studies across species O=en focus on protein- coding regions Genes under selec8ve pressure Immune- related

More information

Essen%al knowledge standards 1.A.1: Natural selec/on is a major mechanism of evolu/on 1.A.2: Natural selec/on acts on phenotypic varia/ons in

Essen%al knowledge standards 1.A.1: Natural selec/on is a major mechanism of evolu/on 1.A.2: Natural selec/on acts on phenotypic varia/ons in Essen%al knowledge standards 1.A.1: Natural selec/on is a major mechanism of evolu/on 1.A.2: Natural selec/on acts on phenotypic varia/ons in popula/ons 1.A.3: Evolu/onary change is also driven by random

More information

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on 02-710 Computa.onal Genomics Seyoung Kim Overview Two fundamental forces that shape genome sequences Recombina.on Muta.on, gene.c

More information

Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics

Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics Lecture: Genetic Basis of Complex Phenotypes 02-715 Advanced Topics in Computa8onal Genomics Genome Polymorphisms A Human Genealogy TCGAGGTATTAAC The ancestral chromosome From SNPS TCGAGGTATTAAC TCTAGGTATTAAC

More information

Supplementary Note: Detecting population structure in rare variant data

Supplementary Note: Detecting population structure in rare variant data Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to

More information

Downstream analysis of transcriptomic data

Downstream analysis of transcriptomic data Downstream analysis of transcriptomic data Shamith Samarajiwa CRUK Bioinforma3cs Summer School July 2015 General Methods Dimensionality reduc3on methods (clustering, PCA, MDS) Visualizing PaKerns (heatmaps,

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

689 Special Topics in Ecological Genomics. Spring January 22, 2015

689 Special Topics in Ecological Genomics. Spring January 22, 2015 689 Special Topics in Ecological Genomics Spring 2015 January 22, 2015 Animal mtdna Excep&ons: heteroplasmy, paternal leakage, intra- and interspecific recombina&on Animal mtdna Haploid and maternally

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

Linkage Analysis Computa.onal Genomics Seyoung Kim

Linkage Analysis Computa.onal Genomics Seyoung Kim Linkage Analysis 02-710 Computa.onal Genomics Seyoung Kim Genome Polymorphisms Gene.c Varia.on Phenotypic Varia.on A Human Genealogy TCGAGGTATTAAC The ancestral chromosome SNPs and Human Genealogy A->G

More information

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Third Pavia International Summer School for Indo-European Linguistics, 7-12 September 2015 HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Brigitte Pakendorf, Dynamique du Langage, CNRS & Université

More information

An Introduction to Population Genetics

An Introduction to Population Genetics An Introduction to Population Genetics THEORY AND APPLICATIONS f 2 A (1 ) E 1 D [ ] = + 2M ES [ ] fa fa = 1 sf a Rasmus Nielsen Montgomery Slatkin Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

More information

SNP Matching Guide, BF McAllister

SNP Matching Guide, BF McAllister Informa(on in this guide is prepared and presented by Bryant McAllister, Associate Professor of Biology at The University of Iowa. This and other resources for understanding the interpreta(ons and uses

More information

Variant Simulation Tools

Variant Simulation Tools Variant Simulation Tools Bo Peng Sep 25, 2014 Genetic Simulations Why perform simulations? To get data that match these (unrealis+c) assump+ons of our methods Validate sta+s+cal methods using simulated

More information

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15 Introduction to Population Genetics Spezielle Statistik in der Biomedizin WS 2014/15 What is population genetics? Describes the genetic structure and variation of populations. Causes Maintenance Changes

More information

Two-locus models. Two-locus models. Two-locus models. Two-locus models. Consider two loci, A and B, each with two alleles:

Two-locus models. Two-locus models. Two-locus models. Two-locus models. Consider two loci, A and B, each with two alleles: The human genome has ~30,000 genes. Drosophila contains ~10,000 genes. Bacteria contain thousands of genes. Even viruses contain dozens of genes. Clearly, one-locus models are oversimplifications. Unfortunately,

More information

Edexcel (B) Biology A-level

Edexcel (B) Biology A-level Edexcel (B) Biology A-level Topic 8: Origins of Genetic Variation Notes Meiosis is reduction division. The main role of meiosis is production of haploid gametes as cells produced by meiosis have half the

More information

Population stratification. Background & PLINK practical

Population stratification. Background & PLINK practical Population stratification Background & PLINK practical Variation between, within populations Any two humans differ ~0.1% of their genome (1 in ~1000bp) ~8% of this variation is accounted for by the major

More information

Algorithms for Genetics: Introduction, and sources of variation

Algorithms for Genetics: Introduction, and sources of variation Algorithms for Genetics: Introduction, and sources of variation Scribe: David Dean Instructor: Vineet Bafna 1 Terms Genotype: the genetic makeup of an individual. For example, we may refer to an individual

More information

Statistical Tools for Predicting Ancestry from Genetic Data

Statistical Tools for Predicting Ancestry from Genetic Data Statistical Tools for Predicting Ancestry from Genetic Data Timothy Thornton Department of Biostatistics University of Washington March 1, 2015 1 / 33 Basic Genetic Terminology A gene is the most fundamental

More information

Linkage Disequilibrium

Linkage Disequilibrium Linkage Disequilibrium Why do we care about linkage disequilibrium? Determines the extent to which association mapping can be used in a species o Long distance LD Mapping at the tens of kilobase level

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of

More information

Genetic data concepts and tests

Genetic data concepts and tests Genetic data concepts and tests Cavan Reilly September 21, 2018 Table of contents Overview Linkage disequilibrium Quantifying LD Heatmap for LD Hardy-Weinberg equilibrium Genotyping errors Population substructure

More information

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000)

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000) HC70A & SAS70A Winter 2009 Genetic Engineering in Medicine, Agriculture, and Law Tracking Human Ancestry Professor John Novembre Themes Global patterns of human genetic diversity Tracing our ancient ancestry

More information

Genomics for Human Variations. Dr. Emile Chimusa

Genomics for Human Variations. Dr. Emile Chimusa Genomics for Human Variations Dr. Emile Chimusa http://www.cbio.uct.ac.za/emile-chimusa/index.html I. Principle of Population structure and Principal Component Analysis. II. Principles of Genome wide Association

More information

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both?

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both? Frequency 10/6/2014 Variation Chapter 9 Some terms Genotype Allele form of a gene, distinguished by effect on phenotype Haplotype form of a gene, distinguished by DNA sequence Gene copy number of copies

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

mstruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations

mstruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations mstruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations Suyash Shringarpure Eric P. Xing School of Computer Science, Carnegie Mellon

More information

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus. NAME EXAM# 1 1. (15 points) Next to each unnumbered item in the left column place the number from the right column/bottom that best corresponds: 10 additive genetic variance 1) a hermaphroditic adult develops

More information

Hardy Weinberg Equilibrium

Hardy Weinberg Equilibrium Gregor Mendel Hardy Weinberg Equilibrium Lectures 4-11: Mechanisms of Evolution (Microevolution) Hardy Weinberg Principle (Mendelian Inheritance) Genetic Drift Mutation Sex: Recombination and Random Mating

More information

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Introduction to Add Health GWAS Data Part I Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Outline Introduction to genome-wide association studies (GWAS) Research

More information

Linkage Disequilibrium. Adele Crane & Angela Taravella

Linkage Disequilibrium. Adele Crane & Angela Taravella Linkage Disequilibrium Adele Crane & Angela Taravella Overview Introduction to linkage disequilibrium (LD) Measuring LD Genetic & demographic factors shaping LD Model predictions and expected LD decay

More information

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations Topics How to track evolution allele frequencies Hardy Weinberg principle applications Requirements for genetic equilibrium Types of natural selection Population genetic polymorphism in populations, pp.

More information

PopGen1: Introduction to population genetics

PopGen1: Introduction to population genetics PopGen1: Introduction to population genetics Introduction MICROEVOLUTION is the term used to describe the dynamics of evolutionary change in populations and species over time. The discipline devoted to

More information

TEST FORM A. 2. Based on current estimates of mutation rate, how many mutations in protein encoding genes are typical for each human?

TEST FORM A. 2. Based on current estimates of mutation rate, how many mutations in protein encoding genes are typical for each human? TEST FORM A Evolution PCB 4673 Exam # 2 Name SSN Multiple Choice: 3 points each 1. The horseshoe crab is a so-called living fossil because there are ancient species that looked very similar to the present-day

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

The Evolution of Populations

The Evolution of Populations The Evolution of Populations What you need to know How and reproduction each produce genetic. The conditions for equilibrium. How to use the Hardy-Weinberg equation to calculate allelic and to test whether

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

Park /12. Yudin /19. Li /26. Song /9

Park /12. Yudin /19. Li /26. Song /9 Each student is responsible for (1) preparing the slides and (2) leading the discussion (from problems) related to his/her assigned sections. For uniformity, we will use a single Powerpoint template throughout.

More information

Pathway Analysis Adding Func2onal Context to High- Throughput Results

Pathway Analysis Adding Func2onal Context to High- Throughput Results Pathway Analysis Adding Func2onal Context to High- Throughput Results Stephen D. Turner, Ph.D. Bioinforma2cs Core Director bioinforma2cs@virginia.edu Outline Bioinforma2cs & the Bioinforma2cs Core Service

More information

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Statistical Methods for Quantitative Trait Loci (QTL) Mapping Statistical Methods for Quantitative Trait Loci (QTL) Mapping Lectures 4 Oct 10, 011 CSE 57 Computational Biology, Fall 011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 1:00-1:0 Johnson

More information

Lecture 10: Introduction to Genetic Drift. September 28, 2012

Lecture 10: Introduction to Genetic Drift. September 28, 2012 Lecture 10: Introduction to Genetic Drift September 28, 2012 Announcements Exam to be returned Monday Mid-term course evaluation Class participation Office hours Last Time Transposable Elements Dominance

More information

Overview. Methods for gene mapping and haplotype analysis. Haplotypes. Outline. acatactacataacatacaatagat. aaatactacctaacctacaagagat

Overview. Methods for gene mapping and haplotype analysis. Haplotypes. Outline. acatactacataacatacaatagat. aaatactacctaacctacaagagat Overview Methods for gene mapping and haplotype analysis Prof. Hannu Toivonen hannu.toivonen@cs.helsinki.fi Discovery and utilization of patterns in the human genome Shared patterns family relationships,

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S Bioinformatics LECTURE 3-16 B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Bioinformatics LECTURE

More information

EXERCISE 1. Testing Hardy-Weinberg Equilibrium. 1a. Fill in Table 1. Calculate the initial genotype and allele frequencies.

EXERCISE 1. Testing Hardy-Weinberg Equilibrium. 1a. Fill in Table 1. Calculate the initial genotype and allele frequencies. Biology 152/153 Hardy-Weinberg Mating Game EXERCISE 1 Testing Hardy-Weinberg Equilibrium Hypothesis: The Hardy-Weinberg Theorem says that allele frequencies will not change over generations under the following

More information

Population Genetics II. Bio

Population Genetics II. Bio Population Genetics II. Bio5488-2016 Don Conrad dconrad@genetics.wustl.edu Agenda Population Genetic Inference Mutation Selection Recombination The Coalescent Process ACTT T G C G ACGT ACGT ACTT ACTT AGTT

More information

RNA sequencing Integra1ve Genomics module

RNA sequencing Integra1ve Genomics module RNA sequencing Integra1ve Genomics module Michael Inouye Centre for Systems Genomics University of Melbourne, Australia Summer Ins@tute in Sta@s@cal Gene@cs 2016 SeaBle, USA @minouye271 inouyelab.org This

More information

The Theory of Evolution

The Theory of Evolution The Theory of Evolution Mechanisms of Evolution Notes Pt. 4 Population Genetics & Evolution IMPORTANT TO REMEMBER: Populations, not individuals, evolve. Population = a group of individuals of the same

More information

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics BST227 Introduction to Statistical Genetics Lecture 3: Introduction to population genetics 1 Housekeeping HW1 due on Wednesday TA office hours today at 5:20 - FXB G11 What have we studied Background Structure

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Population and Community Dynamics. The Hardy-Weinberg Principle

Population and Community Dynamics. The Hardy-Weinberg Principle Population and Community Dynamics The Hardy-Weinberg Principle Key Terms Population: same species, same place, same time Gene: unit of heredity. Controls the expression of a trait. Can be passed to offspring.

More information

H3A - Genome-Wide Association testing SOP

H3A - Genome-Wide Association testing SOP H3A - Genome-Wide Association testing SOP Introduction File format Strand errors Sample quality control Marker quality control Batch effects Population stratification Association testing Replication Meta

More information

Mathematical Population Genetics

Mathematical Population Genetics Mathematical Population Genetics (Hardy-Weinberg, Selection, Drift and Linkage Disequilibrium ) Chiara Sabatti, Human Genetics 5554B Gonda csabatti@mednet.ucla.edu Populations Which predictions are of

More information

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics BST227 Introduction to Statistical Genetics Lecture 3: Introduction to population genetics!1 Housekeeping HW1 will be posted on course website tonight 1st lab will be on Wednesday TA office hours have

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

QTL Mapping, MAS, and Genomic Selection

QTL Mapping, MAS, and Genomic Selection QTL Mapping, MAS, and Genomic Selection Dr. Ben Hayes Department of Primary Industries Victoria, Australia A short-course organized by Animal Breeding & Genetics Department of Animal Science Iowa State

More information

MICROEVOLUTION. On the Origin of Species WHAT IS A SPECIES? WHAT IS A POPULATION? Genetic variation: how do new forms arise?

MICROEVOLUTION. On the Origin of Species WHAT IS A SPECIES? WHAT IS A POPULATION? Genetic variation: how do new forms arise? MICROEVOLUTION On the Origin of Species WHAT IS A SPECIES? Individuals in one or more populations Potential to interbreed Produce fertile offspring WHAT IS A POPULATION? Group of interacting individuals

More information

A genome wide association study of metabolic traits in human urine

A genome wide association study of metabolic traits in human urine Supplementary material for A genome wide association study of metabolic traits in human urine Suhre et al. CONTENTS SUPPLEMENTARY FIGURES Supplementary Figure 1: Regional association plots surrounding

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 3: Genome-wide Association Studies 1 Setting

More information

Evolutionary Mechanisms

Evolutionary Mechanisms Evolutionary Mechanisms Tidbits One misconception is that organisms evolve, in the Darwinian sense, during their lifetimes Natural selection acts on individuals, but only populations evolve Genetic variations

More information

Computational Genomics

Computational Genomics Computational Genomics 10-810/02 810/02-710, Spring 2009 Quantitative Trait Locus (QTL) Mapping Eric Xing Lecture 23, April 13, 2009 Reading: DTW book, Chap 13 Eric Xing @ CMU, 2005-2009 1 Phenotypical

More information

A Primer of Ecological Genetics

A Primer of Ecological Genetics A Primer of Ecological Genetics Jeffrey K. Conner Michigan State University Daniel L. Hartl Harvard University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Contents Preface xi Acronyms,

More information

Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data

Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data Am. J. Hum. Genet. 75:771 789, 2004 Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data Giovanni Montana and Jonathan K. Pritchard Department of Human Genetics, University of

More information

Introduction Chapter 23 - EVOLUTION of

Introduction Chapter 23 - EVOLUTION of Introduction Chapter 23 - EVOLUTION of POPULATIONS The blue-footed booby has adaptations that make it suited to its environment. These include webbed feet, streamlined shape that minimizes friction when

More information

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping - from Darwin's time onward, it has been widely recognized that natural populations harbor a considerably degree of genetic

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing

More information

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population! Genetic variation in a population increases the chance that some individuals

More information

Lecture 6: GWAS in Samples with Structure. Summer Institute in Statistical Genetics 2015

Lecture 6: GWAS in Samples with Structure. Summer Institute in Statistical Genetics 2015 Lecture 6: GWAS in Samples with Structure Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 25 Introduction Genetic association studies are widely used for the identification

More information

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

LACTASE PERSISTENCE: EVIDENCE FOR SELECTION

LACTASE PERSISTENCE: EVIDENCE FOR SELECTION LACTASE PERSISTENCE: EVIDENCE FOR SELECTION OVERVIEW This activity focuses on the genetic changes that gave rise to lactase persistence an example of recent human evolution. Students explore the evidence

More information

Exam 1, Fall 2012 Grade Summary. Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average Grade Distribution:

Exam 1, Fall 2012 Grade Summary. Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average Grade Distribution: Exam 1, Fall 2012 Grade Summary Points: Mean 95.3 Median 93 Std. Dev 8.7 Max 116 Min 83 Percentage: Average 79.4 Grade Distribution: Name: BIOL 464/GEN 535 Population Genetics Fall 2012 Test # 1, 09/26/2012

More information

Population Genetics. Ben Hecht CRITFC Genetics Training December 11, 2013

Population Genetics.   Ben Hecht CRITFC Genetics Training December 11, 2013 Population Genetics http://darwin.eeb.uconn.edu/simulations/drift.html Ben Hecht CRITFC Genetics Training December 11, 2013 1 Population Genetics The study of how populations change genetically over time

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

QTL Mapping Using Multiple Markers Simultaneously

QTL Mapping Using Multiple Markers Simultaneously SCI-PUBLICATIONS Author Manuscript American Journal of Agricultural and Biological Science (3): 195-01, 007 ISSN 1557-4989 007 Science Publications QTL Mapping Using Multiple Markers Simultaneously D.

More information

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool.

11.1 Genetic Variation Within Population. KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool. 11.1 Genetic Variation Within Population Genetic variation in a population increases the chance that some individuals

More information

Structure, Measurement & Analysis of Genetic Variation

Structure, Measurement & Analysis of Genetic Variation Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture20: Haplotype testing and Minimum GWAS analysis steps Jason Mezey jgm45@cornell.edu April 17, 2017 (T) 8:40-9:55 Announcements Project

More information

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer Multi-SNP Models for Fine-Mapping Studies: Application to an association study of the Kallikrein Region and Prostate Cancer November 11, 2014 Contents Background 1 Background 2 3 4 5 6 Study Motivation

More information

The Evolution of Populations

The Evolution of Populations Chapter 23 The Evolution of Populations PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Section KEY CONCEPT A population shares a common gene pool.

Section KEY CONCEPT A population shares a common gene pool. Section 11.1 KEY CONCEPT A population shares a common gene pool. Genetic variation in a population increases the chance that some individuals will survive. Why it s beneficial: Genetic variation leads

More information

The Evolution of Populations

The Evolution of Populations Microevolution The Evolution of Populations C H A P T E R 2 3 Change in allele frequencies over generations Three mechanisms cause allele frequency change: Natural selection (leads to adaptation) Genetic

More information

The Evolution of Populations

The Evolution of Populations Chapter 23 The Evolution of Populations PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Lecture #3 1/23/02 Dr. Kopeny Model of polygenic inheritance based on three genes

Lecture #3 1/23/02 Dr. Kopeny Model of polygenic inheritance based on three genes Lecture #3 1/23/02 Dr. Kopeny Model of polygenic inheritance based on three genes Reference; page 230 in textbook 13 Genotype; The genetic constitution governing a heritable trait of an organism Phenotype:

More information

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Genome-Wide Association Studies Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Introduction The next big advancement in the field of genetics after the Human Genome Project

More information

Genetic Drift Lecture outline. 1. Founder effect 2. Genetic drift consequences 3. Population bottlenecks 4. Effective Population size

Genetic Drift Lecture outline. 1. Founder effect 2. Genetic drift consequences 3. Population bottlenecks 4. Effective Population size Genetic Drift Lecture outline. Founder effect 2. Genetic drift consequences 3. Population bottlenecks 4. Effective Population size Odd populations Deer at Seneca Army Depot Cheetah Silvereyes (Zosterops

More information

Spectrum: joint bayesian inference of population structure and recombination events

Spectrum: joint bayesian inference of population structure and recombination events BIOINFORMATICS Vol. 23 ISMB/ECCB 2007, pages i479 i489 doi:0.093/bioinformatics/btm7 Spectrum: joint bayesian inference of population structure and recombination events Kyung-Ah Sohn and Eric P. Xing*

More information

Genetic Variation, Biological Pathways, and Networks. Sarah Pendergrass Center for Systems Genomics

Genetic Variation, Biological Pathways, and Networks. Sarah Pendergrass Center for Systems Genomics Genetic Variation, Biological Pathways, and Networks Sarah Pendergrass Center for Systems Genomics Outline What are networks and why are they important in biology? Biological Pathways Why do we care about

More information

CS 680: Assembly and Analysis of Sequencing Data. Fall 2012 August 21st, 2012

CS 680: Assembly and Analysis of Sequencing Data. Fall 2012 August 21st, 2012 CS 680: Assembly and Analysis of Sequencing Data Fall 2012 August 21st, 2012 Logis@cs of the Course Logis@cs About the Course Instructor: Chris@na Boucher email: cboucher@cs.colostate.edu Office: CSB 464

More information