Linkage Disequilibrium. Adele Crane & Angela Taravella

Similar documents
Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Analysis of genome-wide genotype data

Algorithms for Genetics: Introduction, and sources of variation

An Introduction to Population Genetics

Computational Workflows for Genome-Wide Association Study: I

LINKAGE DISEQUILIBRIUM MAPPING USING SINGLE NUCLEOTIDE POLYMORPHISMS -WHICH POPULATION?

Genetics Effective Use of New and Existing Methods

Explaining the evolution of sex and recombination

Conifer Translational Genomics Network Coordinated Agricultural Project

Genome-Wide Association Studies (GWAS): Computational Them

Understanding genetic association studies. Peter Kamerman

HAPLOTYPE BLOCKS AND LINKAGE DISEQUILIBRIUM IN THE HUMAN GENOME

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

Supplementary Note: Detecting population structure in rare variant data

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

The evolutionary significance of structure. Detecting and describing structure. Implications for genetic variability

Two-locus models. Two-locus models. Two-locus models. Two-locus models. Consider two loci, A and B, each with two alleles:

Genome-wide analyses in admixed populations: Challenges and opportunities

Why do we need statistics to study genetics and evolution?

What is genetic variation?

LD Mapping and the Coalescent

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

The Evolution of Populations

Crash-course in genomics

How Populations Evolve. Chapter 15

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

Questions we are addressing. Hardy-Weinberg Theorem

Population Genetics II. Bio

Signatures of a population bottleneck can be localised along a recombining chromosome

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

SNP Selection. Outline of Tutorial. Why Do We Need tagsnps? Concepts of tagsnps. LD and haplotype definitions. Haplotype blocks and definitions

Summary. Introduction

The Evolution of Populations

-Is change in the allele frequencies of a population over generations -This is evolution on its smallest scale

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

Human Genetics and Gene Mapping of Complex Traits

Windfalls and pitfalls

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

Linkage Disequilibrium

Supplementary File: In search of the Goldilocks zone for hybrid speciation

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Haplotypes, linkage disequilibrium, and the HapMap

Lecture 19: Hitchhiking and selective sweeps. Bruce Walsh lecture notes Synbreed course version 8 July 2013

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

Association studies (Linkage disequilibrium)

Mapping and Mapping Populations

Linkage Disequilibrium. Biostatistics 666

I See Dead People: Gene Mapping Via Ancestral Inference

Population Genetic Differentiation and Diversity of the Blue Crab in the Gulf of Mexico Inferred with Microsatellites and SNPs

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

Quantitative Genetics for Using Genetic Diversity

Detecting ancient admixture using DNA sequence data

On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study

Chapter 25 Population Genetics

Genomics assisted Genetic enhancement Applications and potential in tree improvement

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

The Evolution of Populations

Park /12. Yudin /19. Li /26. Song /9

Problem! When Fisher Did This Work, It Was Virtually Impossible to Identify Any Specific Loci Influencing a Quantitative Trait.

A brief introduction to population genetics

Overview of using molecular markers to detect selection

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both?

Lecture 10 : Whole genome sequencing and analysis. Introduction to Computational Biology Teresa Przytycka, PhD

Drupal.behaviors.print = function(context) {window.print();window.close();}>

Introduction to Population Genetics. Spezielle Statistik in der Biomedizin WS 2014/15

Random Allelic Variation

Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data

Structure, Measurement & Analysis of Genetic Variation

Genetic data concepts and tests

Genome-wide association studies (GWAS) Part 1

Take Home Message. Molecular Imaging Genomics. How to do Genetics. Questions for the Study of. Two Common Methods for Gene Localization

Identifying Genes Underlying QTLs

Human Genetics and Gene Mapping of Complex Traits

Familial Breast Cancer

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics


The Evolution of Populations

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

It s not a fundamental force like mutation, selection, and drift.

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Population stratification. Background & PLINK practical

Pop Gen meets Quant Gen and other open questions

The role of genomic islands of divergence during speciation. Connor Morgan-Lang November 18th

16.2 Evolution as Genetic Change

Chromosome inversions in human populations Maria Bellet Coll

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

Lecture 5: Inbreeding and Allozymes. Sept 1, 2006

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

The Theory of Evolution

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies

SNPpattern: A Genetic Tool to Derive Haplotype Blocks and Measure Genomic Diversity in Populations Using SNP Genotypes

Overview. Methods for gene mapping and haplotype analysis. Haplotypes. Outline. acatactacataacatacaatagat. aaatactacctaacctacaagagat

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are:

The Evolution of Populations

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY

Transcription:

Linkage Disequilibrium Adele Crane & Angela Taravella

Overview Introduction to linkage disequilibrium (LD) Measuring LD Genetic & demographic factors shaping LD Model predictions and expected LD decay Patterns of LD in human populations GWAS and fine-scale mapping

What is Linkage Disequilibrium (LD)? Linkage disequilibrium: non-independence of alleles at different sites (Pritchard & Pzeworski, 2001) LD exists due to shared ancestry In absence of recombination, diversity arises through mutation Ardlie et al. (2002)

Example: Allele Frequencies Biallelic Loci Locus 1 (alleles A and a) and Locus 2 (alleles B and b) are studied for LD Allele A a B b Allele Frequency p A p a p B p b Gamete Frequency p AB p Ab p ab p ab

Example: Allele Frequencies Biallelic Loci Expected frequencies when loci are in linkage equilibrium (loci are independent): p AB = p A p B p Ab = p A p b p ab = p a p B p ab = p a p b How do we quantify the difference between expected and observed frequencies?

LD Measurements: D & D Linkage Disequilibrium Coefficient D: D = p AB - p A p B D = 0 in linkage equilibrium D 0 in linkage disequilibrium +/- sign for D depends on how alleles are labeled

LD Measurements: D & D Normalized coefficient D better measurement: D depends on allelic frequencies D is [D] over maximum possible values given allele frequencies D = 1 if alleles have not been separated by recombination during history of sample analyzed (complete linkage disequilibrium) D < 1 if LD is disrupted Weakness: D values can be inflated by small samples or low frequencies of minor alleles

LD Measurements: r 2 ( 2 ) r 2 = 1 if alleles have not been separated by recombination and have same allele frequency (perfect linkage disequilibrium) r 2 less inflated by small sample sizes

Comparison of D and r 2 + is D and is r 2 Simulated decay of D and r 2 as a function of genetic distance (cm) under a constant population size and random mating D and r 2 behave differently and high values of D may not be consistent with low values of r 2. More random variation in D values Pritchard & Pzeworski, 2001

LD parameter ρ r 2 can have inverse relationship with ρ = 4N e c N e : effective population size c : recombination rate (varies over time and across regions) ρ is a scaled recombination rate Expected r 2 : E(r 2 ) 1 / (1 + ρ) Large N e : E(r 2 ) 1 / ρ LD increases as ρ decreases Advantages: Can compare LD observed in studies using different marker spacing or types of data (SNP vs. microsatellite data) Provides estimate of recombination rate per generation

Genetic factors shaping LD Recombination Mutation Inversions Hotspots high recombination breaks down haplotype blocks low LD Comparisons of LD in different parts of the genome may not be informative unless local recombination rates are known Myers hotspot motif Introduces diversity into haplotype blocks (especially in non-recombining regions) Suppresses recombination Strong LD can develop Gene conversion Gene conversion can affect short scale LD LD may be broken up by gene conversion

Selection Shaping LD Hitchhiking effect Haplotype near a favored variant swept into high frequency or fixation Background selection Loss of diversity at neutral locus due to negative selection against linked deleterious alleles Epistatic selection Epistasis: interaction between genes (ex: suppression of phenotypic expression) Needs to be strong to maintain allelic association over long distance

Demographic Factors Shaping LD Inbreeding Inbreeding: mating between related individuals Decreased diversity levels can increase LD Minor effect in humans Bottlenecks Temporary reduction in population size can increase LD Long term bottlenecks can lead to sharp reduction in Ne and thus higher LD Populations outside of Africa have higher LD Admixture LD between unlinked sites seen at time of admixture LD increases over long range with recent admixture of populations with different allele frequencies rapid decay Breaks down with random mating

Demographic Models and Expected LD Decay r hat r hat Genetic distance (cm) Genetic distance (cm) Standard model Panmictic population of constant size (N e =10 4 ). Considerable variability is expected Kruglyak model Exponential population growth, from 10 4 to 5x10 9 Low LD between loci expected under this model because of large N e 1 island sample All individuals are drawn from the same sub population 2 island sample All individuals are drawn from both sub populations equally. Population structure tends to increase levels of LD Pritchard & Pzeworski, 2001

Demographic Models and Expected LD Decay Different growth models Neutral model (solid line) population growth leads to reduction in LD but the effect is not as great as with the Kruglyak model Kruglyak model (long dashed line) Expanding population get dramatic reduction in LD Can use LD decay to make inferences on human demographic history Pritchard & Pzeworski, 2001

Pattern of LD in Human Populations LD in global populations LD increases outside of Africa Bottlenecks LD in African Populations Southern African origin for modern humans LD decay averaged across populations within each of six geographic regions The highest correlation coefficient in blue indicates the best fit with a potential geographic origin

GWAS and LD Genome wide association study (GWAS) Testing cases and controls to determine potential variants associated with a disease trait Low r 2 will have little power to detect association at the marker locus Want marker locus linked to disease susceptibility mutation Need a marker density with high probability of strong LD between at least one marker locus and the disease susceptibility mutation Issues with finding causative SNPs (gene localization) Long range LD is problematic Human populations vary in LD and recombination

References Ardlie, K., Kruglyak, L., & Seielstad, M. Patterns of linkage disequilibrium in human genome. Nature Reviews Genetics 3 (2002): 299-209. doi:10.1038/nrg777 Henn, B. M., et al. "Hunter-gatherer genomic diversity suggests a southern African origin for modern humans." Proceedings of the National Academy of Sciences 108.13 (2011): 5154-5162. International HapMap Consortium. A haplotype map of the human genome. Nature 437 (2005): 1299-1320. doi:10.1038/nature04226 Jallow, M., et al. "Genome-wide and fine-resolution association analysis of malaria in West Africa." Nature genetics 41.6 (2009): 657-665. Jobling, M., Hurles, M., & Tyler-Smith, C. Human evolutionary genetics: origins, peoples & disease. Garland Science, 2013. Pritchard, Jonathan K., & Przeworski, M. "Linkage disequilibrium in humans: models and data." The American Journal of Human Genetics 69.1 (2001): 1-14.