Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia

Size: px
Start display at page:

Download "Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia"

Transcription

1 Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia Kevin Galinsky Harvard T. H. Chan School of Public Health American Society of Human Genetics October 10, 2015

2 Population differentiation can detect recent selection Sabeti et al Science 2

3 Example: High-altitude EPAS1 gene in Tibetan populations Yi et al Science 3

4 Selected loci generate false-positive signals in GWAS Population Stratification Phenotype Natural Selection Population Differentiation Ancestry Informative Markers Price et al PLoS Genet, Tian et al PLoS Genet 4

5 Outline Overview of method Simulations Signals of selection in GERA data 5

6 Ancestral Population Population differentiation examines allele frequency differences Pop 1 p1 E p 1 p 2 = 0 Var p 1 p 2 = 2F ST p 1 p p 1 p 2 ~N 0,2F ST p 1 p F ST p 1 p 2 p Pop 2 p2 Weir & Hill 2002 Annu Rev Genet, Bhatia et al Genome Res 6

7 Ancestral Population Population differentiation examines allele frequency differences Pop 1 p1 Sample 1 p 1 p i~n p i, p i 1 p i 2N i p 1 p 2 ~N 0,2F ST p 1 p p Pop 2 p2 Sample 2 p 2 7

8 Ancestral Population Population differentiation examines allele frequency differences Pop 1 p1 Sample 1 p 1 D = p 1 p 2 D~N 0, p 1 p 2F ST + 1 2N N 2 p Smaller F ST = More Power Greater sample size = More Power Pop 2 p2 Bhatia et al AJHG Sample 2 p 2 8

9 Ancestral Population Extending selection statistic to fractional ancestry Pop 1 p1 p 1 and p 2 still follow normal distribution! p Sample Pop 2 p2 α: ancestry x: genotypes p 1 = at x 2a T 1 p 2 = 1 a T x 2 1 a T 1 D still follows a normal distribution! But variance unknown

10 Ancestral Population Approach extends to PCs in the absence of population information Pop 1 p1 Y: normalized genotypes y i = x i 2p i 2p i 1 p i p Sample X: genotypes Eigenvectors v 1,, v K Correspond to ancestry α k = β 0k + β 1k v k Pop 2 p2 D ik y i v k Galinsky et al. biorxiv (under revision, AJHG)

11 Normalizing the SNP loadings with eigenvalues Two problems: Var(D) unknown D = Cyv; C unkown PCA is SVD: Y UΣV T U and V orthonormal Σ diagonal U = YVΣ 1 Var U ij = M 1 M Σ k 2 y iv k 2 ~χ

12 Selection statistic produces genome-wide significant signals Method produces a p-value Correct for M SNPs tested K PCs examined 12

13 Outline Overview of Method Simulations Signals of selection in GERA data 13

14 Ancestral Population Simulation scheme Pop 1 p1 Sample 1 X1 F ST PCs p Pop 2 p2 Sample 2 X2 Differentiated SNPs: D = p 1 p 2 Z 1, Z 2 Selection 14

15 Selection statistic under null is well calibrated 15

16 Power Smaller F ST = More Power D~N 0, p 1 p 2F ST + 1 2N N 2 F_ST Population Differentiation ( D = p 1 -p 2 ) 16

17 Power Larger sample size = More Power F ST = D~N 0, p 1 p 2F ST + 1 2N N 2 N 50k 20k 10k 5k 2k 1k Population Differentiation ( D = p 1 -p 2 ) 17

18 Outline Overview of Method Simulations Signals of selection in GERA data 18

19 Detecting natural selection in a real dataset Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort Looked at samples with European ancestry 670,176 SNPs 62,318 Samples Banda et al Genetics 19

20 Data processing steps Original 670k 62k QC Filtering QC Remove: 609k 55k Related samples <90% European Ancestry Missing data SNPs Low MAF SNPs Selection LD Filtering HWE filter LD filter 162k LD PCA PCs 55k FastPCA Linear time algorithm 57 minutes CPU time 2.6 GB RAM SNPs Samples 20

21 PC1 selection signals LCT, ADH1B, HLA, OCA2 LCT ADH1B HLA OCA2 21

22 PC2 selection signals IGFBP3, IGH, OCA2 IGFBP3 IGH OCA2 22

23 PC3 selection signals LCT, IRF4, HLA, OCA2 LCT IRF4 HLA OCA2 23

24 PC4 selection signals IRF4, HLA, Chromosome 8 inversion IRF4 HLA Chr8 Inversion Tian et al PLoS Genet 24

25 Removing significant regions removes LD signals LCT ADH1B HLA PC1 OCA2 IGFBP3 PC2 IGH OCA2 LCT IRF4 HLA PC3 OCA2 IRF4 PC4 No chromosome 8 inversion HLA Tian et al PLoS Genet 25

26 Detect regions with novel selection Locus Chr Region (Mb) Best Hit PC p-value LCT rs ADH1B rs IRF rs HLA rs IGFBP rs Chr8 Inversion rs IGH rs OCA rs Genome-wide significant after testing 609k SNPs 4 PCs = 2.4M hypotheses p-value threshold =

27 ADH1B: Known phenotype Under selection East Asia Alcohol dehydrogenase rs is C/T SNP and Arg47His ADH1B*47His variant more active Oxidizes ethanol to acetaldehyde Protective against alcoholism Known to be under selection in East Asia Detected novel signal of selection in Europeans Han et al AJHG, Dick et al Alcohol Clin Exp Res 27

28 ADH1B: Independent selection event in Europeans Selection in East Asia detected using EHH methods H7 haplotype detected Contains variant at regulatory SNP rs Variant not found in Europeans Haplotype rs rs Asian (CHB, CHS, JPT) European (CEU, FIN, GBR, IBS, TSI) African (ASW, LWK, YRI) H5 G T 0.36% 1.12% 0% H5b G T 0.18% 0.56% 0% H6 G T 12.1% 0.28% 0% H7 A T 60.9% 0% 0% Other G C 28.0% 91.7% 100% Han et al AJHG, Yi et al Ann Hum Genet 28

29 Novel selection signal at IGFBP3: Gene with cancer and BP phenotypes Insulin-like growth factor-binding protein 3 IGF transport protein in blood Mediates effects of IGF IGF associated with: Breast cancer Implicated in other cancers Blood pressure phenotypes Al-Zahrani et al. Hum. Mol. Genet. 2006, Jogie-Brahim et al. Endocr Rev. 2009, Ganesh et al. AJHG 2014, Zhu et al, AJHG

30 Novel selection signal IGH: Locus associated with MS Immunoglobulin heavy locus Contains variants for heavy chains of antibodies Associated with multiple sclerosis Thought to be an autoimmune disorder Buck et al. Ann. Neurol

31 Conclusions Presented method to detect natural selection Without a priori ancestry Genome-wide significant signals Novel selection in GERA ADH1B in Europeans IGFBP3 IGH 31

32 Acknowledgements Alkes Price Gaurav Bhatia Po-Ru Loh Galinsky et al. biorxiv (under revision, AJHG) Sayan Mukherjee Google Stoyan Georgiev Nick Patterson 32

UK Biobank Axiom Array

UK Biobank Axiom Array DATA SHEET Advancing human health studies with powerful genotyping technology Array highlights The Applied Biosystems UK Biobank Axiom Array is a powerful array for translational research. Designed using

More information

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

Supplementary Note: Detecting population structure in rare variant data

Supplementary Note: Detecting population structure in rare variant data Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to

More information

Supplementary Figures

Supplementary Figures Supplementary Figures 1 Supplementary Figure 1. Analyses of present-day population differentiation. (A, B) Enrichment of strongly differentiated genic alleles for all present-day population comparisons

More information

Alkes Price Harvard School of Public Health January 24 & January 26, 2017

Alkes Price Harvard School of Public Health January 24 & January 26, 2017 EPI 511, Advanced Population and Medical Genetics Week 1: Intro + HapMap / 1000 Genomes Linkage Disequilibrium Alkes Price Harvard School of Public Health January 24 & January 26, 2017 EPI 511: Course

More information

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics Genetic Variation and Genome- Wide Association Studies Keyan Salari, MD/PhD Candidate Department of Genetics How many of you did the readings before class? A. Yes, of course! B. Started, but didn t get

More information

Derrek Paul Hibar

Derrek Paul Hibar Derrek Paul Hibar derrek.hibar@ini.usc.edu Obtain the ADNI Genetic Data Quality Control Procedures Missingness Testing for relatedness Minor allele frequency (MAF) Hardy-Weinberg Equilibrium (HWE) Testing

More information

Haplotype phasing in large cohorts: Modeling, search, or both?

Haplotype phasing in large cohorts: Modeling, search, or both? Haplotype phasing in large cohorts: Modeling, search, or both? Po-Ru Loh Harvard T.H. Chan School of Public Health Department of Epidemiology Broad MIA Seminar, 3/9/16 Overview Background: Haplotype phasing

More information

H3A - Genome-Wide Association testing SOP

H3A - Genome-Wide Association testing SOP H3A - Genome-Wide Association testing SOP Introduction File format Strand errors Sample quality control Marker quality control Batch effects Population stratification Association testing Replication Meta

More information

PLINK gplink Haploview

PLINK gplink Haploview PLINK gplink Haploview Whole genome association software tutorial Shaun Purcell Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA Broad Institute of Harvard & MIT, Cambridge,

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

I/O Suite, VCF (1000 Genome) and HapMap

I/O Suite, VCF (1000 Genome) and HapMap I/O Suite, VCF (1000 Genome) and HapMap Hin-Tak Leung April 13, 2013 Contents 1 Introduction 1 1.1 Ethnic Composition of 1000G vs HapMap........................ 2 2 1000 Genome vs HapMap YRI (Africans)

More information

PERSPECTIVES. A gene-centric approach to genome-wide association studies

PERSPECTIVES. A gene-centric approach to genome-wide association studies PERSPECTIVES O P I N I O N A gene-centric approach to genome-wide association studies Eric Jorgenson and John S. Witte Abstract Genic variants are more likely to alter gene function and affect disease

More information

Population structure, heritability, and polygenic risk

Population structure, heritability, and polygenic risk Population structure, heritability, and polygenic risk Alicia Martin Daly Lab October 18, 2016 armartin@broadinstitute.org @genetisaur Project goals Call local ancestry in large case/control PTSD cohort

More information

Genome variation - part 1

Genome variation - part 1 Genome variation - part 1 Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW Day 2 Friday 21 th January 2016 Aims of the session Introduce major

More information

General aspects of genome-wide association studies

General aspects of genome-wide association studies General aspects of genome-wide association studies Abstract number 20201 Session 04 Correctly reporting statistical genetics results in the genomic era Pekka Uimari University of Helsinki Dept. of Agricultural

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation

Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation Asif Javed 1,2, Petros Drineas 2, Michael W. Mahoney 3 and Peristera Paschou 4 1 Computational Biology Center, IBM T. J. Watson

More information

Leveraging local ancestry to detect gene-gene interactions in genome-wide data

Leveraging local ancestry to detect gene-gene interactions in genome-wide data Aschard et al. BMC Genetics (015) 16:14 DOI 10.1186/s1863-015-083-z METHODOLOGY ARTICLE Leveraging local ancestry to detect gene-gene interactions in genome-wide data Hugues Aschard 1*, Alexander Gusev

More information

SNPs - GWAS - eqtls. Sebastian Schmeier

SNPs - GWAS - eqtls. Sebastian Schmeier SNPs - GWAS - eqtls s.schmeier@gmail.com http://sschmeier.github.io/bioinf-workshop/ 17.08.2015 Overview Single nucleotide polymorphism (refresh) SNPs effect on genes (refresh) Genome-wide association

More information

Principal Component Analysis in Genomic Data

Principal Component Analysis in Genomic Data Principal Component Analysis in Genomic Data Seunggeun Lee Department of Biostatistics University of North Carolina at Chapel Hill March 4, 2010 Seunggeun Lee (UNC-CH) PCA March 4, 2010 1 / 12 Bio Korean

More information

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000)

Themes. Homo erectus. Jin and Su, Nature Reviews Genetics (2000) HC70A & SAS70A Winter 2009 Genetic Engineering in Medicine, Agriculture, and Law Tracking Human Ancestry Professor John Novembre Themes Global patterns of human genetic diversity Tracing our ancient ancestry

More information

http://genemapping.org/ Epistasis in Association Studies David Evans Law of Independent Assortment Biological Epistasis Bateson (99) a masking effect whereby a variant or allele at one locus prevents

More information

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies

Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies p. 1/20 Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies David J. Balding Centre

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

Nature Genetics: doi: /ng Supplementary Figure 1. QQ plots of P values from the SMR tests under a range of simulation scenarios.

Nature Genetics: doi: /ng Supplementary Figure 1. QQ plots of P values from the SMR tests under a range of simulation scenarios. Supplementary Figure 1 QQ plots of P values from the SMR tests under a range of simulation scenarios. The simulation strategy is described in the Supplementary Note. Shown are the results from 10,000 simulations.

More information

Title: Powerful SNP Set Analysis for Case-Control Genome Wide Association Studies. Running Title: Powerful SNP Set Analysis. Hill, NC. MD.

Title: Powerful SNP Set Analysis for Case-Control Genome Wide Association Studies. Running Title: Powerful SNP Set Analysis. Hill, NC. MD. Title: Powerful SNP Set Analysis for Case-Control Genome Wide Association Studies Running Title: Powerful SNP Set Analysis Michael C. Wu 1, Peter Kraft 2,3, Michael P. Epstein 4, Deanne M. Taylor 2, Stephen

More information

Amapofhumangenomevariationfrom population-scale sequencing

Amapofhumangenomevariationfrom population-scale sequencing doi:.38/nature9534 Amapofhumangenomevariationfrom population-scale sequencing The Genomes Project Consortium* The Genomes Project aims to provide a deep characterization of human genome sequence variation

More information

Genotyping requirements for complex disease studies

Genotyping requirements for complex disease studies Genotyping requirements for complex disease studies Grant Montgomery Molecular Epidemiology, Queensland Institute of Medical Research, Australia Queensland Institute of Medical Research Outline Background

More information

Data Sources and Biobanks in the Asia-Pacific Region. Wei Zhou, MD, Ph.D. Department of Epidemiology, Merck Research Laboratories October 23, 2014

Data Sources and Biobanks in the Asia-Pacific Region. Wei Zhou, MD, Ph.D. Department of Epidemiology, Merck Research Laboratories October 23, 2014 Data Sources and Biobanks in the Asia-Pacific Region Wei Zhou, MD, Ph.D. Department of Epidemiology, Merck Research Laboratories October 23, 2014 1 Disclosures Wei Zhou is currently an employee of Merck

More information

Concepts and relevance of genome-wide association studies

Concepts and relevance of genome-wide association studies Science Progress (2016), 99(1), 59 67 Paper 1500149 doi:10.3184/003685016x14558068452913 Concepts and relevance of genome-wide association studies ANDREAS SCHERER and G. BRYCE CHRISTENSEN Dr Andreas Scherer

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Ranajit Chakraborty, Ph.D. Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Overview Some brief remarks about SNPs Haploblock structure of SNPs in the human genome Criteria

More information

Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip

Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip : Sample Size, Power, Imputation, and the Choice of Genotyping Chip Chris C. A. Spencer., Zhan Su., Peter Donnelly ", Jonathan Marchini " * Department of Statistics, University of Oxford, Oxford, United

More information

Haplotypes Personalized Medicine: Understanding Your Own Genome Fall 2014

Haplotypes Personalized Medicine: Understanding Your Own Genome Fall 2014 Haplotypes 02-223 Personalized Medicine: Understanding Your Own Genome Fall 2014 Terminology Review llele: different forms of genecc variacons at a given gene or genecc locus Locus 1 has two alleles, and

More information

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on 02-710 Computa.onal Genomics Seyoung Kim Overview Two fundamental forces that shape genome sequences Recombina.on Muta.on, gene.c

More information

Measures of human population structure show heterogeneity among genomic regions

Measures of human population structure show heterogeneity among genomic regions Measures of human population structure show heterogeneity among genomic regions Bruce S. Weir, Lon R. Cardon, Amy D. Anderson, et al. Genome Res. 2005 15: 1468-1476 Access the most recent version at doi:10.1101/gr.4398405

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning Advanced Introduction to Machine Learning 10715, Fall 2014 Structured Sparsity, with application in Computational Genomics Eric Xing Lecture 3, September 15, 2014 Reading: Eric Xing @ CMU, 2014 1 Structured

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of

More information

University of Groningen. The value of haplotypes Vries, Anne René de

University of Groningen. The value of haplotypes Vries, Anne René de University of Groningen The value of haplotypes Vries, Anne René de IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

ABSTRACT : 162 IQUIRA E & BELZILE F*

ABSTRACT : 162 IQUIRA E & BELZILE F* ABSTRACT : 162 CHARACTERIZATION OF SOYBEAN ACCESSIONS FOR SCLEROTINIA STEM ROT RESISTANCE AND ASSOCIATION MAPPING OF QTLS USING A GENOTYPING BY SEQUENCING (GBS) APPROACH IQUIRA E & BELZILE F* Département

More information

Lecture 6: Association Mapping. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

Lecture 6: Association Mapping. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Lecture 6: Association Mapping Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Association Pipeline Phenotypic Outliers Outliers are unusual data points that substantially deviate

More information

Browsing Genes and Genomes with Ensembl

Browsing Genes and Genomes with Ensembl Browsing Genes and Genomes with Ensembl Victoria Newman Ensembl Outreach Officer EMBL-EBI Objectives What is Ensembl? What type of data can you get in Ensembl? How to navigate the Ensembl browser website.

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Graph-induced structured input/output models - Case Study: Disease Association Analysis Eric Xing Lecture 25, April 16, 2014 Reading: See class

More information

Data quality control in genetic case-control association studies

Data quality control in genetic case-control association studies Europe PMC Funders Group Author Manuscript Published in final edited form as: Nat Protoc. 2010 September ; 5(9): 1564 1573. doi:10.1038/nprot.2010.116. Data quality control in genetic case-control association

More information

Gap hunting to characterize clustered probe signals in Illumina methylation array data

Gap hunting to characterize clustered probe signals in Illumina methylation array data DOI 10.1186/s13072-016-0107-z Epigenetics & Chromatin RESEARCH Gap hunting to characterize clustered probe signals in Illumina methylation array data Shan V. Andrews 1,2, Christine Ladd Acosta 1,2,3, Andrew

More information

ACCEPTED. Victoria J. Wright Corresponding author.

ACCEPTED. Victoria J. Wright Corresponding author. The Pediatric Infectious Disease Journal Publish Ahead of Print DOI: 10.1097/INF.0000000000001183 Genome-wide association studies in infectious diseases Eleanor G. Seaby 1, Victoria J. Wright 1, Michael

More information

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY.

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY. The psoriasis associated deletion of late cornified envelope genes LCE3B and LCE3C has been maintained under balancing selection since Human Denisovan divergence Petar Pajic 1 *, Yen Lung Lin 1 *, Duo

More information

Statistical challenges to genome-wide association study

Statistical challenges to genome-wide association study 1 Statistical challenges to genome-wide association study Naoyuki Kamatani, M.D., Ph.D. 1. Director and Professor, Institute of Rheumatology, Tokyo Women s Medical University 2. Director, Medical Informatics

More information

Data quality control in genetic case-control association studies

Data quality control in genetic case-control association studies Data quality control in genetic case-control association studies Carl A Anderson 1,2, Fredrik H Pettersson 1, Geraldine M Clarke 1, Lon R Cardon 3, Andrew P Morris 1 & Krina T Zondervan 1 1 Genetic and

More information

Computational methods for the analysis of rare variants

Computational methods for the analysis of rare variants Computational methods for the analysis of rare variants Shamil Sunyaev Harvard-M.I.T. Health Sciences & Technology Division Combine all non-synonymous variants in a single test Theory: 1) Most new missense

More information

Runs of Homozygosity Analysis Tutorial

Runs of Homozygosity Analysis Tutorial Runs of Homozygosity Analysis Tutorial Release 8.7.0 Golden Helix, Inc. March 22, 2017 Contents 1. Overview of the Project 2 2. Identify Runs of Homozygosity 6 Illustrative Example...............................................

More information

Linkage Disequilibrium. Adele Crane & Angela Taravella

Linkage Disequilibrium. Adele Crane & Angela Taravella Linkage Disequilibrium Adele Crane & Angela Taravella Overview Introduction to linkage disequilibrium (LD) Measuring LD Genetic & demographic factors shaping LD Model predictions and expected LD decay

More information

What is genetic variation?

What is genetic variation? enetic Variation Applied Computational enomics, Lecture 05 https://github.com/quinlan-lab/applied-computational-genomics Aaron Quinlan Departments of Human enetics and Biomedical Informatics USTAR Center

More information

ARTICLE Leveraging Multi-ethnic Evidence for Mapping Complex Traits in Minority Populations: An Empirical Bayes Approach

ARTICLE Leveraging Multi-ethnic Evidence for Mapping Complex Traits in Minority Populations: An Empirical Bayes Approach ARTICLE Leveraging Multi-ethnic Evidence for Mapping Complex Traits in Minority Populations: An Empirical Bayes Approach Marc A. Coram, 1,8,9 Sophie I. Candille, 2,8 Qing Duan, 3 Kei Hang K. Chan, 4 Yun

More information

SNP calling and VCF format

SNP calling and VCF format SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide

More information

Using the Association Workflow in Partek Genomics Suite

Using the Association Workflow in Partek Genomics Suite Using the Association Workflow in Partek Genomics Suite This user guide will illustrate the use of the Association workflow in Partek Genomics Suite (PGS) and discuss the basic functions available within

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 3: Genome-wide Association Studies 1 Setting

More information

Update on the Genomics Data in the Health and Re4rement Study. Sharon Kardia Jennifer A. Smith University of Michigan April 2013

Update on the Genomics Data in the Health and Re4rement Study. Sharon Kardia Jennifer A. Smith University of Michigan April 2013 Update on the Genomics Data in the Health and Re4rement Study Sharon Kardia Jennifer A. Smith University of Michigan April 2013 Genetic variation in SNPs (Single Nucleotide Polymorphisms) ATTGCAATCCGTGG...ATCGAGCCA.TACGATTGCACGCCG

More information

MMAP Genomic Matrix Calculations

MMAP Genomic Matrix Calculations Last Update: 9/28/2014 MMAP Genomic Matrix Calculations MMAP has options to compute relationship matrices using genetic markers. The markers may be genotypes or dosages. Additive and dominant covariance

More information

Genomic Research: Issues to Consider. IRB Brown Bag August 28, 2014 Sharon Aufox, MS, LGC

Genomic Research: Issues to Consider. IRB Brown Bag August 28, 2014 Sharon Aufox, MS, LGC Genomic Research: Issues to Consider IRB Brown Bag August 28, 2014 Sharon Aufox, MS, LGC Outline Key genomic terms and concepts Issues in genomic research Consent models Types of findings Returning results

More information

Cornell Probability Summer School 2006 Ancestral Recombination Graph

Cornell Probability Summer School 2006 Ancestral Recombination Graph Cornell Probability Summer School 200 Ancestral Recombination Graph Simon Tavaré Lecture 3 Why recombination? In the era of genomic polymorphism data, the need for models that include recombination is

More information

Roadmap: genotyping studies in the post-1kgp era. Alex Helm Product Manager Genotyping Applications

Roadmap: genotyping studies in the post-1kgp era. Alex Helm Product Manager Genotyping Applications Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Alex Helm Product Manager Genotyping Applications 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa,

More information

Computations with Markers

Computations with Markers Computations with Markers Paulino Pérez 1 José Crossa 1 1 ColPos-México 2 CIMMyT-México September, 2014. SLU, Sweden Computations with Markers 1/20 Contents 1 Genomic relationship matrix 2 Examples 3 Big

More information

S SG. Metabolomics meets Genomics. Hemant K. Tiwari, Ph.D. Professor and Head. Metabolomics: Bench to Bedside. ection ON tatistical.

S SG. Metabolomics meets Genomics. Hemant K. Tiwari, Ph.D. Professor and Head. Metabolomics: Bench to Bedside. ection ON tatistical. S SG ection ON tatistical enetics Metabolomics meets Genomics Hemant K. Tiwari, Ph.D. Professor and Head Section on Statistical Genetics Department of Biostatistics School of Public Health Metabolomics:

More information

Exploring the Genetic Basis of Congenital Heart Defects

Exploring the Genetic Basis of Congenital Heart Defects Exploring the Genetic Basis of Congenital Heart Defects Sanjay Siddhanti Jordan Hannel Vineeth Gangaram szsiddh@stanford.edu jfhannel@stanford.edu vineethg@stanford.edu 1 Introduction The Human Genome

More information

Author's response to reviews

Author's response to reviews Author's response to reviews Title: A pooling-based genome-wide analysis identifies new potential candidate genes for atopy in the European Community Respiratory Health Survey (ECRHS) Authors: Francesc

More information

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score Midterm 1 Results 10 Midterm 1 Akey/ Fields Median - 69 8 Number of Students 6 4 2 0 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 Exam Score Quick review of where we left off Parental type: the

More information

Experimental Design and Sample Size Requirement for QTL Mapping

Experimental Design and Sample Size Requirement for QTL Mapping Experimental Design and Sample Size Requirement for QTL Mapping Zhao-Bang Zeng Bioinformatics Research Center Departments of Statistics and Genetics North Carolina State University zeng@stat.ncsu.edu 1

More information

Axiom mydesign Custom Array design guide for human genotyping applications

Axiom mydesign Custom Array design guide for human genotyping applications TECHNICAL NOTE Axiom mydesign Custom Genotyping Arrays Axiom mydesign Custom Array design guide for human genotyping applications Overview In the past, custom genotyping arrays were expensive, required

More information

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions

Measurement of Molecular Genetic Variation. Forces Creating Genetic Variation. Mutation: Nucleotide Substitutions Measurement of Molecular Genetic Variation Genetic Variation Is The Necessary Prerequisite For All Evolution And For Studying All The Major Problem Areas In Molecular Evolution. How We Score And Measure

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Joint Study of Genetic Regulators for Expression

Joint Study of Genetic Regulators for Expression Joint Study of Genetic Regulators for Expression Traits Related to Breast Cancer Tian Zheng 1, Shuang Wang 2, Lei Cong 1, Yuejing Ding 1, Iuliana Ionita-Laza 3, and Shaw-Hwa Lo 1 1 Department of Statistics,

More information

Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding. Peter Bradbury USDA/Cornell University

Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding. Peter Bradbury USDA/Cornell University Applying Genotyping by Sequencing (GBS) to Corn Genetics and Breeding Peter Bradbury USDA/Cornell University Genotyping by sequencing (GBS) makes use of high through-put, short-read sequencing to provide

More information

Characterization of Allele-Specific Copy Number in Tumor Genomes

Characterization of Allele-Specific Copy Number in Tumor Genomes Characterization of Allele-Specific Copy Number in Tumor Genomes Hao Chen 2 Haipeng Xing 1 Nancy R. Zhang 2 1 Department of Statistics Stonybrook University of New York 2 Department of Statistics Stanford

More information

After the association: Functional and Biological Validation of Variants

After the association: Functional and Biological Validation of Variants After the association: Functional and Biological Validation of Variants Jason L. Stein Geschwind Laboratory / Imaging Genetics Center University of California, Los Angeles (but soon to be at UNC-Chapel

More information

Whole-Genome Genetic Data Simulation Based on Mutation-Drift Equilibrium Model

Whole-Genome Genetic Data Simulation Based on Mutation-Drift Equilibrium Model 2012 4th International Conference on Computer Modeling and Simulation (ICCMS 2012) IPCSIT vol.22 (2012) (2012) IACSIT Press, Singapore Whole-Genome Genetic Data Simulation Based on Mutation-Drift Equilibrium

More information

A Bayesian Graphical Model for Genome-wide Association Studies (GWAS)

A Bayesian Graphical Model for Genome-wide Association Studies (GWAS) A Bayesian Graphical Model for Genome-wide Association Studies (GWAS) Laurent Briollais 1,2, Adrian Dobra 3, Jinnan Liu 1, Hilmi Ozcelik 1 and Hélène Massam 4 Prosserman Centre for Health Research, Samuel

More information

Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans

Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans Effects of cis and trans Genetic Ancestry on Gene Expression in African Americans Alkes L. Price 1,2 *, Nick Patterson 3, Dustin C. Hancks 4, Simon Myers 5, David Reich 3,6, Vivian G. Cheung 4,7,8,9, Richard

More information

Introduction to statistics for Genome- Wide Association Studies (GWAS) Day 2 Section 8

Introduction to statistics for Genome- Wide Association Studies (GWAS) Day 2 Section 8 Introduction to statistics for Genome- Wide Association Studies (GWAS) 1 Outline Background on GWAS Presentation of GenABEL Data checking with GenABEL Data analysis with GenABEL Display of results 2 R

More information

Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era

Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Anthony Green Sr. Genotyping Sales Specialist North America 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx,

More information

Lees J.A., Vehkala M. et al., 2016 In Review

Lees J.A., Vehkala M. et al., 2016 In Review Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes Lees J.A., Vehkala M. et al., 2016 In Review Journal Club Triinu Kõressaar 16.03.2016 Introduction Bacterial

More information

KNN-MDR: a learning approach for improving interactions mapping performances in genome wide association studies

KNN-MDR: a learning approach for improving interactions mapping performances in genome wide association studies Abo Alchamlat and Farnir BMC Bioinformatics (2017) 18:184 DOI 10.1186/s12859-017-1599-7 METHODOLOGY ARTICLE Open Access KNN-MDR: a learning approach for improving interactions mapping performances in genome

More information

On detecting incomplete soft or hard selective sweeps using haplotype structure

On detecting incomplete soft or hard selective sweeps using haplotype structure MBE Advance Access published February 18, 2014 On detecting incomplete soft or hard selective sweeps using haplotype structure Anna Ferrer-Admetlla, 1,2,3,4, Mason Liang, 1, Thorfinn Korneliussen, 5 and

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL SUPPLEMENTAL MATERIAL Supplementary Table 1: RT-qPCR primer sequences. Sequences are shown from 5 to 3 direction; all primers are designed using mouse genome as reference. 36B4-F; TGAAGCAAAGGAAGAGTCGGAGGA

More information

A A T A C G T G A G T G A G G C C T T A A

A A T A C G T G A G T G A G G C C T T A A Living in a Genomic World: Unraveling the Complexity of GWAS Studies by Rivka L. Glaser, Department of Biology, Stevenson University, Stevenson, MD Erin L. Zimmer, Department of Biology, Lewis University,

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data

Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data Am. J. Hum. Genet. 75:771 789, 2004 Statistical Tests for Admixture Mapping with Case-Control and Cases-Only Data Giovanni Montana and Jonathan K. Pritchard Department of Human Genetics, University of

More information

An introduction to genetics and molecular biology

An introduction to genetics and molecular biology An introduction to genetics and molecular biology Cavan Reilly September 5, 2017 Table of contents Introduction to biology Some molecular biology Gene expression Mendelian genetics Some more molecular

More information

MAPPING BY ADMIXTURE LINKAGE DISEQUILIBRIUM: ADVANCES, LIMITATIONS AND GUIDELINES

MAPPING BY ADMIXTURE LINKAGE DISEQUILIBRIUM: ADVANCES, LIMITATIONS AND GUIDELINES Nature Reviews Genetics AOP, published online 12 July 25; doi:1.138/nrg1657 REVIEWS MAPPING BY ADMIXTURE LINKAGE DISEQUILIBRIUM: ADVANCES, LIMITATIONS AND GUIDELINES Michael W. Smith* and Stephen J. O

More information

Genetics and Psychiatric Disorders Lecture 1: Introduction

Genetics and Psychiatric Disorders Lecture 1: Introduction Genetics and Psychiatric Disorders Lecture 1: Introduction Amanda J. Myers LABORATORY OF FUNCTIONAL NEUROGENOMICS All slides available @: http://labs.med.miami.edu/myers Click on courses First two links

More information

Identifying Selected Regions from Heterozygosity and Divergence Using a Light-Coverage Genomic Dataset from Two Human Populations

Identifying Selected Regions from Heterozygosity and Divergence Using a Light-Coverage Genomic Dataset from Two Human Populations Nova Southeastern University NSUWorks Biology Faculty Articles Department of Biological Sciences 3-5-2008 Identifying Selected Regions from Heterozygosity and Divergence Using a Light-Coverage Genomic

More information

Strengthening genomic studies of cardiometabolic diseases in Africa the AWI-Gen experience

Strengthening genomic studies of cardiometabolic diseases in Africa the AWI-Gen experience Strengthening genomic studies of cardiometabolic diseases in Africa the AWI-Gen experience Human Heredity and Health in Africa ASHG - October 2015 Inaugural H3Africa meeting Addis Ababa - October 2012

More information

Gene-centric Genomewide Association Study via Entropy

Gene-centric Genomewide Association Study via Entropy Gene-centric Genomewide Association Study via Entropy Yuehua Cui 1,, Guolian Kang 1, Kelian Sun 2, Minping Qian 2, Roberto Romero 3, and Wenjiang Fu 2 1 Department of Statistics and Probability, 2 Department

More information