Update on the Genomics Data in the Health and Re4rement Study. Sharon Kardia Jennifer A. Smith University of Michigan April 2013

Size: px
Start display at page:

Download "Update on the Genomics Data in the Health and Re4rement Study. Sharon Kardia Jennifer A. Smith University of Michigan April 2013"

Transcription

1 Update on the Genomics Data in the Health and Re4rement Study Sharon Kardia Jennifer A. Smith University of Michigan April 2013

2 Genetic variation in SNPs (Single Nucleotide Polymorphisms) ATTGCAATCCGTGG...ATCGAGCCA.TACGATTGCACGCCG ATTGCAAGCCGTGG...ATCTAGCCA.TACGATTGCAAGCCG ATTGCAAGCCGTGG...ATCTAGCCA TACGATTGCAAGCCG ATTGCAATCCGTGG...ATCGAGCCA.TACGATTGCACGCCG ATTGCAAGCCGTGG...ATCTAGCCA.TACGATTGCAAGCCG

3 Genotypes are called with varying uncertainty Intensity of Allele A Intensity of Allele G

4 Two easy ways dealing with uncertain genotypes 1. Genotype Calling: Choose the most likely genotype and con4nue as if it is true (p 11 =10%, p 12 =20% p 22 =70% => G=2) 2. Mean genotype = Dosage Use the weighted average genotype (p 11 =10%, p 12 =20% p 22 =70% => G=1.6)

5 AFer Data Cleaning: Imputa4on HapMap Consor4um Reference (Completed) Used in first wave of GWAS Imputed ~ 2.4 million SNPs Only completed on whites and blacks Not posted 1000 Genomes Reference (Completed) Used in second wave of GWAS Imputed ~22 million SNPs Posted to dbgap in August

6 Different Genotyping Platforms measure different SNPs

7 Linkage Disequilibrium (LD) is the correla4on among muta4ons across SNPs SNP D n LD Markers close together on chromosomes are often transmitted together, creating a correlation between the mutation. LD also arises when populations mix (admixture).

8 Basic Concepts! Parent 1! Parent 2! A " "B "! a " "b! X! A " "B "! a " "b! A B! A B! a b! a b! OR! A b! a B! A B! A B! a b! A B! A B! a b! a B! A B! A b! A b! etc! High LD -> No Recombination! (r 2 = 1) SNP1 tags SNP2! Low LD -> Recombination! Many possibilities!

9 Key Terms! LD (linkage disequilibrium): For a pair of SNP alleles, it s a measure of deviation from random association. Measured by D, r 2 Phased haplotypes: Estimated distribution of SNP alleles in a genomic region on a chromosome. We have a pair of each chromosomes so we have pairs of haplotypes! Tag SNPs: Minimum set of SNPs needed to identify a haplotype. High LD (e.g. r 2 >0.8) indicates two SNPs are nearly redundant, so each one acts as a tag for the other.!

10 HapMap Project! Phase 1! Phase 2*! Phase 3! Samples & POP panels! Genotyping centers! Unique QC+ SNPs! 269 samples! (4 panels)! HapMap International Consortium! Reference! Nature (2005) 437:p1299! 270 samples! (4 panels)! Perlegen! 1.1 M! 3.8 M! (phase I+II)! Nature (2007) 449:p851 1,115 samples! (11 panels)! Broad & Sanger! 1.6 M (Affy 6.0 & Illumina 1M)! Draft Rel. 1! (May 2008)! *This is the version that most gene4cs consor4a are using

11 Haplotype Blocks are mapped across the genome (for each ethnicity)

12 Figure 2. A schema4c of SNP types as defined in the IMPUTE2 imputa4on algorithm. Each individual is represented by a unique color in the horizontal bar(s), and alternate alleles at each SNP are represented as A and B. Sec4on (A) represents phased reference haplotypes, where two samples (4 phased chromosomes) are shown. Sec4on (B) represents three study samples with SNP genotype calls, as would be observed in GWAS array experiment. Sec4on (C) iden4fies the SNP type of each posi4on shown. Type 2 SNPs have data in both the reference and the study samples: posi4ons 1, 4, 6, 8, and 11. Type 0 SNPs have data in the reference but not in the study samples: posi4ons 3, 5, 9 10, and 12. Thus, data at type 2 SNPs (imputa4on basis) are used to impute type 0 SNPs (imputa4on target) in the study samples. Type 3 SNPs are those in study samples but not in the reference; ul4mately, these SNPs are extraneous to the imputa4on, which is why they are shown in white text. This figure is a based off of IMPUTE2 background documenta4on (see Web Resources).

13 Observed genotypes Study Sample HapMap Gonçalo Abecasis

14 Iden<fy match among reference Gonçalo Abecasis

15 Phase chromosomes, impute missing genotypes Gonçalo Abecasis

16

17

18 One more tricky piece: Haplotypes cross over (recombine)

19 Hidden Markov Model Hidden State S m : The pair of contributing reference haplotypes at marker m Data G m : Observed genotypes at marker m Goal: Infer S m

20 Algorithm Update one individual at a time: Construct haplotypes that match observed genotypes, from a pool of reference haplotypes. Forward: Calculate, cumulatively until the last marker, forward probabilities for observed genotypes and haplotype affiliation state S m Backward: Sample haplotype affiliation states (i.e., construct mosaic haplotypes) probabilistically according to forward probabilities and transition probabilities.

21 1000 Genomes Imputa4on Strategy Impute en4re sample Pre phase (es4mate haplotypes) Impute SNPs with > 4 copies of minor allele in any of the following HapMap racial/ethnic groups: African (AFR) Admixed American (AMR) East Asian (ASN) European (EUR)

22 The 1000 Genomes Reference sample Full Popula9on Name Abbrevia9on Number of Samples African Ancestry in Southwest US ASW 61 Luhya in Webuye, Kenya LWK 97 Yoruba in Ibadan, Nigeria YRI 88 Total African ancestry 246 Colombian in Medellin, Colombia CLM 60 Mexican Ancestry in Los Angeles, CA MXL 66 Puerto Rican in Puerto Rico PUR 55 Total American ancestry 181 Han Chinese in Beijing, China CHB 97 Han Chinese South, China CHS 100 Japanese in Tokyo, Japan JPT 89 Total Asian ancestry 286 Utah residents (CEPH) with Northern and Western European ancestry CEU 85 Toscani in Italia TSI 98 Bri4sh in England and Scotland GBR 89 Finnish in Finland FIN 93 Iberian popula4ons in Spain IBS 14 Total European ancestry 379 An overview of the 1,092 samples in the 1000 Genomes Project worldwide reference panel (phase I integrated variant set v3, March 2012), which was used to impute all study par4cipants. Each popula4on was assigned to one of four con4nental groupings: African (AFR), American (AMR), Asian (ASN), and European (EUR). All haplotypes in the phased reference panel are for unrelated, founder individuals only. This table is based on reference panel data downloaded from IMPUTE2 and the sample summary provided by the Project (see Web resources).

23 1000 Genomes Imputa4on Chromosome Study SNPs Imputa9on basis Imputa9on Output 1 171, ,997 1,639, , ,933 1,781, , ,934 1,501, , ,660 1,517, , ,185 1,378, , ,771 1,348, , ,601 1,228, , ,601 1,188, ,854 91, , , ,636 1,040, , ,929 1,038, ,377 97,955 1,006, ,080 72, , ,602 67, , ,755 63, , ,862 68, , ,966 59, , ,189 60, , ,210 42, , ,871 51, , ,884 28, , ,062 30, ,543 X 43,193 40, ,930 Totals 2,195,306 2,065,320 21,632,048 Study SNPs passing pre imputa4on filters (IMPUTE2 SNP types 2 and 3). Study SNPs passing pre imputa4on filters and overlapping with the reference panel (type 2). Imputa4on output is the sum of imputa4on basis (type 2) and imputa4on target (type 0) SNPs. Type 0 SNPs have been restricted to those with at least 4 copies of the minor allele in AFR, AMR, ASN, or EUR reference samples.

24 Quality control: Predic4on of Known Genotypes Minor Allele Frequency (in study samples) Number of SNPs Mean (Median) Overall Concordance Mean (Median) empirical dosage r2 < 0.1 1,059, (0.998) (0.919) 0.1 1,005, (0.994) (0.994) Quality metrics for all masked SNPs, dichotomized into groups of MAF < 0.1 vs. MAF 0.1. The second column shows the number of SNPs in each MAF group. Mean and median values are presented for overall genotype concordance and empirical dosage r 2 (in IMPUTE2 metrics files, labeled as concord_type0 and r2_type0, respec4vely). No info threshold has been applied here, such that all masked and imputed SNPs in each MAF category are included in these averages.

25 B ) Figure 3. Summaries of quality metrics at all imputed SNPs. Panel A shows the distribu4on of the info quality metric, with a dashed line indica4ng a poten4al 0.3 threshold value. Panel B is the distribu4on of certainty, the average certainty of best guess genotypes. Panel C summarizes the rela4onship between the info score and MAF. The secondary axis indicates the count of SNPs in each MAF bin (0.01 intervals).

26 Figure 4. A comparison of imputa4on quality metrics by chromosome for all imputed SNPs, info in panel A and certainty in panel B. Outlier values are not displayed in these box plots. On the x axis, 23 denotes the X chromosome. A) B)

27 Figure 5. Quality metrics for all masked SNPs, grouped into MAF bins at 0.01 intervals. Panel (A) shows the number of SNPs per MAF bin and, on the secondary y axis, the frac4on of SNPs in the bin passing an info filter threshold of 0.8. Panel (B) plots the average empirical dosage r2 metric per MAF bin, both before and afer filtering on the info score (black and gray data series, respec4vely). Similarly, panel (C) is the concordance between the observed and the most likely imputed genotype at masked SNPs within each MAF bin, with and without the info filter.

28 Apo E Imputa4on

29 Descrip9on Whites African Americans Hispanics Gene SNP Is the SNP genotyped on the Omni2.5? Call rate on Omni2.5 (from CIDR) APOE rs429358, rs7412 rs No rs7412 Yes rs N/A rs7412 0% from CIDR, 90% afer re clustering Imputa9on reference database (build and website) 1000 Genomes reference panels generated at University of Michigan (Interim Phase I, data freeze, haplotypes) hup:// PhaseI Interim.html Imputa9on reference panel The EUR reference panel (87 CEU +98 TSI +89 GBR +93 FIN +14 IBS) The EUR + AFR (88 YRI + 97 LWK +61 ASW ) reference panel s The EUR + AMR (60 CLM + 66 MXL + 55 PUR ) reference panels

30 Descrip9on Whites African Americans Hispanics Pre imputa9on quality control Region and number of SNPs in imputa9on Imputa9on program (including sepngs) Number of people included in imputa9on Imputa9on quality (Rsq) Exclude SNPs with MAF<0.01 or HWE< SNPs upstream and 1000 SNPs downstream on chromosome 19 MaCH , one step, 25 rounds Exclude SNPs with MAF<0.01 or HWE< SNPs upstream and 1000 SNPs downstream on chromosome 19 MaCH , one step, 50 rounds rs rs rs rs Exclude SNPs with MAF<0.01 or HWE< SNPs upstream and 1000 SNPs downstream on chromosome 19 MaCH , one step, 50 rounds Rs Rs

31 Descrip9on Whites African Americans Hispanics Valida9on dataset Number of par9cipants in valida9on dataset Concordance rate Quality control procedures Concordance rate aser quality control procedures ADAMS ADAMS ADAMS APOE genotype 113/117 (96.6%) Set genotype to missing if posterior probability <0.90 (N=2 for rs7412) (N=4 for rs429358) APOE genotype 111/114 (97.4%) APOE genotype 25/28 (89.3%) Set genotype to missing if posterior probability <0.90 (N=1 for rs7412) (N=2 for rs429358) APOE genotype 23/25 (92.0%) APOE genotype 15/15 (100%) Set genotype to missing if posterior probability <0.90 (N=1 for rs7412) (N=0 for rs429358) APOE genotype 14/14 (100%)

32 Algorithm to get APOE genotype from the best guess genotype of SNP rs7412 and rs Genotype Frequencies of APOE Apo E imputa4on in HRS rs7412 best guess rs best guess APOE genotype genotype genotype T/T T/T e2/e2 C/T T/T e2/e3 C/T C/T e2/e4 C/C T/T e3/e3 C/C C/T e3/e4 C/C C/C e4/e4 Whites (N=8652) AA (N=1519) HRS_all (N=12367) e2/e2, N (%) 48 (0.56) 20 (1.32) 78 (0.63) e2/e3, N (%) 1131 (13.07) 226 (14.88) 1555 (12.57) e2/e4, N (%) 199 (2.30) 73 (4.81) 301 (2.43) e3/e3, N (%) 5197 (60.07) 704 (46.35) 7415 (59.96) e3/e4, N (%) 1890 (21.84) 434 (28.57) 2751 (22.24) e4/e4, N (%) 187 (2.16) 62 (4.08) 267 (2.16)

33 Principal Components and Popula4on Stra4fica4on

34 Popula4on Stra4fica4on Problem: Diseases and muta4on frequencies are confounded by race Example: Imagine that Hypertension has a prevalence of 50% in blacks and 25% in whites. do a gene4c analysis in the HRS (20 million simple chisquare tests) the vast majority of the hits will be for the black/ white differences and not hypertension. Solu4on: Es4mate and adjust for gene4c variability using principal components

35 Analysis of Genotypes only Principle Component Analysis reveals SNP-vectors explaining largest variation in the data

36 PCA of POPRES cohort

37 The HRS Par4cipants + HapMap par4cipants Figure 1. Principal component analysis of 12,507 unique study par4cipants and 1,230 HapMap controls, using a set of 96,134 autosomal SNPs pruned for both long and short range linkage disequilibrium. For study samples, color coding is according to self iden4fied race while symbol denotes ethnicity (Hispanic or non Hispanic). HapMap samples are color coded by membership in 1 of 11 Phase 3 popula4ons: ASW: African ancestry in Southwest USA; CEU: Utah residents with Northern and Western European ancestry from the CEPH collec4on; CHB: Han Chinese in Beijing, China; CHD: Chinese in Metropolitan Denver, Colorado; GIH: Gujara4 Indians in Houston, Texas; JPT: Japanese in Tokyo, Japan; LWK: Luhya in Webuye, Kenya; MEX: Mexican ancestry in Los Angeles, California; MKK: Maasai in Kinyawa, Kenya; TSI: Tuscan in Italy; and YRI: Yoruban in Ibadan, Nigeria. The percent variance explained by each of these first two components is noted on the axis labels. (Also Figure 11 from the genotype QC report.)

38 What have we done so far?

39 Completed GWAS Analysis on HRS Traits GWAS (29x2=58 total) Gait velocity (with and without osteoporosis, NHW and AA) 4 Longevity (autosomal plus X chromosome) 1 Hand grip strength (main effects and interac4on with sex) 2 Disability in Aging (sex stra4fied) 2 Educa4onal Auainment (quan4ta4ve and dichotomous, sex stra4fied) 4 Subjec4ve well being (3 well being outcomes) 3 Fer4lity (sex stra4fied) 2 Op4mism/Pessimism (sex stra4fied, sex adjusted, interac4on with sex) 5 Personality Traits (six traits) 6 GWAS Replica4on (selected SNPs or complex modeling) Body mass index and SNPs in the PCSK1 gene (NHW and AA) Systolic and diastolic blood pressure (NHW and AA) Hypertension (NHW and AA) Longevity and telomere SNPs CESD subscales (Depression)

40 GWAS of Personality Traits N = 8,113 European Americans Big Five personality traits Agreeableness Extraversion Conscien4ousness Neuro4cism Openness

41 GWAS of Personality Traits Modeling: linear model with age and sex included as covariates Personality Trait = α + β 1 (age) + β 2 (sex) + β 3 (SNP dosage) Analysis Program: PLINK

42 QQ Plots for Neuro4cism and Openness Neuro4cism Openness Observed log(p value) Observed log(p value) Expected log(p value) Expected log(p value)

43

44 Norms for Gene4c Consor4a Large cohorts (>50,000) for discovery Equally large replica4on samples Modest harmoniza4on of traits Shared analysis plans Tons of conference calls Meta analysis of results followed by bioinforma4cs Wri4ng groups 1 4 Years to publica4on

45 Resources for Inves4gators

46 User friendly Resources dbgap Candidate gene list Results database

47 dbgap Website Gene4c Data Released April 03, users have been approved to download the data

48 Summary of Projects through dbgap GWAS of longevity, cholesterol levels, cogni4ve func4on, liver disease, leukemia, lupus, economic outcomes Gene environment interac4ons on cogni4ve func4on, mood disorders, body mass Mendelian Randomiza4on to es4mate effects of gene4c, social, and physical risk factors on long term health Control popula4on for GWAS and other studies Assorta4ve ma4ng with respect to alcohol use

49 Crea4ng a bridge to known candidate genes The list of SNPs/genes was compiled from two lists sent to us List 1: Popular established gene4c variants used most ofen by behavioral scien4sts Compiled by Terrie Moffiu and Avshalom Caspi, October 2011 Several genes were also added to the list because they were suggested by Robert Wallace, October 2011 List 2: Cogni4ve age related candidate genes A list provided by Carol Prescou and Jack McArdle

50 Example of Candidate Gene List

51 What s Next? Exome chip data (~16,300 samples from 2006, 2008 & 2010 collec4on waves.) Over 200,000 func4onal muta4ons measured at once Rare variants (needs a completely different analysis approach e.g. SKAT) Fancy mul4gene es4mates of risk (Risk Index) Measured gene4c heritability Gene environment interac4ons Many consor4a conference calls

52 Illumina Exome Chip SNP Set Number of Candidates Number of Successful Designs Coding Content 275, ,094 GWAS Tag SNPs 5,763 5,325 Grid of Common Variants 5,710 5,286 Randomly Selected Synonymous SNPs 5,000 4,651 AIM African Ancestry 3,388 3,241 Addi9onal Notes An addi4onal set of 8,242 SNPs that were unique to the 1000 Genomes Project and popula4ons under represented in the design was added. For 1,000 SNPs, assays were generated on both strands in order to facilitate QC efforts and future development of methods for genotyping of rare variants. AIM Na4ve American Ancestry 1, HLA Tags 2,536 2,459 ESP Requests 1, Fingerprint SNPs MicroRNA Target Sites Mitochondrial Variants Chromosome Y Indels

53 Es4ma4ng the gene4c rela4onship from genome wide SNPs A ijk is the gene4c rela4onship between individual j and k at SNP i N is the number of SNPs p i is the allele frequency at SNP i x ij is an indicator variable that takes value of 0,1 or 2 if the genotype of the j th individual at SNPi is bb, Bb or BB

54 An Example of The Gene4c Rela4onship Matrix Subject 1 Subject 2 Subject 3 Subject 4 Subject 5 Subject 6.. Subject Subject Subject Subject Subject Subject 6..

55 Histogram of the Gene4c Relatedness Among the HRS subjects (N=12367) Mean= 8.382e 05 Median=0.008 Min= Max=0.559

56 Histogram of the Gene4c Relatedness used in our first heritability study (N=4367) Mean=0.010 Median=0.011 Min= Max=0.025

57 Es4mate Heritability of BMI Using Unrelated Individuals Square of z score difference = α + βa jk where α = 2σ 2 p Β = 2σ 2 g Heritability = σ 2 g /σ2 p Gene4c rela4onship A jk

58 Summary New GWAS requests are slowing down Ongoing connec4on to 10 GWAS consor4a We have finished exome chip cleaning, working with dbgap for submission, and new consor4a analysis Candidate Gene List being reviewed We would like to find a way to offer the gene4c relatedness matrix to inves4gators Should we begin a series of gene4cs webinars? What would be helpful to you?

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

Population description. 103 CHB Han Chinese in Beijing, China East Asian EAS. 104 JPT Japanese in Tokyo, Japan East Asian EAS

Population description. 103 CHB Han Chinese in Beijing, China East Asian EAS. 104 JPT Japanese in Tokyo, Japan East Asian EAS 1 Supplementary Table 1 Description of the 1000 Genomes Project Phase 3 representing 2504 individuals from 26 different global populations that are assigned to five super-populations Number of individuals

More information

Genotyping Technology How to Analyze Your Own Genome Fall 2013

Genotyping Technology How to Analyze Your Own Genome Fall 2013 Genotyping Technology 02-223 How to nalyze Your Own Genome Fall 2013 HapMap Project Phase 1 Phase 2 Phase 3 Samples & POP panels Genotyping centers Unique QC+ SNPs 269 samples (4 populations) HapMap International

More information

Resources at HapMap.Org

Resources at HapMap.Org Resources at HapMap.Org HapMap Phase II Dataset Release #21a, January 2007 (NCBI build 35) 3.8 M genotyped SNPs => 1 SNP/700 bp # polymorphic SNPs/kb in consensus dataset International HapMap Consortium

More information

Haplotypes, linkage disequilibrium, and the HapMap

Haplotypes, linkage disequilibrium, and the HapMap Haplotypes, linkage disequilibrium, and the HapMap Jeffrey Barrett Boulder, 2009 LD & HapMap Boulder, 2009 1 / 29 Outline 1 Haplotypes 2 Linkage disequilibrium 3 HapMap 4 Tag SNPs LD & HapMap Boulder,

More information

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse SUPPLEMENTARY INFORMATION De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations Wong et al. The Supplementary Information contains 4 Supplementary Figures, 3

More information

I/O Suite, VCF (1000 Genome) and HapMap

I/O Suite, VCF (1000 Genome) and HapMap I/O Suite, VCF (1000 Genome) and HapMap Hin-Tak Leung April 13, 2013 Contents 1 Introduction 1 1.1 Ethnic Composition of 1000G vs HapMap........................ 2 2 1000 Genome vs HapMap YRI (Africans)

More information

Genome variation - part 1

Genome variation - part 1 Genome variation - part 1 Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW Day 2 Friday 21 th January 2016 Aims of the session Introduce major

More information

THE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE

THE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE : GENETIC DATA UPDATE April 30, 2014 Biomarker Network Meeting PAA Jessica Faul, Ph.D., M.P.H. Health and Retirement Study Survey Research Center Institute for Social Research University of Michigan HRS

More information

Human Populations: History and Structure

Human Populations: History and Structure Human Populations: History and Structure In the paper Novembre J, Johnson, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann A, Nelson MB, Stephens M, Bustamante CD. 2008. Genes mirror geography

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL SUPPLEMENTAL MATERIAL Supplementary Table 1: RT-qPCR primer sequences. Sequences are shown from 5 to 3 direction; all primers are designed using mouse genome as reference. 36B4-F; TGAAGCAAAGGAAGAGTCGGAGGA

More information

Haplotypes Personalized Medicine: Understanding Your Own Genome Fall 2014

Haplotypes Personalized Medicine: Understanding Your Own Genome Fall 2014 Haplotypes 02-223 Personalized Medicine: Understanding Your Own Genome Fall 2014 Terminology Review llele: different forms of genecc variacons at a given gene or genecc locus Locus 1 has two alleles, and

More information

Supplementary Figure 1 a

Supplementary Figure 1 a Supplementary Figure 1 a b GWAS second stage log 10 observed P 0 2 4 6 8 10 12 0 1 2 3 4 log 10 expected P rs3077 (P hetero =0.84) GWAS second stage (BBJ, Japan) First replication (BBJ, Japan) Second replication

More information

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm The Whole Genome TagSNP Selection and Transferability Among HapMap Populations Reedik Magi, Lauris Kaplinski, and Maido Remm Pacific Symposium on Biocomputing 11:535-543(2006) THE WHOLE GENOME TAGSNP SELECTION

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Lisa J. Strug, PhD Guest Lecturer Biosta)s)cs Laboratory Course (CHL5207/8) March 5, 2015 Gene Mapping in the News Study Finds Gene

More information

Statistical Tools for Predicting Ancestry from Genetic Data

Statistical Tools for Predicting Ancestry from Genetic Data Statistical Tools for Predicting Ancestry from Genetic Data Timothy Thornton Department of Biostatistics University of Washington March 1, 2015 1 / 33 Basic Genetic Terminology A gene is the most fundamental

More information

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Supplementary information

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Supplementary information Fast and accurate genotype imputation in genome-wide association studies through pre-phasing Supplementary information Bryan Howie 1,6, Christian Fuchsberger 2,6, Matthew Stephens 1,3, Jonathan Marchini

More information

Sequence variation Introductory bioinformatics for human genomics workshop, UNSW

Sequence variation Introductory bioinformatics for human genomics workshop, UNSW Sequence variation Dr Jason Wong Prince of Wales Clinical School Introductory bioinformatics for human genomics workshop, UNSW Day 2 Friday 29 th January 2016 Aims of the session Introduce major human

More information

Office Hours. We will try to find a time

Office Hours.   We will try to find a time Office Hours We will try to find a time If you haven t done so yet, please mark times when you are available at: https://tinyurl.com/666-office-hours Thanks! Hardy Weinberg Equilibrium Biostatistics 666

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

Genotype quality control with plinkqc Hannah Meyer

Genotype quality control with plinkqc Hannah Meyer Genotype quality control with plinkqc Hannah Meyer 219-3-1 Contents Introduction 1 Per-individual quality control....................................... 2 Per-marker quality control.........................................

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2015 Human Genetics Series Thursday 4/02/15 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:

More information

Derrek Paul Hibar

Derrek Paul Hibar Derrek Paul Hibar derrek.hibar@ini.usc.edu Obtain the ADNI Genetic Data Quality Control Procedures Missingness Testing for relatedness Minor allele frequency (MAF) Hardy-Weinberg Equilibrium (HWE) Testing

More information

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics Genetic Variation and Genome- Wide Association Studies Keyan Salari, MD/PhD Candidate Department of Genetics How many of you did the readings before class? A. Yes, of course! B. Started, but didn t get

More information

Human Population Differentiation Is Strongly Correlated with Local Recombination Rate

Human Population Differentiation Is Strongly Correlated with Local Recombination Rate Human Population Differentiation Is Strongly Correlated with Local Recombination Rate Alon Keinan 1,2,3 *, David Reich 1,2 1 Department of Genetics, Harvard Medical School, Boston, Massachusetts, United

More information

Analysing Alu inserts detected from high-throughput sequencing data

Analysing Alu inserts detected from high-throughput sequencing data Analysing Alu inserts detected from high-throughput sequencing data Harun Mustafa Mentor: Matei David Supervisor: Michael Brudno July 3, 2013 Before we begin... Even though I'll only present the minimal

More information

Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia

Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia Population differentiation analysis of 54,734 European Americans reveals independent evolution of ADH1B gene in Europe and East Asia Kevin Galinsky Harvard T. H. Chan School of Public Health American Society

More information

Human Population Differentiation is Strongly Correlated With Local Recombination Rate

Human Population Differentiation is Strongly Correlated With Local Recombination Rate Human Population Differentiation is Strongly Correlated With Local Recombination Rate The Harvard community has made this article openly available. Please share how this access benefits you. Your story

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

IL1B-CGTC haplotype is associated with colorectal cancer in. admixed individuals with increased African ancestry

IL1B-CGTC haplotype is associated with colorectal cancer in. admixed individuals with increased African ancestry IL1B-CGTC haplotype is associated with colorectal cancer in admixed individuals with increased African ancestry María Carolina Sanabria-Salas 1, 2,*, Gustavo Hernández-Suárez 1, Adriana Umaña- Pérez 2,

More information

Genotype Prediction with SVMs

Genotype Prediction with SVMs Genotype Prediction with SVMs Nicholas Johnson December 12, 2008 1 Summary A tuned SVM appears competitive with the FastPhase HMM (Stephens and Scheet, 2006), which is the current state of the art in genotype

More information

H3A - Genome-Wide Association testing SOP

H3A - Genome-Wide Association testing SOP H3A - Genome-Wide Association testing SOP Introduction File format Strand errors Sample quality control Marker quality control Batch effects Population stratification Association testing Replication Meta

More information

The HapMap Project and Haploview

The HapMap Project and Haploview The HapMap Project and Haploview David Evans Ben Neale University of Oxford Wellcome Trust Centre for Human Genetics Human Haplotype Map General Idea: Characterize the distribution of Linkage Disequilibrium

More information

GENOME-WIDE data sets from worldwide panels of

GENOME-WIDE data sets from worldwide panels of Copyright Ó 2010 by the Genetics Society of America DOI: 10.1534/genetics.110.116681 Population Structure With Localized Haplotype Clusters Sharon R. Browning*,1 and Bruce S. Weir *Department of Statistics,

More information

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Introduction to Add Health GWAS Data Part I Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Outline Introduction to genome-wide association studies (GWAS) Research

More information

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on 02-710 Computa.onal Genomics Seyoung Kim Overview Two fundamental forces that shape genome sequences Recombina.on Muta.on, gene.c

More information

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Mark J. Rieder Department of Genome Sciences mrieder@u.washington washington.edu Epidemiology Studies Cohort Outcome Model to fit/explain

More information

Genome-wide analyses in admixed populations: Challenges and opportunities

Genome-wide analyses in admixed populations: Challenges and opportunities Genome-wide analyses in admixed populations: Challenges and opportunities E-mail: esteban.parra@utoronto.ca Esteban J. Parra, Ph.D. Admixed populations: an invaluable resource to study the genetics of

More information

Redefine what s possible with the Axiom Genotyping Solution

Redefine what s possible with the Axiom Genotyping Solution Redefine what s possible with the Axiom Genotyping Solution From discovery to translation on a single platform The Axiom Genotyping Solution enables enhanced genotyping studies to accelerate your research

More information

Quality Control Report for Exome Chip Data University of Michigan April, 2015

Quality Control Report for Exome Chip Data University of Michigan April, 2015 Quality Control Report for Exome Chip Data University of Michigan April, 2015 Project: Health and Retirement Study Support: U01AG009740 NIH Institute: NIA 1. Summary and recommendations for users A total

More information

Population stratification. Background & PLINK practical

Population stratification. Background & PLINK practical Population stratification Background & PLINK practical Variation between, within populations Any two humans differ ~0.1% of their genome (1 in ~1000bp) ~8% of this variation is accounted for by the major

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

UK Biobank Axiom Array

UK Biobank Axiom Array DATA SHEET Advancing human health studies with powerful genotyping technology Array highlights The Applied Biosystems UK Biobank Axiom Array is a powerful array for translational research. Designed using

More information

Nature Genetics: doi: /ng.3143

Nature Genetics: doi: /ng.3143 Supplementary Figure 1 Quantile-quantile plot of the association P values obtained in the discovery sample collection. The two clear outlying SNPs indicated for follow-up assessment are rs6841458 and rs7765379.

More information

Browsing Genes and Genomes with Ensembl

Browsing Genes and Genomes with Ensembl Browsing Genes and Genomes with Ensembl Victoria Newman Ensembl Outreach Officer EMBL-EBI Objectives What is Ensembl? What type of data can you get in Ensembl? How to navigate the Ensembl browser website.

More information

Supplementary Online Content

Supplementary Online Content Supplementary Online Content Lee JH, Cheng R, Barral S, Reitz C, Medrano M, Lantigua R, Jiménez-Velazquez IZ, Rogaeva E, St. George-Hyslop P, Mayeux R. Identification of novel loci for Alzheimer disease

More information

Supplementary Note: Detecting population structure in rare variant data

Supplementary Note: Detecting population structure in rare variant data Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to

More information

Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics

Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics Lecture: Genetic Basis of Complex Phenotypes 02-715 Advanced Topics in Computa8onal Genomics Genome Polymorphisms A Human Genealogy TCGAGGTATTAAC The ancestral chromosome From SNPS TCGAGGTATTAAC TCTAGGTATTAAC

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2017 Human Genetics Series Tuesday 4/10/17 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:

More information

VEGAS2: Gene-based test software using 1000 Genomes reference sets. User Manual

VEGAS2: Gene-based test software using 1000 Genomes reference sets. User Manual VEGAS2: Gene-based test software using 1000 Genomes reference sets. User Manual Version: 16:09:002 Date: 16 th September 2014 By Aniket Mishra, Stuart Macgregor Statistical Genetics Group QIMR Berghofer

More information

Supplementary Figures

Supplementary Figures 1 Supplementary Figures exm26442 2.40 2.20 2.00 1.80 Norm Intensity (B) 1.60 1.40 1.20 1 0.80 0.60 0.40 0.20 2 0-0.20 0 0.20 0.40 0.60 0.80 1 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 Norm Intensity

More information

Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma. Supplementary Information

Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma. Supplementary Information Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma Vinod Kumar 1,2, Naoya Kato 3, Yuji Urabe 1, Atsushi Takahashi 2, Ryosuke Muroyama 3, Naoya Hosono

More information

News. The International HapMap Project

News. The International HapMap Project HapMap News A Publication of the Coriell Institute for Medical Research, V olume 1, 2004 The International HapMap Project Excitement is building as scientists begin to construct a resource called the haplotype

More information

Roadmap: genotyping studies in the post-1kgp era. Alex Helm Product Manager Genotyping Applications

Roadmap: genotyping studies in the post-1kgp era. Alex Helm Product Manager Genotyping Applications Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Alex Helm Product Manager Genotyping Applications 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx, Solexa,

More information

Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era

Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era Anthony Green Sr. Genotyping Sales Specialist North America 2010 Illumina, Inc. All rights reserved. Illumina, illuminadx,

More information

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Genome-Wide Association Studies Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Introduction The next big advancement in the field of genetics after the Human Genome Project

More information

Exploring genomic databases: Practical session "

Exploring genomic databases: Practical session Exploring genomic databases: Practical session Work through the following practical exercises on your own. The objective of these exercises is to become familiar with the information available in each

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

Axiom mydesign Custom Array design guide for human genotyping applications

Axiom mydesign Custom Array design guide for human genotyping applications TECHNICAL NOTE Axiom mydesign Custom Genotyping Arrays Axiom mydesign Custom Array design guide for human genotyping applications Overview In the past, custom genotyping arrays were expensive, required

More information

Global Screening Array (GSA)

Global Screening Array (GSA) Technical overview - Infinium Global Screening Array (GSA) with optional Multi-disease drop in (MD) The Infinium Global Screening Array (GSA) combines a highly optimized, universal genome-wide backbone,

More information

Appendix 5: Details of statistical methods in the CRP CHD Genetics Collaboration (CCGC) [posted as supplied by

Appendix 5: Details of statistical methods in the CRP CHD Genetics Collaboration (CCGC) [posted as supplied by Appendix 5: Details of statistical methods in the CRP CHD Genetics Collaboration (CCGC) [posted as supplied by author] Statistical methods: All hypothesis tests were conducted using two-sided P-values

More information

S SG. Metabolomics meets Genomics. Hemant K. Tiwari, Ph.D. Professor and Head. Metabolomics: Bench to Bedside. ection ON tatistical.

S SG. Metabolomics meets Genomics. Hemant K. Tiwari, Ph.D. Professor and Head. Metabolomics: Bench to Bedside. ection ON tatistical. S SG ection ON tatistical enetics Metabolomics meets Genomics Hemant K. Tiwari, Ph.D. Professor and Head Section on Statistical Genetics Department of Biostatistics School of Public Health Metabolomics:

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

Supplementary Figure 1. Study design of a multi-stage GWAS of gout.

Supplementary Figure 1. Study design of a multi-stage GWAS of gout. Supplementary Figure 1. Study design of a multi-stage GWAS of gout. Supplementary Figure 2. Plot of the first two principal components from the analysis of the genome-wide study (after QC) combined with

More information

ARTICLE Contrasting X-Linked and Autosomal Diversity across 14 Human Populations

ARTICLE Contrasting X-Linked and Autosomal Diversity across 14 Human Populations ARTICLE Contrasting X-Linked and Autosomal Diversity across 14 Human Populations Leonardo Arbiza, 1,2 Srikanth Gottipati, 1,2 Adam Siepel, 1 and Alon Keinan 1, * Contrasting the genetic diversity of the

More information

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer Multi-SNP Models for Fine-Mapping Studies: Application to an association study of the Kallikrein Region and Prostate Cancer November 11, 2014 Contents Background 1 Background 2 3 4 5 6 Study Motivation

More information

Supplementary Methods Illumina Genome-Wide Genotyping Single SNP and Microsatellite Genotyping. Supplementary Table 4a Supplementary Table 4b

Supplementary Methods Illumina Genome-Wide Genotyping Single SNP and Microsatellite Genotyping. Supplementary Table 4a Supplementary Table 4b Supplementary Methods Illumina Genome-Wide Genotyping All Icelandic case- and control-samples were assayed with the Infinium HumanHap300 SNP chips (Illumina, SanDiego, CA, USA), containing 317,503 haplotype

More information

Comparison of the levels of diversity between coldspots (CS) and highly recombining regions (HRRs) for SNPs in the FCQ data set.

Comparison of the levels of diversity between coldspots (CS) and highly recombining regions (HRRs) for SNPs in the FCQ data set. Supplementary Figure 1 Comparison of the levels of diversity between coldspots (CS) and highly recombining regions (HRRs) for SNPs in the FCQ data set. Odds ratios (ORs) are computed to compare SNP density

More information

Structure, Measurement & Analysis of Genetic Variation

Structure, Measurement & Analysis of Genetic Variation Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Imputation. Genetics of Human Complex Traits

Imputation. Genetics of Human Complex Traits Genetics of Human Complex Traits GWAS results Manhattan plot x-axis: chromosomal position y-axis: -log 10 (p-value), so p = 1 x 10-8 is plotted at y = 8 p = 5 x 10-8 is plotted at y = 7.3 Advanced Genetics,

More information

A genome wide association study of metabolic traits in human urine

A genome wide association study of metabolic traits in human urine Supplementary material for A genome wide association study of metabolic traits in human urine Suhre et al. CONTENTS SUPPLEMENTARY FIGURES Supplementary Figure 1: Regional association plots surrounding

More information

Alkes Price Harvard School of Public Health January 24 & January 26, 2017

Alkes Price Harvard School of Public Health January 24 & January 26, 2017 EPI 511, Advanced Population and Medical Genetics Week 1: Intro + HapMap / 1000 Genomes Linkage Disequilibrium Alkes Price Harvard School of Public Health January 24 & January 26, 2017 EPI 511: Course

More information

PLINK gplink Haploview

PLINK gplink Haploview PLINK gplink Haploview Whole genome association software tutorial Shaun Purcell Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA Broad Institute of Harvard & MIT, Cambridge,

More information

ИСПОЛЬЗОВАНИ Е ЧИПАТОРОВ. Клиническая лаборатория

ИСПОЛЬЗОВАНИ Е ЧИПАТОРОВ. Клиническая лаборатория ИСПОЛЬЗОВАНИ Е ЧИПАТОРОВ Клиническая лаборатория 1 2 DISTRIBUTED VS CONSOLIDATED One machine per lab Optimized usage time and PI control Sample prep and data analysis is done inside the lab All equipment

More information

Genomics Resources in WHI. WHI ( ) Extension Study Steering Committee Meeting Seattle, WA May 05-06, 2011

Genomics Resources in WHI. WHI ( ) Extension Study Steering Committee Meeting Seattle, WA May 05-06, 2011 Genomics Resources in WHI WHI (2010-2015) Extension Study Steering Committee Meeting Seattle, WA May 05-06, 2011 WHI Genomic Resources in dbgap Outcomes and traits in AA and Hispanics GWAS-SHARe Sequencing-ESP

More information

Amapofhumangenomevariationfrom population-scale sequencing

Amapofhumangenomevariationfrom population-scale sequencing doi:.38/nature9534 Amapofhumangenomevariationfrom population-scale sequencing The Genomes Project Consortium* The Genomes Project aims to provide a deep characterization of human genome sequence variation

More information

Mining GWAS Catalog & 1000 Genomes Dataset. Segun Fatumo

Mining GWAS Catalog & 1000 Genomes Dataset. Segun Fatumo Mining GWAS Catalog & 1000 Genomes Dataset Segun Fatumo What is GWAS Catalog NHGRI GWA Catalog www.genome.gov/gwastudies Citation How to cite the NHGRI GWAS Catalog: Hindorff LA, MacArthur J (European

More information

Topics in Statistical Genetics

Topics in Statistical Genetics Topics in Statistical Genetics INSIGHT Bioinformatics Webinar 2 August 22 nd 2018 Presented by Cavan Reilly, Ph.D. & Brad Sherman, M.S. 1 Recap of webinar 1 concepts DNA is used to make proteins and proteins

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1: Loci associated with 2 hr glucose during pregnancy imputed to HAPMAP (a) and 1000 Genomes (b). Peak of association is in the first intron of HKDC1. Black bars

More information

Single Nucleotide Polymorphisms (SNPs)

Single Nucleotide Polymorphisms (SNPs) Single Nucleotide Polymorphisms (SNPs) Sequence variations Single nucleotide polymorphisms Insertions/deletions Copy number variations (large: >1kb) Variable (short) number tandem repeats Single Nucleotide

More information

Prostate Cancer Genetics: Today and tomorrow

Prostate Cancer Genetics: Today and tomorrow Prostate Cancer Genetics: Today and tomorrow Henrik Grönberg Professor Cancer Epidemiology, Deputy Chair Department of Medical Epidemiology and Biostatistics ( MEB) Karolinska Institutet, Stockholm IMPACT-Atanta

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

Population structure, heritability, and polygenic risk

Population structure, heritability, and polygenic risk Population structure, heritability, and polygenic risk Alicia Martin Daly Lab October 18, 2016 armartin@broadinstitute.org @genetisaur Project goals Call local ancestry in large case/control PTSD cohort

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Evidence of selection on human stature inferred from spatial distribution of allele frequencies.

Evidence of selection on human stature inferred from spatial distribution of allele frequencies. Evidence of selection on human stature inferred from spatial distribution of allele frequencies. 1 Davide Piffer Abstract Spatial patterns of allele frequencies reveal a clear signal of natural (or sexual)

More information

Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation

Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation Efficient Genomewide Selection of PCA-Correlated tsnps for Genotype Imputation Asif Javed 1,2, Petros Drineas 2, Michael W. Mahoney 3 and Peristera Paschou 4 1 Computational Biology Center, IBM T. J. Watson

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2018 Human Genetics Series Thursday 4/5/18 Nancy L. Saccone, Ph.D. Dept of Genetics nlims@genetics.wustl.edu / 314-747-3263 What

More information

CONTRACTING ORGANIZATION: Icahn School of Medicine at Mount Sinai New York, NY 10029

CONTRACTING ORGANIZATION: Icahn School of Medicine at Mount Sinai New York, NY 10029 AWARD NUMBER: W81XWH-14-1-0399 TITLE: Molecular & Genetic Investigation of Tau in Chronic Traumatic Encephalopathy (Log No. 13267017) PRINCIPAL INVESTIGATOR: John F. Crary, MD-PhD CONTRACTING ORGANIZATION:

More information

ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms

ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms Catarina D. Campbell, 1 Nick Sampas, 2 Anya Tsalenko, 2 Peter H. Sudmant, 1 Jeffrey M. Kidd, 1,3 Maika Malig, 1 Tiffany

More information

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics,

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics, Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es Addie Thompson Genomics, 11-15-2016 Outline History and terminology Sta5s5cs and breeding Linkage and associa5on analysis,

More information

Supplementary Figures

Supplementary Figures Supplementary Figures 1 Supplementary Figure 1. Analyses of present-day population differentiation. (A, B) Enrichment of strongly differentiated genic alleles for all present-day population comparisons

More information

Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics

Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics Lecture 2: Population Structure 02-715 Advanced Topics in Computa8onal Genomics 1 What is population structure? Popula8on Structure A set of individuals characterized by some measure of gene8c dis8nc8on

More information

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary Executive Summary Helix is a personal genomics platform company with a simple but powerful mission: to empower every person to improve their life through DNA. Our platform includes saliva sample collection,

More information

Evaluation of a multipoint method for imputing genotypes using HapMap III

Evaluation of a multipoint method for imputing genotypes using HapMap III Mathematical Statistics Stockholm University Evaluation of a multipoint method for imputing genotypes using HapMap III Emil Rehnberg Examensarbete 2009:5 Postal address: Mathematical Statistics Dept. of

More information

Genetic data concepts and tests

Genetic data concepts and tests Genetic data concepts and tests Cavan Reilly September 21, 2018 Table of contents Overview Linkage disequilibrium Quantifying LD Heatmap for LD Hardy-Weinberg equilibrium Genotyping errors Population substructure

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

Improving the accuracy and efficiency of identity by descent detection in population

Improving the accuracy and efficiency of identity by descent detection in population Genetics: Early Online, published on March 27, 2013 as 10.1534/genetics.113.150029 Improving the accuracy and efficiency of identity by descent detection in population data Brian L. Browning *,1 and Sharon

More information