Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis

Size: px
Start display at page:

Download "Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis"

Transcription

1 correction notice Nat. Genet. 45, (2013); published online 14 April 2013; corrected online 1 October 2013 Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis Tasha E Fingerlin, Elissa Murphy, Weiming Zhang, Anna L Peljto, Kevin K Brown, Mark P Steele, James E Loyd, Gregory P Cosgrove, David Lynch, Steve Groshong, Harold R Collard, Paul J Wolters, Williamson Z Bradford, Karl Kossen, Scott D Seiwert, Roland M du Bois, Christine Kim Garcia, Megan S Devine, Gunnar Gudmundsson, Helgi J Isaksson, Naftali Kaminski, Yingze Zhang, Kevin F Gibson, Lisa H Lancaster, Joy D Cogan, Wendi R Mason, Toby M Maher, Philip L Molyneaux, Athol U Wells, Miriam F Moffatt, Moises Selman, Annie Pardo, Dong Soon Kim, James D Crapo, Barry J Make, Elizabeth A Regan, Dinesha S Walek, Jerry J Daniel, Yoichiro Kamatani, Diana Zelenika, Keith Smith, David McKean, Brent S Pedersen, Janet Talbert, Raven N Kidd, Cheryl R Markin, Kenneth B Beckman, Mark Lathrop, Marvin I Schwarz & David A Schwartz In the version of this supplementary file initially posted online, there was an error in footnote b of Supplementary Tables 5 and 6. In both tables, this footnote should have read: Minor allele in combined case and control group listed first. This error has been corrected in this file as of 1 October nature genetics

2 Supplementary Information Genome-wide association study identifies multiple susceptibility loci for pulmonary fibrosis Tasha E. Fingerlin, Elissa Murphy, Weiming Zhang, Anna L. Peljto, Kevin K. Brown, Mark P. Steele, James E. Loyd, Gregory P. Cosgrove, David Lynch, Steve Groshong, Harold R. Collard, Paul J. Wolters, Williamson Z. Bradford, Karl Kossen, Scott D. Seiwert, Roland M. du Bois, Christine Kim Garcia, Megan S. Devine, Gunnar Gudmundsson, Helgi J. Isaksson, Naftali Kaminski, Yingze Zhang, Kevin F. Gibson, Lisa H. Lancaster, Joy D. Cogan, Wendi R. Mason, Toby M. Maher, Philip L. Molyneaux, Athol U. Wells, Miriam F. Moffatt, Moises Selman, Annie Pardo, Dong Soon Kim, James D. Crapo, Barry J. Make, Elizabeth A. Regan, Dinesha S. Walek, Jerry J. Daniel, Yoichiro Kamatani, Diana Zelenika, Keith Smith, David McKean, Brent S. Pedersen, Janet Talbert, Ravin N. Kidd, Cheryl R. Markin, Kenneth B. Beckman, Mark Lathrop, Marvin I. Schwarz, David A. Schwartz 1

3 Table of Contents Supplementary Table 1: Characteristics of GWAS and Replication Samples...3 Supplementary Table 2: Specific IIP diagnosis information for GWAS discovery samples...4 Supplementary Table 3: Sample origin for GWAS and replication IIP cases genotyped....5 Supplementary Table 4: Association information for all 198 SNPs chosen for replication....6 Supplementary Table 5: Genotype counts and Hardy-Weinberg Equilibrium (HWE) P-values among cases and controls in the discovery set for all 198 SNPs taken into replication set...15 Supplementary Table 6: Genotype counts and Hardy-Weinberg Equilibrium (HWE) P-values among cases and controls in the replication set for all SNPs successfully genotyped in replication Supplementary Table 7: Specific IIP diagnosis information for replication samples Supplementary Table 8: Association information for SNPs chosen for replication among the GWAS cases (n=1127) and controls (n=2832) homozygous for the H1 haplotype (non-h2 carriers) on chromosome 17q Supplementary Table 9: Association information for SNPs chosen for replication among the replication cases (n=617) and controls (n=1138) homozygous for the H1 haplotype on chromosome 17q Supplementary Table 10: Association information for imputed SNPs across genome-wide significant loci Supplementary Table 11: Adjusted association information for all genome-wide significant SNPs in meta-analysis using joint genotypes from subset of GWAS cases, all replication cases and all replication controls...41 Supplementary Figure 1: Quantile-Quantile (Q-Q) plot of observed vs. expected p-value distribution for GWAS across 439,828 high quality SNPs...44 Supplementary Figure 2: Locus-specific plots with genotyped (circles) and imputed (squares) SNP results (based on GWAS discovery cases and controls only) for 6 loci reaching genomewide significance in the GWAS discovery analysis and meta-analysis of the discovery and replication results Supplementary Figure 3: Locus-specific plots with genotyped (circles) and imputed (squares) SNP results (based on GWAS discovery cases and controls only) for 4 loci reaching genomewide significance in the meta-analysis of the discovery and replication results Supplementary Figure 4: Linkage disequilibrium among the genome-wide significant SNPs at 11p15 and rs

4 Supplementary Table 1: Characteristics of GWAS and Replication Samples Discovery GWAS Replication Cases (n=1616) Controls a (n=4683) P-value b Cases (n=876) Controls (n=1890) P-value b Male e e-24 Age in years (SD) 65.5 (9.5) NA NA 63.4 (9.7) 58.4 (9.7) 1.02e-35 a Age information not available on out-of-study controls b Cases and controls compared via chi-squared test of association (Male) or two-sample t-test (Age) 3

5 Supplementary Table 2: Specific IIP diagnosis information for GWAS discovery samples. Genotyped (N=1914) Included after QC (N=1616) Sporadic a Familial b Sporadic a Familial b IPF 1055 (55%) 445 (24%) 948 (58%) 303 (19%) NSIP 51 (3%) 60 (3%) 44 (3%) 47 (3%) COP 3 (<1%) 3 (<1%) 3 (<1%) 3 (<1%) RB-ILD 8 (<1%) 3 (<1%) 8 (<1%) 3 (<1%) DIP 5 (<1%) 0 5 (<1%) 0 Unclassified 226 (12%) 55 (3%) 218 (13%) 34 (2%) a No known family history b At least 2 affected relatives (3 rd degree or closer) QC: Quality control; IPF: Idiopathic pulmonary fibrosis; NSIP: Non-specific interstitial pneumonia; COP: Cryptogenic organizing pneumonia; RB-ILD: Respiratory bronchiolitis-associated interstitial lung disease; DIP: Desquamative interstitial pneumonia 4

6 Supplementary Table 3: Sample origin for GWAS and replication IIP cases genotyped. GWAS Cases (n=1914) Replication Cases (n=1027) Familial Pulmonary Fibrosis 566 United Kingdom* 222 National Jewish Health 238 Duke 238 InterMune IPF Trials 720 U. Texas Southwestern 192 University of California San Francisco 66 Pittsburgh 232 NHLBI LTRC/LGRC 219 Vanderbilt 143 Vanderbilt 105 5

7 Supplementary Table 4: Association information for all 198 SNPs chosen for replication. Blank Replication and Joint columns correspond to SNPs not successfully genotyped in replication. Chr. SNP a Chr. 1 MAF b Case MAF b Control Discovery GWAS rs (1.12,1.41) rs (1.1,1.35) rs (1.13,1.39) rs (0.62,0.85) rs (0.33,0.64) Chr. 2 rs (1.14,1.48) rs (0.73,0.90) rs (1.17,1.54) rs (0.77,0.92) rs (0.72,0.89) rs (0.72,0.89) rs (1.21,1.66) rs (1.34,2.26) rs (1.34,2.04) rs (1.34,2.03) rs (1.33,2.01) rs (1.30,1.94) rs (1.32,1.98) rs (1.15,1.53) rs (1.09,1.34) OR c MAF b (95% CI) P-value d Case MAF b Control Replication 8.81e (0.74,1.03) 4.05e (0.91,1.21) 6.45e (0.92,1.22) 6.21e (0.80,1.20) 3.75e e (0.79,1.17) 4.30e (1.09,1.45) 4.54e (0.83,1.22) 9.17e (0.93,1.18) 1.31e (0.91,1.21) 1.33e (0.91,1.21) 2.74e (0.89,1.40) 5.06e (0.64,1.45) 7.16e (0.73,1.39) 7.42e e (0.77,1.42) 3.61e (0.80,1.47) 1.75e (0.78,1.43) 3.65e (0.94,1.42) 9.83e (0.97,1.31) Meta Analysis OR c (95% CI) P-value d P-value d e e e-05 6

8 Chr. 3 rs (1.17,1.53) rs (1.15,1.44) rs (1.10,1.31) rs (0.70,0.88) rs (0.80,0.94) rs (1.16,1.37) rs (1.19,1.43) rs (1.19,1.42) rs (1.19,1.42) rs (0.74,0.90) Chr. 4 rs (0.23,0.66) rs (0.23,0.61) rs (0.69,0.87) rs (0.69,0.87) rs (0.71,0.9) rs (0.73,0.87) rs (1.18,1.42) rs (0.7,0.85) rs (0.73,0.86) rs (0.73,0.85) rs (0.75,0.89) rs (0.74,0.89) rs (0.74,0.89) rs (0.75,0.88) rs (1.15,1.36) 3.15e (0.73,1.06) 3.53e (0.90,1.25) 7.37e (0.99,1.30) 9.67e (0.86,1.17) 6.14e (0.88,1.11) 3.60e (1.06,1.36) 3.90e (1.17,1.52) 3.71e (1.21,1.55) 3.20e (1.23,1.58) 4.99e (0.80,1.04) 7.79e (0.88,2.7) 2.34e (0.61,1.58) 3.52e (0.76,1.06) 3.35e e (0.82,1.13) 6.66e (0.80,1.03) 5.27e (1.25,1.64) 3.73e (0.82,1.09) 7.54e (0.85,1.08) 7.59e (0.79,1.00) 8.49e (0.79,1.01) 4.72e (0.79,1.02) 5.42e (0.79,1.01) 2.93e (0.83,1.06) 5.35e (0.93,1.18) e e e e e e e e e e e e e e e e e e e e e-05 7

9 rs (0.77,0.91) rs (0.74,0.88) rs (1.12,1.37) rs (0.67,0.86) Chr. 5 rs (0.67,0.79) rs (0.70,0.84) rs (0.79,0.93) rs (1.14,1.44) rs (1.10,1.33) rs (0.70,0.89) rs (0.70,0.88) rs (0.66,0.84) rs (0.67,0.87) rs (0.62,0.85) Chr. 6 rs (1.15,1.56) rs (1.15,1.42) rs (0.70,0.85) rs (1.32,1.55) rs (1.20,1.43) rs (0.71,0.87) rs (1.21,1.57) rs (1.20,1.56) rs (1.21,1.57) rs (1.20,1.55) rs (1.23,1.54) 4.19e (0.86,1.09) 5.73e (0.83,1.07) 6.16e (0.74,1.00) 9.30e e (0.65,0.83) 8.93e (0.73,0.96) 3.80e (0.94,1.19) 2.68e (0.85,1.18) 7.65e (0.85,1.14) 6.49e (0.74,1.04) 7.08e (0.68,0.94) 2.96e (0.74,1.04) 8.39e (0.81,1.17) 5.04e (0.77,1.20) 4.64e (1.00,1.52) 2.92e (1.13,1.54) 3.41e (0.71,0.94) 1.14e (1.13,1.42) 6.41e (0.95,1.22) 1.54e (0.72,0.95) 4.73e (0.86,1.26) 2.76e (0.93,1.35) 2.58e (0.98,1.44) 3.38e (0.96,1.40) 2.26e e e e e e e e e e e e e e e e e-05 8

10 rs (1.21,1.51) rs (0.75,0.89) rs (1.09,1.28) rs (1.12,1.33) rs (1.14,1.51) Chr. 7 rs (1.13,1.33) rs (1.23,1.67) rs (1.18,1.60) rs (1.13,1.45) rs (1.16,1.36) rs (1.20,1.41) rs (1.10,1.31) rs (1.15,1.38) rs (1.14,1.44) Chr. 8 rs (1.45,2.19) rs (1.11,1.36) rs (1.17,1.72) rs (1.41,2.73) rs (0.68,0.86) rs (1.11,1.36) rs (1.18,1.55) rs (0.77,0.91) rs (0.77,0.91) rs (0.77,0.91) rs (0.73,0.87) Chr e e (0.86,1.10) 8.43e e (0.95,1.23) 6.11e (0.84,1.26) 3.96e (0.92,1.16) 8.71e (0.90,1.41) 7.92e (0.80,1.24) 8.90e (0.81,1.16) 5.87e (1.00,1.26) 6.72e (0.98,1.24) 5.10e (0.86,1.10) 1.02e (0.90,1.17) 6.39e (0.85,1.19) 5.74e (0.86,1.59) 7.76e (0.81,1.10) 5.97e (0.9,1.59) 8.65e (0.45,1.60) 6.35e (0.89,1.25) 6.92e (0.87,1.16) 1.18e (0.93,1.39) 1.90e (0.78,0.99) 1.83e e (0.81,1.03) 2.92e (0.76,0.97) e e e e e e e e e e e e-07 9

11 rs (0.30,0.69) rs (0.25,0.68) rs (1.29,1.95) rs (1.10,1.29) Chr. 10 rs (1.15,1.60) rs (1.16,1.46) rs (1.15,1.46) rs (0.75,0.9) rs (0.75,0.9) rs (0.68,0.87) rs (1.13,1.36) rs (0.74,0.88) rs (0.74,0.87) rs (0.74,0.87) rs (0.74,0.87) rs (0.66,0.85) rs (0.66,0.85) rs (1.12,1.33) rs (1.12,1.33) Chr. 11 rs (1.15,1.35) rs (1.15,1.35) rs (1.27,1.50) rs (0.72,0.86) rs (0.66,0.82) rs (0.61,0.83) 3.54e (0.59,1.58) 9.57e (0.56,1.64) 4.12e (0.74,1.35) 1.99e (0.84,1.06) 7.33e (1.1,1.66) 7.92e (0.79,1.10) 8.57e (0.78,1.09) 3.40e (0.88,1.12) 3.23e (0.88,1.12) 6.25e (0.82,1.16) 2.50e (0.90,1.17) 1.24e (0.72,0.91) 4.73e (0.77,0.97) 4.65e (0.77,0.98) 2.82e (0.77,0.97) 8.46e e (0.80,1.13) 6.47e (0.89,1.15) 4.01e (0.93,1.19) 1.86e (1.00,1.26) 1.90e (1.03,1.3) 9.29e (0.98,1.25) 3.10e (0.90,1.16) 5.90e (0.86,1.15) 1.69e (0.68,1.05) e e e e e e e e e e e-05 10

12 rs (1.40,1.65) rs (1.39,1.64) rs (0.64,0.75) rs (0.67,0.8) rs (0.71,0.89) rs (0.59,0.70) rs (0.73,0.86) rs (0.71,0.84) rs (0.71,0.84) rs (0.72,0.84) rs (0.71,0.84) rs (0.63,0.80) rs (0.69,0.82) rs (0.74,0.88) rs (1.18,1.57) rs (1.1,1.30) rs (1.18,1.60) rs (1.18,1.60) Chr. 12 rs (1.14,1.46) rs (0.65,0.86) Chr. 13 rs (0.58,0.83) rs (0.72,0.88) Chr. 14 rs (1.11,1.38) rs (0.45,0.75) rs (0.76,0.91) 5.46e (1.39,1.76) 1.62e (1.39,1.77) 4.17e (0.78,0.98) 8.47e (0.81,1.03) 4.73e (0.84,1.14) 1.26e (0.69,0.87) 8.58e e (0.76,0.96) 1.48e (0.76,0.96) 7.52e (0.72,0.91) 7.77e e (0.59,0.84) 3.69e (0.69,0.88) 3.98e (0.75,0.95) 4.42e (1.01,1.52) 2.85e (0.96,1.21) 4.66e (0.81,1.25) 4.21e (0.83,1.30) 8.70e (0.94,1.35) 5.05e (0.94,1.38) 4.79e e (0.70,0.92) 6.84e (0.98,1.34) 7.43e (0.59,1.14) 2.91e (0.83,1.08) 1.49e e e e e e e e e e e e e e e e e e e e e e

13 rs (1.09,1.28) Chr. 15 rs (0.71,0.85) rs (0.71,0.85) rs (0.77,0.91) rs (0.76,0.90) rs (1.14,1.33) rs (0.71,0.84) rs (0.77,0.91) rs (1.29,2.17) rs (0.60,0.85) rs (0.72,0.89) rs (1.15,1.36) rs (1.15,1.36) rs (1.15,1.36) rs (1.14,1.36) rs (1.15,1.36) rs (1.14,1.35) rs (1.17,1.63) rs (1.17,1.63) Chr. 16 rs (1.12,1.38) rs (1.13,1.54) rs (0.63,0.87) Chr. 17 rs (0.65,0.8) rs (0.7,0.85) rs (0.68,0.83) 9.45e e (0.73,0.94) 3.49e (0.75,0.96) 2.14e (0.78,0.99) 6.53e e (1.07,1.36) 1.86e (0.74,0.93) 8.14e e (0.56,1.25) 3.91e (0.88,1.39) 3.65e (0.90,1.21) 1.11e (0.89,1.13) 1.12e (0.87,1.11) 9.74e (0.89,1.13) 3.36e (0.92,1.19) 1.03e (1.00,1.27) 3.05e (0.93,1.18) 9.68e (0.84,1.39) 8.39e (0.79,1.31) 3.45e (0.88,1.20) 9.21e (0.89,1.38) 8.65e (0.76,1.18) 9.26e (0.71,0.95) 1.57e (0.79,1.04) 1.49e e e e e e e e e e e-05 12

14 rs (0.68,0.82) rs (0.64,0.79) rs (0.71,0.86) rs (0.64,0.78) rs (0.64,0.78) rs (0.64,0.79) rs (0.72,0.86) rs (0.65,0.80) rs (0.64,0.79) rs (0.64,0.79) rs (0.68,0.83) rs (0.64,0.79) rs (0.71,0.86) rs (0.64,0.80) rs (0.65,0.80) Chr. 18 rs (1.11,1.37) Chr. 19 rs (1.18,1.41) rs (1.18,1.40) rs (0.71,0.88) rs (1.10,1.29) Chr. 20 rs (1.18,1.57) rs (0.78,0.92) rs (1.13,1.38) Chr.21 rs (1.35,2.57) 9.18e (0.73,0.97) 7.07e (0.58,0.79) 3.42e (0.65,0.86) 3.39e (0.58,0.79) 2.52e (0.58,0.79) 3.87e (0.61,0.83) 7.60e (0.70,0.89) 1.29e (0.60,0.81) 9.61e (0.57,0.77) 1.04e (0.58,0.78) 6.95e (0.73,0.97) 2.33e (0.60,0.82) 3.48e (0.77,0.99) 5.19e (0.59,0.81) 7.86e (0.62,0.84) 8.30e (0.94,1.27) 9.57e (1.15,1.47) 1.22e (1.10,1.41) 1.92e (0.80,1.09) 5.00e (0.86,1.08) 5.23e (0.94,1.47) 5.27e (0.80,1.01) 9.58e (0.98,1.29) 9.90e (0.68,1.86) e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e

15 rs (1.14,1.46) Chr e (0.89,1.25) rs e (1.21,1.63) (0.83,1.32) rs e-06 (1.23,1.63) rs e (1.07,1.31) (0.94,1.24) a Bolded SNPs were genome-wide significant in meta-analysis e b MAF: Minor allele frequency; minor allele defined as minor allele in combined case and control group; c OR: Odds ratio for the minor allele; CI: Confidence Interval d Adjusted for sex 14

16 Supplementary Table 5: Genotype counts and Hardy-Weinberg Equilibrium (HWE) P-values among cases and controls in the discovery set for all 198 SNPs taken into replication set. Chr. Cases Controls SNP Position a Alleles b Minor c Hets d Major e HWE P f Minor c Hets d Major e HWE P f Chr. 1 rs A/G rs A/G rs C/A rs G/A < rs G/A < Chr. 2 rs A/G rs A/G rs C/T rs G/A rs T/C rs A/G rs C/T < rs A/G rs C/T rs T/C rs T/C rs G/T rs A/G rs A/C < rs C/A Chr. 3 rs C/T < rs T/C rs C/A rs T/C rs G/A rs G/A rs T/C rs T/C rs C/T

17 rs T/C Chr. 4 rs A/G < rs T/C < rs C/A rs G/A rs C/T rs G/A rs G/T rs T/C rs C/T rs T/C rs T/C rs A/G rs T/C rs T/C rs T/C rs T/G rs A/G rs C/T rs C/T Chr. 5 rs C/A < rs T/C rs T/G rs G/A rs G/A rs A/G rs C/T rs A/G rs G/A rs T/C Chr. 6 rs G/A rs C/T rs C/T rs G/T rs A/G rs A/G rs A/G

18 rs T/C rs A/G rs G/A rs A/G rs G/A rs T/C rs A/G rs C/T rs A/G Chr. 7 rs C/T rs T/C rs C/T rs C/T rs G/A rs A/C rs G/A rs T/G rs G/A Chr. 8 rs C/T < <0.01 rs A/G rs A/G < rs A/G < rs T/G rs T/C rs C/A <0.01 rs G/A rs G/T rs A/G rs A/G Chr. 9 rs A/G rs A/G < rs T/C < <0.01 rs G/A Chr. 10 rs G/A rs C/A rs C/T

19 rs T/C rs T/C rs T/C rs C/T rs G/T rs G/A rs A/G rs G/A rs T/C rs C/A rs G/T rs C/T Chr. 11 rs G/A rs C/T rs C/T rs C/T rs A/G rs G/A rs T/C rs T/C rs C/T rs C/T rs G/T rs T/C rs A/G rs C/T rs G/A rs C/T rs T/C rs A/G rs C/A rs C/T rs G/A rs T/G rs T/C rs A/G Chr. 12 rs A/G

20 rs G/A Chr. 13 rs T/C rs A/G Chr. 14 rs T/C rs C/T rs C/T rs C/T Chr. 15 rs A/G rs T/C rs G/A rs G/A rs T/C rs G/A rs A/C rs G/A rs C/T rs A/C rs C/A rs C/T rs T/C rs G/A rs G/A rs A/G rs C/T rs C/T Chr. 16 rs G/T rs A/C rs G/A Chr. 17 rs G/A rs G/A rs G/A rs C/T rs A/G rs T/C <0.01 rs G/A

21 rs A/G rs G/A rs G/A rs G/A rs C/T rs A/G rs A/G rs G/A rs G/A rs A/G rs C/T Chr. 18 rs T/C Chr. 19 rs G/A rs A/G rs C/T rs T/C Chr. 20 rs T/C rs T/C rs T/G Chr.21 rs C/T < rs G/T Chr.23 rs A/G rs A/G rs G/A a Genomic position based on NCBI Build 36 b Minor allele in combined case and control group listed first. c Minor: Count of minor allele homozygous subjects d Het: Count of heterozygous subjects e Major: Count of major allele (more frequent allele) homozygous subjects f P-value for HWE goodness-of-fit test 20

22 Supplementary Table 6: Genotype counts and Hardy-Weinberg Equilibrium (HWE) P-values among Chr. Cases Controls SNP Position a Alleles b Minor c Hets d Major e HWE P f Minor c Hets d Major e HWE P f Chr. 1 rs A/G rs T/C g rs C/A rs G/A Rs G/A Chr. 2 rs T/C g rs A/G rs C/T rs C/T g rs T/C rs T/C g rs C/T rs A/G rs C/T rs A/G g rs G/T rs T/C g rs A/C rs C/A Chr. 3 rs C/T rs T/C rs C/A rs T/C rs T/C g rs C/T g rs T/C < rs A/G g rs C/T rs T/C Chr. 4 cases and controls in the replication set for all SNPs successfully genotyped in replication. rs A/G

23 rs T/C rs C/A rs C/T rs G/A rs G/T rs T/C rs C/T rs T/C rs T/C rs A/G rs T/C rs T/C rs T/C rs T/G rs T/C g rs C/T Chr. 5 rs G/T g rs A/G g rs A/C g rs G/A rs C/T g rs T/C g rs C/T rs T/C g rs C/T g rs A/G g Chr. 6 rs G/A rs C/T <0.01 rs C/T rs G/T rs A/G rs A/G rs T/C g rs A/G g rs A/G rs G/A rs T/C rs G/A g

24 rs T/C g Chr. 7 rs C/T rs T/C rs C/T rs G/A g rs G/A rs A/C rs G/A rs A/C g rs G/A Chr. 8 rs G/A g rs A/G rs A/G rs A/G rs T/G rs A/G g rs C/A rs G/A rs A/G < <0.01 rs A/G Chr. 9 rs A/G rs T/C g rs T/C rs C/T g Chr. 10 rs G/A rs C/A rs C/T rs T/C rs A/G g rs A/G g rs C/T rs G/T rs C/T g rs T/C g rs G/A rs C/A

25 rs G/T rs C/T Chr. 11 rs G/A rs C/T rs G/A g rs C/T rs T/C g rs C/T g rs A/G g rs A/G g rs G/A g rs C/T rs C/A g rs T/C rs C/T rs G/A rs C/T rs T/C g rs C/A rs C/T rs G/A rs T/G rs A/G g rs A/G Chr. 12 rs A/G rs G/A Chr. 13 rs A/G Chr. 14 rs T/C rs C/T rs C/T Chr. 15 rs T/C g rs A/G g rs C/T g rs T/C rs C/T g <

26 rs C/T g rs C/T rs A/C rs C/A rs C/T rs T/C rs G/A rs G/A rs A/G rs C/T rs C/T Chr. 16 rs G/T rs A/C < rs G/A Chr. 17 rs G/A rs C/T g rs C/T rs A/G rs T/C rs G/A rs A/G rs G/A rs G/A rs C/T g rs G/A g rs T/C g rs A/G rs G/A rs G/A rs T/C g rs G/A g Chr. 18 rs T/C Chr. 19 rs G/A rs A/G rs C/T rs A/G g

27 Chr. 20 rs T/C rs T/C rs T/G Chr.21 rs C/T rs G/T Chr.23 rs A/G rs G/A a Genomic position based on NCBI Build 36 b Minor allele in combined case and control group listed first. c Minor: Count of minor allele homozygous subjects d Het: Count of heterozygous subjects e Major: Count of major allele (more frequent allele) homozygous subjects f P-value for HWE goodness-of-fit test\ g Alleles listed are on opposite strand as those from the GWAS panel listed in Supplementary Table 3 26

28 Supplementary Table 7: Specific IIP diagnosis information for replication samples. Genotyped (N=1027) Included after QC (N=876) Sporadic a Familial b Sporadic a Familial b IPF 881 (86%) 32 (3%) 749 (86%) 25 (3%) NSIP 66 (6%) 1 (<1%) 58 (7%) 1 (<1%) COP 6 (1%) 0 5 (1%) 0 RB-ILD 3 (<1%) 0 2 (<1%) 0 DIP 6 (1%) 0 5 (1%) 0 Unclassified 31 (3%) 1 (<1%) 30 (3%) 1 (<1%) a No known family history b At least 2 affected relatives (3 rd degree or closer) QC: Quality control; IPF: Idiopathic pulmonary fibrosis; NSIP: Non-specific interstitial pneumonia; COP: Cryptogenic organizing pneumonia; RB-ILD: Respiratory bronchiolitis-associated interstitial lung disease; DIP: Desquamative interstitial pneumonia 27

29 Supplementary Table 8: Association information for SNPs chosen for replication among the GWAS cases (n=1127) and controls (n=2832) homozygous for the H1 haplotype (non-h2 carriers) on chromosome 17q21. Chr. SNP a Position b OR c 95% CI d P-value e Chr. 1 rs (1.09,1.39) 4.10e-04 rs (1.07,1.38) 1.44e-03 Chr. 2 rs (0.72,0.93) 2.53e-03 rs (1.18,1.65) 7.33e-04 rs (0.75,0.93) 5.88e-04 rs (0.68,0.88) 3.47e-04 rs (0.68,0.88) 3.27e-04 rs (1.07,1.38) 4.68e-04 Chr. 3 rs (1.05,1.39) 7.25e-03 rs (1.06,1.32) 3.06e-03 rs (0.72,0.96) 2.60e-02 rs (1.08,1.31) 1.56e-04 rs (1.2,1.47) 1.41e-06 rs (1.17,1.46) 3.27e-05 rs (1.18,1.47) 1.47e-05 rs (1.18,1.46) 2.12e-05 rs (0.77,0.97) 2.07e-02 Chr. 4 rs (0.69,0.91) 1.00e-03 rs (0.7,0.87) 2.67e-04 rs (1.17,1.47) 3.56e-05 rs (0.65,0.83) 2.01e-05 rs (0.68,0.84) 7.15e-06 rs (0.68,0.83) 3.57e-07 rs (0.69,0.86) 1.60e-05 rs (0.69,0.86) 1.35e-05 rs (0.69,0.86) 1.71e-05 rs (0.71,0.86) 2.39e-05 rs (1.17,1.44) 2.45e-06 rs (0.74,0.9) 1.21e-04 rs (0.68,0.84) 1.25e-05 rs (1.09,1.38) 1.64e-03 Chr. 5 rs (0.68,0.82) 3.13e-09 rs (0.69,0.87) 3.52e-05 rs (0.78,0.95) 9.33e-04 rs (1.09,1.38) 5.08e-04 rs (0.58,0.85) 1.99e-04 28

30 Chr. 6 rs (1.05,1.52) 1.15e-02 rs (1.09,1.41) 3.27e-03 rs (0.72,0.92) 8.17e-04 rs (1.29,1.58) 4.28e-11 rs (1.18,1.46) 2.04e-06 rs (0.69,0.88) 2.66e-04 rs (1.11,1.47) 2.24e-03 rs (1.1,1.46) 3.58e-03 rs (0.79,0.98) 9.43e-03 rs (1.02,1.25) 1.73e-02 rs (1.06,1.31) 1.64e-03 Chr. 7 rs (1.09,1.33) 4.74e-04 rs (1.12,1.36) 1.99e-04 rs (1.17,1.43) 6.20e-06 rs (1.08,1.33) 1.48e-03 rs (1.15,1.43) 3.12e-05 Chr. 8 rs (1.08,1.39) 2.63e-03 rs (1.06,1.36) 5.61e-03 rs (0.79,0.96) 1.26e-02 rs (0.79,0.96) 1.13e-02 rs (0.78,0.95) 9.25e-03 rs (0.74,0.92) 1.46e-03 Chr. 9 rs (1.05,1.29) 3.68e-03 Chr. 10 rs (0.7,0.86) 1.92e-05 rs (0.7,0.86) 1.87e-05 rs (1.12,1.4) 2.07e-04 rs (0.72,0.88) 1.57e-04 rs (0.72,0.88) 1.03e-05 rs (0.72,0.88) 1.19e-05 rs (0.71,0.87) 5.84e-06 rs (0.63,0.86) 1.20e-04 rs (0.63,0.86) 1.54e-04 rs (1.11,1.38) 5.49e-04 rs (1.1,1.36) 8.66e-04 Chr. 11 rs (1.1,1.34) 2.95e-04 rs (1.1,1.34) 2.78e-04 rs (1.24,1.51) 8.36e-09 rs (0.69,0.86) 1.05e-05 rs (0.64,0.83) 1.24e-05 rs (1.41,1.73) 1.21e-16 rs (1.4,1.72) 2.97e-16 rs (0.63,0.77) 2.37e-12 29

31 rs (0.64,0.8) 1.98e-10 rs (0.68,0.9) 1.55e-03 rs (0.58,0.71) 3.89e-17 rs (0.7,0.86) 5.02e-05 rs (0.68,0.84) 5.88e-06 rs (0.68,0.83) 4.78e-06 rs (0.69,0.84) 4.43e-06 rs (0.68,0.84) 8.05e-06 rs (0.59,0.79) 3.07e-08 rs (0.67,0.82) 7.44e-07 rs (0.71,0.87) 9.81e-05 rs (1.14,1.62) 1.14e-03 rs (1.11,1.36) 3.39e-04 rs (1.16,1.69) 1.38e-03 rs (1.16,1.69) 1.36e-03 Chr. 13 rs (0.68,0.87) 2.26e-05 Chr. 14 rs (1.08,1.4) 3.33e-03 rs (0.76,0.95) 1.52e-03 rs (1.1,1.34) 1.25e-04 Chr. 15 rs (0.72,0.88) 6.29e-05 rs (0.71,0.88) 5.27e-05 rs (0.77,0.94) 1.84e-03 rs (0.76,0.92) 4.47e-04 rs (1.13,1.38) 5.36e-05 rs (0.7,0.86) 3.22e-06 rs (0.77,0.94) 4.79e-03 rs (0.68,0.88) 2.21e-04 rs (1.11,1.36) 3.86e-04 rs (1.11,1.36) 3.40e-04 rs (1.12,1.37) 2.46e-04 rs (1.1,1.37) 1.32e-03 rs (1.11,1.37) 3.28e-04 rs (1.09,1.35) 8.27e-04 Chr. 16 rs (1.07,1.39) 6.65e-04 rs (1.1,1.63) 2.51e-04 Chr. 17 rs (0.82,1.04) 2.75e-01 rs (0.89,1.29) 5.54e-01 Chr. 18 rs (1.08,1.4) 2.42e-03 Chr. 19 rs (1.17,1.44) 3.09e-06 rs (1.14,1.41) 1.23e-05 rs (0.69,0.9) 2.20e-04 30

32 rs (1.08,1.32) 4.15e-04 Chr. 20 rs (0.76,0.93) 7.97e-04 rs (1.1,1.4) 2.64e-03 Chr. 23 rs (1.29,1.85) 2.00e-05 rs (1.3,1.83) 3.79e-06 rs (1.03,1.31) 3.44e-03 a SNPs with <5 homozygotes for the rare allele in either cases or controls excluded. b Based on NCBI Build 36 c OR: Odds ratio for the minor allele d CI: Confidence Interval e Adjusted for sex 31

33 Supplementary Table 9: Association information for SNPs chosen for replication among the replication cases (n=617) and controls (n=1138) homozygous for the H1 haplotype on chromosome 17q21. Chr. SNP a Position b OR c 95% CI d P-value e Chr. 1 rs (0.67,1.01) 0.06 rs (0.9,1.27) 0.46 rs (0.94,1.34) 0.20 rs (0.73,1.21) 0.63 Chr. 2 rs (0.75,1.22) 0.72 rs (1.07,1.53) rs (0.77,1.24) 0.86 rs (0.89,1.2) 0.64 rs (0.85,1.21) 0.91 rs (0.84,1.2) 0.94 rs (0.81,1.42) 0.61 rs (0.44,1.31) 0.32 rs (0.62,1.4) 0.74 rs (0.71,1.53) 0.84 rs (0.73,1.56) 0.74 rs (0.71,1.52) 0.85 rs (0.78,1.31) 0.92 rs (0.86,1.24) 0.74 Chr. 3 rs (0.7,1.11) 0.28 rs (0.76,1.13) 0.46 rs (0.87,1.22) 0.72 rs (0.8,1.16) 0.69 rs (0.86,1.15) 0.97 rs (1.03,1.41) rs (1.16,1.6) rs (1.21,1.65) 1.65e-05 rs (1.22,1.67) 7.24e-06 rs (0.83,1.15) 0.77 Chr. 4 rs (0.98,3.75) rs (0.79,2.53) 0.24 rs (0.77,1.16) 0.58 rs (0.84,1.24) 0.83 rs (0.8,1.09) 0.38 rs (1.21,1.69) 3.52e-05 rs (0.79,1.11) 0.44 rs (0.85,1.15) 0.9 rs (0.79,1.05) 0.21 rs (0.74,1.01)

34 rs (0.75,1.02) rs (0.75,1.02) rs (0.84,1.13) 0.76 rs (0.88,1.18) 0.83 rs (0.84,1.12) 0.69 rs (0.84,1.14) 0.78 rs (0.7,1.01) Chr. 5 rs (0.63,0.84) 1.69e-05 rs (0.69,0.95) rs (0.9,1.21) 0.58 rs (0.73,1.1) 0.3 rs (0.76,1.09) 0.31 rs (0.8,1.21) 0.86 rs (0.64,0.96) 0.02 rs (0.66,1.0) rs (0.78,1.22) 0.82 Chr. 6 rs (0.96,1.4) 0.14 rs (0.74,1.02) rs (1.05,1.39) rs (0.92,1.25) 0.38 rs (0.71,1.0) rs (0.77,1.24) 0.86 rs (0.81,1.29) 0.88 rs (0.87,1.39) 0.43 rs (0.81,1.29) 0.85 rs (0.84,1.14) 0.78 rs (0.89,1.22) 0.63 rs (0.71,1.18) 0.51 Chr. 7 rs (0.86,1.14) 0.9 rs (0.92,1.57) 0.19 rs (0.78,1.33) 0.9 rs (0.82,1.27) 0.88 rs (1.06,1.42) rs (1.03,1.38) rs (0.84,1.16) 0.89 rs (0.88,1.22) 0.71 rs (0.79,1.18) 0.73 Chr. 8 rs (0.77,1.59) 0.59 rs (0.78,1.12) 0.44 rs (0.72,1.49) 0.84 rs (0.4,1.88) 0.71 rs (0.83,1.24) 0.9 rs (0.76,1.09) 0.31 rs (0.87,1.44)

35 rs (0.81,1.08) 0.35 rs (0.84,1.11) 0.62 rs (0.77,1.03) 0.12 Chr. 9 rs (0.51,1.77) 0.88 rs (0.57,2.14) 0.76 rs (0.62,1.31) 0.59 rs (0.82,1.1) 0.52 Chr. 10 rs (1.1,1.79) rs (0.8,1.21) 0.87 rs (0.8,1.21) 0.86 rs (0.87,1.17) 0.93 rs (0.87,1.17) 0.92 rs (0.76,1.16) 0.55 rs (0.82,1.14) 0.68 rs (0.69,0.93) rs (0.78,1.05) 0.18 rs (0.79,1.06) 0.23 rs (0.79,1.05) 0.21 rs (0.87,1.31) 0.52 rs (0.87,1.19) 0.83 rs (0.91,1.23) 0.47 Chr. 11 rs (1.02,1.36) rs (1.06,1.42) rs (0.97,1.3) 0.14 rs (0.92,1.25) 0.37 rs (0.83,1.18) 0.88 rs (0.65,1.09) 0.19 rs (1.33,1.78) 5.33e-09 rs (1.34,1.79) 3.19e-09 rs (0.74,0.98) rs (0.79,1.06) 0.25 rs (0.83,1.2) 1 rs (0.61,0.82) 3.65e-06 rs (0.78,1.04) 0.15 rs (0.78,1.04) 0.14 rs (0.72,0.96) rs (0.66,0.9) rs (0.71,0.95) rs (0.97,1.61) rs (0.9,1.19) 0.61 Chr. 12 rs (0.83,1.29) 0.76 rs (0.71,1.14) 0.38 Chr. 13 rs (0.66,0.93)

36 Chr. 14 rs (0.92,1.36) 0.25 rs (0.65,1.36) 0.73 rs (0.82,1.14) 0.71 Chr. 15 rs (0.67,0.91) rs (0.68,0.93) rs (0.71,0.96) rs (1.07,1.43) rs (0.7,0.92) rs (0.57,1.49) 0.74 rs (1.01,1.76) rs (0.89,1.28) 0.49 rs (0.9,1.2) 0.57 rs (0.88,1.17) 0.83 rs (0.89,1.18) 0.76 rs (0.88,1.21) 0.73 rs (0.98,1.32) rs (0.88,1.19) 0.75 rs (0.78,1.43) 0.73 rs (0.73,1.35) 0.97 Chr. 16 rs (0.79,1.16) 0.64 rs (0.85,1.47) 0.42 rs (0.82,1.39) 0.63 Chr. 17 rs (0.79,1.12) 0.49 rs (1.09,1.93) rs (1.08,1.66) Chr. 18 rs (0.89,1.29) 0.49 Chr. 19 rs (1.02,1.38) rs (0.98,1.32) rs (0.77,1.14) 0.52 rs (0.76,1.02) Chr. 20 rs (0.92,1.59) 0.17 rs (0.83,1.09) 0.48 rs (1.08,1.51) Chr. 21 rs (0.48,1.74) 0.78 rs (0.89,1.35) 0.39 Chr. 23 rs (0.9,1.56) 0.24 rs (0.88,1.24) 0.64 a SNPs with <5 homozygotes for the rare allele in either cases or controls excluded. b Based on NCBI Build 36 35

37 c OR: Odds ratio for the minor allele d CI: Confidence Interval e Adjusted for sex 36

38 Supplementary Table 10: Association information for imputed SNPs across genome-wide significant loci. Cases Controls Chr SNP a Position b Allele A AA c AB c BB c AA c AB c BB c P-value d Chr. 3 rs A e-09 rs A e-09 rs C e-09 rs A e-09 rs A e-09 rs * C e-08 rs A e-08 rs C e-09 rs C e-09 rs C e-09 rs A e-09 rs * C e-09 rs * C e-09 rs C e-09 rs G e-09 Chr. 4 rs C e-08 rs * C e-08 rs C e-08 rs * C e-09 rs C e-09 rs C e-08 37

39 Chr. 5 rs C e-11 rs * A e-14 rs * C e-08 rs A e-09 Chr. 6 rs * G e-18 rs C e-10 rs * A e-09 rs A e-08 rs A e-08 Chr. 7 rs C e-08 rs C e-10 rs * A e-10 rs A e-13 Chr. 11 rs A e-14 rs G e-14 rs C e-14 rs C e-14 rs A e-14 rs * C e-14 rs * A e-08 rs C e-09 rs * T e-24 rs * T e-23 rs * C e-18 rs * C e-12 rs C e-11 38

40 rs868903* C e-25 rs A e-10 rs C e-12 rs * A e-08 rs A e-09 rs C e-09 rs C e-09 rs * C e-09 rs * A e-10 rs A e-19 rs A e-12 rs * C e-09 rs * T e-09 rs C e-10 rs G e-08 rs C e-08 rs * A e-08 rs * A e-10 rs C e-10 rs A e-10 rs G e-14 Chr. 15 rs * A e-08 rs * C e-08 rs A e-10 rs C e-10 rs A e-10 rs A e-10 rs A e-10 rs A e-10 39

41 rs * A e-10 rs C e-10 rs A e-10 rs A e-09 rs C e-08 rs A e-09 rs A e-08 Chr. 19 rs * A e-09 rs * A e-08 a SNPs that were genotyped as part of GWAS marked with * ; imputed results presented for all SNPs to make p-values comparable between the imputed and genotyped SNPs. b Based on NCBI Build 36 c Genotypic counts based on imputed genotypes d P-value based on Snptest v2 40

42 Supplementary Table 11: Adjusted association information for all genome-wide significant SNPs in meta-analysis using joint genotypes from subset of GWAS cases, all replication cases and all replication controls. Joint Analysis a Joint Analysis Adjusted for top SNP b OR OR Gene (95% CI) P-value (95% CI) P-value GWAS P-value and Meta-analysis P-value < 5x10-8 Chr. 5p15 rs TERT 0.75 (0.677,0.822) 3.39e-09 N/A N/A Chr. 6p24 rs DSP 1.30 (1.184,1.431) 5.33e-08 N/A N/A Chr. 7q22 rs (1.082,1.315) 4.06e-04 N/A N/A Chr. 11p15 rs e-10 (0.666,0.810) (0.934,1.162) 0.46 rs MUC e-21 (1.459,1.778) (0.944,1.182) 0.34 rs MUC e-21 (1.464,1.784) (0.944,1.183) 0.34 rs MUC e-05 (0.747,0.906) (0.988,1.225) 0.08 rs e-03 (1.042,1.271) (0.846,1.054) 0.31 rs e-03 (0.782,0.957) (1.015,1.268) 0.03 rs e-06 (0.626,0.828) (0.766,1.034) 0.13 rs e-06 (0.716,0.877) (0.905,1.130) 0.85 Chr. 15q14-15 rs e-04 N/A N/A rs Chr. 17q21 rs rs rs Chr. 19p13 rs rs DISP2 MAPT MAPT MAPT DPP9 DPP9 (0.765,0.924) 0.85 (0.763,0.940) 0.69 (0.604,0.776) 0.69 (0.605,0.776) 0.70 (0.617,0.791) 1.35 (1.214,1.495) 1.31 (1.179,1.446) 1.83e (0.804,1.080) e-09 N/A N/A 2.72e e (0.171,2.704) 1.69 (0.720,3.939) e-08 N/A N/A 3.01e (0.641,1.208) 0.43 Joint Analysis Adjusted for age c OR (95% CI) P-value 0.76 (0.685,0.839) 1.29 (1.170,1.425) 1.18 (1.068,1.308) 0.75 (0.681,0.834) 1.57 (1.413,1.735) 1.57 (1.415,1.737) 0.85 (0.772,0.942) 1.10 (0.995,1.224) 0.90 (0.809,0.998) 0.75 (0.649,0.869) 0.80 (0.719,0.888) 0.84 (0.756,0.921) 0.84 (0.754,0.938) 0.71 (0.621,0.805) 0.71 (0.621,0.804) 0.72 (0.636,0.822) 1.32 (1.185,1.471) 1.28 (1.154,1.427) 7.49e e e e e e e e e e e e e e e e-06 41

43 GWAS (5x10-8 < P-value <.0001) and Meta-analysis (P-value < 5x10-8 ) Chr. 3q26 rs e-04 (1.089,1.335) (0.794,1.081) 0.33 rs MYNN e-08 (1.212,1.507) (0.708,1.250) 0.67 rs e-09 (1.231,1.524) (0.139,1.966) 0.34 rs LRRC (1.242,1.535) 2.32e-09 N/A N/A Chr. 4q22 rs FAM13A 1.32 (1.179,1.481) 1.66e-06 N/A N/A Chr. 5p15 rs TERT e-03 (0.742,0.928) (0.83,1.06) 0.32 Chr. 6p24 rs Chr. 10q24 rs rs rs Chr. 11p15 rs rs rs Chr. 13q34 rs Chr. 15q14-15 rs rs Chr. 17q21 rs rs rs rs rs rs DSP OBFC1 OBFC1 OBFC1 TOLLIP MUC5B ATP11A DISP2 IVD CRHR1, C17orf69 IMP5 KIAA1267 KIAA1267 KIAA (0.699,0.880) 0.84 (0.761,0.929) 0.86 (0.781,0.947) 0.86 (0.780,0.945) 1.17 (1.062,1.288) 0.78 (0.706,0.858) 0.82 (0.747,0.907) 0.80 (0.708,0.893) 0.84 (0.753,0.929) 1.14 (1.037,1.262) 0.77 (0.683,0.872) 0.69 (0.604,0.776) 0.76 (0.676,0.853) 0.69 (0.613,0.784) 0.67 (0.590,0.758) 0.68 (0.599,0.769) 3.60e e e (0.76,1.03) 0.88 (0.751,1.040) 1.25 (0.533,2.915) e-03 N/A N/A 1.49e e e (0.920,1.136) 1.03 (0.927,1.150) 1.10 (0.990,1.228) e-04 N/A N/A 8.26e e e e e e e e (0.80,1.07) 1.00 (0.87,1.17) 4.29 (2.315,7.940) 0.71 (0.173,2.882) 1.18 (0.920,1.506) 0.90 (0.553,1.462) 0.42 (0.190,0.938) 0.37 (0.127,1.104) e (1.052,1.300) 1.34 (1.197,1.502) 1.37 (1.228,1.532) 1.38 (1.238,1.543) 1.29 (1.144,1.451) 0.85 (0.753,0.950) 0.79 (0.703,0.892) 0.84 (0.758,0.933) 0.88 (0.799,0.976) 0.88 (0.798,0.974) 1.13 (1.025,1.254) 0.79 (0.713,0.872) 0.84 (0.759,0.928) 0.78 (0.695,0.885) 0.84 (0.749,0.930) 1.15 (1.041,1.276) 0.82 (0.721,0.930) 0.71 (0.622,0.806) 0.77 (0.682,0.869) 0.72 (0.635,0.820) 0.69 (0.607,0.786) 0.70 (0.616,0.798) 3.65e e e e e e e e e e e e e e e e e e e-08 42

44 rs NSF e (0.622,0.804) (0.778,1.507) (0.639,0.834) 3.48e-06 rs NSF e (0.625,0.809) (0.792,1.499) (0.647,0.846) 1.00e-05 rs WNT e (0.633,0.817) (0.800,1.381) (0.659,0.858) 2.20e-05 a Based on joint analysis of a subset of the GWAS cases (n=859) and all replication cases (876) compared to replication controls (n=1890) to allow for adjustment for rs , which is not on GWAS panel, and age; GWAS cases were re-genotyped for Supplemental Table 2 SNPs and rs using same platform and at same time as replication cases and controls. b Each SNP was tested for association in a logistic regression model that also included the most highly associated SNP from the meta-analysis at that locus in addition to sex. The exception is chromosome 11p15, where each SNP was tested for association in a logistic regression model that also included rs c each SNP was tested for association in a logistic regression model that also included age in addition to sex. 43

45 Supplementary Figure 1: Quantile-Quantile (Q-Q) plot of observed vs. expected p-value distribution for GWAS across 439,828 high quality SNPs. 44

46 Supplementary Figure 2: Locus-specific plots with genotyped (circles) and imputed (squares) SNP results (based on GWAS discovery cases and controls only) for 6 loci reaching genome-wide significance in the GWAS discovery analysis and meta-analysis of the discovery and replication results. All P values based on imputation analysis for comparability of levels of statistical significance (see Online Methods). For each plot, the log 10 P values (y axis) of the SNPs are shown according to their chromosomal positions (x axis). The significant loci are on chromosomes 5p15 (a), 6p24 (b), 7q22 (c), 11p15 (d), 15q14-15 (e), and 19p13 (f). The estimated recombination rates (cm/mb) from the HapMap Project (NCBI Build 36) are shown as light blue lines, and the genomic locations of genes within the regions of interest in the NCBI Build 36 human assembly are shown as arrows. SNPs shown in red, orange, green, light blue and blue have r 2 0.8, r 2 0.6, r 2 0.4, r and r 2 < 0.2 with the most highly-associated SNP, respectively. 45

47 46

48 Supplementary Figure 3: Locus-specific plots with genotyped (circles) and imputed (squares) SNP results (based on GWAS discovery cases and controls only) for 4 loci reaching genome-wide significance in the meta-analysis of the discovery and replication results. All P values based on imputation analysis for comparability of levels of statistical significance (see Online Methods). For each plot, the log 10 P values (y axis) of the SNPs are shown according to their chromosomal positions (x axis). The significant loci are on chromosomes 3q26 (a), 4q22 (b), 10q24 (c), and 13q34 (d). The estimated recombination rates (cm/mb) from the HapMap Project (NCBI Build 36) are shown as light blue lines, and the genomic locations of genes within the regions of interest in the NCBI Build 36 human assembly are shown as arrows. SNPs shown in red, orange, green, light blue and blue have r 2 0.8, r 2 0.6, r 2 0.4, r and r 2 < 0.2 with the most highly-associated SNP, respectively. 47

49 48

50 Supplementary Figure 4: Linkage disequilibrium among the genome-wide significant SNPs at 11p15 and rs Color is D : red indicates D estimate = 1, white a D estimate =0. Numbers in squares correspond to r 2 *100. Estimates based on joint case and control genotypes as used in analyses for Table 3 and Supplementary Table 8. 49

Supplementary Figure 1. Study design of a multi-stage GWAS of gout.

Supplementary Figure 1. Study design of a multi-stage GWAS of gout. Supplementary Figure 1. Study design of a multi-stage GWAS of gout. Supplementary Figure 2. Plot of the first two principal components from the analysis of the genome-wide study (after QC) combined with

More information

Familial Breast Cancer

Familial Breast Cancer Familial Breast Cancer SEARCHING THE GENES Samuel J. Haryono 1 Issues in HSBOC Spectrum of mutation testing in familial breast cancer Variant of BRCA vs mutation of BRCA Clinical guideline and management

More information

Nature Genetics: doi: /ng.3143

Nature Genetics: doi: /ng.3143 Supplementary Figure 1 Quantile-quantile plot of the association P values obtained in the discovery sample collection. The two clear outlying SNPs indicated for follow-up assessment are rs6841458 and rs7765379.

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

H3A - Genome-Wide Association testing SOP

H3A - Genome-Wide Association testing SOP H3A - Genome-Wide Association testing SOP Introduction File format Strand errors Sample quality control Marker quality control Batch effects Population stratification Association testing Replication Meta

More information

Using the Association Workflow in Partek Genomics Suite

Using the Association Workflow in Partek Genomics Suite Using the Association Workflow in Partek Genomics Suite This user guide will illustrate the use of the Association workflow in Partek Genomics Suite (PGS) and discuss the basic functions available within

More information

Lecture 3: Introduction to the PLINK Software. Summer Institute in Statistical Genetics 2015

Lecture 3: Introduction to the PLINK Software. Summer Institute in Statistical Genetics 2015 Lecture 3: Introduction to the PLINK Software Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 PLINK Overview PLINK is a free, open-source whole genome association analysis

More information

Lecture 3: Introduction to the PLINK Software. Summer Institute in Statistical Genetics 2017

Lecture 3: Introduction to the PLINK Software. Summer Institute in Statistical Genetics 2017 Lecture 3: Introduction to the PLINK Software Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 20 PLINK Overview PLINK is a free, open-source whole genome

More information

Supplementary Figure 1. Linkage disequilibrium (LD) at the CDKN2A locus

Supplementary Figure 1. Linkage disequilibrium (LD) at the CDKN2A locus rs3731249 rs3731217 Supplementary Figure 1. Linkage disequilibrium (LD) at the CDKN2A locus. Minimal correlation was observed (r 2 =0.0007) in Hapmap CEU individuals between B ALL risk variants rs3731249

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

Supplementary Figures

Supplementary Figures 1 Supplementary Figures exm26442 2.40 2.20 2.00 1.80 Norm Intensity (B) 1.60 1.40 1.20 1 0.80 0.60 0.40 0.20 2 0-0.20 0 0.20 0.40 0.60 0.80 1 1.20 1.40 1.60 1.80 2.00 2.20 2.40 2.60 2.80 Norm Intensity

More information

The Hardy-Weinberg Principle. Essential Learning Objectives 1.A.1 (g) and 1.A.1 (h)

The Hardy-Weinberg Principle. Essential Learning Objectives 1.A.1 (g) and 1.A.1 (h) The Hardy-Weinberg Principle Essential Learning Objectives 1.A.1 (g) and 1.A.1 (h) Evolution of Populations Individuals do not evolve, but rather, populations evolve Scientists use mathematical models

More information

A candidate gene study of the type I interferon pathway implicates IKBKE and IL8 as risk loci for SLE

A candidate gene study of the type I interferon pathway implicates IKBKE and IL8 as risk loci for SLE A candidate gene study of the type I interferon pathway implicates IKBKE and IL8 as risk loci for SLE Johanna K. Sandling, Sophie Garnier, Snaevar Sigurdsson, Chuan Wang, Gunnel Nordmark, Iva Gunnarsson,

More information

THE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE

THE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE : GENETIC DATA UPDATE April 30, 2014 Biomarker Network Meeting PAA Jessica Faul, Ph.D., M.P.H. Health and Retirement Study Survey Research Center Institute for Social Research University of Michigan HRS

More information

Supplementary Figure 1 a

Supplementary Figure 1 a Supplementary Figure 1 a b GWAS second stage log 10 observed P 0 2 4 6 8 10 12 0 1 2 3 4 log 10 expected P rs3077 (P hetero =0.84) GWAS second stage (BBJ, Japan) First replication (BBJ, Japan) Second replication

More information

Supplementary Figure 2.Quantile quantile plots (QQ) of the exome sequencing results Chi square was used to test the association between genetic

Supplementary Figure 2.Quantile quantile plots (QQ) of the exome sequencing results Chi square was used to test the association between genetic SUPPLEMENTARY INFORMATION Supplementary Figure 1.Description of the study design The samples in the initial stage (China cohort, exome sequencing) including 216 AMD cases and 1,553 controls were from the

More information

PLINK gplink Haploview

PLINK gplink Haploview PLINK gplink Haploview Whole genome association software tutorial Shaun Purcell Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA Broad Institute of Harvard & MIT, Cambridge,

More information

Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma. Supplementary Information

Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma. Supplementary Information Genome-wide association study identifies a susceptibility locus for HCVinduced hepatocellular carcinoma Vinod Kumar 1,2, Naoya Kato 3, Yuji Urabe 1, Atsushi Takahashi 2, Ryosuke Muroyama 3, Naoya Hosono

More information

Supplementary Information. Werner Koch, Petra Hoppmann, Jakob C. Mueller, Albert Schömig & Adnan Kastrati

Supplementary Information. Werner Koch, Petra Hoppmann, Jakob C. Mueller, Albert Schömig & Adnan Kastrati Supplementary Information Werner Koch, Petra Hoppmann, Jakob C. Mueller, Albert Schömig & Adnan Kastrati The Supplementary Information has the following sections in order: 1. Supplementary Methods 2. Supplementary

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

Supplementary table 1. Study design

Supplementary table 1. Study design Supplementary table 1. Study design Population GWAS genotyping platform N Case/Controls After genotyping quality controls Genotyped SNPs Analyzed SNPs (overlapping between populations) Statistical Power*

More information

Supplementary table 1: List of sequences of primers used in sequenom assay

Supplementary table 1: List of sequences of primers used in sequenom assay Supplementary table 1: List of sequences of primers used in sequenom assay SNP_ID 2nd-PCRP Sequence 1st-PCRP Sequence Allele specific (iplex) iplex primer primer Direction ROCK2 1 rs978906 ACGTTGGATGATAAAGCTCTCTCGGCAGTC

More information

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci SUPPLEMENTARY METHODS AND DATA General strategy for identifying deletion loci We used two approaches for identifying potentially suitable deletion loci for PDP-FISH analysis. In the first approach, we

More information

Data cleaning for HGDP45 was performed as in Conrad et al. (2006) (Figure S1). Genotyping of

Data cleaning for HGDP45 was performed as in Conrad et al. (2006) (Figure S1). Genotyping of Supplementary Methods Data cleaning Data cleaning for HGDP45 was performed as in Conrad et al. (2006) (Figure S1). Genotyping of 48 SNPs was attempted; three SNPs did not pass the quality checks of Conrad

More information

Office Hours. We will try to find a time

Office Hours.   We will try to find a time Office Hours We will try to find a time If you haven t done so yet, please mark times when you are available at: https://tinyurl.com/666-office-hours Thanks! Hardy Weinberg Equilibrium Biostatistics 666

More information

Genetic data concepts and tests

Genetic data concepts and tests Genetic data concepts and tests Cavan Reilly September 21, 2018 Table of contents Overview Linkage disequilibrium Quantifying LD Heatmap for LD Hardy-Weinberg equilibrium Genotyping errors Population substructure

More information

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics Genetic Variation and Genome- Wide Association Studies Keyan Salari, MD/PhD Candidate Department of Genetics How many of you did the readings before class? A. Yes, of course! B. Started, but didn t get

More information

Algorithms for Genetics: Introduction, and sources of variation

Algorithms for Genetics: Introduction, and sources of variation Algorithms for Genetics: Introduction, and sources of variation Scribe: David Dean Instructor: Vineet Bafna 1 Terms Genotype: the genetic makeup of an individual. For example, we may refer to an individual

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations. Supplementary Materials

Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations. Supplementary Materials Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations Supplementary Materials Chen Wu 1, 22, Xiaoping Miao 2, 22, Liming Huang 1,

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

Supplementary Figure 1. Quantile quantile plot for the combined analysis of cohorts 1 and 2.

Supplementary Figure 1. Quantile quantile plot for the combined analysis of cohorts 1 and 2. Supplementary Figure 1 Quantile quantile plot for the combined analysis of cohorts 1 and 2. Quantile quantile plot of the observed log 10 (P values) versus the expectation under the null hypothesis. Data

More information

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017 INTRODUCTION TO MOLECULAR GENETICS Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017 Learning Objectives Understand: The distinction between Quantitative Genetic

More information

Topics in Statistical Genetics

Topics in Statistical Genetics Topics in Statistical Genetics INSIGHT Bioinformatics Webinar 2 August 22 nd 2018 Presented by Cavan Reilly, Ph.D. & Brad Sherman, M.S. 1 Recap of webinar 1 concepts DNA is used to make proteins and proteins

More information

1b. How do people differ genetically?

1b. How do people differ genetically? 1b. How do people differ genetically? Define: a. Gene b. Locus c. Allele Where would a locus be if it was named "9q34.2" Terminology Gene - Sequence of DNA that code for a particular product Locus - Site

More information

Supplementary Information

Supplementary Information Supplementary Information Two new susceptibility loci for Kawasaki disease identified through genome-wide association analysis Yi-Ching Lee 1,2, Ho-Chang Kuo 3,4, Jeng-Sheng Chang,Luan-Yin Chang 6,1, Li-Min

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

4.1.1 Association of SNP variants within PARK2-PACRG gene regulatory region with leprosy susceptibility sharing chromosomal region 6q26

4.1.1 Association of SNP variants within PARK2-PACRG gene regulatory region with leprosy susceptibility sharing chromosomal region 6q26 4.1 GENETIC VARIATIONS IN PARK2 AND PACRG GENE REGULATORY REGIONS AND THEIR INTERACTION WITH IMPORTANT IMMUNO-REGULATORY GENES IN THE OUTCOME OF LEPROSY PARK2 and PACRG gene regulatory region was saturated

More information

Two-locus models. Two-locus models. Two-locus models. Two-locus models. Consider two loci, A and B, each with two alleles:

Two-locus models. Two-locus models. Two-locus models. Two-locus models. Consider two loci, A and B, each with two alleles: The human genome has ~30,000 genes. Drosophila contains ~10,000 genes. Bacteria contain thousands of genes. Even viruses contain dozens of genes. Clearly, one-locus models are oversimplifications. Unfortunately,

More information

(c) Suppose we add to our analysis another locus with j alleles. How many haplotypes are possible between the two sites?

(c) Suppose we add to our analysis another locus with j alleles. How many haplotypes are possible between the two sites? OEB 242 Midterm Review Practice Problems (1) Loci, Alleles, Genotypes, Haplotypes (a) Define each of these terms. (b) We used the expression!!, which is equal to!!!!!!!! and represents sampling without

More information

Fingerlin et al. BMC Genetics (2016) 17:74 DOI /s

Fingerlin et al. BMC Genetics (2016) 17:74 DOI /s Fingerlin et al. BMC Genetics (2016) 17:74 DOI 10.1186/s12863-016-0377-2 RESEARCH ARTICLE Genome-wide imputation study identifies novel HLA locus for pulmonary fibrosis and potential role for auto-immunity

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Eigenvector plots for the three GWAS including subpopulations from the NCI scan.

Nature Genetics: doi: /ng Supplementary Figure 1. Eigenvector plots for the three GWAS including subpopulations from the NCI scan. Supplementary Figure 1 Eigenvector plots for the three GWAS including subpopulations from the NCI scan. The NCI subpopulations are as follows: NITC, Nutrition Intervention Trial Cohort; SHNX, Shanxi Cancer

More information

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are:

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are: Problem Set 8 Genetics 371 Winter 2010 1. In a population exhibiting Hardy-Weinberg equilibrium, 23% of the individuals are homozygous for a recessive character. What will the genotypic, phenotypic and

More information

Introduction to statistics for Genome- Wide Association Studies (GWAS) Day 2 Section 8

Introduction to statistics for Genome- Wide Association Studies (GWAS) Day 2 Section 8 Introduction to statistics for Genome- Wide Association Studies (GWAS) 1 Outline Background on GWAS Presentation of GenABEL Data checking with GenABEL Data analysis with GenABEL Display of results 2 R

More information

>3 Sequencing coverage (x)

>3 Sequencing coverage (x) Number of rice hybrids 400 300 200 100 0 0.5-1 1-1.5 1.5-2 2-2.5 2.5-3 >3 Sequencing coverage (x) Supplementary Figure 1 Sequencing coverage of 1495 rice hybrids. The genomes of 1495 hybrid varieties were

More information

An Introduction to Population Genetics

An Introduction to Population Genetics An Introduction to Population Genetics THEORY AND APPLICATIONS f 2 A (1 ) E 1 D [ ] = + 2M ES [ ] fa fa = 1 sf a Rasmus Nielsen Montgomery Slatkin Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

More information

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Lisa J. Strug, PhD Guest Lecturer Biosta)s)cs Laboratory Course (CHL5207/8) March 5, 2015 Gene Mapping in the News Study Finds Gene

More information

Genome-wide analyses in admixed populations: Challenges and opportunities

Genome-wide analyses in admixed populations: Challenges and opportunities Genome-wide analyses in admixed populations: Challenges and opportunities E-mail: esteban.parra@utoronto.ca Esteban J. Parra, Ph.D. Admixed populations: an invaluable resource to study the genetics of

More information

Genes in Populations: Hardy Weinberg Equilibrium. Biostatistics 666

Genes in Populations: Hardy Weinberg Equilibrium. Biostatistics 666 Genes in Poulations: Hardy Weinberg Equilibrium Biostatistics 666 Previous Lecture: Primer In Genetics How information is stored in DNA How DNA is inherited Tyes of DNA variation Common designs for genetic

More information

Supporting Information

Supporting Information Supporting Information De Jager et al. 10.1073/pnas.0813310106 SI Methods Genotyping. In the initial screen of the Brigham and Women s Hospital samples, SNPs were genotyped using the iplex Sequenom MassARRAY

More information

Supplementary Fig. 1. Location of top two candidemia associated SNPs in CD58 gene

Supplementary Fig. 1. Location of top two candidemia associated SNPs in CD58 gene Supplementary Figures Supplementary Fig. 1. Location of top two candidemia associated SNPs in CD58 gene locus. The region encompass CD58 and three long non-coding RNAs (RP4-655J12.4, RP5-1086K13.1 and

More information

Module 2: Introduction to PLINK and Quality Control

Module 2: Introduction to PLINK and Quality Control Module 2: Introduction to PLINK and Quality Control 1 Introduction to PLINK 2 Quality Control 1 Introduction to PLINK 2 Quality Control Single Nucleotide Polymorphism (SNP) A SNP (pronounced snip) is a

More information

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

Reviewers' comments: Reviewer #1 (Remarks to the Author):

Reviewers' comments: Reviewer #1 (Remarks to the Author): Reviewers' comments: Reviewer #1 (Remarks to the Author): This is an interesting paper and a demonstration that diversity in the allelic spectrum, such as those in founder populations, can be leveraged

More information

Association studies (Linkage disequilibrium)

Association studies (Linkage disequilibrium) Positional cloning: statistical approaches to gene mapping, i.e. locating genes on the genome Linkage analysis Association studies (Linkage disequilibrium) Linkage analysis Uses a genetic marker map (a

More information

Mapping and Mapping Populations

Mapping and Mapping Populations Mapping and Mapping Populations Types of mapping populations F 2 o Two F 1 individuals are intermated Backcross o Cross of a recurrent parent to a F 1 Recombinant Inbred Lines (RILs; F 2 -derived lines)

More information

Derrek Paul Hibar

Derrek Paul Hibar Derrek Paul Hibar derrek.hibar@ini.usc.edu Obtain the ADNI Genetic Data Quality Control Procedures Missingness Testing for relatedness Minor allele frequency (MAF) Hardy-Weinberg Equilibrium (HWE) Testing

More information

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work Genome 371, 1 March 2010, Lecture 13 Association Mapping Mendelian versus Complex Phenotypes How to Perform an Association Study Why Association Studies (Can) Work Introduction to LOD score analysis Common

More information

What is genetic variation?

What is genetic variation? enetic Variation Applied Computational enomics, Lecture 05 https://github.com/quinlan-lab/applied-computational-genomics Aaron Quinlan Departments of Human enetics and Biomedical Informatics USTAR Center

More information

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility?

Genome wide association studies. How do we know there is genetics involved in the disease susceptibility? Outline Genome wide association studies Helga Westerlind, PhD About GWAS/Complex diseases How to GWAS Imputation What is a genome wide association study? Why are we doing them? How do we know there is

More information

Hardy-Weinberg Principle

Hardy-Weinberg Principle Name: Hardy-Weinberg Principle In 1908, two scientists, Godfrey H. Hardy, an English mathematician, and Wilhelm Weinberg, a German physician, independently worked out a mathematical relationship that related

More information

SNP calling. Jose Blanca COMAV institute bioinf.comav.upv.es

SNP calling. Jose Blanca COMAV institute bioinf.comav.upv.es SNP calling Jose Blanca COMAV institute bioinf.comav.upv.es SNP calling Genotype matrix Genotype matrix: Samples x SNPs SNPs and errors A change in a read may due to: Sample contamination Cloning or PCR

More information

COVER PAGE SAMPLE DNA GENOTYPING FINAL REPORT. Date. Month XX, 20XX. Reviewed & approved by: Steve Granger, Ph.D, Chief Scientific Officer

COVER PAGE SAMPLE DNA GENOTYPING FINAL REPORT. Date. Month XX, 20XX. Reviewed & approved by: Steve Granger, Ph.D, Chief Scientific Officer 5962 La Place Court, Suite 275, Carlsbad, CA 92008 USA Tel 800.790.2258 Fax 760.448.5397 www.salimetrics.com COVER PAGE Project Name XXXXXXXXXXXXX Document Final Report Analysis Genotyping Researcher Name

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2015 Human Genetics Series Thursday 4/02/15 Nancy L. Saccone, nlims@genetics.wustl.edu ancestral chromosome present day chromosomes:

More information

Statistical challenges to genome-wide association study

Statistical challenges to genome-wide association study 1 Statistical challenges to genome-wide association study Naoyuki Kamatani, M.D., Ph.D. 1. Director and Professor, Institute of Rheumatology, Tokyo Women s Medical University 2. Director, Medical Informatics

More information

Imputation. Genetics of Human Complex Traits

Imputation. Genetics of Human Complex Traits Genetics of Human Complex Traits GWAS results Manhattan plot x-axis: chromosomal position y-axis: -log 10 (p-value), so p = 1 x 10-8 is plotted at y = 8 p = 5 x 10-8 is plotted at y = 7.3 Advanced Genetics,

More information

Genotype quality control with plinkqc Hannah Meyer

Genotype quality control with plinkqc Hannah Meyer Genotype quality control with plinkqc Hannah Meyer 219-3-1 Contents Introduction 1 Per-individual quality control....................................... 2 Per-marker quality control.........................................

More information

Whole Genome Sequencing. Biostatistics 666

Whole Genome Sequencing. Biostatistics 666 Whole Genome Sequencing Biostatistics 666 Genomewide Association Studies Survey 500,000 SNPs in a large sample An effective way to skim the genome and find common variants associated with a trait of interest

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

CUMACH - A Fast GPU-based Genotype Imputation Tool. Agatha Hu

CUMACH - A Fast GPU-based Genotype Imputation Tool. Agatha Hu CUMACH - A Fast GPU-based Genotype Imputation Tool Agatha Hu ahu@nvidia.com Term explanation Figure resource: http://en.wikipedia.org/wiki/genotype Allele: one of two or more forms of a gene or a genetic

More information

Cross Haplotype Sharing Statistic: Haplotype length based method for whole genome association testing

Cross Haplotype Sharing Statistic: Haplotype length based method for whole genome association testing Cross Haplotype Sharing Statistic: Haplotype length based method for whole genome association testing André R. de Vries a, Ilja M. Nolte b, Geert T. Spijker c, Dumitru Brinza d, Alexander Zelikovsky d,

More information

GENOME WIDE ASSOCIATION STUDY OF INSECT BITE HYPERSENSITIVITY IN TWO POPULATION OF ICELANDIC HORSES

GENOME WIDE ASSOCIATION STUDY OF INSECT BITE HYPERSENSITIVITY IN TWO POPULATION OF ICELANDIC HORSES GENOME WIDE ASSOCIATION STUDY OF INSECT BITE HYPERSENSITIVITY IN TWO POPULATION OF ICELANDIC HORSES Merina Shrestha, Anouk Schurink, Susanne Eriksson, Lisa Andersson, Tomas Bergström, Bart Ducro, Gabriella

More information

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations Topics How to track evolution allele frequencies Hardy Weinberg principle applications Requirements for genetic equilibrium Types of natural selection Population genetic polymorphism in populations, pp.

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

Prioritization: from vcf to finding the causative gene

Prioritization: from vcf to finding the causative gene Prioritization: from vcf to finding the causative gene vcf file making sense A vcf file from an exome sequencing project may easily contain 40-50 thousand variants. In order to optimize the search for

More information

A genome wide association study of metabolic traits in human urine

A genome wide association study of metabolic traits in human urine Supplementary material for A genome wide association study of metabolic traits in human urine Suhre et al. CONTENTS SUPPLEMENTARY FIGURES Supplementary Figure 1: Regional association plots surrounding

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2018 Human Genetics Series Thursday 4/5/18 Nancy L. Saccone, Ph.D. Dept of Genetics nlims@genetics.wustl.edu / 314-747-3263 What

More information

General aspects of genome-wide association studies

General aspects of genome-wide association studies General aspects of genome-wide association studies Abstract number 20201 Session 04 Correctly reporting statistical genetics results in the genomic era Pekka Uimari University of Helsinki Dept. of Agricultural

More information

Axiom Biobank Genotyping Solution

Axiom Biobank Genotyping Solution TCCGGCAACTGTA AGTTACATCCAG G T ATCGGCATACCA C AGTTAATACCAG A Axiom Biobank Genotyping Solution The power of discovery is in the design GWAS has evolved why and how? More than 2,000 genetic loci have been

More information

Why can GBS be complicated? Tools for filtering & error correction. Edward Buckler USDA-ARS Cornell University

Why can GBS be complicated? Tools for filtering & error correction. Edward Buckler USDA-ARS Cornell University Why can GBS be complicated? Tools for filtering & error correction Edward Buckler USDA-ARS Cornell University http://www.maizegenetics.net Maize has more molecular diversity than humans and apes combined

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu January 29, 2015 Why you re here

More information

B I O I N F O R M A T I C S

B I O I N F O R M A T I C S Bioinformatics LECTURE 3-16 B I O I N F O R M A T I C S Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Bioinformatics LECTURE

More information

Haplotype phasing in large cohorts: Modeling, search, or both?

Haplotype phasing in large cohorts: Modeling, search, or both? Haplotype phasing in large cohorts: Modeling, search, or both? Po-Ru Loh Harvard T.H. Chan School of Public Health Department of Epidemiology Broad MIA Seminar, 3/9/16 Overview Background: Haplotype phasing

More information

PopGen1: Introduction to population genetics

PopGen1: Introduction to population genetics PopGen1: Introduction to population genetics Introduction MICROEVOLUTION is the term used to describe the dynamics of evolutionary change in populations and species over time. The discipline devoted to

More information

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed. MCB 104 MIDTERM #2 October 23, 2013 ***IMPORTANT REMINDERS*** Print your name and ID# on every page of the exam. You will lose 0.5 point/page if you forget to do this. Name KEY If you need more space than

More information

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications

Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Ranajit Chakraborty, Ph.D. Evaluation of Genome wide SNP Haplotype Blocks for Human Identification Applications Overview Some brief remarks about SNPs Haploblock structure of SNPs in the human genome Criteria

More information

SNPassoc: an R package to perform whole genome association studies

SNPassoc: an R package to perform whole genome association studies SNPassoc: an R package to perform whole genome association studies Juan R González, Lluís Armengol, Xavier Solé, Elisabet Guinó, Josep M Mercader, Xavier Estivill, Víctor Moreno November 16, 2006 Contents

More information

Genetics and Bioinformatics

Genetics and Bioinformatics Genetics and Bioinformatics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be Lecture 3: Genome-wide Association Studies 1 Setting

More information

Breeding on polled genetics in Holsteins - chances and limitations

Breeding on polled genetics in Holsteins - chances and limitations IT-Solutions for Animal Production Breeding on polled genetics in Holsteins - chances and limitations EAAP, 2013, Nantes, session 37: Ethical aspects of breeding Segelke D.*, Täubert, H.*, Reinhardt, F.*,

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

The PLINK example GWAS analysed by PLINK and Sib-pair

The PLINK example GWAS analysed by PLINK and Sib-pair The PLINK example GWAS analysed by PLINK and Sib-pair David Duffy Genetic Epidemiology Laboratory Introduction Overview of development of Sib-pair PLINK v. Sib-pair Overview of Sib-pair An extensible platform

More information

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus. NAME EXAM# 1 1. (15 points) Next to each unnumbered item in the left column place the number from the right column/bottom that best corresponds: 10 additive genetic variance 1) a hermaphroditic adult develops

More information

Supplementary Materials

Supplementary Materials Supplementary Materials Genome-wide association study identifies 1p36.22 as a new susceptibility locus for hepatocellular carcinoma in chronic hepatitis B virus carriers Hongxing Zhang 1, Yun Zhai 1, Zhibin

More information

1. A dihybrid YyZz is test crossed. The following phenotypic classes are observed:

1. A dihybrid YyZz is test crossed. The following phenotypic classes are observed: Problem Set 4 Genetics 371 Winter 2010 1. A dihybrid YyZz is test crossed. The following phenotypic classes are observed: 442 Yz 458 yz 46 YZ 54 yz (a) What is the parental type of the heterozygous parent?

More information

Shaare Zedek Medical Center (SZMC) Gaucher Clinic. Peripheral blood samples were collected from each

Shaare Zedek Medical Center (SZMC) Gaucher Clinic. Peripheral blood samples were collected from each SUPPLEMENTAL METHODS Sample collection and DNA extraction Pregnant Ashkenazi Jewish (AJ) couples, carrying mutation/s in the GBA gene, were recruited at the Shaare Zedek Medical Center (SZMC) Gaucher Clinic.

More information

Lecture 9b: Applications of the Hardy-Weinberg Theorem

Lecture 9b: Applications of the Hardy-Weinberg Theorem Lecture 9b: Applications of the Hardy-Weinberg Theorem For a two-allele locus: Let p = the frequency of one allele in the population (usually the dominant) Let q = the frequency of the recessive allele

More information

Supplementary Information

Supplementary Information A rare variant in MYH6 is associated with high risk of sick sinus syndrome Hilma Holm 1,8, Daniel F. Gudbjartsson 1,8, Patrick Sulem 1, Gisli Masson 1, Hafdis Th. Helgadottir 1, Carlo Zanon 1, Olafur Th.

More information

Polygenic Influences on Boys & Girls Pubertal Timing & Tempo. Gregor Horvath, Valerie Knopik, Kristine Marceau Purdue University

Polygenic Influences on Boys & Girls Pubertal Timing & Tempo. Gregor Horvath, Valerie Knopik, Kristine Marceau Purdue University Polygenic Influences on Boys & Girls Pubertal Timing & Tempo Gregor Horvath, Valerie Knopik, Kristine Marceau Purdue University Timing & Tempo of Puberty Varies by individual (Marceau et al., 2011) Risk

More information