Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations. Supplementary Materials

Size: px
Start display at page:

Download "Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations. Supplementary Materials"

Transcription

1 Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations Supplementary Materials Chen Wu 1, 22, Xiaoping Miao 2, 22, Liming Huang 1, Xu Che 3, Guoliang Jiang 4, Dianke Yu 1, Xianghong Yang 5, Guangwen Cao 6, Zhibin Hu 7, Yongjian Zhou 8, Chaohui Zuo 9, Chunyou Wang 10, Xianghong Zhang 11, Yifeng Zhou 12, Xianjun Yu 13, Wanjin Dai 5, Zhaoshen Li 14, Hongbing Shen 7, Luming Liu 15, Yanling Chen 16, Sheng Zhang 17, Xiaoqi Wang 18, Kan Zhai 1, Jiang Chang 1, Yu Liu 1, Menghong Sun 19, Wei Cao 5, Jun Gao 14, Ying Ma 5, Xiongwei Zheng 20, Siu Tim Cheung 18, Yongfeng Jia 21, Jian Xu 1, Wen Tan 1, Ping Zhao 3, Tangchun Wu 2, Chengfeng Wang 3, 23 1, 23, and Dongxin Lin Correspondence should be addressed to D. Lin or C. Wang 1 State Key Laboratory of Molecular Oncology and 3 Department of Abdominal Surgery, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing; 2 Key Laboratory for Environment and Health, School of Public Health and 10 Union Hospital, Tongji Medical College, Huazhong University of Sciences and Technology all in Wuhan, Hubei; 4 Departments of Radiation Oncology, 13 Pancreas & Hepatobiliary Surgery, 15 Integrative Oncology and 19 Pathology, Cancer Hospital, Fudan University, Shanghai; 5 Department of Pathology, Shengjing Hospital, China Medical University, Shenyang, Liaoning; 6 Department of Epidemiology, Second Military Medical University and 14 Department of Gastroenterology, the First affiliated hospital, Second Military Medical University all in Shanghai; 7 Department of Epidemiology and Biostatistics, Cancer Center, Nanjing Medical University, Nanjing, Jiangsu; 8 Departments of Gastrointestinal Surgery and 16 Hepatobiliary & Pancreatic Surgery, Union Hospital of Fujian Medical University, and 17 Department of Pathology, the First Affiliated Hospital of Fujian Medical University all in Fuzhou, Fujian; 9 Department of Gastroduodenal & Pancreatic Surgery, Hunan Province Tumor Hospital, Changsha, Hunan; 11 Department of Experimental Pathology, Hebei Medical University, Shijiazhuang, Hebei; 12 Laboratory of Cancer Molecular Genetics, Medical College of Soochow University, Suzhou, Jiangsu; 18 Department of Surgery, The University of Hong Kong, Hong Kong; 20 Department of Pathology, Fujian Provincial Cancer Hospital, Fuzhou, Fujian; 21 Department of Pathology, Affiliated Hospital, Inner Mongolia School of Medicine, Huhhot, Inner Mongolia. 22 These authors contributed equally to this article. 23 These authors jointly directed this work. 1

2 Supplementary Figure 1. Plots for genetic matching of three principal components derived from the PCA of 981 cases with pancreatic ductal adenocarcinoma and 1,991 controls, and 206 HapMap individuals without relationships. (a) PC1 versus PC2 for cases and controls, (b) PC1 versus PC3 for cases and controls, (c) PC2 versus PC3 for cases and controls, and (d) PC1 versus PC2 for cases, controls and HapMap individuals including 57 YRIs, 60 CEUs, 44 JPTs, and 45 CHBs. The case-control matching and low lamda GC value (1.059) suggested minimal evidence of population stratification. 2

3 Supplementary Figure 2. Quantile-Quantile plot of observed P values for association. The red circles represent the distribution of P values for the association of 666,141 autosomal SNPs in 981 cases with pancreatic ductal adenocarcinoma and 1,991 controls. 3

4 Supplementary Table 1. Associations of SNPs with pancreatic cancer risk at P<10-6 in discovery phase SNP Minor allele Chr. Position Gene Location MAF Cases Controls OR* (95% CI) P OR (95% CI) P rs C BACH1 3'-UTR ( ) ( ) rs G MIPEPP2 downstream ( ) ( ) rs G BACH1 intron ( ) ( ) rs G UQCRFS1 downstream ( ) ( ) rs A BACH1 intron ( ) ( ) rs T BACH1 intron ( ) ( ) rs T BACH1 intron ( ) ( ) rs A BACH1 intron ( ) ( ) rs T LOC upstream ( ) ( ) rs A BACH1 intron ( ) ( ) rs G DAB2 intron ( ) ( ) rs C LOC downstream ( ) ( ) rs G ETAA1 upstream ( ) ( ) rs A BAI3 upstream ( ) ( ) rs C DAB2 intron ( ) ( ) rs A SLC1A1 upstream ( ) ( ) rs C BACH1 intron ( ) ( ) rs T PRLHR downstream ( ) ( ) rs G TFF1 downstream ( ) ( ) rs G ADAMTS1 downstream ( ) ( ) rs C ZNF678 upstream ( ) ( ) rs G LAMA3 upstream ( ) ( ) rs T ARID1B upstream ( ) ( ) rs C NA NA ( ) ( )

5 rs T DAB2 intron ( ) ( ) rs C DAB2 intron ( ) ( ) rs G RBMX upstream ( ) ( ) rs G NA NA ( ) ( ) rs G HNRNPAI downstream ( ) ( ) rs A TFRC downstream ( ) ( ) rs G FAM19A5 upstream ( ) ( ) rs A DGKH intron ( ) ( ) rs T LOC upstream ( ) ( ) Note: Chr. = chromosome; MAF = minor allele frequency; NA = no gene annotated. *Odd ratio (OR) and 95% confidence interval (CI) were calculated by logistic regression and adjusted for age and sex. OR and 95% CI were calculated by logistic regression and adjusted for sex, age, and the first three principal components of population stratification. 5

6 Supplementary Figure 3. Linkage disequilibrium (LD) structures of the SNPs on chromosomes 21q21.3 and 5p13.1 significantly associated with pancreatic cancer risk in Chinese populations. LD maps are shown by r 2 parameter for 981 pancreatic cancer cases and 1,991 controls. (a) LD block consisting of GWA analysis identified SNPs rs , rs and rs , rs , rs372883, rs117214, rs , and rs (r 2 = ). (b) LD block consisting of GWA analysis identified SNP rs , rs , rs , and rs (r 2 = ). 6

7 Supplementary Table 2. Associations of select 25 SNPs with pancreatic cancer risk in both discovery and replication phases SNP Risk MAF Chr. Position Gene SNP location Study allele Cases Controls P OR (95% CI) rs T BACH1 3'-UTR GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G MIPEPP2 downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs T UQCRFS1 downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C LOC upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs T DAB2 intron GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C LOC downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) 7

8 Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G ETAA1 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs A BAI3 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs A SLC1A1 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs T PRLHR downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs A TFF1 downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G ADAMTS1 downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) 8

9 Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C ZNF678 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G LAMA3 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C ARID1B upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C NA NA GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G RBMX upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G NA NA GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) 9

10 Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs A HNRNPAI downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs A TFRC downstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G FAM19A5 upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs G DGKH intron GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C LOC upstream GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs C NR5A2 intron GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) 10

11 Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* rs T CLPTM1L intron GWA study (additive model) ( ) Replication (additive model) ( ) Pooled sample (additive model) ( ) Pooled sample (dominant model) ( ) Pooled sample (recessive model) ( ) Heterogeneity test P= ( )* Note: Chr. = chromosome; MAF = minor allele frequency; NA = no gene annotated. Odds ratio (OR) and their 95% confidence interval (CI) were calculated by logistic regression and adjusted for age and sex. Test for heterogeneity between ORs for GWA study (additive model) and replication (additive model). *OR of meta-analysis of GWA study (additive model) and replication (additive model). 11

12 Supplementary Table 3. Cumulative association of five significant SNPs with pancreatic cancer risk in combined sample Number of Cases (n = 3,514) Controls (n = 4,732) hazard allele No. (%) No. (%) OR* (95% CI) P 0 89 (2.5) 286 (6.0) 1.00 (Reference) (15.4) 1064 (22.5) 1.64 ( ) (29.8) 1609 (34.0) 2.11 ( ) (30.7) 1175 (24.8) 2.97 ( ) (16.8) 505 (10.7) 3.77 ( ) (4.8) 93 (2.0) 6.27 ( ) P for trend test Note: due to genotyping failure of some DNA samples, the number of cases and controls in analysis was not equal to their total number. *Odds ratio (OR) and 95% confidence interval (CI) were calculated by logistic regression and adjusted for age and sex. Supplementary Table 4. Comparison of allele frequencies of SNPs associated with pancreatic cancer indentified in our study among different ethnicity populations SNP (allele) HapMap Phase II datasets GWA study CEU CHB JPT YRI Chinese European ancestry rs (T/C) 0.43/ / / / / rs (G/T) 0.00/ / / / / rs (T/A) 0.49/ / / / / rs (A/G) 0.66/ / / / / rs (C/T) 0.32/ / / / / /0.62 rs (C/G) 0.29/ / / / / /0.67 rs (A/G) 0.58/ / / / / rs (C/T) 0.71/ / / / / /0.30 rs (C/T) 0.57/ / / / / /

13 Supplementary Table 5. Geographic distribution of pancreatic ductal adenocarcinoma patients and controls Regions (provinces or cities) No. of cases recruited No. of controls recruited Beijing Liaoning Shanghai Zhejiang Fujian Hunan Jiangsu Hubei Sichuan & Chongqing Guangxi Shandong Hebei 88 0 Inner Mongolia Hong Kong 57 0 Shan xi 19 0 Yunnan 16 0 Total 3,615 4,941 Supplementary Table 6. Select characteristics of cases and controls in this study Discovery phase Replication phase Cases Controls Cases* Controls (n=1,012) (n=2,064) (n=2,603) (n=2,877) Age, mean (SD) 60.3 (11.6) 61.3 (8.6) 58.3 (11.0) 59.9 (8.7) Sex Male 806 (79.6) 1,701 (82.4) 1,709 (65.7) 1,930 (67.1) Female 206 (20.4) 363 (17.6) 894 (34.3) 947 (32.9) *DNA samples of 814 cases (511 males and 303 females) were isolated from surgically removed and paraffin-embraced normal pancreatic tissues adjacent to tumors. 13

14 Supplementary Table 7. Comparison of minor allele frequencies for SNPs genotyped using blood DNA or adjacent normal tissue DNA in the replication phase and their association with pancreatic cancer risk SNP Source of DNA MAF OR* (95% CI) P rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) rs Blood ( ) Tissue ( ) Odds ratio (OR) and their 95% confidence interval (CI) were calculated by logistic regression and adjusted for age and sex. 14