Multiple Sclerosis: Recent Insights from Genomics Bruce Cree, MD, PhD, MAS University of California San Francisco Disclosure Bruce Cree has received personal compensation for consulting from Abbvie, Biogen Idec, EMD Serono, MedImmune, Novartis, Genzyme/sanofi aventis, Teva and has received contracted research support (including clinical trials) from Acorda, Biogen Idec, EMD Serono, Hoffman La Roche, MedImmune, Novartis and Teva.
Multiple Sclerosis Genetics: take home points The Major Histocompatibility Complex (MHC) is the primary MS susceptibility locus HLA-DRB*15:1 is the major MS susceptibility allele at the MHC The MHC also contains a major protective allele: HLA-A*2:1 Alleles of the IL2α and IL7α receptors were the first confirmed non-hla MS risk conferring genes identified in a genomewide association screen There are over 15 other common alleles of genes that individually fractionally contribute to MS susceptibility The majority of these genes have immunological functions Together these alleles and SNP associations explain nearly 5% of MS heritability. Rare alleles may account for the remaining, as yet unidentified, MS heritability Why Study MS Genetics? Understanding the genetic variations that contribute to MS susceptibility teaches us about the biology of MS Identify targets for treatment Predict who is at risk Predict who may become disabled Optimize individual treatment
Multiple Sclerosis as a Genetic Disease MS clusters in families No detectable effect of shared environment on MS susceptibility in spouses, adoptees Identical twins have a 3% risk of MS compared to fraternal twins and non-twin siblings whose risk is 3-5% Single gene disorder Mutation Complex disorder Polymorphisms Gene A Gene B Gene A Gene C Gene D Dominant inheritance pattern Complex inheritance pattern Impact of mutations on disease Impact of polymorphisms on disease 1% 3% A B C D Environment Post-genomic modifications Genetic risk in different families Genetic risk in different families 1% 3% Family 1 Family 2 Family 3 Family 1 Family 2 Family 3 Modified from Peltonen & McKusic. Science 291:1224, 1
Multiple Sclerosis as a Genetic Disease: Study Designs Linkage: determines whether the inheritance of specific chromosome segments predispose to disease in a family Efficient method for determining chromosomal regions of interest using families Requires only a few thousand DNA markers The number of families needed depends on the Odds Ratio of the genetic risk factor For genetic risk factors that have Odds Ratios < 2, thousands of families are needed Multiple Sclerosis Genomic Regions of Interest 5 genome wide linkage screen 1 2 3 4 5 6 7 8 HLA- DRB1 9 1 11 12 13 14 15 16 17 18 19 21 22 Y X Sawcer S et al. Am J Hum Genet. 5
Whole Genome Association Screen Affected individuals (cases) Non-affected individuals (controls) Compare single nucleotide polymorphisms (SNPs) in two populations Case-control design Whole Genome Screen Powerful technique to identify common alleles that contribute modest or weak effects to the disease state Requires very large numbers of polymorphisms across the genome Requires large samples of cases and controls Susceptible to population stratification bias
First IMSCG Whole Genome Association Screen First iteration of the consortium utilized samples from San Francisco, Boston and Cambridge Affymetrix 5K Chip 931 Trios 334,923 SNPs X 931 Trios = 1 billion genotypes OR = 2. 99% power to find P<1-3 OR = 1.5 8% power to find P<1-3 OR = 1.2 3% power to find P<1-3 Multiple Sclerosis Genomic Regions of Interest 7 genome wide association screen 1 2 3 4 5 6 7 8 IL7R HLA- DRB1 9 1 11 12 13 14 15 16 IL2R 17 18 19 21 22 Y X IMSGC et al. NEJM. 7
IMSGC GWAS This large screen revealed that common alleles, meaning SNPs whose minor allele was present in >2% of the population, were associated with very low MS risk Odds ratios ranged from 1.1 to 1.6 This explained why prior linkage studies failed It became clear that the initial IMSGC dataset was substantially underpowered for detection of alleles whose Odds Ratios were in this range What was needed was a 1X increase in power to detect these modest MS alleles But where was one to find ~1, MS patients and 1, controls? IMSGC and THE WTCC 23 research groups from 15 countries 19,614 MS cases with European ancestry 9772 cases and 17376 controls passed stringent DNA QC 441,547 autosomal SNPs A novel variance component method was used to control for differences in population structure between cases and controls
IMSGC and WTCC IMSGC 11, Nature 11;476:214-9 IMSGC and WTCC Outside the MHC, 12 SNPs identified 93 distinct regions associated with MS susceptibility with p < 1 X 1-4 Replication analysis with these 12 SNPs was performed using 4218 cases and 7296 controls The same allele was over-represented in the replication dataset in 98 of the 12 SNPs
MS Genes Mapped by Chromosome non-mhc loci validated, 22 new loci discovered Novel Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Tags GWAS SNP Tags Functional SNP Immune System Nearest Gene No. Regional Genes 7 15 5 2 1 4 4 7 3 1 1 7 5 3 8 7 1 4 1 5 3 3 2 4 3 2 1 4 3 3 4 18 4 9 3313 3 3 3 4 8 1 25 9 2 3 12 11 9 13 2 15 9 15 -log 1 p value 12 11 1 9 8 7 6 5 4 3 2 1 IMSGC 11, Nature 11;476:214-9 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 16 17 18 19 21 22 Chromosome Study design: 81,162 subjects Discovery study Replication study Joint Analysis GWAS Metaanalysis v3. 14,82 MS cases 26,73 control subjects >8.6 million SNPs 15 strata of data Any SNP p<.5 And Identify secondary, tertiary, signals in each locus Custom MS Chip, 512 MS cases 19,145 control subjects ~9, SNPs 8 strata of data 35, 314 MS cases 45,848 control subjects Initial result 159 genetic variants genomewide significant
159 associated variants SNPs Genes MS Genetic Map 14 Chromosome Multiplicity of associations in each susceptibility locus First effect: p=3.1x1-1 log 1(p value) 1 8 6 4 2 rs12365699 r 2.8.6.4.2 1 8 6 4 Recombination rate (cm/mb) Example: CXCR5 locus 3 independent associations! Second effect: p=2.2x1-9 log 1(p value) 8 6 4 2 r 2.8.6.4.2 rs658976 1 8 6 4 Recombination rate (cm/mb) % susceptibility loci have more than one association Third effect: p=3.1x1-5 log 1(p value) 5 4 3 2 1 chr11:118783424 r 2.8.6.4.2 1 8 6 4 Recombination rate (cm/mb) 16% 4% 8% 1 Effect 2 Effects 3 Effects Systemic lupus erythematosus Primary biliary cirrhosis Glioma MLL PHLDB1 DDX6 CXCR5 UPK2 SLC37A4 C2CD2L PDZD3 TTC36 TREH BCL9L FOXR1 HYOU1 HINFP CBL TMEM25 MIR4492 CCDC84 HMBS ABCG4 118.4 118.5 118.6 118.7 118.8 118.9 119
How much of MS heritability is explained? Non-replicated variants 7.17% Sweden 6.18% US 9.4% 11.78% 26.57% Replicated variants 24.56% MHC variants 1.18 % 7.14% UK 42-45% MS heritability explained with current set of variants 27.83 % Pathways: old vs. new MS map CYTOKINE CYTOKINE-RECEPTOR INTERACTION T CELL RECEPTOR SIGNALING PATHWAY JAK STAT SIGNALING PATHWAY New list of genes (14) INTESTINAL IMMUNE NETWORK FOR IgA PRODUCTION NATURAL KILLER CELL MEDIATED CYTOTOXICITY Previously published gene list (13)
MS Genetics. From SNP identification to function Gene Variant Putative Mechanism Reference MHC HLA-DRB1*151 OR = 3. - 7. IL7R Susceptibility (exon 6) rs6897932 C OR = 1.2 IL2RA Susceptibility (intronic) rs214286 T OR = 1.15 TNFRSF1A Susceptibility (intronic) rs18693 G OR = 1.6 CD58 Protective (intronic) rs23747 G OR =.8 IRF8 Protective (intronic) 17445836 A OR =.8 TYK2 Protective (exon 21) rs34536443 C OR =.63 CD6 Susceptibility (intragenic) rs1782933 A OR = 1.12 EVI5 Susceptibility (intragenic) rs1181217 A CYP27B1 OR = 1.15 Susceptibility (intergenic) rs12368653 A OR = 1.1 Encephalitogenic immune response Location of lesion Low level skipping of exon 6, changes in the soluble / membrane bound ratio with higher sil7r Changes in the soluble / membrane bound ratio with higher sil2ra Skipping of exon 6, changes in the soluble / membrane bound ratio with higher stnfr1 Higher membrane expression of CD58 and correction of CD4+ regulatory cell function Widespread effect on the type I interferon transcriptional responses Decrease TYK2 kynase activity and cytokine shifting towards Th2 Decreased expression of full length CD6 in CD4+ cells leading to altered proliferation Regulation of adjacent gene GFl1 Under-expression in tolerogenic dendritic cells (DC2) Many Gregory et al. Nat Genet, 7 Maier et al. PLOS Genet, 9 Gregory et al. Nature 12 De Jager et al. PNAS, 9 De Jager et al. PNAS, 9 Couturier et al. Brain, 11 Kofler et al. J Immunol, 11 Martin et al. Nature Struct Molec Biol, 11 Shahijanian et al. Hum Molec Genet, 13 MS susceptibility p=1.6x1-8 log 1(p value) 8 6 4 r 2.8.6.4.2 rs35218683 1 8 6 4 Recombination rate (cm/mb) MS variant affect gene expression 2 IFITM3 eqtl T-cells p=1.7x1-31 log 1(p value) 35 3 25 15 1 rs35218683 r 2.8.6.4.2 1 8 6 4 Recombination rate (cm/mb) Testable hypothesis of proximal functional consequence 5 cis eqtl 3 rs35218683 r 2.8 1 IFITM3 eqtl Monocytes p=2.2x1-28 log 1(p value) 25 15 1.6.4.2 8 6 4 Recombination rate (cm/mb) Gene A Gene B SNP Gene C 5 NLRP6 ATHL1 IFITM2 IFITM3 IFITM5 IFITM1.28.3.32.34.36 Position on chr11 (Mb)
Integrated Map: B cell Network General histone marks zone eqtl-rich zone Courtesy S. Baranzini Specific protein binding zone MS Genomic Map >159 MS susceptibility variants Brain MS PBMC CD4 T cell Monocyte Multiple variants in a given locus Genetic risk is distributed across the immune system and brain
Conclusions A new resource for the MS community: The MS Genomic Map Final version in 15 Next steps: Translate results into biological insights and drug discovery Support the development of primary prevention strategies