Complete Sample to Analysis Solutions for DNA Methylation Discovery using Next Generation Sequencing SureSelect Human/Mouse Methyl-Seq Kyeong Jeong PhD February 5, 2013 CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 11
DNA Methylation: Background DNA Methylation: Enzymatic modification of cytosine in CG dinucleotides Maintained through cell division Both DNA strands are methylated Platform for Methyl binding proteins Protein recruitment leading to compact, silent chromatin unavailable for transcription initiation Gene silencing, imprinting, X-inactivation, tissue specific repression Genome Stability CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 22
Common targets for DNA methylation Differentially methylated regions (DMR) CpG islands (e.g. 4~8 % tissue-specific differentially methylated regions or T-DMR) CpG island shores (~2kb away from islands, e.g. 76% of T-DMRs in shores) HS3ST4 : heparan sulfate D-glucosaminyl 3-Osulfotransferase 4 Irizarry RA et al. Nature Genetics 2009 CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 33
DNA methylation: Significance Role in cancer Hypomethylation in heterochromatin: Genomic instability Hypermethylation in tumor suppressor gene: Transcriptional repression of TSG Robertson K. Nature Genetics, 2005 CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 44
DNA methylation: Significance Role in other diseases Neurodevelopmental disorders X-linked α-thalassemia and metal retardation (ATRX syndrome) Fragile X syndrome ICF (Immune deficiency, Centromeric instability, and Facial abnormalities) Imprinting disorders Prader-Willi syndrome Angelman syndrome Beckwith-Wiedemann syndrome CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 55
Current methods o NGS-based assay MethylC-Seq: Whole genome shotgun sequencing with bisulfite-treated DNA (1 bp) RRBS (Reduced Representation Bisulfite Sequencing): Methylation-insensitive restriction enzyme & Bisulfite treatment (1 bp) MeDIP-Seq: Antibody for methylated DNA (150 bp) MBD-Seq: Methyl-binding domain protein (150 bp) MRE-seq: Restriction enzyme to detect unmethylated DNA (1 bp) o Microarray assay Single base methylation assay (450k or 27k) CHARM (Comprehensive High-throughput Arrays of Relative Methylation) Antibody-based assay LIMITATIONS Cost of WGS Bias from enzyme / antibody Lack of single base pair resolution Content limitations CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 66
SureSelect XT Human Methyl-Seq Reduced bias Other methods use enzymes or antibodies that can bias towards specific sequences or methylation states. Discovery Tool - Probes are not methylation state dependent so you do not need to have prior knowledge of the methylation states of the regions that you want to target Comprehensive design - Not limited to CpG Islands. Comprehensive targeting key methylation regions: CpG Islands, Promoters and DMRs High sensitivity - Ability to distinguish individual CpG sites CONTENT - 84 Mb Design, 3.7M CpGs CpG islands Cancer, Tissue-specific DMRs GENCODE promoters Known DMRs or Regulatory features in Shores and shelves ±4kb DNAse hypersensitive sites Under-methlated regions RefSeq Genes Ensemble Regulatory Features CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 77
SureSelect XT Human Methyl-Seq (84 Mb Design, 3.7M CpGs) Site Classification Covered regions (bp) CpGs covered by baits CpG Islands (~91% of UCSC annotated CpG islands) 19,605,556 1,679,870 Cancer-, Tissue-Specific DMRs (~23,000 DMRs; Most are from Irizarry RA et al. Nat. Genet. 2009 Feb;41(2):178-86) Gencode promoters (~141,000 promoters; 1kb-upsteam from TSS; All genes in Gencode v7 are included except repeat regions) ~482,000 DMRs or regulatory features in - CpG Island shores/shelves (±4 kb) - Enhancers - Ensemble regulatory regions - Dnase I hypersensitive sites 9,773,047 293,619 36,974,007 1,272,026 48,021,626 2,057,280 CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 88
SureSelect XT Mouse Methyl-Seq Site Classification Number of Targets Total Bases Covered CpG Islands 16,027 10,512,276 bp Tissue-specific DMRs 33,456 10,452,692 bp Ensembl Regulatory Features - CpG shores and shelves (±4kb) - DNase I Hypersensitive sites - Histone Modifications - TFBS - Polymerase 171,796 91,799,015 bp Open Regulatory Annotation (ORegAnno) - Promoters - Enhancers - TFBS - Regulatory Polymorphisms 14,951 9,983,957 bp Provided by Dr. Druley (Washington Univ.) / 109Mb / Early Access CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 99
SureSelect Target Enrichment Focus on regions of Methylation significance Whole genome vs. SureSelect 3,200,000,000 bp = 38x efficiency 84,000,000 bp CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 10 10
SureSelect Target Enrichment me me me me A A A A me me me me DNA Shearing End repair A tailing madapter ligation Hybridization (24hr) Capture/ Wash Bisulfite treatment PCR Index PCR Sequencing Me Sodium Bisulfite Me Bismark Alignment CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 11 11
Methyl-Seq analysis Workflow Preprocessing Bisulfite treatment Alignment Sequencing Demultiplexing % Methylation Computation Capture performance Summary QC CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 12 12
Capture performance 8Gb of sequencing Percentage reads in targeted regions 86.0% Percentage reads in regions +/- 100bp: 95.3% Percent of genome targeted: 2.7% Percentage of targeted bases covered by at least 1 read 98.2% Percentage of targeted bases covered by at least 5 read 94.8% Percentage of targeted bases covered by at least 10 read 90.0% CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 13 13
Whole genome data vs. SureSelect Methyl-Seq R = 0.93 R = 0.99 Whole genome bisulfite sequencing data: Lister R. et al. 2009 (IMR90: Fetal lung fibroblasts) http://www.chem.agilent.com/library/posters/public/agbt_methylseq_poster_feb2012.pdf -Tissue Specific DMRs CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 14 14
SureSelect Methyl-Seq vs. Illumina 450K array R=0.96 CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 15 15
Colon cancer cells (HCT116 vs. Methyltransferase DK HCT116 (Colon cancer cell line) HCT116 DKO: Methyltransferase double knockout (DNMT1 -/- & DNMT3b -/- ) HCT116 HCT116 DKO Methyl-Seq Illumina 450K Highly sensitive and accurate detection of DNA methylation changes CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 16 16
Methylation at single base-pair resolution (GNAS: G-protein alpha subunit ) HCT116 SureSelect HCT116DKO SureSelect DMR??? HCT116 450K HCT116DKO 450K Minimize missing information Detect gradual changes More confidence on subtle changes CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 17 17
Applications for Non-CpG methylation Stem cells Lister R. et al. 2009 Human DNA methylomes at base resolution show widespread epigenomic differences. Nature Mouse Genome Xie W. et al. 2012 Base-Resolution Analyses of Sequence and Parent-of-Origin Dependent DNA Methylation in the Mouse Genome. Cell Nucleosome positioning / Chromatin Accessibility Kelly T. K. et al. 2012 Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules, Genome Research CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 18 18
Conclusions SureSelect Methyl-Seq Target Enrichment Platform: Comprehensive Robust Cost-effective SureSelect Methyl-Seq allows for single base-pair resolution. Excellent concordance with published whole genome data. Content is focused most on important regions of the human methylome. CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 19 19
SureSelect: Complete Omics Solution DNA: Genetic variation RNA: Gene Expression Methyl: Effects on Gene Expression DNA Methylation Effects on Gene Expression RNA CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 20 20
Acknowledgements University of Washington: o John Stamatoyannopoulos - Tony Shafer - Eric Haugen Washington University: o Todd Druley University of California San Diego: o Kun Zhang Johns Hopkins University: o Andy Feinberg o Sarven Sarbuncian CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 21 21
Thank You! CAG EMEAI DGG/GSD/GFO Agilent Restricted Page 22 22