Impact of gdna Integrity on the Outcome of DNA Methylation Studies

Similar documents
Implementation of Automated Sample Quality Control in Whole Exome Sequencing

Evaluating the Agilent 4200 TapeStation System for High Throughput Sequencing Quality Control

Evaluation with RNA ScreenTape Assays

DNA. bioinformatics. epigenetics methylation structural variation. custom. assembly. gene. tumor-normal. mendelian. BS-seq. prediction.

Method development for DNA methylation on a single cell level

Practical solutions. for DNA methylation analysis

ab Post-Bisulfite DNA Library Preparation Kit (For Illumina )

Optimizing FFPE DNA preparation for use in SureSelect XT2

Performance characteristics of the High Sensitivity DNA kit for the Agilent 2100 Bioanalyzer

EpiNext High-Sensitivity Bisulfite-Seq Kit (Illumina)

ACCEL-NGS 2S DNA LIBRARY KITS

MassARRAY. Quantitative Methylation Analysis. High Resolution Profiling. Simplified with EpiTYPER.

For Research Use Only Ver

Tech Note. Using the ncounter Analysis System with FFPE Samples for Gene Expression Analysis. ncounter Gene Expression. Molecules That Count

Single Cell Genomics

Quick-DNA. DNA from any Sample

Assay Standards Working Group Nov 2012 Assay Standards Working Group Recommendations, November 2012

10 RXN 50 RXN 500 RXN

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

FFPE DNA Extraction kit

Welcome to the NGS webinar series

Nature Methods: doi: /nmeth Supplementary Figure 1

Next Generation Sequencing. Target Enrichment

ChIP-seq/Functional Genomics/Epigenomics. CBSU/3CPG/CVG Next-Gen Sequencing Workshop. Josh Waterfall. March 31, 2010

Archer ALK, RET, ROS1 Fusion Detection v1 Illumina Platform

Introduction Bioo Scientific

SUPPLEMENTARY MATERIAL AND METHODS

scgem Workflow Experimental Design Single cell DNA methylation primer design

TECHNICAL BULLETIN. SeqPlex DNA Amplification Kit for use with high throughput sequencing technologies. Catalog Number SEQX Storage Temperature 20 C

DNA Workflow. Marine Biological Laboratory. Mark Bratz Applications Scientist, Promega Corporation. August 2016

ZR-96 DNA Sequencing Clean-up Kit Catalog Nos. D4052 & D4053

SNP GENOTYPING WITH iplex REAGENTS AND THE MASSARRAY SYSTEM

XactEdit Cas9 Nuclease with NLS User Manual

Zymoclean Gel DNA Recovery Kit Catalog Nos. D4001T, D4001, D4002, D4007 & D4008

Axygen AxyPrep Magnetic Bead Purification Kits. A Corning Brand

Complete Sample to Analysis Solutions for DNA Methylation Discovery using Next Generation Sequencing

Preparation of Reduced Representation Bisulfite Sequencing (RRBS) libraries

High-Resolution Quantitative Methylation Profiling with EpiTYPER and the MassARRAY System

ab Methylated DNA Quantification Kit (Colorimetric)

For Research Use Only Ver

Getting the Most from Your DNA Analysis from Purification to Downstream Analysis. Eric B. Vincent, Ph.D. February 2013

Eluted DNA is high quality and inhibitor-free making it ideal for PCR and genotyping.

Cancer Genetics Solutions

High-yield, Scalable Library Preparation with the NEBNext Ultra II FS DNA Library Prep Kit

Nature Genetics: doi: /ng Supplementary Figure 1. Characteristics of MHBs in the human genome.

RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP)

NEBNext. Multiplex Oligos for Illumina (Dual Index Primers Set 2)

Automated size selection of NEBNext Small RNA libraries with the Sage Pippin Prep

Procedure & Checklist - Preparing Asymmetric SMRTbell Templates

EZ-96 DNA Methylation-Gold Kit Catalog No. D5008 (Deep-Well Format)

Introductory Next Gen Workshop

Application Note. Genomics. Abstract. Authors. Kirill Gromadski Ruediger Salowsky Susanne Glueck Agilent Technologies Waldbronn, Germany

SMARTer Ultra Low RNA Kit for Illumina Sequencing Two powerful technologies combine to enable sequencing with ultra-low levels of RNA

Zymoclean Large Fragment DNA Recovery Kit Catalog Nos. D4045 & D4046

Non-Organic-Based Isolation of Mammalian microrna using Norgen s microrna Purification Kit

Gene Expression Technology

Gene Expression Profiling and Validation Using Agilent SurePrint G3 Gene Expression Arrays

Single Cell 3 Reagent Kits v2 Quick Reference Cards

NEBNext. for Ion Torrent LIBRARY PREPARATION KITS

BIOO LIFE SCIENCE PRODUCTS

QIAGEN s NGS Solutions for Biomarkers NGS & Bioinformatics team QIAGEN (Suzhou) Translational Medicine Co.,Ltd

Figure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle

KAPA HiFi HotStart Uracil+ ReadyMix PCR Kit

Quantitative analysis of PCR fragments with the Agilent 2100 Bioanalyzer. Application Note. Odilo Mueller. Abstract

ZR-96 DNA Clean & Concentrator -5 Catalog Nos. D4023 & D4024

SOLiD Total RNA-Seq Kit SOLiD RNA Barcoding Kit

Single Cell 3 Reagent Kits v2 Quick Reference Cards

PAXgene Blood DNA PAXgene Blood DNA Tube (IVD) for clinical use PAXgene Blood DNA System (RUO) for research use. Explore more at

II First Strand cdna Synthesis Kit

Unlocking your FFPE archive

YeaStar Genomic DNA Kit Catalog No. D2002

MMLV Reverse Transcriptase 1st-Strand cdna Synthesis Kit

Pinpoint Slide DNA Isolation System Catalog No. D3001

NEBNext Multiplex Oligos for Illumina (Index Primers Set 2)

Axygen AxyPrep Magnetic Bead Purification Kits. A Corning Brand

Page Bioanalyzer Assays and Applications

NRAS Codon 61 Mutation Analysis Reagents

ProtoScript. First Strand cdna Synthesis Kit DNA AMPLIFICATION & PCR. Instruction Manual. NEB #E6300S/L 30/150 reactions Version 2.

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Subtractive hybridization is a powerful technique particularly well-suited for the identification of target cdnas that correspond to rare

Genome Reagent Kits v2 Quick Reference Cards

DNA Microarray Technology

Lab methods: Exome / Genome. Ewart de Bruijn

EZ-96 DNA Methylation-Lightning Kit Catalog No. D5033 (Deep-Well Format)

with drmid Dx for Illumina NGS systems

3 Designing Primers for Site-Directed Mutagenesis

Bioinformatics Advice on Experimental Design

EXTRACTME Kits, Product Brochure

EZ-96 DNA Methylation-Direct MagPrep Catalog Nos. D5044 & D5045

Amplicon Sequencing Template Preparation

ReliaPrep FFPE gdna Miniprep System

Pyrosequencing. Alix Groom

qpcr Quantitative PCR or Real-time PCR Gives a measurement of PCR product at end of each cycle real time

High-Resolution Oligonucleotide- Based acgh Analysis of Single Cells in Under 24 Hours

EPIGENTEK. EpiQuik In Situ Histone H3-K27 Tri-Methylation Assay Kit. Base Catalog # P-3014T PLEASE READ THIS ENTIRE USER GUIDE BEFORE USE

Targeted Sequencing in the NBS Laboratory

Problem Set 8. Answer Key

Comparing the Agilent 2100 Bioanalyzer Performance to Traditional DNA Analysis Techniques

# SimpleChIP ChIP-seq Multiplex Oligos for Illumina (Dual Index Primers) 1 Kit (96 assays) Store at -20 C

Transcription:

Impact of gdna Integrity on the Outcome of DNA Methylation Studies Application Note Nucleic Acid Analysis Authors Emily Putnam, Keith Booher, and Xueguang Sun Zymo Research Corporation, Irvine, CA, USA Introduction A growing area of epigenetics research focuses on changes in gene expression that occur without altering the primary sequence, exclusively based on DNA, RNA, or protein modifications. Epigenetic modifications have been shown to change based on nutrition and environmental factors, and have been implicated in human diseases such as cancer and aging. In some cases, epigenetic changes are heritable. Several regulators involved in epigenetics are known, including: DNA methylation and hydroxymethylation Histone modification and chromatin remodeling Small and large regulatory RNAs RNA methylation DNA methylation is the most widely studied modification involved in epigenetics. In mammalian cells, DNA methylation mainly involves the transfer of a methyl group from S-adenosyl methionine to the carbon 5 position of a cytosine residue to produce 5-methylcytosine. DNA methylation is mainly implicated in the repression of transcriptional activity. DNA methylation studies are performed to discover new disease markers, drug targets, and to further explain gene expression data, for example, understanding phenotypes that classical genetic studies are unable to resolve. This Application Note focuses on 5-methylcytosine (5-mC) DNA methylation and its detection. By far, the most commonly used method for DNA methylation analysis is bisulfite sequencing due to its high resolution detection when combined with sequencing. Advances in next generation sequencing made it possible to perform bisulfite sequencing at a genome-wide scale. The DNA methylation analysis based on reduced representation bisulfite sequencing, a high-throughput technique, allows analysis on a single nucleotide level 1. This method is based on the enrichment of genome regions with high CpG content (sites within the genome where a cytosine is next to a guanine) using a combination of restriction enzymes and bisulfite sequencing, and is applicable to any species with a reference genome.

In addition to the genome-wide bisulfite sequencing, targeted studies are performed to validate genome-wide discoveries with a larger group of samples 2. The next generation targeted bisulfite sequencing approach allows high throughput, and provides single-base pair resolution. It is known that the quality of the DNA starting material is crucial for next generation sequencing 3. The experiments described in this Application Note were carried out to verify that the gdna integrity of the starting material has an impact on DNA methylation analysis based on next generation bisulfite sequencing. Therefore, the initial gdna quality, as indicated by a numerical measure of DNA integrity (DIN) 4, was determined with the Agilent 22 TapeStation system and the Agilent Genomic DNA ScreenTape assay. DIN was correlated to the next generation bisulfite sequencing results, to determine if the initial DNA integrity had an effect on the quality of the DNA methylation studies. Experimental Material Restriction enzymes, TaqαI and MspI, were purchased from New England Biolabs (Ipswich, MA, USA). FFPE DNA MiniPrep, Quick-DNA Universal, DNA Clean & Concentrator-5, ZR-96 DNA Clean & Concentrator, Zymoclean Gel DNA Recovery, EZ DNA Methylation Lightning kits, and Rosefinch software were from Zymo Research Corp. (Irvine, CA, USA). NuSieve 3:1 Agarose Lonza was obtained from (Basel, Switzerland). The HiSeq system and the MiSeq Reagent Micro Kit v2 were from Illumina, Inc. (San Diego, CA, USA). The Access Array System was from Fluidigm, Inc. (South San Fransisco, CA, USA). The Agilent 22 TapeStation system (p/n G2965AA) with Agilent TapeStation Analysis software, Agilent Genomic DNA ScreenTape (p/n 567 5365), and Agilent Genomic DNA Reagents (p/n 567-5366), D1 ScreenTape (p/n 567-5582) and D1 Reagents (p/n 567-5583) were obtained from Agilent Technologies, Inc. (Santa Clara, CA, USA). Genome-wide DNA methylation analysis Human brain tissue from the same donor was used for the genome-wide methylation analysis. Genomic DNA was extracted from fresh frozen tissue using the Quick-DNA Universal kit, and was used as a control for the experiments. Four tissue samples were formalin fixed, paraffin embedded (FFPE) using varying lengths of paraffin treatment times (3 to 12 minutes). Following FFPE treatment, gdna from the four samples was extracted using the ZR FFPE DNA MiniPrep kit. A 2 ng amount of extracted gdna was sequentially digested using 6 units of TaqαI and 3 units of MspI. Next, the DNA fragments were ligated to pre-annealed adapters containing 5 -methyl-cytosine instead of cytosine. The adaptor-ligated fragments of 15 to 25 bp and 25 to 35 bp were recovered from a 2.5 % agarose gel. The fragments were then bisulfite-treated. Preparative-scale PCR was performed, and the resulting products were purified for HiSeq sequencing. Targeted DNA methylation analysis Genomic DNA was extracted from 282 mammalian samples, and bisulfite converted. Then, targeted amplification was performed using unbiased, bisulfite specific primers to cover CpG sites in the specified regions of interest. Primers were created with a dedicated bisulfite-converted DNA-specific primer design tool (proprietary Zymo Research software). Multiplex amplification of all samples using region of interest specific primer pairs and the Access Array system was performed. The amplification conditions were chosen for PCR amplicons ideally ranging from 1 bp to 3 bp. The resulting amplicons were pooled for harvesting and subsequent bar-coding. Samples were purified, then prepared for massively parallel sequencing and paired-end sequencing. Sequence reads were identified using standard base-calling software, then analyzed using proprietary Zymo Research software. CpG site coverage was calculated as a percentage of CpG site covered by sequencing reads versus total number of CpG sites for the designed assays. DNA analysis with the 22 TapeStation system DNA analysis was performed using the 22 TapeStation system in combination with the Genomic DNA or D1 ScreenTape assay, according to the manufacturer s instructions 5,6. 2

Results and Discussion Genome-wide DNA methylation analysis Human brain tissue from the same donor was used for the genome-wide DNA methylation analysis. Genomic DNA was extracted from fresh frozen tissue and was used as a control for the following experiments. In addition, four tissue samples were subjected to FFPE using varying lengths of paraffin treatment times to generate varying levels of DNA degradation. The gdna integrity of this starting material was determined using the Genomic DNA ScreenTape assay and the 22 TapeStation system. Figure 1 shows the obtained gel image, the DIN values indicating the gdna integrity, and the DNA concentrations. The numerical assessment ranged from 1 1. A high DIN value indicates highly intact gdna, and a low DIN degraded gdna 4. DNA extraction from FFPE samples has proven to be especially challenging 7. The most common issues are cross-linking and fragmentation of DNA during tissue processing and storage, affecting the yield and quality of gdna obtained. Figure 1 demonstrates that, as expected, the gdna yield and quality obtained for FFPE samples was lower, compared to the fresh frozen sample. The obtained DIN values for the FFPE samples ranged from 4. to 7.1. The gdna extracted from the fresh frozen sample had a relative high gdna quality with DIN 8.3, and the highest DNA concentration. Following this initial quality control step, the five gdna samples were subjected to the workflow for genome-wide DNA methylation analysis based on reduced representation bisulfite sequencing, as outlined in Figure 2. 48,5 1 FFPE DIN 4. 5.8 7.3 7.1 8.3 Conc. (ng/µl) 4.3 3.2 7.5 25.4 26.4 Figure 1. Analysis of gdna extracted from four FFPE samples (lanes 1 to 4) and one fresh frozen sample (last lane) from human brain tissue. gdna integrity (DIN) and concentration were determined using the Agilent 22 TapeStation system and the Agilent Genomic DNA ScreenTape assay. gdna preparation Restriction enzyme digest Adapter ligation Size selection Bisulfite conversion PCR amplification Sequencing Figure 2. Genome-wide DNA methylation analysis workflow based on reduced representation bisulfite sequencing. 3

The gdna samples were digested using a combination of TaqαI (T CGA) and MspI (C CGG). Neither restriction enzyme is sensitive to CpG methylation. Fragments of different sizes with CpG at the end were generated. Next, methylated sequence adapters were ligated to the DNA fragments. Then, a size selection of fragments was performed to enrich the informative sequences. The following bisulfite treatment of the size-selected DNA fragments resulted in the conversion of unmethylated cytosine residues to uracil. 5-Methylcytosine is protected from conversion. The bisulfite converted DNA was then amplified using PCR, and a barcode sequence was added for sample identification post-sequencing. Following the PCR amplification step, the obtained samples were analyzed using the 22 TapeStation system and the D1 ScreenTape assay. In all PCR reactions, DNA of approximately 2 bp was detected with different concentrations. Figure 3 demonstrates that the DNA library yield after PCR amplification varied depending on the DIN value of the initial gdna sample. A low DNA integrity of the starting material resulted in a low DNA yield after amplification. Finally, the obtained DNA libraries from the five samples were subjected to next generation sequencing. Due to the bisulfite conversion, the methylated cytosine residues are read as cytosine. The nonmethylated cytosines are converted to uracil, and read as thymine. Special software was used for alignment and data analysis. The alignment to the reference genome allowed the software to identify base pairs within the genome that were methylated. To demonstrate the effect of DIN integrity on the quality of the sequencing results, the number of covered CpG sites was determined and compared to the DNA integrity of the initial gdna samples (Figure 4). [bp] 1,5 2 25 FFPE Initial DIN 4. 5.8 7.3 7.1 8.3 Conc. (ng/µl).37.68 2.29 3.42 2.42 Figure 3. Analysis of DNA samples after PCR amplification from four FFPE samples (lanes 1 to 4) and one fresh frozen sample (last lane) from human brain tissue. The analysis was performed using the Agilent 22 TapeStation system and the Agilent D1 ScreenTape assay. The table shows DIN values of the initial gdna samples, and the determined DNA concentration. No. of cytosines in millions 7 6 5 4 3 2 1 4. 5.8 7.3 DIN 7.1 8.3 Figure 4. Correlation of the gdna integrity and the covered CpG sites from four FFPE samples (blue) and one fresh frozen sample (green) from human brain tissue. 4

Figure 4 demonstrates that the total number of covered CpG sites varied, depending on initial DIN value of the gdna samples. For the fresh frozen sample, a CpG coverage of slightly above six million was obtained; a similar value was obtained for the FFPE sample with a DIN of 7.3. It can be observed that, with decreasing DNA integrity of the gdna starting material, the CpG coverage decreases. Hexbin plots were used for the analysis of the obtained large data sets. Figure 5 compares the methylation levels for overlapping CpG sites for the four FFPE samples with DIN ranging from 4 to 7.3 to the fresh frozen sample with a DIN value of 8. The number of overlapping CpG sites comparing the gdna obtained from the FFPE sample with DIN 4 as starting material, with the DNA extracted from the fresh frozen tissue (DIN 8) was low (n = 66,291) compared to the samples with higher DIN values. The number of overlapping CpG sites increased with increasing DNA integrity. Even though the number of CpG sites overlapping was low, the correlated methylation levels were still fairly high. The correlation was good for all samples (R >.9). Figure 5 clearly demonstrates that the initial DNA integrity can significantly impact the success of genome-wide DNA methylation studies. DIN 8. 1..8.6.4.2 DIN 4. 5.4 4.8 4.2 3.6 3. 2.4 1.8 1.2 log 1 (N) DIN 8. 1..8.6.4.2 DIN 5.8 5.4 4.8 4.2 3.6 3. 2.4 1.8 1.2 log 1 (N).6.6.2.4.6 DIN 4..8 1..2.4.6 DIN 5.8.8 1. DIN 8. DIN 7.1 1. 1. DIN 7.3 5.4.8 4.8 4.2.8.6 3.6 3..6.4 2.4 1.8.4.2 1.2.2.6.2.4.6.8 1..2.4.6 DIN 7.1 DIN 7.3 log 1 (N) DIN 8..8 1. 5.6 4.8 4. 3.2 2.4 1.6.8 log 1 (N) Figure 5. Comparison of the DNA methylation levels for overlapping CpG sites for the four FFPE samples with DIN ranging from 4 to 7.3 to the fresh frozen sample with a DIN value of 8 using Hexbin plots. The correlation between two samples as indicated by the Pearson s R correlation coefficient is displayed at the top. Colors, at the indicated location on the plot, represent point densities, with red indicating points comprised from many values, and blue showing points from only a few values. 5

Targeted DNA methylation analysis In addition to the genome-wide bisulfite sequencing, targeted studies are performed to validate genome-wide discoveries with a larger group of samples. Figure 6 shows the targeted bisulfite next generation sequencing workflow. For this Application Note, gdna was extracted from 382 mammalian samples, and bisulfite converted. Then, targeted amplification was performed using unbiased, bisulfite-specific primers to cover CpG sites in the specified regions of interest. Due to the DNA degradation induced by the harsh conditions of the bisulfite conversion, multiplex amplification of all samples with conditions for short PCR amplicons ideally ranging from 1 bp to 3 bp was performed. The resulting amplicons were bar-coded using PCR to enable sample identification post-sequencing. Parallel and paired-end sequencing was applied. The CpG site coverage was calculated as a percentage of CpG site covered by sequencing reads versus the total number of CpG sites for the designed assays. The total number of covered CpG sites was compared to the DNA integrity determined for the initial gdna starting material. Figure 7 shows the comparison of the determined DIN with the percentage of total CpG sites covered in targeted bisulfite sequencing assays. gdna preparation Bisulfite conversion Targeted amplification Adapter PCR Sequencing Figure 6. DNA methylation analysis workflow based on targeted bisulfite next generation sequencing. CpG sites (%) 1 9 8 7 6 5 4 3 2 1 1 2 3 4 5 DIN 6 7 8 9 1 Figure 7. Correlation of the gdna integrity (DIN) and the covered CpG sites obtained using targeted next generation bisulfite sequencing. For most of the samples, the higher the gdna integrity, the higher the obtained CpG coverage. However, a few gdna samples with high DIN resulted in low CpG coverage. This is probably due to additional alterations of the gdna during extraction from the FFPE tissue. It is known that formalin causes cross-linking of nucleic acids and proteins, increasing the susceptibility of DNA to mechanical stress and decreasing the accessibility to enzymes. In addition, formalin may be oxidized to formic acid, which causes DNA depurination and DNA strand breaks. Therefore, the fixation conditions can significantly impact the quality of the extracted DNA 7. For some gdna samples with low DIN, a high CpG coverage was achieved by using a higher amount of gdna starting material, or by increasing the sequencing depth for each of the amplicons. The gdna integrity of the starting material can significantly impact the outcome of targeted DNA methylation studies. A high DIN integrity ensures a better assay success with the CpG coverage. Using the Genomic DNA ScreenTape assay and the 22 TapeStation system for quality control of the gdna starting material allows optimizing of the experimental conditions to improve the sequencing results. For example, increasing the amount of input DNA or increasing the sequencing depth for each of the amplicons might help to overcome low quality starting material. 6

Conclusion The Agilent 22 TapeStation system and the Agilent Genomic DNA ScreenTape assay provide an optimal tool for quality control of gdna samples extracted from FFPE tissue for DNA methylation analysis. The automatically determined DIN correlates with the CpG coverage, a key quality metric. The DIN can be used as quality criteria to determine how to handle individual gdna samples for downstream workflows, and to ensure successful DNA methylation analysis. References 1. Gu, H.; et al. Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution. Nature Methods 21, 7(2), pp 133-136. 2. Ashktorab, H; et al. DNA methylome profiling identifies novel methylated genes in African American patients with colorectal neoplasia. Epigenetics 214, 9(4), pp 53 512. 3. DNA quality control of formalin-fixed paraffin-embedded and fresh-frozen tissues prior to target-enrichment and next generation sequencing, Agilent Technologies, publication number 5991-483EN, 212. 4. DNA Integrity Number (DIN) with the Agilent 22 TapeStation System & Genomic DNA ScreenTape, Agilent Technologies, publication number G5991-5258EN, 214. 5. Agilent Genomic DNA ScreenTape System Quick Guide, Agilent Technologies, publication number G2964-94 rev.c, 214. 6. Agilent D1 ScreenTape System Quick Guide Agilent Technologies, publication number G2964-932 Rev. C, 214. 7. Tang, W.; et al. DNA Extraction from Formalin-Fixed, Paraffin-Embedded Tissue, Cold Spring Harbor Protocols 29, doi:1.111/pdb.prot5138. 7

www.agilent.com/genomics/ tapestation For Research Use Only. Not for use in diagnostic procedures. The information contained within this document is subject to change without prior notice. Agilent Technologies, Inc., 215 Published in the USA, December 1, 215 5991-6427EN