NEXT GENERATION SEQUENCING Whole Gene Sequencing

Similar documents
Transcription:

NEXT GENERATION SEQUENCING Whole Gene Sequencing Ingrid Faé Educational Session 3: Next generation sequencing Stockholm, Friday, June 27 th 2014 Department for Blood Group Serology and Transfusion Medicine

Second generation sequencing (Ion Torrent) Third generation sequencing (PacBio) Quality Assurance

Preliminary question Are mutations in exons 2,3 and 4 the only actionable mutations in the entire HLA gene? What impact do mutation in the other exons and intron of the HLA gene have on proteinfolding and the subsequent presentation of antigens? The ultimate solution for preventing ambiguities in genotyping is to sequence the entire HLA gene.

Ion Torrent PGM Chemistry and detection Whole gene approach Workflow Advantages/disadvantages

Chemistry

Detection

Super high resolution for single molecule-sequence-based typing of classical HLA loci at the 8-digit level using next generation sequencers T. Shiina, S. Suzuki, Y. Ozaki, H.Taira, E. Kikkawa, A. Shigenari, A.Oka T. Umemura, S. Joshita, O. Takahashi Y. Hayashi, M. Paumen, Y. Katsuyama, S. Mitsunaga, M.Ota, J. K. Kulski & H. Inoko Tissue Antigens, 2012, 80, 305 316

Work Flow gdna Amplicon Library RNA Library Fragmentation Size dependent Fragmentation (long amplicons) End repair (small amplicons) Prepare WT or mirna End repair (for physical shearing) Adapter ligation & nick repair Adapter ligation & nick repair Hybr./Adapter ligation Adapter ligation & nick repair Size selection Size selection Reverse Transcription Size selection Amlification (if needed) Amlification (if needed) Size selection Amlification (if needed) Qualify & quantify Qualify & quantify Amplification Qualify & quantify Qualify & quantify

Fragmentation Enzymatic fragmentation blunt ends Physical fragmentation end repair http://www.tebubio.com/userfiles/image/035/035%20i_endit_dnafragments_1.jpg

Adapter Ligation & Nick Repair

E-Gel

Emulsion PCR

Begin with the begin -E.Coli library

First Own Library

HLA typing on 314 chip

HLA typing on 316 chip

Analysis Software Solutions HLA TypeStreamT Analysis Software (Life Technology) NGSengine (GenDx) Omixon Conexio

NGSengine

Omixon

HLA TypeStreamT Analysis Software

Coverage

Match List

Flagged Positions

Advantages Whole gene sequencing possible Clonal sequences Automation Chip size Low to High throughput Costs

Disadvantages HLA/IMGT database currently incomplete Single urgent samples Emulsion PCR Length of reads GC rich regions Coverage Phase Amplification bias Remedy-> third generation sequencing?

3 rd Generation Sequencing Reaction of single molecules is measured less starting material no PCR -> PCR bias (uneven amplification of different alleles) Genom of single cells released signal - realtime detection (Protone or Fluorophore) Heliscop Sequencer PacBio (SMRT ) DNA Sequencing

Advantages Long reads Unambiguous de novo phasing of longrange sequencing reads Reduced sample manipulation

Disadvantages High priced equiment Errors, while frequent, occur in random locations and base composition Similar length of amplicon in one run

Quality Assurance PCR primer design Loss of alleles Quantification of DNA/PCR product Multiplex PCR monitoring Creation of artefacts should be prevented

Validation Validation Analytical sensitivity the minimum detectable concentration of the analyte Specificity freedom from interference by any element or compound other than the analyte Precision is a measure of random errors, and may be expressed as Repeatability is the closeness of agreement between mutally independent test results obtained with the same method on identical test material in the same laboratory by the same operator using the same equipment within short intervals of time. Reproducibility

Quality check Total Bases Key Signal Filtered: Low quality Number of filtered and trimmed base pairs reported in the output BAM file. Percentage of Live ISPs with a key signal that is identical to the library key signal. Low or unrecognizable signal.

Quality check AQ20 The percentage of reads that have a predicted quality score of Q20 or better. AQ20 score is the predicted quality of a Phred-like score of 20 or better, or one error in 100 bp. AQ17 The AQ17 Read Lengths graph is a histogram of read lengths, in bp units, that have a Phred-like score of 17 or better, or one error in 50 bp.

Acceptance of Data Criteria for acceptance of data must be specified Read length Minimal allele ratio Coverage Examples

Read Length

Minimal Allele Ratio

Coverage

Contamination Negative controls Extended contamination control Barcode change

External Quality Controls Dedicated for NGS Whole gene sequencing Amplicon sequencing technique Format of Results Alleles FastqFiles Raw data of the device

Summary 2 nd Generation Sequencing ->advantages Whole gene sequencing High throughput Costs 3 rd Generation Sequencing Long reads Phasekeeping Quality Assurance