Sanger Sequencing: Troubleshooting Guide

Similar documents
Sanger sequencing troubleshooting guide. GATC Biotech AG

Diagnosis Sanger. Interpreting and Troubleshooting Chromatograms. Volume 1: Help! No Data! GENEWIZ Technical Support

Quick Reference Guide

Tips DNA Sequencing (Revision Date: 09 July 2007)

2. Pyrosequencing Assay Design

Tips for sequencing DNA

GENOMICS RESEARCH LABORATORY Sample Submission Requirements

The Agilent Total RNA Isolation Kit

7. Troubleshooting Guidelines

FastFire qpcr PreMix (Probe)

Comparison of ExoSAP-IT and ExoSAP-IT Express reagents to alternative PCR cleanup methods

Interpretation of sequence results

Guide to Sanger Sequencing at RAMAC

GenBuilder TM Plus Cloning Kit User Manual

GenBuilder TM Plus Cloning Kit User Manual

ZR-96 Genomic DNA Clean & Concentrator -5 Catalog Nos. D4066 & D4067

Roche Molecular Biochemicals Technical Note No. LC 9/2000

Genomic DNA Clean & Concentrator -25 Catalog Nos. D4064 & D4065

GenBuilder TM Cloning Kit User Manual

User Bulletin. ABI PRISM 373 DNA Sequencer. Using the ABI PRISM 373 BigDye Filter Wheel. Contents. Introduction. February 16, 1998 (updated 06/2001)

1

Guidelines for Developing Robust and Reliable PCR Assays

Cold Fusion Cloning Kit. Cat. #s MC100A-1, MC101A-1. User Manual

Guide to Sanger Sequencing at RAMAC

1. COMPONENTS. PyroStart Fast PCR Master Mix (2X) (#K0211 for 250 reactions of 20µl) 2. STORAGE 3. DESCRIPTION

Sanger Sequencing Portfolio

GeneCopoeia TM. All-in-One qpcr Mix For universal quantitative real-time PCR. User Manual

Genomic Sequencing. Genomic Sequencing. Maj Gen (R) Suhaib Ahmed, HI (M)

Efficient Method for Isolation of High Quality Concentrated Cellular RNA with Extremely Low Levels of Genomic DNA Contamination Application


QUANTITATIVE RT-PCR PROTOCOL (SYBR Green I) (Last Revised: April, 2007)

THUNDERBIRD SYBR qpcr Mix

BENG 183 Trey Ideker (the details )

Bead Type (NaI) Gel Extraction Kits

Design. Construction. Characterization

Real Time Quantitative PCR Assay Validation, Optimization and Troubleshooting

GenepHlow Gel Extraction Kit

Assay Design Considerations, Optimization and Validation

Zymo-Spin Technology. The microcentrifuge column that revolutionized nucleic acid clean-up

Polymerase Chain Reaction-361 BCH

Seminar: Sanger Sequencing

Presto Stool DNA Extraction Kit

Troubleshooting of Real Time PCR Ameer Effat M. Elfarash

Description...1 Components...1 Storage... 1 Technical Information...1 Protocol...2 Examples using the kit...4 Troubleshooting...

BIOO LIFE SCIENCE PRODUCTS

SEQUENCING SERVICES. McGill University and Génome Québec Innovation Centre FEBRUARY 28, Version 3.5

Executive Summary. clinical supply services

Genetics and Genomics in Medicine Chapter 3. Questions & Answers

Presto Food DNA Extraction Kit

Roche Molecular Biochemicals Technical Note No. LC 10/2000

qpcr Quantitative PCR or Real-time PCR Gives a measurement of PCR product at end of each cycle real time

Functional Genomics Research Stream. Research Meeting: June 19, 2012 SYBR Green qpcr, Research Update

SunScript TM One Step RT-qPCR Kit

Accura High-Fidelity Polymerase

SANGER SEQUENCING WHITE PAPER

Roche Molecular Biochemicals Technical Note No. LC 12/2000

Presto Soil DNA Extraction Kit

Quick and easy 7 minute protocol to select for 300 bp, 200 bp, 150 bp, 100 bp, 50 bp DNA fragments or perform a double size selection

Quick and easy 7 minute protocol to select for 300 bp, 200 bp, 150 bp, 100 bp, 50 bp DNA fragments or perform a double size selection

RTS Wheat Germ LinTempGenSet, His 6 -tag Manual

PureSpin DNA Clean Up Kit

Polymerase Chain Reaction (PCR)

Lecture Four. Molecular Approaches I: Nucleic Acids

P HENIX. PHENIX PCR Enzyme Guide Tools For Life Science Discovery RESEARCH PRODUCTS

HiPer Real-Time PCR Teaching Kit

Dig System for Starters

Bootcamp: Molecular Biology Techniques and Interpretation

Optimizing crna fragmentation for microarray experiments using the Agilent 2100 bioanalyzer. Application Deborah Vitale.

Presto Mini Plasmid Kit

TB Green Premix Ex Taq II (Tli RNaseH Plus)

Certificate of Analysis. Quality Control Parameters. 2 -deoxyadenosine-5 -triphosphate. MW = g /mol

SYBR Premix Ex Taq II (Tli RNaseH Plus), ROX plus

RTS 100 E. coli LinTempGenSet, Histag

Performance characteristics of the High Sensitivity DNA kit for the Agilent 2100 Bioanalyzer

AxyPrep PCR Clean-up Kit AxyPrep-96 PCR Clean-up Kit

Polymerase Chain Reaction: Application and Practical Primer Probe Design qrt-pcr

Functional Genomics Research Stream. Research Meetings: November 2 & 3, 2009 Next Generation Sequencing

Introduction. Kit components. 50 Preps GF-PC-050. Product Catalog No. 200 Preps GF-PC Preps GF-PC Preps SAMPLE

CloneDirect Rapid Ligation Kit

SYBR Premix Ex Taq II (Tli RNaseH Plus), Bulk

Biol/Chem 475 Spring 2007

Real Time PCR (qpcr) Dott. Finetti Luca

7 Synthesizing the ykkcd Mutant Toxin Sensor RNA in vitro

DNA Labeling Kits Fluorescein-dCTP and -dutp Instruction Manual

PCR OPTIMIZATION AND TROUBLESHOOTING

PCR PRIMER DESIGN SARIKA GARG SCHOOL OF BIOTECHNOLGY DEVI AHILYA UNIVERSITY INDORE INDIA

DNA Labeling Kits Texas Red -dctp and -dutp Instruction Manual

Finishing Drosophila Ananassae Fosmid 2728G16

Genomic DNA Clean & Concentrator -10 Catalog Nos. D4010 & D4011

Gene-Spin DNA Purification Kit - V3

Experiment (5): Polymerase Chain Reaction (PCR)

We begin with a high-level overview of sequencing. There are three stages in this process.

AMPURE PCR PURIFICATION PAGE 1 OF 7

Real-Time PCR Principles and Applications

BIOO LIFE SCIENCE PRODUCTS

EZ-10 SPIN COLUMN HANDBOOK


All-in-One qpcr Mix. User Manual. For universal quantitative real-time PCR

AmpliTaq 360 DNA Polymerase

SunScript One Step RT-PCR Kit

Transcription:

If you need help analysing your Sanger sequencing output, this guide can help. CONTENTS 1 Introduction... 2 2 Sequence Data Evaluation... 2 3 Troubleshooting... 4 3.1 Reviewing the Sequence... 4 3.1.1 Electropherogram... 4 3.1.2 Raw Sequence... 4 3.2 Result Evaluation... 5 3.2.1 Failed sequence... 5 3.2.2 Weak sequence... 6 3.2.3 Short sequence (or shorter than expected)... 7 3.2.4 Multiple sequences... 8 3.2.5 Artifacts... 9 4 Review AGRF submission... 10 5 Reviewing Experimental Setup of SEQ Reaction... 10 5.1 DNA Template Review... 10 5.2 Primer Design Review... 11 6 Contact AGRF Sequencing... 11 Page 1 of 11

1 Introduction This document highlights some common problems associated with DNA sequencing as well as the possible causes and solutions for these problems. Pictures of sequence traces are provided where possible along with the information describing the problem, how to identify the problem, the cause, and the potential solution for the problem. Other problems can occur with sequence data, but the following are those seen most commonly. Use this guide as the AGRF recommended data review/troubleshooting process. 2 Sequence Data Evaluation For each sample processed, the following are provided: Filename.ab1: The raw chromatogram trace file Filename.seq: A text file of the sequence, as generated by the sequencing instruments Filename.fa: A quality trimmed FASTA formatted text file Filename.bn: A BLAST file (GenBank) of the quality trimmed FASTA file The filename.ab1 file contains annotation of the sample, the raw data trace and the analysed electropherogram. Basecalling and analysis algorithms are applied to the raw data to create the analysed data trace. When evaluating or trouble-shooting sequence data, it is important to look at the raw, and analysed data traces. The raw data trace should show an even distribution of peaks across the read and no residual dyes (Figure 1). The analysed data trace should show sharp, evenly spaced peaks across the read and a clear baseline (Figure 2). AGRF recommends the use of Applied Biosystem s free Sequence Scanner software (available for download at - https://www.thermofisher.com/au/en/home/life-science/ sequencing/sanger-sequencing/sanger-dna-sequencing/sanger-sequencing-dataanalysis.html) Figure 1: Raw Data Page 2 of 11

Figure 2: Analysed data It is equally important to look at data values displayed in the annotation file (Figure 3). It is useful to check the following: Average signal to noise ratio indicates labelling efficiency and should fall between be 100 and 750. The base call start indicates the scan point at which the read commences at and should be ~600 to 800. The end point should be ~13,000 to 14,000 or at the end of the read. The number of QV bases >=20 should be ~950 to 1000 (less for shorter PCR fragments) Figure 3: Annotation file which shows values for signal strength and start/end points Page 3 of 11

3 Troubleshooting When troubleshooting sequencing data, follow the workflow below to try to identify the cause of your problem. The following steps in this section use Sequencing Analysis Software or Sequence Scanner Software. 3.1 Reviewing the Sequence 3.1.1 Electropherogram Select the Electropherogram tab and review the sequence for data quality. Check the following: Well-defined peak resolution minimal fluorescence overlap from one peak to the next with a sharp peak top. Uniform peak spacing peak spacing is consistent throughout the trace. Signal-to-noise ratios and variation in peak heights High signal to noise ratio and even peak height characterize good quality sequence. (Please note: the analysed view is re-scaled, the peak heights are not representative of raw fluorescence detected by the AB 3730xl.) 3.1.2 Raw Sequence Select the Raw tab and review the unprocessed fluorescence data to assess the signal quality. Check the following: Artifacts Are there any artifacts, such as four-color spikes? Peak heights Are peaks well-resolved, with acceptable heights? Data start points Do any data start points deviate from others in the same submission? Length of read Was the expected length of read obtained? Does the signal stop suddenly? Baseline Is there background noise for all the peaks? Zoom in horizontally and vertically to verify the baseline noise. Page 4 of 11

3.2 Result Evaluation Use the following examples as a guide to try to identify an explanation of your results. Please note that this list is not exhaustive, but include the most common results seen at the AGRF. 3.2.1 Failed sequence No noticeable fluorescence peaks in raw data Only background noise seen in electropherogram Problem Probable Cause Solution No sequence detected No priming site present Confirm the primer site is present in the template. Redesign or use a different primer Primers have degraded through freeze-thaw cycles Make up new primer stocks Inefficient primer binding Redesign primer Insufficient amount of DNA template Re-quantify DNA and increase the amount of DNA if required DNA template has degraded or Inhibitory contaminant in your samples e.g. salts, phenol, EDTA, ethanol Re-extract DNA template or clean-up template. Page 5 of 11

3.2.2 Weak sequence Very low peak height in the raw data trace Base calls fade before the end of the read and the signal-to-noise ratios are very low Problem Probable Cause Solution Low peaks throughout trace Insufficient amount of DNA template Inhibitory contaminant in your samples (e.g. salts, phenol, EDTA, ethanol) Quantitate the DNA Increase the amount of DNA template, clean-up DNA template Insufficient amount of primer or inefficient primer binding Check primer dilution and/or redesign primer Page 6 of 11

3.2.3 Short sequence (or shorter than expected) Very high peaks in the raw data trace that fade off abruptly Poor quality sequence at start leading to shorter than expected sequence length Problem Probable Cause Solution Sequence starts well but signal drops gradually (Ski-sloping) Primer or Template ratio is incorrect or contaminant is present in template Re-examine template and primer concentration Re-extract or clean-up template Repetitive region - Repeat regions, especially GC and GT repeats, can cause the signal to fade either due to depletion or slippage or secondary structure Add (1ul) DMSO to the sequencing reaction Sequence complementary strand the Sequence starts well but signal stops abruptly Secondary structure - GC and AT rich templates can cause the DNA to loop and form hairpins Linearized DNA - restriction enzymes may have cut the template Add (1ul) DMSO to the sequencing reaction to help relax the structure Design primers close to the hairpin Run product out on an agarose gel to check Page 7 of 11

3.2.4 Multiple sequences Overlapping peaks in all or part of the electropherogram that maintain correct base-spacing Overlapping peaks after a homopolymer region Problem Probable Cause Solution Overlapping peaks in all or part of the sequence Overlapping peaks following stretch of mononucleotide sequence Mixed plasmid preparation Multiple PCR products Re-isolate the the DNA from a pure colony and re-sequence Check PCR template on gel for single band Frame shift mutation Use a different primer after the mutation or sequence the complementary strand Primer-dimer contamination Multiple priming sites Multiple primers in reaction Optimise PCR amplification or redesign primer Make sure primer only has one priming site Ensure only one primer has been used Primer with N-1 contamination Re-synthesize primer with PAGE purification Enzyme slippage occurs giving varying lengths of the same sequence after this region (n-1, n-2 and n-3 populations) Sequence the complementary strand Page 8 of 11

3.2.5 Artifacts Peaks of excess dye present in the raw data trace Large broadened peaks that obscure the sequence Problem Probable Cause Solution Large peaks obscuring the real sequence Sample peaks become lumpy and increasingly unreadable early in the sequence (before 500bp) Dye blobs caused by unincorporated BigDye Terminator (BDT) and are typically seen at 70bp and 120bp. Usually seen in failed or weak sequences. Real sequence can still be read underneath these blobs If related to individual samples this is due to a contaminant in the sample For CS submissions, review cleanup method used and/or add more DNA template and less BDT. For PD submissions, please notify AGRF staff and a free re-run will be provided. Clean up template DNA Page 9 of 11

4 Review AGRF submission Take some time to review the samples in each submission batch. For example, does the problem occur in: o Specific samples o Specific submissions o Samples extracted during the same process, or different processes o Samples stored under similar conditions or under different conditions etc. Is the symptom present in other samples of the same submission? Are there any differences in how the templates of other samples in the same submission were prepared? Was the same primer used in all samples of the submission? How was the template stored after preparation for this submission? 5 Reviewing Experimental Setup of SEQ Reaction Based on the results from Sequence Data Evaluation, use the tables below to review your experimental setup. 5.1 DNA Template Review Recommendation Run an agarose gel to detect any contaminating DNA or RNA. Measure the A260/A280 ratio of your samples. Quantitate the DNA template using the absorbance at 260 nm (A260). Dilute or concentrate the DNA as needed to obtain an A260 reading between 0.05 and 1.00. Comment Purified DNA should run as a single band on an agarose gel. Note: Uncut plasmid DNA can run as three bands: supercoiled, nicked, and linear. Note: RNA contamination up to 1 μg can be tolerated in the sequencing reaction, but it affects DNA quantitation greatly. For pure preparations of DNA (in TE), the A260/A280* ratio is 1.8. Very clean samples in pure water can give a ratio of 1.5 to 1.6. Smaller ratios may indicate the presence of protein or organic contaminants. Ratios less than 1.8 may still produce high quality results. Quantitation by agarose gel electrophoresis may not be accurate because ethidium bromide incorporation is not consistent and the method of comparing the standard and sample brightness is subjective. A260 values below 0.05 or above 1.00 are not accurate because Beer s law generally applies only within a certain concentration range. Outside of this concentration range, the relationship between absorbance and concentration is nonlinear. *A260 and A280 are the optical spectrometer measurement of absorbance at the wavelengths of 260 nm and 280 nm respectively. A260 is frequently used to measure DNA/RNA concentration and A280 is used to measure protein concentration. A ratio of A260/A280 > 1.8 suggests little protein contamination in a DNA/RNA sample. Page 10 of 11

5.2 Primer Design Review Recommendation Ensure that the primer has Tm >45 C. Ensure that primers are at least 18 bases long. Ensure that there are no known secondary hybridization sites on the target DNA. Choose primers that do not have runs of identical nucleotides, especially 4 or more Gs. Choose primers with G-C content in the range of 30 to 80%, preferably 50 to 55%. Design primers to minimize the potential for secondary structure and/or hybridization Purify primers by HPLC to reduce the quantity of n-1 primers. Comment If the Tm is too low, it may result in poor priming and low or no signal Primers that are too short may have Tms that are too low. Secondary hybridization sites on the target DNA can result in double peaks throughout the sequence Runs of identical nucleotides in primers can cause n+1 or n-1 effects. Also, these primers may be more difficult to synthesize. If the G-C content is too low, the Tm may be too low. If so, increase the primer length beyond 18 bases to obtain a Tm>45 C. Primer-dimer formation from hybridization can result in mixed sequence at the beginning of the sequence. Secondary structure in the primer, particularly at the 3 end can result in poor priming and low or no signal. Primers containing contaminants or synthesized primers of the wrong length can cause problems in sequencing reactions, such as failed reactions, noisy data, or poor sequencing results. If the primer is a short oligo that contains n-1 primers, HPLC cannot always remove the n-1 contaminants. 6 Contact AGRF Sequencing If you have not resolved your problem, please contact AGRF Sequencing for further support. Contact Details Email: sequencing@agrf.org.au Phone: 07 3365 8815 Page 11 of 11