Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression

Size: px
Start display at page:

Download "Genome Resequencing. Rearrangements. SNPs, Indels CNVs. De novo genome Sequencing. Metagenomics. Exome Sequencing. RNA-seq Gene Expression"

Transcription

1 Genome Resequencing De novo genome Sequencing SNPs, Indels CNVs Rearrangements Metagenomics RNA-seq Gene Expression Splice Isoform Abundance High Throughput Short Read Sequencing: Illumina Exome Sequencing DNA Methylation ChIP-SEQ 3D Organization Genotyping Small RNA

2

3

4 Hi-C Lieberman-Aiden 2010

5 Hi-C Lieberman-Aiden 2010

6 Dovetail Sequencing (Putnam 2015)

7 10X Genomics

8

9 25x linked read coverage

10 60 kb deletion

11

12

13

14 Genome Resequencing Rearrangements De novo genome Sequencing RNA-seq Gene Expression Splice Isoform Abundance SNPs, Indels CNVs Long Read Sequencing PacBio, (Nanopore) Metageno mics Exome Sequencing DNA Methylation ChIP-SEQ 3D Organization Genotyping Small RNA

15 Illumina sequencing workflow Library Construction Cluster Formation Sequencing Data Analysis

16 Fragmentation Mechanical shearing: BioRuptor Covaris Enzymatic: Fragmentase, RNAse3 DNA, RNA DNA, RNA Chemical: Mg2+, Zn2+ RNA

17 DNA library construction Fragmented DNA End Repair 5 P OH HO P 5 Blunt End Fragments A Tailing 5 P A A P 5 Single Overhang Fragments T T Adapter Ligation DNA Fragments with Adapter Ends

18 Enrichment of library fragments 5 5 PCR Amplification

19 Illumina Sequencing Technology Sequencing By Synthesis (SBS) Technology 3 5 DNA ( ug) Library preparation Single Cluster molecule generation array A C T C T G C T G A A G 5 T G C T A C G A T A C C C G A T C G A T Sequencing

20 TruSeq Chemistry: Flow Cell 8 channels Surface of flow cell coated with a lawn of oligo pairs

21 Sequencing 1.6 Billion Clusters Per Flow Cell 20 Microns 100 Microns 27

22 Sequencing 100 Microns 28

23 Patterned Flowcell

24 Hiseq 3000: 478 million nanowells per lane

25

26

27 What will go wrong? cluster identification bubbles synthesis errors:

28 What will go wrong? synthesis errors: Phasing & Pre-Phasing problems

29 Illumina SAV viewer

30 base composition

31 fluorescence intensity F:\HiSeq intensity.png

32 amplicon mix F:\HiSeq intensity.png

33 amplicon F:\HiSeq intensity.png

34 amplicon mix Q30 F:\HiSeq intensity.png

35

36 If you can put adapters on it, we can sequence it!

37 Know your sample single-stranded Adapter Ligation

38 Maher Al Rwahnih UCD Plant Foundation Plant Services No need to be scared of HTS UC Davis Center for Plant Diversity/Herbarium The Herbarium archives contain over 300,000 dried specimens. Search for Grapevine Red Blotch-Associated Virus Virus traces found by PCR

39 Studying historic Bean varieties from herbarium samples - GBS (Genotyping-By-Sequencing) - 60 year old herbarium samples Sarah Dohle, Gepts Lab

40 Quantitation & QC methods Intercalating dye methods (PicoGreen, Qubit, etc.): Specific to dsdna, accurate at low levels of DNA Great for pooling of indexed libraries to be sequenced in one lane Requires standard curve generation, many accurate pipetting steps Bioanalyzer: Quantitation is good for rough estimate Invaluable for library QC High-sensitivity DNA chip allows quantitation of low DNA levels qpcr Most accurate quantitation method More labor-intensive Must be compared to a control

41 Optional: PCR-free libraries PCR-free library: OR if concentration allows Reduction of PCR bias against e.g. GC rich or AT rich regions, especially for metagenomic samples Library enrichment by PCR: Ideal combination: high input and low cycle number; low-bias polymerase

42 Library QC by Bioanalyzer Predominant species of appropriate MW Minimal primer dimer or adapter dimers Minimal higher MW material

43 Library QC by Bioanalyzer ~ 125 bp Beautiful 100% Adapters Beautiful

44 Library QC ~125 bp Examples for successful libraries Adapter contamination at ~125 bp

45 RNA-seq targeted sequencing: - Capture-seq (Mercer et al. 2014) - Nimblegen and Illumina - Low quality DNA (FFPE) - Lower read numbers 10 million reads - Targeting lowly expressed genes.

46 THIRD GENERATION DNA SEQUENCING Single Molecule Real Time (SMRT ) sequencing Sequencing of single DNA molecule by single polymerase Very long reads: average reads over 8 kb, up to 30 kb High error rate (~13%). Complementary to short accurate reads of Illumina

47 70 nm aperture Zero Mode Waveguide

48 Damien Pelt

49 First Sequencing of CGG-repeat Alleles in Human Fragile X Syndrome using PacBio RS Sequencer Paul Hagerman, Biochemistry and Molecular Medicine, SOM. Single-molecule sequencing of pure CGG array, - first for disease-relevant allele. Loomis et al. (2012) Genome Research. - applicable to many other tandem repeat disorders. Direct genomic DNA sequencing of methyl groups, - direct epigenetic sequencing (paper under review). Discovered 100% bias toward methylation of 20 CGGrepeat allele in female, first direct methylated DNA sequencing in human CGG disease. 36 CGG 95 DoD STTR award with PacBio. Basis of R01 applications. C A G T Nucleotide position

50

51 Thank you!