|
|
- Kory Jefferson
- 5 years ago
- Views:
Transcription
1
2
3 Background
4
5 Wikipedia
6
7 Lee and Mahadavan, JCB, 2009
8 History (Platform Comparison)
9 P Park, Nature Review Genetics, 2009
10 P Park, Nature Reviews Genetics, 2009
11 Rozowsky et al., Nature Biotechnology, 2009
12 Chromatin Immunoprecipitation (ChIP)
13 PJ Farnham, 2009
14
15 VS.
16 P Park, 2009
17 DNA TF1 TF1 TF3 TF4 TF2 Region of DNA Isolated via CHIP
18 PJ Farnham, Nature Reviews Genetics, 2009
19 Reference Samples
20
21 Auerbach et al, PNAS, 2009
22 Bioinformatics
23
24
25
26
27
28
29 Pepke et al., Nature Methods, 2009
30 Rozowsky et al., Nature Biotechnology, 2009
31 P Park, 2009
32 Transcriptome Analysis using RNA-Seq CBB752 Lukas Habegger 04/07/2010
33 Outline Background Comparison of RNA-Seq to previous methods Informatics Mapping of RNA-Seq reads Calculating gene expression values Transcriptome analysis of human embryonic stem cells undergoing neural differentiation 2
34 Background The transcriptome is the complete set of transcripts in a cell population For a specific developmental stage or physiological condition Understanding the transcriptome is essential to Interpret the functional elements of the genome Understand development and disease 3
35 Aims of Transcriptomics The key aims of transcriptomics are: To catalog the types of transcripts present in a cell population mrnas Long non-coding RNAs Small RNAs To determine the transcriptional structure of genes To quantify the abundance of transcript isoforms 4
36 Outline Background Comparison of RNA-Seq to previous methods Informatics Mapping of RNA-Seq reads Calculating gene expression values Transcriptome analysis of human embryonic stem cells undergoing neural differentiation 5
37 Previous methods: Sequence-based approaches Sanger sequencing of cdna/est libraries Low throughput Expensive Not very quantitative Tag-based methods SAGE (Serial Analysis of Gene Expression) MPSS (Massively Parallel Signature Sequencing) Limitations: Expensive (based on Sanger sequencing) Unable to distinguish transcript isoforms 6
38 Previous methods: Hybridization-based approaches Gene expression arrays Exon arrays Tiling arrays Limitations: Cross-hybridization Resolution Dynamic range Source: wikipedia 7
39 Overview of an RNA-Seq experiment 8 Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10, (2009).
40 Benefits of RNA-Seq Ability to distinguish different isoforms Ability to distinguish allelic gene expression Detection of RNA editing events De-novo transcript assembly Detection of fusion transcripts High-throughput Nucleotide level resolution 9
41 Comparison between methods Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10, (2009). 10
42 Outline Background Comparison of RNA-Seq to previous methods Informatics Mapping of RNA-Seq reads Calculating gene expression values Transcriptome analysis of human embryonic stem cells undergoing neural differentiation 11
43 Mapping RNA-Seq reads Short vs. long reads Short-read mappers Algorithms based on seed-based indexing Algorithms based on Borrows-Wheeler Transform (BWT) 12
44 Index-based short-read mappers 13 Adapted from Trapnell, C. & Salzberg, S.L. How to map billions of short reads onto genomes. Nat Biotech 27, (2009)
45 BWT-based short-read mappers Adapted from Flicek, P. & Birney, E. Sense from sequence reads: methods for alignment and assembly. Nat Meth 6, S6-S12 (2009). 14 Adapted from Trapnell, C. & Salzberg, S.L. How to map billions of short reads onto genomes. Nat Biotech 27, (2009)
46 Mapping of RNA-Seq reads DNA cdna/rna Map reads to the reference genome and splice junction library 15 Adapted from Pepke, S., Wold, B. & Mortazavi, A. Computation for ChIP-seq and RNA-seq studies. Nat Meth 6, S22-S32 (2009).
47 TopHat: Identification of novel splice junctions Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics,
48 Transcript assembly Assemble transcripts directly based on RNA-Seq data Advantage: Detect structural alterations that are not present in reference genome Computationally intensive Not suitable for lowly expressed genes 17
49 Outline Background Comparison of RNA-Seq to previous methods Informatics Mapping of RNA-Seq reads Calculating gene expression values Transcriptome analysis of human embryonic stem cells undergoing neural differentiation 18
50 Calculating exon/gene expressions values Reads per kilobase per million mapped reads (RPKM) Composite gene model Isoform 1 Isoform 2 Composite Model Transcript isoform quantification 19
51 Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth 5, (2008). 20
52 Outline Background Comparison of RNA-Seq to previous methods Informatics Mapping of RNA-Seq reads Calculating gene expression values Transcriptome analysis of human embryonic stem cells undergoing neural differentiation 21
53 Transcriptome analysis of human embryonic stem cells undergoing neural differentiation Technologies: 454 (2M reads) Illumina (250M reads) Single-end Paired-end Samples: Human embryonic stem cells Neuronal precursors 3 stages 22
54 Example: Long/short RNA-Seq reads Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 23
55 Differential gene expression Embryonic stem cells (hesc): RPKM Neural progenitor cells (N2): RPKM Adapted from Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 24
56 Differential gene expression Embryonic stem cells (hesc): RPKM NCAM: Neural Cell Adhesion Molecule Neural progenitor cells (N2): RPKM Adapted from Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 25
57 Differential splicing Embryonic stem cells (hesc): RPKM Embryonic stem cells (hesc) Neural progenitor cells (N2) Neural progenitor cells (N2): RPKM SLK: Serine / Threonine Kinase 2 Adapted from Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 26
58 Fraction of genes detected N2 hesc 1x coverage N2 hesc 5x coverage Adapted from Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 27
59 Number of splice junctions detected Known splice junctions Novel splice junctions Adapted from Wu, J.Q., Habegger, L. et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proceedings of the National Academy of Sciences 107, (2010). 28
60 Summary RNA-Seq has many advantages compared to conventional methods Higher resolution Larger dynamic range Method of choice to study the structure and dynamics of the transcriptome Connectivity of exons Alternative splicing 29