Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC)

Similar documents
Transcription:

Application of NGS (nextgeneration sequencing) for studying RNA regulation Samsung Advanced Institute of Heath Sciences and Technology (SAIHST) Sungkyunkwan University (SKKU) Samsung Research Institute for Future Medicine Samsung Medical Center (SMC) Sung Wook Chi

Application of NGS for studying RNA regulation 1. Next-generation Sequencing : Genomics : Human genome project & NGS 2. Overview of RNA regulation 3. RNA-Seq : Sequencing of RNAs by High-throughout Methods 4. Ribosome profiling : Translation 5. HITS-CLIP : RNA-protein interactions 6. Integrative analysis of RNA regulation

I. Next-generation Sequencing

Sequencing Technology : History

Human Genome Project June, 2000 (1990-2003) DECODING THE HUMAN GENOME WILL LEAD TO NEW WAYS TO PREVENT, DIAGNOSE, TREAT, AND CURE DISEASE. TODAY'S ANNOUNCEMENT REPRESENTS THE STARTING POINT FOR A NEW ERA OF GENETIC MEDICINE Human Genome ( ~3 million bp) - 13 years - ~3 billion dollars Nature. 2001 Feb 15; 409(6822):860-921

Human Genome Project ( ~3 million bp) - 13 years - ~3 billion dollars

Chain-termination methods (Sanger Method) DNA is fragmented Cloned to a plasmid vector Cyclic sequencing reaction Separation by electrophoresis Readout with fluorescent tags

Sequencing Technology : History

Next-Generation Sequencing (NGS) (2005 ~ ) Human Genome ( ~3 million bp) Nature 452, 872-876 (2008) Nature. 2001 Feb 15; 409(6822):860-921 - 1990-2003 (13 years) - ~3 billion dollars - 2008 (2 month) - < 1 million dollars

Sequencing in 2003 vs Sequencing today 100 per run >100 million per run

Next-Generation Sequencing (NGS) Next-generation sequencing (NGS) 1000 Genomes Project (2008~) : human genetic variation Forbes, Jun. 3 2010 Personalized Medicine

NGS platforms 12

Emulsion PCR (454, SOLiD) Fragments, with adaptors, are PCR amplified within a water drop in oil. One primer is attached to the surface of a bead. SOLiD 454

Pyrosequencing (454) PPi -> light TCAGGTTTTTTAA

SOLiD (Ligation) 15

Illumina/Solexa - Bridge amplification DNA fragments are flanked with adaptors. A flat surface coated with two types of primers, corresponding to the adaptors. Amplification proceeds in cycles, with one end of each bridge tethered to the surface.

Illumina/Solexa

Base calling from image (Illumina)

NGS technologies comparison

NGS technologies comparison

Sequencing Technology : History

Ion Torrent PGM / Ion Proton Sequencing-by-synthesis, native DNA pol/dntps, ph detection

Pacific Biosciences Single Molecule, Real Time Sequencing Eid et al 2008

Current flow Oxford Nanopore Electronic & Single molecule Nanopore = very small hole Electrical current flows through the hole Introduce analyte of interest into the hole identify analyte by the disruption or block to the electrical current 24

NGS platforms Illumina (~10days) SOLiD (~8days) HiSeq Genome Analyzer IIx MiSeq 600 Gb : 3 billion total 187 million reads/ lane 2 x 100bp 95 Gb: 320 million total : 40 million / lane 2 x 150bp 7 Gb: 15 million total : 2 x 250bp SOLiD 5500xl SOLiD 5500 SOLiD 4 Ion Torrent (1day) Ion Proton 10-100Gb Ion PGM 1Gb- 20Mb

Application of Next-generation Sequencing DNA RNA (HITS-CLIP) (Bisulfite sequencing) (RNA-Seq) (RNA-Seq) http://www.insightpharmareports.com/reports/2007/85_next_gen_sequencing/overview.asp

II. Overview of RNA regulation

Controlling mrna Transcription DNA Pre-mRNA Processing Translation mrna protein 1) initiation 2) elongation 3) Splicing / Poly (A) 4) Stability 5) Repression 6) Local nucleus RNA polymerase II Chromatin structure cytoplasm mrna Ribosome

Why RNA regulation? Human Mouse Worm Vs Vs Coding Genes (DNA) Transcripts (RNA) Nature. 2001 Feb 15; 409(6822):860-921 Nature. 2002 Dec;420, 520-562 Science. 1998 Dec;282(5396):p2011 21,550 22,670 20,389 126,019 77,177 29,873 RNA variants e.g. Alternative splicing Alternative poly adenylation Ensembl (March 2011) Complexity!

Why RNA regulation? : Complexity in regulation Non-coding RNAs 60% Coding genes 40% mirna Other non-coding genes 29% (15,103 ) ( 20,540 ) lincrna 3% mirna 3% (1,351) snrna 4% (1,756 ) (1,944 ) (10,870 ) pseudogene 21% Regulatory functions! Human genes Gencode project version 7 (Nov, 2010) Complexity!

Why RNA regulation? : Complexity in regulation Non-coding RNAs 60% Coding genes 40% mirna Other non-coding genes 29% (15,103 ) ( 20,540 ) lincrna 3% mirna 3% (1,351) snrna 4% (1,756 ) (1,944 ) (10,870 ) pseudogene 21% Human genes Gencode project version 7 (Nov, 2010) Regulatory functions! Science. 2008 Mar 28;319(5871):1785-6. Complexity!

Regulation at RNA level Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

III. RNA-Seq

NGS for RNA study

Coding PolyA mrna Non-PolyA mrna RNA families Replisome DNA DNA repair RNA Structural associated RNA associated Telomeric DNA methylation (pirna) Ribosome associated rrna Non-coding microrna TSS associated Regulatory Anti-sense Enhancer RNA lincrna

Construction of small RNA library

Identification of small RNAs by NGS

The evolution of transcriptomics Hybridization-based RNA-seq is still a technology under active development 1995 P. Brown, et. al. Gene expression profiling using spotted cdna microarray: expression levels of known genes 2002 Affymetrix, whole genome expression profiling using tiling array: identifying and profiling novel genes and splicing variants 2008 many groups, mrna-seq: direct sequencing of mrnas using next generation sequencing techniques (NGS)

RNA-seq basics RNA-seq, aka Whole Transcriptome Shotgun Sequencing, WTSS High-throughput sequencing of cdnas reverse transcriptase PCR, cdna library Number of reads from RNA transcripts Direct evidence of transcript variants and gene expression Types of libraries available: Total RNA sequencing Poly(A)+ RNA sequencing Small RNA sequencing

Overview of RNA-Seq library I RNA purification mrna selection - Poly(A) selection 1) RT with oligo dt 2) Oligo dt beads - rrna depletion : Beads with rrna probes - Fragmentation -RT (reverse transcriptase) -Linker ligation -PCR -cdna Zeng W. Mortazavi A. Technical considerations for functional sequencing assays. Nat Immunol. Sep;13(9):802-7 (2012)

Overview of RNA-Seq library II

RNA-Seq library using random priming - Poly(A) selected (Oligo dt beads) or rrna depleted (Beads with rrna probes) random

RNA-Seq analysis Haas BJ, Zody MC. Advancing RNA-Seq analysis. Nat Biotechnol. May;28(5):421-3. (2010)

Problem of mapping RNA-Seq reads

Mapping RNA-Seq reads (TopHat)

Quatification of transcripts The expression levels of known transcripts (exon model) are measured by the number of reads per kilobase of transcript per million mapped reads (RPKM) Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, 621 628 (2008).

RPKM (Expression values) Reads Per Kilobase of exon model per Million mapped fragments Nat Methods. 2008, Mapping and quanti fying mammalian transcriptomes by RNA -Seq. Mortazavi A et al. FPKM 10 9 C NL C= the number of reads mapped onto the gene's exons N= total number of reads in the experiment L= the sum of the exons in base pairs.

RNA-Seq vs. Microarray RNA-Seq vs. tiling arrays for Saccharomycescerevisiaecells. - Agree for genes with medium levels of expression, - but correlation is very low for genes with either low or high expression levels

Performance of RNA-Seq 1. Reproducibility 2. Dynamic range 3. Accuracy and Sensitivity Nature Methods - 5, 621-628 (2008)

Regulation at RNA level RNA-Seq Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

IV. Ribosome profiling

Overview of ribosomal footprinting - freeze ribosomes in the RNA with cycloheximide - digest mrnp with micrococcal nuclease - Isolate ribosomes with the protected mrna fragment by centrifugation

Ribosome profiling analysis - Genome annotation: AUG, STOP, micro-peptides, non-coding genes - Measure translational regulation - visualize translation - genome wide Ingolia et al, Science 2009

Library of ribosomal footprinting

Ribosome profiling analysis - reading frame -coding potential or translation efficiency - non-coding RNAs Ingolia et al, Cell 2011

Regulation at RNA level Ribosome profiling Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

V. HITS-CLIP (CLIP-Seq) HIgh Throughput Sequencing of Cross-Liniking ImminoPrecipitation

What regulates RNA? Transcript Variants RNP (RNA binding protein) Posttranscriptional Regulation RNA-Protein Complex! Argonaute : mirna P5 Nova1 -/- WT E10.5 RNA-Protein complex z Are they critical to functions? Neuron. 2000 Feb;25(2):359-71 Science 305 (5689): 1437-1441, 2004

RNA-protein complex: Critical to Functions! What are their RNA targets? Need biochemical method to identify RNA-protein interaction in vivo

HITS (high throughput sequencing) CLIP (CLIP-Seq) UV Science. 2003, 302:1212-15 Nature. 2008 Nov 27;456(7221):464-9 Genome wide map Cluster RNAse digestion Highthroughput Sequencing IP with Stringent Condition Align tags to genome Protein-RNA complex +UV -UV Labeling RNA SDS-PAGE kda Sequencing 97-66- RNA purification RT-PCR 46-

Application of HITS-CLIP - HITS-CLIP can be used for any RNABP : Genome-wide map for in vivo RNA-protein interactions - How about Argonaute-miRNA complex? : Difficult to identify mirna target sites

mirna target recognition mirna (~1500, human) RISC 5 3 3 5 Ago 3 5 Seed pairing rule Partial pairing 6mer seed match Target (~1000) Function Target gene repression - Biological function - Cellular phenotype - Disease Unable to precisely identify mirna target sites!

mirna mrna Ago HITS-CLIP UV Developed with Robert Darnell, Rockefeller Univ (2008) Ago 1) Transcriptome-wide mapping of Ago- mrna bound regions 2) Match Ago-mRNA to Ago-miRNA Modified from Patel D, Nature. 2008, 456(7224):921-6. Sequencing RNAse digestion Ago Labeling RNA Ago SDS-PAGE -UV +UV AgomRNA AgomiRNA IP with Stringent Condition Nitrocellulose transfer Chi SW, Zang JB, Mele A, and Darnell RB. Nature. 2009 Jul 23;460(7254):479-86

Ago-miRNA-mRNA ternary map Ago antibody1 Ago-miRNA Ago-mRNA Ago antibody1 Ago antibody2 Ago antibody2 A B C D E Ago footprint Top 20 mirna (~90% Ago-miRNAs) Seed Peak Ago-mRNA clusters (~10,000)

mirna functions predicted by Ago ternary maps Chi et al, Nature. 2009 Jul 23;460(7254):479-86 Biological function of mirna!

Regulation at RNA level HITS-CLIP Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

VI. Integrative analysis of RNA regulation

RNA-binding proteins and diseases Trends Genet. 2008 Aug;24(8):416-25.

Integration of HITS-CLIP maps Understanding combinatorial RNA regulation Chr8 Ago Ptbp2 Elavl Nova

Interplay between mirnas and PTB ( Mol Cell, 2007 T. Maniatis ) ( Cell, 2013, X. Fu ) Direct Conversion of Fibroblasts to Neurons by Reprogramming PTB-Regulated MicroRNA Circuits

Genomics of RNA regulation : A functional link in the post-genomic era Disease / Phenotype Translation Transcription Protein RNA DNA Proteomics Ribosome profiling Transcriptomics RNA-Seq HITS-CLIP Genomics GWAS, CNV CHIP-Seq Variation / Information

Credit: Together Design, London Thank you