Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC)

Size: px
Start display at page:

Download "Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC)"

Transcription

1 Application of NGS (nextgeneration sequencing) for studying RNA regulation Samsung Advanced Institute of Heath Sciences and Technology (SAIHST) Sungkyunkwan University (SKKU) Samsung Research Institute for Future Medicine Samsung Medical Center (SMC) Sung Wook Chi

2 Application of NGS for studying RNA regulation 1. Next-generation Sequencing : Genomics : Human genome project & NGS 2. Overview of RNA regulation 3. RNA-Seq : Sequencing of RNAs by High-throughout Methods 4. Ribosome profiling : Translation 5. HITS-CLIP : RNA-protein interactions 6. Integrative analysis of RNA regulation

3 I. Next-generation Sequencing

4 Sequencing Technology : History

5 Human Genome Project June, 2000 ( ) DECODING THE HUMAN GENOME WILL LEAD TO NEW WAYS TO PREVENT, DIAGNOSE, TREAT, AND CURE DISEASE. TODAY'S ANNOUNCEMENT REPRESENTS THE STARTING POINT FOR A NEW ERA OF GENETIC MEDICINE Human Genome ( ~3 million bp) - 13 years - ~3 billion dollars Nature Feb 15; 409(6822):

6 Human Genome Project ( ~3 million bp) - 13 years - ~3 billion dollars

7 Chain-termination methods (Sanger Method) DNA is fragmented Cloned to a plasmid vector Cyclic sequencing reaction Separation by electrophoresis Readout with fluorescent tags

8 Sequencing Technology : History

9 Next-Generation Sequencing (NGS) (2005 ~ ) Human Genome ( ~3 million bp) Nature 452, (2008) Nature Feb 15; 409(6822): (13 years) - ~3 billion dollars (2 month) - < 1 million dollars

10 Sequencing in 2003 vs Sequencing today 100 per run >100 million per run

11 Next-Generation Sequencing (NGS) Next-generation sequencing (NGS) 1000 Genomes Project (2008~) : human genetic variation Forbes, Jun Personalized Medicine

12 NGS platforms 12

13 Emulsion PCR (454, SOLiD) Fragments, with adaptors, are PCR amplified within a water drop in oil. One primer is attached to the surface of a bead. SOLiD 454

14 Pyrosequencing (454) PPi -> light TCAGGTTTTTTAA

15 SOLiD (Ligation) 15

16 Illumina/Solexa - Bridge amplification DNA fragments are flanked with adaptors. A flat surface coated with two types of primers, corresponding to the adaptors. Amplification proceeds in cycles, with one end of each bridge tethered to the surface.

17 Illumina/Solexa

18 Base calling from image (Illumina)

19 NGS technologies comparison

20 NGS technologies comparison

21 Sequencing Technology : History

22 Ion Torrent PGM / Ion Proton Sequencing-by-synthesis, native DNA pol/dntps, ph detection

23 Pacific Biosciences Single Molecule, Real Time Sequencing Eid et al 2008

24 Current flow Oxford Nanopore Electronic & Single molecule Nanopore = very small hole Electrical current flows through the hole Introduce analyte of interest into the hole identify analyte by the disruption or block to the electrical current 24

25 NGS platforms Illumina (~10days) SOLiD (~8days) HiSeq Genome Analyzer IIx MiSeq 600 Gb : 3 billion total 187 million reads/ lane 2 x 100bp 95 Gb: 320 million total : 40 million / lane 2 x 150bp 7 Gb: 15 million total : 2 x 250bp SOLiD 5500xl SOLiD 5500 SOLiD 4 Ion Torrent (1day) Ion Proton Gb Ion PGM 1Gb- 20Mb

26 Application of Next-generation Sequencing DNA RNA (HITS-CLIP) (Bisulfite sequencing) (RNA-Seq) (RNA-Seq)

27 II. Overview of RNA regulation

28 Controlling mrna Transcription DNA Pre-mRNA Processing Translation mrna protein 1) initiation 2) elongation 3) Splicing / Poly (A) 4) Stability 5) Repression 6) Local nucleus RNA polymerase II Chromatin structure cytoplasm mrna Ribosome

29 Why RNA regulation? Human Mouse Worm Vs Vs Coding Genes (DNA) Transcripts (RNA) Nature Feb 15; 409(6822): Nature Dec;420, Science Dec;282(5396):p ,550 22,670 20, ,019 77,177 29,873 RNA variants e.g. Alternative splicing Alternative poly adenylation Ensembl (March 2011) Complexity!

30 Why RNA regulation? : Complexity in regulation Non-coding RNAs 60% Coding genes 40% mirna Other non-coding genes 29% (15,103 ) ( 20,540 ) lincrna 3% mirna 3% (1,351) snrna 4% (1,756 ) (1,944 ) (10,870 ) pseudogene 21% Regulatory functions! Human genes Gencode project version 7 (Nov, 2010) Complexity!

31 Why RNA regulation? : Complexity in regulation Non-coding RNAs 60% Coding genes 40% mirna Other non-coding genes 29% (15,103 ) ( 20,540 ) lincrna 3% mirna 3% (1,351) snrna 4% (1,756 ) (1,944 ) (10,870 ) pseudogene 21% Human genes Gencode project version 7 (Nov, 2010) Regulatory functions! Science Mar 28;319(5871): Complexity!

32 Regulation at RNA level Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

33 III. RNA-Seq

34 NGS for RNA study

35 Coding PolyA mrna Non-PolyA mrna RNA families Replisome DNA DNA repair RNA Structural associated RNA associated Telomeric DNA methylation (pirna) Ribosome associated rrna Non-coding microrna TSS associated Regulatory Anti-sense Enhancer RNA lincrna

36 Construction of small RNA library

37 Identification of small RNAs by NGS

38 The evolution of transcriptomics Hybridization-based RNA-seq is still a technology under active development 1995 P. Brown, et. al. Gene expression profiling using spotted cdna microarray: expression levels of known genes 2002 Affymetrix, whole genome expression profiling using tiling array: identifying and profiling novel genes and splicing variants 2008 many groups, mrna-seq: direct sequencing of mrnas using next generation sequencing techniques (NGS)

39 RNA-seq basics RNA-seq, aka Whole Transcriptome Shotgun Sequencing, WTSS High-throughput sequencing of cdnas reverse transcriptase PCR, cdna library Number of reads from RNA transcripts Direct evidence of transcript variants and gene expression Types of libraries available: Total RNA sequencing Poly(A)+ RNA sequencing Small RNA sequencing

40 Overview of RNA-Seq library I RNA purification mrna selection - Poly(A) selection 1) RT with oligo dt 2) Oligo dt beads - rrna depletion : Beads with rrna probes - Fragmentation -RT (reverse transcriptase) -Linker ligation -PCR -cdna Zeng W. Mortazavi A. Technical considerations for functional sequencing assays. Nat Immunol. Sep;13(9):802-7 (2012)

41 Overview of RNA-Seq library II

42 RNA-Seq library using random priming - Poly(A) selected (Oligo dt beads) or rrna depleted (Beads with rrna probes) random

43 RNA-Seq analysis Haas BJ, Zody MC. Advancing RNA-Seq analysis. Nat Biotechnol. May;28(5): (2010)

44 Problem of mapping RNA-Seq reads

45 Mapping RNA-Seq reads (TopHat)

46 Quatification of transcripts The expression levels of known transcripts (exon model) are measured by the number of reads per kilobase of transcript per million mapped reads (RPKM) Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5, (2008).

47 RPKM (Expression values) Reads Per Kilobase of exon model per Million mapped fragments Nat Methods. 2008, Mapping and quanti fying mammalian transcriptomes by RNA -Seq. Mortazavi A et al. FPKM 10 9 C NL C= the number of reads mapped onto the gene's exons N= total number of reads in the experiment L= the sum of the exons in base pairs.

48 RNA-Seq vs. Microarray RNA-Seq vs. tiling arrays for Saccharomycescerevisiaecells. - Agree for genes with medium levels of expression, - but correlation is very low for genes with either low or high expression levels

49 Performance of RNA-Seq 1. Reproducibility 2. Dynamic range 3. Accuracy and Sensitivity Nature Methods - 5, (2008)

50

51 Regulation at RNA level RNA-Seq Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

52 IV. Ribosome profiling

53 Overview of ribosomal footprinting - freeze ribosomes in the RNA with cycloheximide - digest mrnp with micrococcal nuclease - Isolate ribosomes with the protected mrna fragment by centrifugation

54 Ribosome profiling analysis - Genome annotation: AUG, STOP, micro-peptides, non-coding genes - Measure translational regulation - visualize translation - genome wide Ingolia et al, Science 2009

55 Library of ribosomal footprinting

56 Ribosome profiling analysis - reading frame -coding potential or translation efficiency - non-coding RNAs Ingolia et al, Cell 2011

57 Regulation at RNA level Ribosome profiling Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

58 V. HITS-CLIP (CLIP-Seq) HIgh Throughput Sequencing of Cross-Liniking ImminoPrecipitation

59 What regulates RNA? Transcript Variants RNP (RNA binding protein) Posttranscriptional Regulation RNA-Protein Complex! Argonaute : mirna P5 Nova1 -/- WT E10.5 RNA-Protein complex z Are they critical to functions? Neuron Feb;25(2): Science 305 (5689): , 2004

60 RNA-protein complex: Critical to Functions! What are their RNA targets? Need biochemical method to identify RNA-protein interaction in vivo

61 HITS (high throughput sequencing) CLIP (CLIP-Seq) UV Science. 2003, 302: Nature Nov 27;456(7221):464-9 Genome wide map Cluster RNAse digestion Highthroughput Sequencing IP with Stringent Condition Align tags to genome Protein-RNA complex +UV -UV Labeling RNA SDS-PAGE kda Sequencing RNA purification RT-PCR 46-

62 Application of HITS-CLIP - HITS-CLIP can be used for any RNABP : Genome-wide map for in vivo RNA-protein interactions - How about Argonaute-miRNA complex? : Difficult to identify mirna target sites

63 mirna target recognition mirna (~1500, human) RISC Ago 3 5 Seed pairing rule Partial pairing 6mer seed match Target (~1000) Function Target gene repression - Biological function - Cellular phenotype - Disease Unable to precisely identify mirna target sites!

64 mirna mrna Ago HITS-CLIP UV Developed with Robert Darnell, Rockefeller Univ (2008) Ago 1) Transcriptome-wide mapping of Ago- mrna bound regions 2) Match Ago-mRNA to Ago-miRNA Modified from Patel D, Nature. 2008, 456(7224): Sequencing RNAse digestion Ago Labeling RNA Ago SDS-PAGE -UV +UV AgomRNA AgomiRNA IP with Stringent Condition Nitrocellulose transfer Chi SW, Zang JB, Mele A, and Darnell RB. Nature Jul 23;460(7254):479-86

65 Ago-miRNA-mRNA ternary map Ago antibody1 Ago-miRNA Ago-mRNA Ago antibody1 Ago antibody2 Ago antibody2 A B C D E Ago footprint Top 20 mirna (~90% Ago-miRNAs) Seed Peak Ago-mRNA clusters (~10,000)

66 mirna functions predicted by Ago ternary maps Chi et al, Nature Jul 23;460(7254): Biological function of mirna!

67 Regulation at RNA level HITS-CLIP Transcript Variants Posttranscriptional Regulation -Alternative splicing -Alternative poly(a) : RNA binding proteins -mrna stability -Translational regulation : RNA binding proteins, (Long non-coding RNA, small RNAs)

68 VI. Integrative analysis of RNA regulation

69 RNA-binding proteins and diseases Trends Genet Aug;24(8):

70 Integration of HITS-CLIP maps Understanding combinatorial RNA regulation Chr8 Ago Ptbp2 Elavl Nova

71 Interplay between mirnas and PTB ( Mol Cell, 2007 T. Maniatis ) ( Cell, 2013, X. Fu ) Direct Conversion of Fibroblasts to Neurons by Reprogramming PTB-Regulated MicroRNA Circuits

72 Genomics of RNA regulation : A functional link in the post-genomic era Disease / Phenotype Translation Transcription Protein RNA DNA Proteomics Ribosome profiling Transcriptomics RNA-Seq HITS-CLIP Genomics GWAS, CNV CHIP-Seq Variation / Information

73 Credit: Together Design, London Thank you