RNA-Seq analysis workshop. Zhangjun Fei

Size: px
Start display at page:

Download "RNA-Seq analysis workshop. Zhangjun Fei"

Transcription

1 RNA-Seq analysis workshop Zhangjun Fei

2 Outline Background of RNA-Seq Application of RNA-Seq (what RNA-Seq can do?) Available sequencing platforms and strategies and which one to choose RNA-Seq data analysis Read processing and quality assessment De novo assembly Alignment to reference genome/transcriptome Differentially expressed gene identification

3 Milestones of Transcriptome analysis Year Milestone 1965 Sequence of the first RNA molecule determined 1977 Development of the Northern blot technique and the Sanger sequencing method 1989 Reports of RT-PCR experiments for transcriptome analysis 1991 First high-throughput EST sequencing study 1992 Introduction of Differential Display for the discovery of differentially expressed genes 1995 Reports of the microarray and Serial Analysis of Gene Expression (SAGE) methods 1996 Suppression subtractive hybridization reported 2005 First next-generation sequencing technology (Roche/454) introduced to the market 2006 First transcriptome sequencing studies using a next-generation technology (Roche/454)

4 New sequencing technologies Next generation sequencing Illumina (HiSeq, NovaSeq) Roche/454 Ion Torrent (Ion Proton) ABI/SOLiD Helicos Third generation sequencing Pacific Biosciences Oxford Nanopore Complete Genomics Desktop sequencer Ion Torrent PGM Illumina MiSeq 454 GS Junior

5 RNA-Seq applications

6 RNA-Seq application Accelerating gene discovery and gene family expansion Improving genome annotation identifying novel genes and gene models Identifying tissue/condition specific alternative splicing events

7 RNA-Seq applications Alternative splicing Short reads can t provide the complete structure of an isoform

8 RNA-Seq applications PacBio long reads SEQUEL SYSTEM

9 RNA-Seq applications PacBio long reads error correction

10 Each sample needs four libraries with different insert sizes: 1-2K, 2-3K, 3-5K, >5K RNA-Seq applications

11 RNA-Seq applications

12 RNA-Seq applications SNP and SSR marker identification facilitating breeding SNP discovery in RNA-Seq is more challenging than in DNA: Varying levels of coverage depth False discovery around splicing junctions due to incorrect mapping

13 RNA-Seq applications Phylogenetic relationship, population structure Xu et al. (2017) Draft genome of spinach Spinacia oleracea and transcriptome diversity of 120 Spinacia accessions. Nature Communications

14 RNA-Seq applications selective sweep Xu et al. (2017) Draft genome of spinach Spinacia oleracea and transcriptome diversity of 120 Spinacia accessions. Nature Communications

15 RNA-Seq applications Expression QTL (eqtl) network A melon RIL population (Nurit Katzir, unpublished)

16 RNA-Seq applications Mutant gene cloning (BSA RNA-Seq) white fruit x yellow fruit 132 of 189 SNPs in this region F1 F2 kb F3 white pool yellow pool RNA-Seq SNPs and DE genes Feder et al. (2015) A Kelch domain-containing F-box coding gene negatively regulates flavonoid accumulation in Cucumis melo L. Plant Physiol 169:

17 RNA-Seq applications GWAS Distribution of mapped markers associating with the erucic acid trait

18 RNA-Seq applications Genomic imprinting and allele specific expression

19 RNA-Seq applications Regulatory mode of gene expression in F1 hybrids Provided by Nabil Elrouby C. maxima, Rimu C. moschata, Rifu x Fruit Root Leaf stem The interspecific hybrid, Shintosa 62-80% trans, 13-24% cis

20 RNA-Seq applications Root Root Fruit Fruit Leaf Stem Root Cma F1 Cmo Response to heat (GO: ) Cma F1 Cmo Carotenoid biosynthesis (GO: ) Cma F1 Cmo Defense response (GO: ) Genes exhibiting dominant and transgressive expression patterns in Shinotasa are enriched with those involved in defense response, response to heat, carotenoid biosynthesis and photosynthesis Cma F1 Cmo Cma F1 Cmo Cma F1 Cmo Cma F1 Cmo Photosynthesis (GO: )

21 RNA-Seq applications non-coding RNAs (lncrna, lincrnas )

22 Gene fusion RNA-Seq applications

23 Gene expression profiling RNA-Seq applications

24 RNA-Seq vs microarray Problem of microarray Cross-hybridization Stable probe secondary structures high background (e.g., nonspecific hybridization) limited dynamic range (e.g., nonlinear and saturable hybridization kinetics) RNA-Seq (digital expression analysis) allow direct enumeration of transcript molecules digital expression data are absolute so data can be directly compared across different experiments and laboratories without the need for extensive internal controls or other experimental manipulation provide open systems that allow detection of previously uncharacterized transcripts, as well as rare transcripts

25 RNA-Seq applications Summary Accelerating gene discovery and gene family expansion Improving genome annotation identifying novel genes and gene models Identifying tissue/condition specific alternative splicing events SNP and SSR marker identification Phylogenetic relationship, population structure, selective sweep Expression QTL analysis Mutant gene cloning (BSA RNA-Seq) Genome (Transcriptome)-wide associate study Genomic imprinting and allele specific expression analysis Identifying non-coding RNAs (lncrna, lincrnas ) Identifying gene fusion events Gene expression profiling analysis

26 Sequencing platforms and strategies

27 Sequencing platforms Next generation sequencing Illumina (HiSeq, NovaSeq) Ion Torrent (Ion Proton) ABI/SOLiD Roche/454 Helicos Third generation sequencing Pacific Biosciences Oxford Nanopore Complete Genomics Desktop sequencer Ion Torrent PGM Illumina MiSeq Illumina NextSeq 454 GS Junior

28 Sequencing platforms Illumina HiSeq 2000/2500 High-output mode ( M reads/ read pairs per lane) Single-end, 50, 100 bp Paired-end, 2 x 125bp Run time: 2-11 days Rapid run mode ( M reads/ read pairs per lane) Single-end, 50, 100, 150 bp Paired-end, 2 x 100 bp Paired-end, 2 x 150 bp Paired-end, 2 x 200 bp Paired-end, 2 x 250 bp Runtime: 7-40 hours Illumina MiSeq 50 bp sequencing kit 300 bp sequencing kit (e.g. 2 x 150 bp) 500 bp sequencing kit (e.g. 2 x 250 bp) 150 bp sequencing kit (e.g. 2 x 75 bp) 600 bp sequencing kit (e.g. 2 x 300 bp) Run time: 5-65 hours

29 Sequencing platforms Single-end or paired-end For gene expression analysis with a reference genome, singleend is enough For de novo assembly, genome annotation, alternative splicing identification, it s better to use paired-end Strand-specific or non strand-specific Always choose strand-specific RNA-Seq if possible

30 Strand-specific RNA sequencing More accurately determine the expression level Significantly reduce false positives in identifying alternatively spliced transcripts Identify antisense transcripts another level of gene regulation in important biological processes Determine the transcribed strand of non-coding RNAs (e.g. lincrnas)

31 Strand-specific RNA-Seq library construction

32 High throughput ssrna-seq Up to 96 libraries in two days Paired-end compatible multiplexing

33 Strand specific RNA sequencing Strand-specific sequencing can produce more accurate digital gene expression data when compared to the conventional Illumina RNA-Seq.

34 Strand specific RNA sequencing

35 Strand specific RNA sequencing Antisense transcript cis-natural antisense transcripts (cis-nat) 1340 cis-nat pairs in Arabidopsis (Wang et al., 2005) 687 cis-nat pairs in rice (Osato et al., 2003) trans-natural antisense transcripts (trans-nat) 1,320 trans-nat pairs in Arabidopsis (Wang et al., 2006) function alternative splicing RNA editing DNA methylation genomic imprinting X-chromosome inactivation

36 Strand specific RNA sequencing Antisense transcript LEFL2040O reads 259 reads LEFL2002DC reads 1189 reads

37 lincrna (determine the sense strand) Strand specific RNA sequencing

38 RNA-Seq strategies Sequencing depth and no. of biological replicates Most frequently asked question How many samples should I multiplex in one lane? or How many reads should I generate for each of my samples? Depend on $$$ Depends on the quality of the library and the reads rrna, trna, organelle, adaptor contamination No. of biological replicates for expression call At least three Effects of read numbers on expression call Mature green fruit library (22M reads) Randomly select , 1-22M reads from the library and calculate gene expression for each dataset (20 different randomizations)

39 RNA-Seq (multiplexing) 0.1M 1M 2M r= r= r= M 5M 10M r= r= r= Mature green fruit, 22M

40 RNA-Seq (multiplexing)

41 RNA-Seq (multiplexing)

42 Common problems in RNA-Seq experimental design Without involvement of a bioinformatics expert in the experimental design. This could cause serious problems for downstream data analysis if the experimental design has flaws. No biological replicates. Currently most journals requires at least three biological replicates. Biological replicated samples collected at different time or different places. For biotic/abiotic stress experiment, no mock control. All treatments are compared to non-treated samples (time 0). (Circadian clock genes, genes differentially expressed due to different environmental factors and developmental stages ) Directly compare different genotypes with totally different genetic background. Genes differentially expressed due to other phenotypes, not the interested one.

43 RNA-Seq data analysis

44 Read quality control (fastqc) Read processing

45 Read quality control (fastqc) Read processing

46 Read quality control (fastqc) Read processing

47 Read processing Remove adaptors and all possible contaminations: rrna, trna, organelle (chloroplast and mitochondrion) RNAs, virus, low quality sequences Arabidopsis 25S ribosomal RNA vs GenBank nr protein database

48 Read processing Remove contaminated sequences Align reads to rrna and organelle sequence database (bowtie or BWA) Affect RPKM values if not removed Trim adaptor and low quality sequences FASTX-Toolkit AdapterRemoval Trimmomatic Cutadapt Condetri ERNE-filter Prinseq SolexaQA-bwa Sickle

49 Read processing

50 RNA-Seq data analysis De novo transcriptome assembly Long reads (454/Sanger) overlap-layout-consensus strategy Short reads (Illumina) de Bruijn graph approach Martin & Wang, 2011

51 De novo transcriptome assembly Long reads (454/Sanger) CAP3 ( TGICL/CAP3 ( MIRA ( Newbler (-cdna) Phrap ( Two major problems in existing EST assembly programs and unigene databases: 1) Large portion of different transcripts (mainly alternative spliced transcripts and paralogs) are incorrectly assembled into same transcripts type I error (false positives) 2) Large portion of nearly identical sequences are not assembled into one transcript type II error (false negatives)

52 Example of type I assembly error (paralog) In DFCI Tomato Gene Index, AW is a member of TC Sequence identity between AW and TC232370: 91.5% AW is aligned to tomato chromosome 4 TC is aligned to tomato chromosome 11

53 Example of type I assembly error (alternative splicing) In DFCI Tomato Gene Index, U95008 is a member of TC226520

54 Example of type II assembly error In DFCI Tomato Gene Index, two unigenes, TC and TC221582, are identical

55 iassembler iterative assemblies (assembly of assemblies) using MIRA and CAP3 (four cycles of MIRA followed by one cycle of CAP3) reduce errors that nearly identical sequences are not assembled Further assembly error identification 1) comparing unigene sequences against themselves to identify nearly identical sequences (type II errors) 2) aligning EST sequences to their corresponding unigene sequences to identify mis-assembled ESTs (type I errors) Both type I and II assembly errors are corrected automatically by the program Unigene base errors are then corrected based on the resulting SAM files

56 De novo transcriptome assembly Short reads (Illumina) Trinity Trans-ABySS Oases/velvet SOAPdenovo-Trans

57 De novo transcriptome assembly Reference-guided de novo assembly Cufflink IsoLasso Scripture Traph StringTie

58 De novo transcriptome assembly Trinity

59 De novo transcriptome assembly Post processing of de novo assemblies Remove contaminations (bacteria, virus, fungus ) Remove assembly errors (mainly redundancy) Remove errors caused by library preparation (incomplete digestion of dutp containing 2 nd strand during strandspecific RNA-Seq library construction)

60 De novo transcriptome assembly blastx Remove contamination blastn

61 De novo transcriptome assembly Remove contamination DeconSeq SeqClean

62 De novo transcriptome assembly Remove type II assembly error (redundancy) iassembler

63 De novo transcriptome assembly Remove transcripts derived from incomplete 2 nd digestion Gene ID length antisense sense UN comp38294_c0_seq removed

64 De novo transcriptome assembly High number of assembled transcripts Alternative splicing Non-coding RNAs Incomplete coverage of full length transcripts DFCI gene index

65 RNA-Seq data analysis Alignment Align reads to reference genome TopHat HISAT STAR Alignment reads to reference transcriptome bowtie BWA If you have a reference genome, it s not a good idea to align the reads to the predicted CDS or cdna, due to the incomplete prediction of UTRs and alternative splicing

66 RNA-Seq data analysis Visualization tools Integrative Genomics Viewer (IGV)

67 RNA-Seq data analysis Read counting and normalization Read counting htseq-count samtools (samtools view c) Normalization RPKM: reads per kilobase of exon model per million mapped reads FPKM: fragments per kilobase of exon model per million mapped reads

68 RNA-Seq data analysis Quality control biological replicates Sample correlation matrix

69 RNA-Seq data analysis Differentially expressed gene detection Pair-wise comparison DESeq edger Time course data edger first data transformation using getvariancestabilizeddata function in DESeq (to get normal distribution). Then DE gene identification using F tests in LIMMA Multiple test correction False Discovery Rate (FDR) q value

70 RNA-Seq data analysis Differentially expressed gene detection

71

RNA-Seq analysis workshop

RNA-Seq analysis workshop RNA-Seq analysis workshop Zhangjun Fei Boyce Thompson Institute for Plant Research USDA Robert W. Holley Center for Agriculture and Health Cornell University Outline Background of RNA-Seq Application of

More information

RNA-seq analysis worksop

RNA-seq analysis worksop RNA-seq analysis worksop Zhangjun Fei Boyce Thompson Institute for Plant Research USDA Robert W. Holley Center for Agriculture and Health Cornell University Outline Background of RNA-seq Application of

More information

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Transcriptomics analysis with RNA seq: an overview Frederik Coppens Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)

More information

Experimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis

Experimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis -Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification

More information

RNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS)

RNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS) RNA-sequencing Next Generation sequencing analysis 2016 Anne-Mette Bjerregaard Center for biological sequence analysis (CBS) Terms and definitions TRANSCRIPTOME The full set of RNA transcripts and their

More information

Transcriptome analysis

Transcriptome analysis Statistical Bioinformatics: Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018 Outline: a) How to assess the quality of sequencing reads? b) How to normalize

More information

Introduction to RNA-Seq

Introduction to RNA-Seq Introduction to RNA-Seq Monica Britton, Ph.D. Bioinformatics Analyst September 2014 Workshop Overview of Today s Activities Morning RNA-Seq Concepts, Terminology, and Work Flows Two-Condition Differential

More information

Introduction to RNA-Seq

Introduction to RNA-Seq Introduction to RNA-Seq Monica Britton, Ph.D. Sr. Bioinformatics Analyst March 2015 Workshop Overview of RNA-Seq Activities RNA-Seq Concepts, Terminology, and Work Flows Using Single-End Reads and a Reference

More information

1. Introduction Gene regulation Genomics and genome analyses

1. Introduction Gene regulation Genomics and genome analyses 1. Introduction Gene regulation Genomics and genome analyses 2. Gene regulation tools and methods Regulatory sequences and motif discovery TF binding sites Databases 3. Technologies Microarrays Deep sequencing

More information

De novo assembly in RNA-seq analysis.

De novo assembly in RNA-seq analysis. De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct

More information

Sequence Analysis 2RNA-Seq

Sequence Analysis 2RNA-Seq Sequence Analysis 2RNA-Seq Lecture 10 2/21/2018 Instructor : Kritika Karri kkarri@bu.edu Transcriptome Entire set of RNA transcripts in a given cell for a specific developmental stage or physiological

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Wheat CAP Gene Expression with RNA-Seq

Wheat CAP Gene Expression with RNA-Seq Wheat CAP Gene Expression with RNA-Seq July 9 th -13 th, 2018 Overview of the workshop, Alina Akhunova http://www.ksre.k-state.edu/igenomics/workshops/ RNA-Seq Workshop Activities Lectures Laboratory Molecular

More information

ChIP-seq and RNA-seq

ChIP-seq and RNA-seq ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)

More information

Eucalyptus gene assembly

Eucalyptus gene assembly Eucalyptus gene assembly ACGT Plant Biotechnology meeting Charles Hefer Bioinformatics and Computational Biology Unit University of Pretoria October 2011 About Eucalyptus Most valuable and widely planted

More information

Applications of short-read

Applications of short-read Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ Sequencing applications RNA-Seq includes experiments

More information

CBC Data Therapy. Metatranscriptomics Discussion

CBC Data Therapy. Metatranscriptomics Discussion CBC Data Therapy Metatranscriptomics Discussion Metatranscriptomics Extract RNA, subtract rrna Sequence cdna QC Gene expression, function Institute for Systems Genomics: Computational Biology Core bioinformatics.uconn.edu

More information

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq Sequencing applications Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ RNA-Seq includes experiments

More information

RNA-SEQUENCING ANALYSIS

RNA-SEQUENCING ANALYSIS RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS

More information

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Abundance RNA is... Diverse Dynamic Central DNA rrna Epigenetics trna RNA mrna Time Protein Abundance

More information

RNA-Seq de novo assembly training

RNA-Seq de novo assembly training RNA-Seq de novo assembly training Training session aims Give you some keys elements to look at during read quality check. Transcriptome assembly is not completely a strait forward process : Multiple strategies

More information

measuring gene expression December 5, 2017

measuring gene expression December 5, 2017 measuring gene expression December 5, 2017 transcription a usually short-lived RNA copy of the DNA is created through transcription RNA is exported to the cytoplasm to encode proteins some types of RNA

More information

measuring gene expression December 11, 2018

measuring gene expression December 11, 2018 measuring gene expression December 11, 2018 Intervening Sequences (introns): how does the cell get rid of them? Splicing!!! Highly conserved ribonucleoprotein complex recognizes intron/exon junctions and

More information

RNA-Seq Workshop AChemS Sunil K Sukumaran Monell Chemical Senses Center Philadelphia

RNA-Seq Workshop AChemS Sunil K Sukumaran Monell Chemical Senses Center Philadelphia RNA-Seq Workshop AChemS 2017 Sunil K Sukumaran Monell Chemical Senses Center Philadelphia Benefits & downsides of RNA-Seq Benefits: High resolution, sensitivity and large dynamic range Independent of prior

More information

RNA-Seq data analysis course September 7-9, 2015

RNA-Seq data analysis course September 7-9, 2015 RNA-Seq data analysis course September 7-9, 2015 Peter-Bram t Hoen (LUMC) Jan Oosting (LUMC) Celia van Gelder, Jacintha Valk (BioSB) Anita Remmelzwaal (LUMC) Expression profiling DNA mrna protein Comprehensive

More information

Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ),

Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ), Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ), 2012-01-26 What is a gene What is a transcriptome History of gene expression assessment RNA-seq RNA-seq analysis

More information

Introduction to RNA sequencing

Introduction to RNA sequencing Introduction to RNA sequencing Bioinformatics perspective Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden November 2017 Olga (NBIS) RNA-seq November 2017 1 / 49 Outline Why sequence

More information

How to deal with your RNA-seq data?

How to deal with your RNA-seq data? How to deal with your RNA-seq data? Rachel Legendre, Thibault Dayris, Adrien Pain, Claire Toffano-Nioche, Hugo Varet École de bioinformatique AVIESAN-IFB 2017 1 Rachel Legendre Bioinformatics 27/11/2018

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

RNA-Seq Software, Tools, and Workflows

RNA-Seq Software, Tools, and Workflows RNA-Seq Software, Tools, and Workflows Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 1, 2016 Some mrna-seq Applications Differential gene expression analysis Transcriptional profiling Assumption:

More information

Transcriptome Assembly, Functional Annotation (and a few other related thoughts)

Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types

More information

RNA-Sequencing analysis

RNA-Sequencing analysis RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges

More information

RNA-Seq Analysis. Simon Andrews, Laura v

RNA-Seq Analysis. Simon Andrews, Laura v RNA-Seq Analysis Simon Andrews, Laura Biggins simon.andrews@babraham.ac.uk @simon_andrews v2018-10 RNA-Seq Libraries rrna depleted mrna Fragment u u u u NNNN Random prime + RT 2 nd strand synthesis (+

More information

Introduction to transcriptome analysis using High Throughput Sequencing technologies. D. Puthier 2012

Introduction to transcriptome analysis using High Throughput Sequencing technologies. D. Puthier 2012 Introduction to transcriptome analysis using High Throughput Sequencing technologies D. Puthier 2012 A typical RNA-Seq experiment Library construction Protocol variations Fragmentation methods RNA: nebulization,

More information

Bioinformatics Advice on Experimental Design

Bioinformatics Advice on Experimental Design Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics

More information

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Applications in Genome Studies Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Protocols Next generation sequencing protocol cdna, not RNA sequencing

More information

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017

Next Generation Sequencing. Jeroen Van Houdt - Leuven 13/10/2017 Next Generation Sequencing Jeroen Van Houdt - Leuven 13/10/2017 Landmarks in DNA sequencing 1953 Discovery of DNA double helix structure 1977 A Maxam and W Gilbert "DNA seq by chemical degradation" F Sanger"DNA

More information

RNA Sequencing: Experimental Planning and Data Analysis. Nadia Atallah September 12, 2018

RNA Sequencing: Experimental Planning and Data Analysis. Nadia Atallah September 12, 2018 RNA Sequencing: Experimental Planning and Data Analysis Nadia Atallah September 12, 2018 A Next Generation Sequencing (NGS) Refresher Became commercially available in 2005 Construction of a sequencing

More information

Sanger vs Next-Gen Sequencing

Sanger vs Next-Gen Sequencing Tools and Algorithms in Bioinformatics GCBA815/MCGB815/BMI815, Fall 2017 Week-8: Next-Gen Sequencing RNA-seq Data Analysis Babu Guda, Ph.D. Professor, Genetics, Cell Biology & Anatomy Director, Bioinformatics

More information

Third Generation Sequencing

Third Generation Sequencing Third Generation Sequencing By Mohammad Hasan Samiee Aref Medical Genetics Laboratory of Dr. Zeinali History of DNA sequencing 1953 : Discovery of DNA structure by Watson and Crick 1973 : First sequence

More information

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University

RNA-Seq. Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University RNA-Seq Joshua Ainsley, PhD Postdoctoral Researcher Lab of Leon Reijmers Neuroscience Department Tufts University joshua.ainsley@tufts.edu Day five Alternative splicing Assembly RNA edits Alternative splicing

More information

The Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience

The Genome Analysis Centre. Building Excellence in Genomics and Computa5onal Bioscience Building Excellence in Genomics and Computa5onal Bioscience Resequencing approaches Sarah Ayling Crop Genomics and Diversity sarah.ayling@tgac.ac.uk Why re- sequence plants? To iden

More information

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012

Plant Breeding and Agri Genomics. Team Genotypic 24 November 2012 Plant Breeding and Agri Genomics Team Genotypic 24 November 2012 Genotypic Family: The Best Genomics Experts Under One Roof 10 PhDs and 78 MSc MTech BTech ABOUT US! Genotypic is a Genomics company, which

More information

RNA-Seq with the Tuxedo Suite

RNA-Seq with the Tuxedo Suite RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop The Basic Tuxedo Suite References Trapnell C, et al. 2009 TopHat: discovering splice junctions with

More information

Statistical Genomics and Bioinformatics Workshop. Genetic Association and RNA-Seq Studies

Statistical Genomics and Bioinformatics Workshop. Genetic Association and RNA-Seq Studies Statistical Genomics and Bioinformatics Workshop: Genetic Association and RNA-Seq Studies RNA Seq and Differential Expression Analysis Brooke L. Fridley, PhD University of Kansas Medical Center 1 Next-generation

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter

NGS technologies: a user s guide. Karim Gharbi & Mark Blaxter NGS technologies: a user s guide Karim Gharbi & Mark Blaxter genepool-manager@ed.ac.uk Natural history of sequencing 2 Brief history of sequencing 100s bp throughput 100 Gb 1977 1986 1995 1999 2005 2007

More information

Total RNA isola-on End Repair of double- stranded cdna

Total RNA isola-on End Repair of double- stranded cdna Total RNA isola-on End Repair of double- stranded cdna mrna Isola8on using Oligo(dT) Magne8c Beads AAAAAAA A Adenyla8on (A- Tailing) A AAAAAAAAAAAA TTTTTTTTT AAAAAAA TTTTTTTTT TTTTTTTT TTTTTTTTT AAAAAAAA

More information

Gene Expression Technology

Gene Expression Technology Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene

More information

Computational & Quantitative Biology Lecture 6 RNA Sequencing

Computational & Quantitative Biology Lecture 6 RNA Sequencing Peter A. Sims Dept. of Systems Biology Dept. of Biochemistry & Molecular Biophysics Sulzberger Columbia Genome Center October 27, 2014 Computational & Quantitative Biology Lecture 6 RNA Sequencing We Have

More information

Background Wikipedia Lee and Mahadavan, JCB, 2009 History (Platform Comparison) P Park, Nature Review Genetics, 2009 P Park, Nature Reviews Genetics, 2009 Rozowsky et al., Nature Biotechnology, 2009

More information

10/06/2014. RNA-Seq analysis. With reference assembly. Cormier Alexandre, PhD student UMR8227, Algal Genetics Group

10/06/2014. RNA-Seq analysis. With reference assembly. Cormier Alexandre, PhD student UMR8227, Algal Genetics Group RNA-Seq analysis With reference assembly Cormier Alexandre, PhD student UMR8227, Algal Genetics Group Summary 2 Typical RNA-seq workflow Introduction Reference genome Reference transcriptome Reference

More information

Long and short/small RNA-seq data analysis

Long and short/small RNA-seq data analysis Long and short/small RNA-seq data analysis GEF5, 4.9.2015 Sami Heikkinen, PhD, Dos. Topics 1. RNA-seq in a nutshell 2. Long vs short/small RNA-seq 3. Bioinformatic analysis work flows GEF5 / Heikkinen

More information

RNA standards v May

RNA standards v May Standards, Guidelines and Best Practices for RNA-Seq: 2010/2011 I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable properties for quantification,

More information

Introduction of RNA-Seq Analysis

Introduction of RNA-Seq Analysis Introduction of RNA-Seq Analysis Jiang Li, MS Bioinformatics System Engineer I Center for Quantitative Sciences(CQS) Vanderbilt University September 21, 2012 Goal of this talk 1. Act as a practical resource

More information

Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies

Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department

More information

Single Cell Transcriptomics scrnaseq

Single Cell Transcriptomics scrnaseq Single Cell Transcriptomics scrnaseq Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Purpose The sequencing of

More information

Analysis of RNA-seq Data

Analysis of RNA-seq Data Analysis of RNA-seq Data A physicist and an engineer are in a hot-air balloon. Soon, they find themselves lost in a canyon somewhere. They yell out for help: "Helllloooooo! Where are we?" 15 minutes later,

More information

Genome annotation & EST

Genome annotation & EST Genome annotation & EST What is genome annotation? The process of taking the raw DNA sequence produced by the genome sequence projects and adding the layers of analysis and interpretation necessary

More information

Analysis of RNA-seq Data. Bernard Pereira

Analysis of RNA-seq Data. Bernard Pereira Analysis of RNA-seq Data Bernard Pereira The many faces of RNA-seq Applications Discovery Find new transcripts Find transcript boundaries Find splice junctions Comparison Given samples from different experimental

More information

Next Generation Sequencing. Tobias Österlund

Next Generation Sequencing. Tobias Österlund Next Generation Sequencing Tobias Österlund tobiaso@chalmers.se NGS part of the course Week 4 Friday 13/2 15.15-17.00 NGS lecture 1: Introduction to NGS, alignment, assembly Week 6 Thursday 26/2 08.00-09.45

More information

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis Experimental Design Dr. Matthew L. Settles Genome Center University of California, Davis settles@ucdavis.edu What is Differential Expression Differential expression analysis means taking normalized sequencing

More information

Advanced RNA-Seq course. Introduction. Peter-Bram t Hoen

Advanced RNA-Seq course. Introduction. Peter-Bram t Hoen Advanced RNA-Seq course Introduction Peter-Bram t Hoen Expression profiling DNA mrna protein Comprehensive RNA profiling possible: determine the abundance of all mrna molecules in a cell / tissue Expression

More information

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD)

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD) Analysis of RNA-seq Data Feb 8, 2017 Peikai CHEN (PHD) Outline What is RNA-seq? What can RNA-seq do? How is RNA-seq measured? How to process RNA-seq data: the basics How to visualize and diagnose your

More information

Research school methods seminar Genomics and Transcriptomics

Research school methods seminar Genomics and Transcriptomics Research school methods seminar Genomics and Transcriptomics Stephan Klee 19.11.2014 2 3 4 5 Genetics, Genomics what are we talking about? Genetics and Genomics Study of genes Role of genes in inheritence

More information

Form for publishing your article on BiotechArticles.com this document to

Form for publishing your article on BiotechArticles.com  this document to Your Article: Article Title (3 to 12 words) Article Summary (In short - What is your article about Just 2 or 3 lines) Category Transcriptomics sequencing and lncrna Sequencing Analysis: Quality Evaluation

More information

Next Gen Sequencing. Expansion of sequencing technology. Contents

Next Gen Sequencing. Expansion of sequencing technology. Contents Next Gen Sequencing Contents 1 Expansion of sequencing technology 2 The Next Generation of Sequencing: High-Throughput Technologies 3 High Throughput Sequencing Applied to Genome Sequencing (TEDed CC BY-NC-ND

More information

An introduction to RNA-seq. Nicole Cloonan - 4 th July 2018 #UQWinterSchool #Bioinformatics #GroupTherapy

An introduction to RNA-seq. Nicole Cloonan - 4 th July 2018 #UQWinterSchool #Bioinformatics #GroupTherapy An introduction to RNA-seq Nicole Cloonan - 4 th July 2018 #UQWinterSchool #Bioinformatics #GroupTherapy The central dogma Genome = all DNA in an organism (genotype) Transcriptome = all RNA (molecular

More information

Overview of Next Generation Sequencing technologies. Céline Keime

Overview of Next Generation Sequencing technologies. Céline Keime Overview of Next Generation Sequencing technologies Céline Keime keime@igbmc.fr Next Generation Sequencing < Second generation sequencing < General principle < Sequencing by synthesis - Illumina < Sequencing

More information

RNA Seq: Methods and Applica6ons. Prat Thiru

RNA Seq: Methods and Applica6ons. Prat Thiru RNA Seq: Methods and Applica6ons Prat Thiru 1 Outline Intro to RNA Seq Biological Ques6ons Comparison with Other Methods RNA Seq Protocol RNA Seq Applica6ons Annota6on Quan6fica6on Other Applica6ons Expression

More information

RNA-Seq Module 2 From QC to differential gene expression.

RNA-Seq Module 2 From QC to differential gene expression. RNA-Seq Module 2 From QC to differential gene expression. Ying Zhang Ph.D, Informatics Analyst Research Informatics Support System (RISS) MSI Apr. 24, 2012 RNA-Seq Tutorials Tutorial 1: Introductory (Mar.

More information

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing

More information

Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC)

Application of NGS (nextgeneration. for studying RNA regulation. Sung Wook Chi. Sungkyunkwan University (SKKU) Samsung Medical Center (SMC) Application of NGS (nextgeneration sequencing) for studying RNA regulation Samsung Advanced Institute of Heath Sciences and Technology (SAIHST) Sungkyunkwan University (SKKU) Samsung Research Institute

More information

Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA. March 2, Steven R. Kain, Ph.D. ABRF 2013

Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA. March 2, Steven R. Kain, Ph.D. ABRF 2013 Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA March 2, 2013 Steven R. Kain, Ph.D. ABRF 2013 NuGEN s Core Technologies Selective Sequence Priming Nucleic Acid Amplification

More information

Application of NGS (next-generation sequencing) for studying RNA regulation

Application of NGS (next-generation sequencing) for studying RNA regulation Application of NGS (next-generation sequencing) for studying RNA regulation SAIHST, SKKU Sung Wook Chi In this lecturre Intro: Sequencing Technology NGS (Next-Generation Sequencing) Sequencing of RNAs

More information

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing Complete Report Catalogue # and Service: IR16001 rrna depletion (human, mouse, or rat) IR11081 Total RNA Sequencing (80 million reads, 2x75 bp PE) Xxxxxxx - xxxxxxxxxxxxxxxxxxxxxx

More information

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods

More information

RNA-seq Data Analysis

RNA-seq Data Analysis Lecture 3. Clustering; Function/Pathway Enrichment analysis RNA-seq Data Analysis Qi Sun Bioinformatics Facility Biotechnology Resource Center Cornell University Lecture 1. Map RNA-seq read to genome Lecture

More information

Mapping strategies for sequence reads

Mapping strategies for sequence reads Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements

More information

Matthew Tinning Australian Genome Research Facility. July 2012

Matthew Tinning Australian Genome Research Facility. July 2012 Next-Generation Sequencing: an overview of technologies and applications Matthew Tinning Australian Genome Research Facility July 2012 History of Sequencing Where have we been? 1869 Discovery of DNA 1909

More information

Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing

Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing Comparison and Evaluation of Cotton SNPs Developed by Transcriptome, Genome Reduction on Restriction Site Conservation and RAD-based Sequencing Hamid Ashrafi Amanda M. Hulse, Kevin Hoegenauer, Fei Wang,

More information

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database

More information

Contact us for more information and a quotation

Contact us for more information and a quotation GenePool Information Sheet #1 Installed Sequencing Technologies in the GenePool The GenePool offers sequencing service on three platforms: Sanger (dideoxy) sequencing on ABI 3730 instruments Illumina SOLEXA

More information

Introduction to NGS analyses

Introduction to NGS analyses Introduction to NGS analyses Giorgio L Papadopoulos Institute of Molecular Biology and Biotechnology Bioinformatics Support Group 04/12/2015 Papadopoulos GL (IMBB, FORTH) IMBB NGS Seminar 04/12/2015 1

More information

Welcome to the NGS webinar series

Welcome to the NGS webinar series Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic

More information

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput Chapter 11: Gene Expression The availability of an annotated genome sequence enables massively parallel analysis of gene expression. The expression of all genes in an organism can be measured in one experiment.

More information

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SETTLES@UCDAVIS.EDU Bioinformatics Core Genome Center UC Davis BIOINFORMATICS.UCDAVIS.EDU DISCLAIMER This talk/workshop

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

Quantifying gene expression

Quantifying gene expression Quantifying gene expression Genome GTF (annotation)? Sequence reads FASTQ FASTQ (+reference transcriptome index) Quality control FASTQ Alignment to Genome: HISAT2, STAR (+reference genome index) (known

More information

COMPUTATIONAL PREDICTION AND CHARACTERIZATION OF A TRANSCRIPTOME USING CASSAVA (MANIHOT ESCULENTA) RNA-SEQ DATA

COMPUTATIONAL PREDICTION AND CHARACTERIZATION OF A TRANSCRIPTOME USING CASSAVA (MANIHOT ESCULENTA) RNA-SEQ DATA COMPUTATIONAL PREDICTION AND CHARACTERIZATION OF A TRANSCRIPTOME USING CASSAVA (MANIHOT ESCULENTA) RNA-SEQ DATA AOBAKWE MATSHIDISO, SCOTT HAZELHURST, CHRISSIE REY Wits Bioinformatics, University of the

More information

Analysis of Differential Gene Expression in Cattle Using mrna-seq

Analysis of Differential Gene Expression in Cattle Using mrna-seq Analysis of Differential Gene Expression in Cattle Using mrna-seq mrna-seq A rough guide for green horns Animal and Grassland Research and Innovation Centre Animal and Bioscience Research Department Teagasc,

More information

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies Eric T. Weimer, PhD, D(ABMLI) Assistant Professor, Pathology & Laboratory Medicine, UNC School of Medicine Director, Molecular Immunology Associate Director, Clinical Flow Cytometry, HLA, and Immunology

More information

SCIENCE CHINA Life Sciences. Comparative analysis of de novo transcriptome assembly

SCIENCE CHINA Life Sciences. Comparative analysis of de novo transcriptome assembly SCIENCE CHINA Life Sciences SPECIAL TOPIC February 2013 Vol.56 No.2: 156 162 RESEARCH PAPER doi: 10.1007/s11427-013-4444-x Comparative analysis of de novo transcriptome assembly CLARKE Kaitlin 1, YANG

More information

Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento

Next generation sequencing techniques Toma Tebaldi Centre for Integrative Biology University of Trento Next generation sequencing techniques" Toma Tebaldi Centre for Integrative Biology University of Trento Mattarello September 28, 2009 Sequencing Fundamental task in modern biology read the information

More information

RNA

RNA RNA sequencing Michael Inouye Baker Heart and Diabetes Institute Univ of Melbourne / Monash Univ Summer Institute in Statistical Genetics 2017 Integrative Genomics Module Seattle @minouye271 www.inouyelab.org

More information

Analysis Datasheet Exosome RNA-seq Analysis

Analysis Datasheet Exosome RNA-seq Analysis Analysis Datasheet Exosome RNA-seq Analysis Overview RNA-seq is a high-throughput sequencing technology that provides a genome-wide assessment of the RNA content of an organism, tissue, or cell. Small

More information

Finding Genes with Genomics Technologies

Finding Genes with Genomics Technologies PLNT2530 Plant Biotechnology (2018) Unit 7 Finding Genes with Genomics Technologies Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License

More information

Assessing De-Novo Transcriptome Assemblies

Assessing De-Novo Transcriptome Assemblies Assessing De-Novo Transcriptome Assemblies Shawn T. O Neil Center for Genome Research and Biocomputing Oregon State University Scott J. Emrich University of Notre Dame 100K Contigs, Perfect 1M Contigs,

More information

SCALABLE, REPRODUCIBLE RNA-Seq

SCALABLE, REPRODUCIBLE RNA-Seq SCALABLE, REPRODUCIBLE RNA-Seq SCALABLE, REPRODUCIBLE RNA-Seq Advances in the RNA sequencing workflow, from sample preparation through data analysis, are enabling deeper and more accurate exploration

More information