Demo of mrna NGS Concluding Report

Size: px
Start display at page:

Download "Demo of mrna NGS Concluding Report"

Transcription

1 Demo of mrna NGS Concluding Report Project: Demo Report Customer: Dr. Demo Company/Institute: Exiqon AS Date: 09-Mar-2015 Performed by Exiqon A/S Company Reg.No.(CVR) Skelstedet 16 DK-2950, Vedbæk Denmark

2 Additional files provided with this report: Content Sampleinfo.xlsx Pictures Data tables (Spreadsheet.tsv files) Description Overview of samples and groups High resolution copies of pictures presented in this report (QC plots, volcano plots, heat maps and PCAs). All tables for genes, isoforms, CDS and TSS. Count data for all samples in tsv tables. Normalized data for all samples (FPKM) in tsv tables. Differential expression of all relevant comparisons in tsv tables. Other relevant.tsv tables (fx attribute tables) and GO analysis tables. Table 1. List of additional data files included with this report Files provided on disc drive: An containing information on encryption of disc drive will be sent, and the disc drive will be forwarded by courier. Content Disk drive Description All FASTQ files associated with the project All BAM files generated in the project including mapped and unmapped files (use IGV viewer to visualize) Table 2. List of data files included on disc drive Ref code: 9999 Page 2 of 32

3 Table of Contents Summary... 4 Experimental overview... 5 Sample overview... 5 Reference genome... 5 Experimental design... 5 Project workflow... 6 QC & Mapping... 7 QC Summary... 7 Mapping and yields Results Identified genes Principal Component Analysis plot Heat map and unsupervised clustering Identification of novel mrnas Differentially expressed genes Differentially expressed novel transcripts Volcano plot Gene Ontology Enrichment Analysis Conclusion and next steps mirsearch Data Analysis workflow Software tools used for the analysis Material and methods Library preparation and Next Generation Sequencing References Frequently asked questions Ref code: 9999 Page 3 of 32

4 Summary Dear Dr. Demo, We have now finalized the Next Generation Sequencing analysis of the mrnas identified in the samples you have submitted to Exiqon Services. Next Generation Sequencing libraries were successfully prepared, quantified and sequenced for all your samples. The collected reads were subjected to quality control and downstream analysis. The principal findings are summarized in this document. Additional information and further details on specific RNA transcripts can be found in the various documents listed in the table on the previous page. Differential expression analysis of read counts identified a subset of mrna sequences that had significant differences in the associated number of reads between the two experimental groups. We also found a number of putative novel transcripts in your samples, some of which show significant differential expression. Exiqon's product line offers many tools for further validating potentially regulated mrnas by qpcr, in situ hybridization, Northern blot or GapmeRs for highly efficient antisense inhibition of mrna and lncrna function. For more information please see If you have any questions related to this report, please do not hesitate to contact us at DxServices@exiqon.com. Kind regards, Exiqon Services Exiqon A/S Ref code: 9999 Page 4 of 32

5 Experimental overview Sample overview The table below lists all the samples processed in this project and their specifications according to the sample submission form. There were a total of 6 samples, split into two experimental groups. Sample ID Group Sequencing batch File Name Control1 Control 1 XXX088363_CS_1.fastq Control2 Control 1 XXX088363_CS_2.fastq Control3 Control 1 XXX088363_CS_3.fastq Treated1 Treated 1 XXX088363_TS_4.fastq Treated2 Treated 1 XXX088363_TS_5.fastq Treated3 Treated 1 XXX088363_TS_6.fastq Table 3. Sample ID, grouping, sequencing batch and associated FASTQ file. Reference genome Annotation of the obtained sequences was performed using the reference annotation listed below. Organism: Human Reference genome: h.sapiens, hg19 / GRC37, UCSC Genome Browser Annotation reference: Gencode v11, Ensembl Experimental design The experiments were performed using the following settings: Instrument: NextSeq500 Number of reads: 50 mio Read length: 50 bp, Paired End Ref code: 9999 Page 5 of 32

6 Project workflow The figure below outlines the Next Generation Sequencing process for mrna and whole transcriptome RNA sequencing at Exiqon A/S. Figure 1. Schematic NGS workflow Ref code: 9999 Page 6 of 32

7 QC & Mapping The following sections provide a summary of the QC and mapping results obtained for your dataset. QC Summary Following sequencing, intensity correction and base calling, an initial QC of the data is performed internally by the sequencer. This includes CHASTITY filtering and quality scoring (Q-score, see details on page Error! Bookmark not defined.) of each individual base in each read. At this stage the data is separated for Paired end reads (PE) to determine whether the second read significantly differs from the first in terms of overall quality. As illustrated in the figure below ( Figure 2), we found that the vast majority of the data has a Q score greater than 30 (>99.9% correct), indicating that high quality data was obtained for all samples. Reads pairs R1 (read1) and R2 (read2) are presented seperately. Ref code: 9999 Page 7 of 32

8 Figure 2. Average read quality of the NGS sequencing data. A Q-score above 30 is considered high quality data (red dotted line). Ref code: 9999 Page 8 of 32

9 In the graph below ( Figure 3), an overview of the average base quality is shown. As for the average read quality we found that the vast majority of the bases have a Q score greater than 30 (>99.9% correct), indicating that high quality data was obtained for all samples. Ref code: 9999 Page 9 of 32

10 Figure 3. Average base quality (R1 and R2 Q-scores) of the NGS sequencing data. The vast majority of the bases has a Q score greater than 30 (>99.9% correct), indicating high quality data. Ref code: 9999 Page 10 of 32

11 Mapping and yields Mapping of the sequencing data represents a useful quality control step in the NGS data analysis pipeline as it can help to evaluate the quality of the samples. For this purpose, we classify the reads in the following classes: Outmapped reads or high abundance reads: For example; rrna, mtrna, polya and PolyC homopolymers Unmapped reads: no alignment possible Mappable reads: aligning to reference genome In a typical experiment it is possible to align 60-90% of the reads to the reference genome, However, this number depends upon the quality of the sample and the coverage of the relevant reference genome; if the sample is degraded, fewer reads will be mrna specific and more material will be degraded rrna. The following table and plot summarizes the mapping results. In addition to the mapping results, the table below also shows the total number of reads obtained for each sample. On average 65 million reads were obtained from each sample and genome mapping was on average 91 % for all samples. The uniformity of the sample s mapping results suggests that the samples are comparable. Sample Total reads rrnas (%) Outmapped reads Other (mtrna) (%) Mappable reads (%) Unmapped (%) Control Control Control Treated Treated Treated Table 4. Summary of the mapping results for each sample. The following plot summarizes the mapping results for each sample. Ref code: 9999 Page 11 of 32

12 Figure 4. Summary of mapping results of the reads by sample. If you want to inspect the mapping in details, please see the BAM alignment files, which are supplied on the hard disk. The BAM files can be viewed and inspected in any standard genome viewer such as the IGV browser (Robinson et al.,2011) and (Thorvaldsdóttir et al., (2012) downloadable from Ref code: 9999 Page 12 of 32

13 Results Below you will find a summary of the principal findings for this project. The complete analysis may be found in the associated files listed on page 2. For detailed description of the data analysis process see the Data Analysis section on page 29. Identified genes Based on alignment to the reference genome, the number of identified genes per sample was calculated. The reliability of the identified genes increased with number of identified fragments. When performing the statistical comparison of two groups, we include all genes irrespective of how few calls have been made. As can be seen from the table below, and from Figure 5, all samples included in this study have comparable call rates. Sample ID Number of genes identified Number of isoforms identified Control Control Control Treated Treated Treated Ref code: 9999 Page 13 of 32

14 Table 5. Number of genes and isoforms identified in each sample which have a fragment count estimation of at least 10 counts per gene. Ref code: 9999 Page 14 of 32

15 The distribution of the calls based on the number of fragments identified is illustrated in the radar plot below. The sample name is indicated on the outer rim of the plot. The number of genes with 1, 10, 100 or 1000 fragments are illustrated as colored rings. If one sample results in significantly lower number of genes in each category, this is an indication that the sample is deviating from the remaining samples. Overall, the rings in the plot are consistent. Figure 5. Radar plot showing gene call rates for each sample at different fragment count cutoff values. See color scale at top of figure for specification of cutoff values. Expression levels are measured as FPKM FPKM is a unit of measuring expression for NGS experiments. The number of reads corresponding to the particular gene is normalized to the total number of mapped reads (Fragments Per Kilobase of transcript per Million mapped reads), In the analysis part the FPKM values are normalized with median of the geometric mean (Anders & Huber, 2010). Ref code: 9999 Page 15 of 32

16 Principal Component Analysis plot Principal Component Analysis (PCA) is a method used to reduce the dimension of large data sets and is a useful tool to explore the naturally arising sample classes based on the expression profile. The top 200 transcripts (genes) that have the largest log2 fold difference based on FPKM counts have been included in the analysis. If the biological differences between the samples are pronounced, this will describe the primary components of the variation in the data. This leads to separation of samples in different regions of a PCA plot corresponding to their biology. If other factors, e.g. sample quality, introduce more variation in the data, the samples will not cluster according to the biology. The largest component in the variation is plotted along the X-axis and the second largest is plotted on the Y-axis. As seen below, the groups cluster on the primary component Figure 6. Principal component analysis (PCA) plot. The PCA was performed on all samples passing QC using the top 200 transcripts (genes) that have the largest log2 fold difference based on FPKM counts. Ref code: 9999 Page 16 of 32

17 Heat map and unsupervised clustering The heat map diagram below shows the result of the two-way hierarchical clustering of RNA transcripts and samples, by including the top 200 transcripts (genes) that have the largest log2 fold difference based on FPKM counts. Each row represents one RNA transcript and each column represents one sample. The color of each point represents the relative expression level of a transcript across all samples: The color scale is shown at the bottom right: red represents an expression level above the mean; green represents an expression level below the mean. Figure 7. Heat Map and unsupervised hierarchical clustering by sample and transcripts was performed on all samples passing QC using the top 200 transcripts (genes) that have the largest log2 fold difference based on FPKM counts. Ref code: 9999 Page 17 of 32

18 Identification of novel mrnas During the transcriptome assembly process, both known and novel transcripts are identified. A novel transcript is characterized as a transcript which contains features not present in the reference annotation. Thus, a novel transcript can be both a new isoform of a known gene or a transcript without any known features. For example, a novel transcript could be the result of a previously unknown splicing event for a known gene or a previously unknown long noncoding RNA. Identification of novel transcripts depends upon the reference annotation. For the present study, the hsa hg19 genome from Gencode v11, Ensembl has been used for annotation. Transcripts not part of this annotation will be classified as novel. In the result files we will classify novel transcripts with known features by listing the known transcripts most closely resembling the novel transcript. For novel transcripts without any known features we will provide a locally unique name as transcript identifier. In addition, we will provide the genomic positions for the features of the novel transcript, e.g. the location and number of exons. Please see page 21 for differentially expressed novel transcripts, and for full list of identified novel transcripts. The full lists of Coding DNA Sequence (CDS), genes, exon isoforms and differential start site isoforms are presented in these files. The table annotations are complex but a good reference is presented in the Cufflinks manual accessible at Ref code: 9999 Page 18 of 32

19 Differentially expressed genes To identify differentially expressed genes, it is assumed that the number of reads produced by each transcript is proportional to its abundance. Exiqon Services has customized the analysis pipeline based on the Tuxedo suite, including the cufflinks, cuffmerge and cuffdiff steps of the Tuxedo pipeline. For more details see Data Analysis workflow on page 29. Comparison of Control and Treated experimental groups, known mrna The table below shows the individual results for the top 20 most differentially expressed known mrna genes. For a full list of differentially expressed transcripts is given in the associated.tsv file folder listed in table 1. Gene_id Gene Locus Control FPKM Treated FPKM log2_fc q_value XLOC_ PSG5 19: XLOC_ TRAC,TRAJ20 14: XLOC_ GREM1 15: XLOC_ KCNK2 1: XLOC_ HOXD10,HOXD11 2: XLOC_ DKK1 10: XLOC_ CPA4 7: XLOC_ HAPLN1 5: XLOC_ LHX9 1: XLOC_ KIAA : XLOC_ WNT16 7: XLOC_ BNC1 15: XLOC_ FOXE1 9: XLOC_ RP11-94A24.1 8: XLOC_ GALNT5 2: XLOC_ LOX 5: XLOC_ RP11-265N7 15: XLOC_ RP11-709B3.2 15: XLOC_ SLC1A7 1: XLOC_ ADAMTSL1 9: Table 6. Known mrnas: Table of the 20 most differentially expressed mrnas, with log fold change (Log2_FC FPKM) between groups Control and Treated with Benjamini-Hochberg FDR corrected q-values. The list is sorted on Log2_FC. Control and Treated columns are group average FPKM values. Ref code: 9999 Page 19 of 32

20 Comparison of Control and Treated experimental groups, isoforms The table below shows the individual results for the top 20 most differentially expressed isoforms. A full list of differentially expressed transcripts is given in the associated.tsv file folder listed in table 1. Gene_id Gene Locus Control FPKM Treated FPKM log2_fc q_value XLOC_ THBS1 15: E XLOC_ MXRA5 X: XLOC_ KIAA : XLOC_ GREM1 15: XLOC_ DKK1 10: XLOC_ ADAMTSL1 9: XLOC_ DKK1 10: XLOC_ ITGA11 15: XLOC_ COL8A1 3: XLOC_ MIR125B1 11: XLOC_ SULF1 8: XLOC_ MYOF 10: XLOC_ LAMA4 6: XLOC_ LOX 5: XLOC_ HAS2 8: XLOC_ HMGA2 12: XLOC_ COL8A1 3: XLOC_ RGMB 5: XLOC_ COL6A3 2: XLOC_ ENPP2 8: Table 7. Isoforms: Table of the 20 most differentially expressed isoforms, with log fold change (Log2_FC FPKM) between groups Treated and Control, with Benjamini-Hochberg FDR corrected q-values. The list is sorted on Log2_FC. Control and Treated columns are group average FPKM values. Ref code: 9999 Page 20 of 32

21 Differentially expressed novel transcripts The table below lists the top 20 differentially expressed novel transcripts identified in this project. In the second column in the table below are listed known transcripts most closely resembling the novel transcript. For a full list of differentially expressed transcripts is given in the associated.tsv file folder listed in table 1. Gene_id Gene Locus Control Treated log2_fc q_value XLOC_ KIAA : XLOC_ RGMB 5: XLOC_ TWIST2 2: XLOC_ MEGF6 1: XLOC_ AC X: XLOC_ CCDC14 3: XLOC_ EDA2R X: XLOC_ FLJ : XLOC_ SLIT2 4: XLOC_ WEE1 11: XLOC_ HOXA9 7: XLOC_ MACF1 1: XLOC_ SRSF11 1: XLOC_ ADAM33 20: XLOC_ KIF23 15: XLOC_ PRKY Y: XLOC_ FKBP10 17: XLOC_ COL8A1 3: XLOC_ LOXL2 8: XLOC_ HAS2 8: Table 8. Novel transcripts. Table of the 20 most differentially expressed novel transcripts, with log fold change (Log2_FC FPKM) between groups Control and Treated, with Benjamini- Hochberg FDR corrected q-values. The list is sorted on Log2_FC. Control and Treated columns are group average FPKM values. Ref code: 9999 Page 21 of 32

22 Volcano plot The Volcano plot provides a way to perform a quick visual identification of the RNA transcripts displaying large-magnitude changes which are also statistically significant. The plot is constructed by plotting the p-value (-log10) on the y-axis, and the expression fold change between the two experimental groups on the x-axis. There are two regions of interest in the plot: those points that are found towards the top of the plot (high statistical significance) and at the extreme left or right (strongly down and up-regulated respectively). Genes that pass the filtering of q-value <0.05 are indicated on the plot. For the present study, no genes pass this filtering. For volcano plots of other comparisons, please see additional Figures. Figure 8. Volcano plot showing the relationship between the p-values and the fold change in normalized expression between the experimental groups Control and Treated. Ref code: 9999 Page 22 of 32

23 Gene Ontology Enrichment Analysis Gene ontology (GO - Gene Ontology Consortium, 2000) enrichment analysis attempts to identify GO terms that are significantly associated with differentially expressed protein coding genes. We investigate whether specific GO terms are more likely to be associated with the differentially expressed mrnas. Two different statistical tests are used and compared. Firstly a standard Fisher s test is used to investigate enrichment of terms between the two test groups. Secondly, the Elim method takes a more conservative approach by incorporating the topology of the GO network to compensate for local dependencies between GO which can mask significant GO terms. Comparisons of the predictions from these two methods can highlight truly relevant GO terms. The figure below shows a comparison of the results for the GO (Biological process) terms associated with the significantly differentially expressed mrnas that were identified between groups Control and Treated. Complete GO enrichment analysis for all of the comparisons is presented in the associated GO folder in the full dataset supplied with the report. The Cellular component (CC) and Molecular functions (MF) analysis are presented in the associated data folder. In the plot, the majority of overrepresented terms are not statistically significant but a small number of terms appear to be relevant. Figure 9. Scatter plot for significantly enriched GO terms predicted to be associated with differentially expressed genes. Plot shows a comparison of the results obtained by the two statistical tests used. Values along diagonal are consistent between both methods with values in the bottom left of the plot corresponding to the terms with most reliable estimates from both methods. Size of dot is proportional to number of genes mapping to that GO term and coloring represents number of significantly differentially expressed genes corresponding to that term with dark red representing more terms and yellow representing fewer. Ref code: 9999 Page 23 of 32

24 A list of potentially significant GO (Biological process) terms is given in the table below. Rank in Classic KS elimks GO.ID Term Annotated Significant Expected Classic Fisher p-value p-value GO: extracellular matrix organization E E-08 GO: inflammatory response E E-05 GO: homophilic cell adhesion E E-05 GO: anatomical structure formation involved E GO: cell adhesion E GO: brain development E GO: regulation of cell migration E GO: positive regulation of neuron differenti GO: axon guidance GO: glutamate metabolic process GO: positive regulation of epithelial cell p GO: negative regulation of blood coagulation GO: chemical homeostasis GO: leukocyte migration GO: regulation of protein transport GO: complement activation, classical pathway GO: monocarboxylic acid biosynthetic process GO: positive regulation of transport GO: epithelial to mesenchymal transition GO: high-density lipoprotein particle remode Table 9. The top 20 significant GO terms for the genes found to be differentially expressed between Control and Treated and their corresponding annotation for Biological process (BP). The associated network topology is shown in Ref code: 9999 Page 24 of 32

25 Figure 10. Ref code: 9999 Page 25 of 32

26 To illustrate how the differentt GO terms are linked, a GO network has been created. Figure 10. GO network generated from the GO terms predicted too be enrichedd for the Biological process (BP vocabulary). Nodes are colored from red to t yellow withh the node with the strongest support colored red and nodes with no significant enrichment colored yellow. The five nodes with stronges support aree marked with rectangular nodes. A high-resolution version of this graph is found in the supplementary Figures. Ref code: 9999 Page 26 of 32

27 Conclusion and next steps mrna Next Generation Sequencing libraries were successfully prepared, quantified and sequenced for all your samples. The data passed all QC metrics, with high Q-score, indicating good technical performance of the NGS experiment. A high percentage of the reads could be mapped to the reference genome, indicating that the samples were of high quality. A large number of novel transcripts were identified. Note, however, that many of these will be novel isoforms or start sites of known genes and transcripts. It is clear from the unsupervised analysis that the two samples/groups cluster according to their biological groups, indicating that the sample groups are causing the largest variation on the samples. The supervised analysis showed large numbers of significantly differentially expressed mrna at the CDS (Coding DNA Sequence) and gene level as well as at the isomer level. Note: when navigating through these data, counts lower than 1-5 FPKM (on average) per group might be difficult to validate in a qpcr experiment. We would like to help you interpret the data presented in this report and guide you on how best to proceed with subsequent experiments. If you would like to arrange a time to discuss the data with us in more detail, please do not hesitate to contact DxServices@exiqon.com and we will be happy to arrange a phone call with you. Ref code: 9999 Page 27 of 32

28 mirsearch If you are interested in looking at which micrornas are regulating your transcripts, Exiqon offers two options for further data mining of the results: mirsearch 3.0 An interactive mirsearch database, offering you up-to-date information on specific micrornas, tissues, diseases, as well as co-regulated micrornas, target genes and much more. mirsearch includes a built-in report feature which allows you to easily collect and store all the relevant information gathered. Access mirseach from this address: XploreRNA XploreRNA is an advanced database search tool for scientists engaged in transcriptome analysis. The XploreRNA app enables scientists unfamiliar with database searches to access relevant public and proprietary genetic and molecular biology databases through a simple user interface. All databases are cross-annotated and relevant databases are regularly updated by advanced text mining of the literature e.g. in respect to new information on microrna-mrna interactions. The app provides information from major databases such as Ensembl and mirbase. XploreRNA can be downloaded from App Store and Google Play. All search results provide information on literature reference(s) with integrated access to PubMed for reading of abstracts and original publications. Ref code: 9999 Page 28 of 32

29 Data Analysis workflow Software tools used for the analysis Our data analysis pipeline is based on the Tuxedo software package, which is a combination of open-source software and implements peer-reviewed statistical methods. In addition we employ specialized software developed internally at Exiqon to interpret and improve the readability of the final results. The components of our NGS RNA seq analysis pipeline include Bowtie2 (v ), Tophat (v2.0.11) and Cufflinks (v2.2.1) and are described in detail below. Tophat is a fast splice junction mapper for RNA-Seq reads. It aligns the sequencing reads to the reference genome using the sequence aligner Bowtie2. Tophat also uses the sequence alignments to identify splice junctions for both known and novel transcripts. Cufflinks takes the alignment results from Tophat to assemble the aligned sequences into transcripts, constructing a map or a snapshot of the transcriptome. To guide the assembly process, an existing transcript annotation is used. In addition, we perform fragment bias correction which seeks to correct for sequence bias during library preparation (see Kasper et al., 2010 and Adam et al., 2011). The Cufflinks assembles aligned reads into different transcript isoforms based on exon usage and also determines the transcriptional start sites (TSSs). When comparing groups, Cuffdiff is used to calculate the FPKM (number of fragments per kilobase per million mapped fragments) and test for differential expression and regulation among the assembled transcripts across the submitted samples using the Cufflinks output. Cuffdiff can be used to test differential expression at different levels, from CDS and gene specific, down to the isoform and TSS transcript level. For more information on the Cuffdiff module, see Trapnell et al., (2013). As a final step, CummeRbund, which is an open source R package, will be used in combination with in-house custom software for post processing of Cufflinks and Cuffdiff results. We use these tools to generate a visual representation of your sequencing results to aid the interpretation of the sequencing data and the analysis results. Ref code: 9999 Page 29 of 32

30 Material and methods All experiments were conducted at Exiqon Services, Denmark. Library preparation and Next Generation Sequencing The library preparation was done using TruSeq Stranded mrna Sample preparation kit (Illumina inc.). The starting material (100 ng) of total RNA was mrna enriched using the oligodt bead system (manufacturer). The isolated mrna was subsequently fragmented using enzymatic fragmentation (manufacturer, enzymes?). Then first strand synthesis and second strand synthesis were performed and the double stranded cdna was purified (AMPure XP, Beckman Coulter?). The cdna was end repaired, 3 adenylated and Illumina sequencing adaptors ligated onto the fragments ends, and the library was purified (AMPure XP). The mrna stranded libraries were pre-amplified with PCR and purified (AMPure XP). The libraries size distribution was validated and quality inspected on a Bioanalyzer high sensitivity DNA chip (Agilent Technologies?). High quality libraries were quantified using qpcr, the concentration normalized and the samples pooled according to the project specification (number of reads). The library pool(s) were re-quantified with qpcr and optimal concentration of the library pool used to generate the clusters on the surface of a flowcell before sequencing on Nextseq500 instrument using High Output sequencing kit (150 cycles) according to the manufacturer instructions (Illumina Inc.). Ref code: 9999 Page 30 of 32

31 References Trapnell, C., et al. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 28(5): Trapnell,C., et al.(2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7, Trapnell, C., et al. (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics (Oxford, England), 25(9): , Langmead, B., et al. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(3): R Roberts, A., et al. (2011) Identification of novel transcripts in annotated genomes using RNA- Seq. Bioinformatics, 27(17): Anders S. and Huber W. (2010) Differential expression analysis for sequence count data. Genome Biology 11: R106 Goff L., et al.(2012) Robinson, J.T., et al (2011) Integrative genomics viewer. Nature Biotechnology 29, Thorvaldsdóttir, H., et al. (2012) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics. Kasper D., et al. (2010), Biases in Illumina transcriptome sequencing caused by random hexamer priming Nucleic Acids Research, Volume 38, Issue 12. Roberts, A., et al., (2011) Improving RNA-Seq expression estimates by correcting for fragment bias Genome Biology, Volume 12, R22. Marinov, G. K., et al (2014) From single-cell to cell-pool transcriptomes: Stochasticity in gene expression and RNA splicing. Genome Res. 24: Kellis, M., et al.(2013) Defining functional DNA elements in the human genome. PNAS, Vol. 111: Ref code: 9999 Page 31 of 32

32 Frequently asked questions What is Q-score? Answer: A quality score (or Q-score) is a prediction of the probability of an incorrect base call. Q-score = -10 log10(p(~x)) where P(~X) is the estimated probability of the base call being wrong. A quality score of 10 indicates an error probability of 0.1, a quality score of 20 indicates an error probability of 0.01, a quality score of 30 indicates an error probability of 0.001, and so on. Question: What is the difference between FPKM and RPKM? Answer: RPKM stands for Reads per Kilobase of exon per Million mapped reads, FPKM stands for Fragments per Kilobase of exon per Million mapped fragments. The term fragments refers to the cdna fragments present during library preparation. Both RPKM and FPKM are normalized numbers which tell you something about the relative abundance of, for example, an assembled transcript. In paired-end sequencing, two reads are produced per cdna fragment during library preparation, whereas only one read is produced per cdna fragment in single-end sequencing. Thus, single-end versus paired-end sequencing will affect the value of RPKM but not FPKM. Consequently, FPKM is preferred over RPKM as it will provide values comparable between single-end sequencing and paired-end sequencing Question: What does 1 FPKM mean in terms of abundance? Answer: This is difficult to estimate and highly variable according to cell type and the total number of mrnas in a given cell. For example,. It was estimated that in a single cell analysis of the cell line GM12878, that one transcript copy corresponds to 10 FPKM (Marinov 2014l). Others find that FPKMs are not directly comparable among different subcellular fractions, as they reflect relative abundances within a fraction rather than average absolute transcript copy numbers per cell (Kellis 2013). Depending on the total amount of RNA in a cell, one transcript copy per cell corresponds to between 0.5 and 5 FPKM in PolyA+ whole-cell samples according to current estimates with the upper end of that range corresponding to small cells with little RNA and vice versa. Question: What is a novel RNA transcript? Answer: A novel transcript is characterized as a transcript from a region that lacks annotation not present in the reference annotation. Identification of novel transcripts depends therefore in the reference annotation. Question: A novel transcript identified seems to be a known gene when I look it up in the gene browser, why is that? Answer: Most novel transcripts are not new genes but different isoforms of previously annotated genes. A novel transcript is most commonly a novel combination of exons or a different start site. Ref code: 9999 Page 32 of 32

Next Generation Sequencing

Next Generation Sequencing Next Generation Sequencing Complete Report Catalogue # and Service: IR16001 rrna depletion (human, mouse, or rat) IR11081 Total RNA Sequencing (80 million reads, 2x75 bp PE) Xxxxxxx - xxxxxxxxxxxxxxxxxxxxxx

More information

RNAseq Differential Gene Expression Analysis Report

RNAseq Differential Gene Expression Analysis Report RNAseq Differential Gene Expression Analysis Report Customer Name: Institute/Company: Project: NGS Data: Bioinformatics Service: IlluminaHiSeq2500 2x126bp PE Differential gene expression analysis Sample

More information

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Introduction to RNA-Seq David Wood Winter School in Mathematics and Computational Biology July 1, 2013 Abundance RNA is... Diverse Dynamic Central DNA rrna Epigenetics trna RNA mrna Time Protein Abundance

More information

Transcriptome analysis

Transcriptome analysis Statistical Bioinformatics: Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018 Outline: a) How to assess the quality of sequencing reads? b) How to normalize

More information

Introduction to RNA-Seq in GeneSpring NGS Software

Introduction to RNA-Seq in GeneSpring NGS Software Introduction to RNA-Seq in GeneSpring NGS Software Dipa Roy Choudhury, Ph.D. Strand Scientific Intelligence and Agilent Technologies Learn more at www.genespring.com Introduction to RNA-Seq In a few years,

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

10/06/2014. RNA-Seq analysis. With reference assembly. Cormier Alexandre, PhD student UMR8227, Algal Genetics Group

10/06/2014. RNA-Seq analysis. With reference assembly. Cormier Alexandre, PhD student UMR8227, Algal Genetics Group RNA-Seq analysis With reference assembly Cormier Alexandre, PhD student UMR8227, Algal Genetics Group Summary 2 Typical RNA-seq workflow Introduction Reference genome Reference transcriptome Reference

More information

NGS Data Analysis and Galaxy

NGS Data Analysis and Galaxy NGS Data Analysis and Galaxy University of Pretoria Pretoria, South Africa 14-18 October 2013 Dave Clements, Emory University http://galaxyproject.org/ Fourie Joubert, Burger van Jaarsveld Bioinformatics

More information

Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ),

Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ), Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ), 2012-01-26 What is a gene What is a transcriptome History of gene expression assessment RNA-seq RNA-seq analysis

More information

SMARTer Ultra Low RNA Kit for Illumina Sequencing Two powerful technologies combine to enable sequencing with ultra-low levels of RNA

SMARTer Ultra Low RNA Kit for Illumina Sequencing Two powerful technologies combine to enable sequencing with ultra-low levels of RNA SMARTer Ultra Low RNA Kit for Illumina Sequencing Two powerful technologies combine to enable sequencing with ultra-low levels of RNA The most sensitive cdna synthesis technology, combined with next-generation

More information

Bioinformatics Monthly Workshop Series. Speaker: Fan Gao, Ph.D Bioinformatics Resource Office The Picower Institute for Learning and Memory

Bioinformatics Monthly Workshop Series. Speaker: Fan Gao, Ph.D Bioinformatics Resource Office The Picower Institute for Learning and Memory Bioinformatics Monthly Workshop Series Speaker: Fan Gao, Ph.D Bioinformatics Resource Office The Picower Institute for Learning and Memory Schedule for Fall, 2015 PILM Bioinformatics Web Server (09/21/2015)

More information

A guide to the whole transcriptome and mrna Sequencing Service

A guide to the whole transcriptome and mrna Sequencing Service Exiqon Services A guide to the whole transcriptome and mrna Sequencing Service Guidelines v1.3 December 2015 Table of Contents Consultation and experimental design... 3 How do I get started?.... 3 Designing

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA. March 2, Steven R. Kain, Ph.D. ABRF 2013

Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA. March 2, Steven R. Kain, Ph.D. ABRF 2013 Integrated NGS Sample Preparation Solutions for Limiting Amounts of RNA and DNA March 2, 2013 Steven R. Kain, Ph.D. ABRF 2013 NuGEN s Core Technologies Selective Sequence Priming Nucleic Acid Amplification

More information

1. Introduction Gene regulation Genomics and genome analyses

1. Introduction Gene regulation Genomics and genome analyses 1. Introduction Gene regulation Genomics and genome analyses 2. Gene regulation tools and methods Regulatory sequences and motif discovery TF binding sites Databases 3. Technologies Microarrays Deep sequencing

More information

Total RNA isola-on End Repair of double- stranded cdna

Total RNA isola-on End Repair of double- stranded cdna Total RNA isola-on End Repair of double- stranded cdna mrna Isola8on using Oligo(dT) Magne8c Beads AAAAAAA A Adenyla8on (A- Tailing) A AAAAAAAAAAAA TTTTTTTTT AAAAAAA TTTTTTTTT TTTTTTTT TTTTTTTTT AAAAAAAA

More information

ChIP-seq and RNA-seq

ChIP-seq and RNA-seq ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)

More information

RNA-Sequencing analysis

RNA-Sequencing analysis RNA-Sequencing analysis Markus Kreuz 25. 04. 2012 Institut für Medizinische Informatik, Statistik und Epidemiologie Content: Biological background Overview transcriptomics RNA-Seq RNA-Seq technology Challenges

More information

RNA-Seq with the Tuxedo Suite

RNA-Seq with the Tuxedo Suite RNA-Seq with the Tuxedo Suite Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 2015 Workshop The Basic Tuxedo Suite References Trapnell C, et al. 2009 TopHat: discovering splice junctions with

More information

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility 2018 ABRF Meeting Satellite Workshop 4 Bridging the Gap: Isolation to Translation (Single Cell RNA-Seq) Sunday, April 22 Basics of RNA-Seq (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly,

More information

RNA-Seq Analysis. Simon Andrews, Laura v

RNA-Seq Analysis. Simon Andrews, Laura v RNA-Seq Analysis Simon Andrews, Laura Biggins simon.andrews@babraham.ac.uk @simon_andrews v2018-10 RNA-Seq Libraries rrna depleted mrna Fragment u u u u NNNN Random prime + RT 2 nd strand synthesis (+

More information

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX

Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX Next-Generation Sequencing Gene Expression Analysis Using Agilent GeneSpring GX Technical Overview Introduction RNA Sequencing (RNA-Seq) is one of the most commonly used next-generation sequencing (NGS)

More information

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing Gene Regulation Solutions Microarrays and Next-Generation Sequencing Gene Regulation Solutions The Microarrays Advantage Microarrays Lead the Industry in: Comprehensive Content SurePrint G3 Human Gene

More information

Galaxy Platform For NGS Data Analyses

Galaxy Platform For NGS Data Analyses Galaxy Platform For NGS Data Analyses Weihong Yan wyan@chem.ucla.edu Collaboratory Web Site http://qcb.ucla.edu/collaboratory http://collaboratory.lifesci.ucla.edu Workshop Outline ü Day 1 UCLA galaxy

More information

Novel methods for RNA and DNA- Seq analysis using SMART Technology. Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc.

Novel methods for RNA and DNA- Seq analysis using SMART Technology. Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc. Novel methods for RNA and DNA- Seq analysis using SMART Technology Andrew Farmer, D. Phil. Vice President, R&D Clontech Laboratories, Inc. Agenda Enabling Single Cell RNA-Seq using SMART Technology SMART

More information

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq Sequencing applications Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ RNA-Seq includes experiments

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:1.138/nature11233 Supplementary Figure S1 Sample Flowchart. The ENCODE transcriptome data are obtained from several cell lines which have been cultured in replicates. They were either left intact (whole

More information

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer.

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer. DNA Preparation and QC Extraction DNA was extracted from whole blood or flash frozen post-mortem tissue using a DNA mini kit (QIAmp #51104 and QIAmp#51404, respectively) following the manufacturer s recommendations.

More information

VM origin. Okeanos: Image Trinity_U16 (upgrade to Ubuntu16.04, thanks to Alexandros Dimopoulos) X2go: LXDE

VM origin. Okeanos: Image Trinity_U16 (upgrade to Ubuntu16.04, thanks to Alexandros Dimopoulos) X2go: LXDE VM origin Okeanos: Image Trinity_U16 (upgrade to Ubuntu16.04, thanks to Alexandros Dimopoulos) X2go: LXDE NGS intro + Genome-Based Transcript Reconstruction and Analysis Using RNA-Seq Data Based on material

More information

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis Experimental Design Dr. Matthew L. Settles Genome Center University of California, Davis settles@ucdavis.edu What is Differential Expression Differential expression analysis means taking normalized sequencing

More information

TECH NOTE Pushing the Limit: A Complete Solution for Generating Stranded RNA Seq Libraries from Picogram Inputs of Total Mammalian RNA

TECH NOTE Pushing the Limit: A Complete Solution for Generating Stranded RNA Seq Libraries from Picogram Inputs of Total Mammalian RNA TECH NOTE Pushing the Limit: A Complete Solution for Generating Stranded RNA Seq Libraries from Picogram Inputs of Total Mammalian RNA Stranded, Illumina ready library construction in

More information

Applications of short-read

Applications of short-read Applications of short-read sequencing: RNA-Seq and ChIP-Seq BaRC Hot Topics March 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/ Sequencing applications RNA-Seq includes experiments

More information

Wheat CAP Gene Expression with RNA-Seq

Wheat CAP Gene Expression with RNA-Seq Wheat CAP Gene Expression with RNA-Seq July 9 th -13 th, 2018 Overview of the workshop, Alina Akhunova http://www.ksre.k-state.edu/igenomics/workshops/ RNA-Seq Workshop Activities Lectures Laboratory Molecular

More information

Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12

Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12 Galaxy for Next Generation Sequencing 初探次世代序列分析平台 蘇聖堯 2013/9/12 What s Galaxy? Bringing Developers And Biologists Together. Reproducible Science Is Our Goal An open, web-based platform for data intensive

More information

Obtain superior NGS library performance with lower input amounts using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina

Obtain superior NGS library performance with lower input amounts using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina be INSPIRED drive DISCOVERY stay GENINE TECHNICAL NOTE Directional rrna depletion Obtain superior NGS library performance with lower input amounts using the NEBNext ltra II Directional RNA Library Prep

More information

Obtain superior NGS library performance with lower input amounts using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina

Obtain superior NGS library performance with lower input amounts using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina be INSPIRED drive DISCOVERY stay GENINE TECHNICAL NOTE Directional rrna depletion Obtain superior NGS library performance with lower input amounts using the NEBNext ltra II Directional RNA Library Prep

More information

Benchmarking of RNA-seq data processing pipelines using whole transcriptome qpcr expression data

Benchmarking of RNA-seq data processing pipelines using whole transcriptome qpcr expression data Benchmarking of RNA-seq data processing pipelines using whole transcriptome qpcr expression data Jan Hellemans 7th international qpcr & NGS Event - Freising March 24 th, 2015 Therapeutics lncrna oncology

More information

RNA Seq: Methods and Applica6ons. Prat Thiru

RNA Seq: Methods and Applica6ons. Prat Thiru RNA Seq: Methods and Applica6ons Prat Thiru 1 Outline Intro to RNA Seq Biological Ques6ons Comparison with Other Methods RNA Seq Protocol RNA Seq Applica6ons Annota6on Quan6fica6on Other Applica6ons Expression

More information

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS SETTLES@UCDAVIS.EDU Bioinformatics Core Genome Center UC Davis BIOINFORMATICS.UCDAVIS.EDU DISCLAIMER This talk/workshop

More information

RNA-Seq data analysis course September 7-9, 2015

RNA-Seq data analysis course September 7-9, 2015 RNA-Seq data analysis course September 7-9, 2015 Peter-Bram t Hoen (LUMC) Jan Oosting (LUMC) Celia van Gelder, Jacintha Valk (BioSB) Anita Remmelzwaal (LUMC) Expression profiling DNA mrna protein Comprehensive

More information

RNA-Seq Module 2 From QC to differential gene expression.

RNA-Seq Module 2 From QC to differential gene expression. RNA-Seq Module 2 From QC to differential gene expression. Ying Zhang Ph.D, Informatics Analyst Research Informatics Support System (RISS) MSI Apr. 24, 2012 RNA-Seq Tutorials Tutorial 1: Introductory (Mar.

More information

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

Gene Expression Technology

Gene Expression Technology Gene Expression Technology Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu Gene expression Gene expression is the process by which information from a gene

More information

RNA-Seq Workshop AChemS Sunil K Sukumaran Monell Chemical Senses Center Philadelphia

RNA-Seq Workshop AChemS Sunil K Sukumaran Monell Chemical Senses Center Philadelphia RNA-Seq Workshop AChemS 2017 Sunil K Sukumaran Monell Chemical Senses Center Philadelphia Benefits & downsides of RNA-Seq Benefits: High resolution, sensitivity and large dynamic range Independent of prior

More information

Analysis of Differential Gene Expression in Cattle Using mrna-seq

Analysis of Differential Gene Expression in Cattle Using mrna-seq Analysis of Differential Gene Expression in Cattle Using mrna-seq mrna-seq A rough guide for green horns Animal and Grassland Research and Innovation Centre Animal and Bioscience Research Department Teagasc,

More information

Measuring and Understanding Gene Expression

Measuring and Understanding Gene Expression Measuring and Understanding Gene Expression Dr. Lars Eijssen Dept. Of Bioinformatics BiGCaT Sciences programme 2014 Why are genes interesting? TRANSCRIPTION Genome Genomics Transcriptome Transcriptomics

More information

Statistical Genomics and Bioinformatics Workshop. Genetic Association and RNA-Seq Studies

Statistical Genomics and Bioinformatics Workshop. Genetic Association and RNA-Seq Studies Statistical Genomics and Bioinformatics Workshop: Genetic Association and RNA-Seq Studies RNA Seq and Differential Expression Analysis Brooke L. Fridley, PhD University of Kansas Medical Center 1 Next-generation

More information

RNA

RNA RNA sequencing Michael Inouye Baker Heart and Diabetes Institute Univ of Melbourne / Monash Univ Summer Institute in Statistical Genetics 2017 Integrative Genomics Module Seattle @minouye271 www.inouyelab.org

More information

RNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS)

RNA-sequencing. Next Generation sequencing analysis Anne-Mette Bjerregaard. Center for biological sequence analysis (CBS) RNA-sequencing Next Generation sequencing analysis 2016 Anne-Mette Bjerregaard Center for biological sequence analysis (CBS) Terms and definitions TRANSCRIPTOME The full set of RNA transcripts and their

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

Application Note Selective transcript depletion

Application Note Selective transcript depletion Application Note Selective transcript depletion Sample Authors Laura de Jager RED Scientist Michael Berry Bioinformatics Scientist Luke Esau RED Senior Scientist Ross Wadsworth RED Team Lead Roche Sequencing

More information

Increased transcription detection with the NEBNext Single Cell/Low Input RNA Library Prep Kit

Increased transcription detection with the NEBNext Single Cell/Low Input RNA Library Prep Kit be INSPIRED drive DISCOVERY stay GENUINE TECHNICAL NOTE Increased transcription detection with the NEBNext Single Cell/Low Input RNA Library Prep Kit Highly sensitive, robust generation of high quality

More information

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist Whole Transcriptome Analysis of Illumina RNA- Seq Data Ryan Peters Field Application Specialist Partek GS in your NGS Pipeline Your Start-to-Finish Solution for Analysis of Next Generation Sequencing Data

More information

High-quality stranded RNA-seq libraries from single cells using the SMART-Seq Stranded Kit Product highlights:

High-quality stranded RNA-seq libraries from single cells using the SMART-Seq Stranded Kit Product highlights: TECH NOTE High-quality stranded RNA-seq libraries from single cells using the SMART-Seq Stranded Kit Product highlights: Simple workflow starts directly from 1 1,000 cells or 10 pg 10 ng total RNA to generate

More information

Sequence Analysis 2RNA-Seq

Sequence Analysis 2RNA-Seq Sequence Analysis 2RNA-Seq Lecture 10 2/21/2018 Instructor : Kritika Karri kkarri@bu.edu Transcriptome Entire set of RNA transcripts in a given cell for a specific developmental stage or physiological

More information

Targeted RNA sequencing reveals the deep complexity of the human transcriptome.

Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Tim R. Mercer 1, Daniel J. Gerhardt 2, Marcel E. Dinger 1, Joanna Crawford 1, Cole Trapnell 3, Jeffrey A. Jeddeloh 2,4, John

More information

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD)

Analysis of RNA-seq Data. Feb 8, 2017 Peikai CHEN (PHD) Analysis of RNA-seq Data Feb 8, 2017 Peikai CHEN (PHD) Outline What is RNA-seq? What can RNA-seq do? How is RNA-seq measured? How to process RNA-seq data: the basics How to visualize and diagnose your

More information

TECH NOTE Ligation-Free ChIP-Seq Library Preparation

TECH NOTE Ligation-Free ChIP-Seq Library Preparation TECH NOTE Ligation-Free ChIP-Seq Library Preparation The DNA SMART ChIP-Seq Kit Ligation-free template switching technology: Minimize sample handling in a single-tube workflow >> Simplified protocol with

More information

Canadian Bioinforma3cs Workshops

Canadian Bioinforma3cs Workshops Canadian Bioinforma3cs Workshops www.bioinforma3cs.ca Module #: Title of Module 2 1 Module 3 Expression and Differen3al Expression (lecture) Obi Griffith & Malachi Griffith www.obigriffith.org ogriffit@genome.wustl.edu

More information

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Transcriptomics analysis with RNA seq: an overview Frederik Coppens Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)

More information

How to deal with your RNA-seq data?

How to deal with your RNA-seq data? How to deal with your RNA-seq data? Rachel Legendre, Thibault Dayris, Adrien Pain, Claire Toffano-Nioche, Hugo Varet École de bioinformatique AVIESAN-IFB 2017 1 Rachel Legendre Bioinformatics 27/11/2018

More information

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center

High Throughput Sequencing the Multi-Tool of Life Sciences. Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center High Throughput Sequencing the Multi-Tool of Life Sciences Lutz Froenicke DNA Technologies and Expression Analysis Cores UCD Genome Center Complementary Approaches Illumina Still-imaging of clusters (~1000

More information

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput

less sensitive than RNA-seq but more robust analysis pipelines expensive but quantitiatve standard but typically not high throughput Chapter 11: Gene Expression The availability of an annotated genome sequence enables massively parallel analysis of gene expression. The expression of all genes in an organism can be measured in one experiment.

More information

Finding Genes with Genomics Technologies

Finding Genes with Genomics Technologies PLNT2530 Plant Biotechnology (2018) Unit 7 Finding Genes with Genomics Technologies Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License

More information

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008

Agilent GeneSpring GX 10: Beyond. Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008 Agilent GeneSpring GX 10: Gene Expression and Beyond Pam Tangvoranuntakul Product Manager, GeneSpring October 1, 2008 GeneSpring GX 10 in the News Our Goals for GeneSpring GX 10 Goal 1: Bring back GeneSpring

More information

Differential gene expression analysis using RNA-seq

Differential gene expression analysis using RNA-seq https://abc.med.cornell.edu/ Differential gene expression analysis using RNA-seq Applied Bioinformatics Core, August 2017 Friederike Dündar with Luce Skrabanek & Ceyda Durmaz Day 3 QC of aligned reads

More information

Experimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis

Experimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis -Seq Analysis Quality Control checks Reproducibility Reliability -seq vs Microarray Higher sensitivity and dynamic range Lower technical variation Available for all species Novel transcript identification

More information

Mapping and quantifying mammalian transcriptomes by RNA-Seq. Ali Mortazavi, Brian A Williams, Kenneth McCue, Lorian Schaeffer & Barbara Wold

Mapping and quantifying mammalian transcriptomes by RNA-Seq. Ali Mortazavi, Brian A Williams, Kenneth McCue, Lorian Schaeffer & Barbara Wold Mapping and quantifying mammalian transcriptomes by RNA-Seq Ali Mortazavi, Brian A Williams, Kenneth McCue, Lorian Schaeffer & Barbara Wold Supplementary figures and text: Supplementary Figure 1 RNA shatter

More information

Integrative Genomics 1a. Introduction

Integrative Genomics 1a. Introduction 2016 Course Outline Integrative Genomics 1a. Introduction ggibson.gt@gmail.com http://www.cig.gatech.edu 1a. Experimental Design and Hypothesis Testing (GG) 1b. Normalization (GG) 2a. RNASeq (MI) 2b. Clustering

More information

Long and short/small RNA-seq data analysis

Long and short/small RNA-seq data analysis Long and short/small RNA-seq data analysis GEF5, 4.9.2015 Sami Heikkinen, PhD, Dos. Topics 1. RNA-seq in a nutshell 2. Long vs short/small RNA-seq 3. Bioinformatic analysis work flows GEF5 / Heikkinen

More information

Non-Organic-Based Isolation of Mammalian microrna using Norgen s microrna Purification Kit

Non-Organic-Based Isolation of Mammalian microrna using Norgen s microrna Purification Kit Application Note 13 RNA Sample Preparation Non-Organic-Based Isolation of Mammalian microrna using Norgen s microrna Purification Kit B. Lam, PhD 1, P. Roberts, MSc 1 Y. Haj-Ahmad, M.Sc., Ph.D 1,2 1 Norgen

More information

CBC Data Therapy. Metatranscriptomics Discussion

CBC Data Therapy. Metatranscriptomics Discussion CBC Data Therapy Metatranscriptomics Discussion Metatranscriptomics Extract RNA, subtract rrna Sequence cdna QC Gene expression, function Institute for Systems Genomics: Computational Biology Core bioinformatics.uconn.edu

More information

Single Cell Transcriptomics scrnaseq

Single Cell Transcriptomics scrnaseq Single Cell Transcriptomics scrnaseq Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Purpose The sequencing of

More information

RNA standards v May

RNA standards v May Standards, Guidelines and Best Practices for RNA-Seq: 2010/2011 I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable properties for quantification,

More information

Background Wikipedia Lee and Mahadavan, JCB, 2009 History (Platform Comparison) P Park, Nature Review Genetics, 2009 P Park, Nature Reviews Genetics, 2009 Rozowsky et al., Nature Biotechnology, 2009

More information

Computational & Quantitative Biology Lecture 6 RNA Sequencing

Computational & Quantitative Biology Lecture 6 RNA Sequencing Peter A. Sims Dept. of Systems Biology Dept. of Biochemistry & Molecular Biophysics Sulzberger Columbia Genome Center October 27, 2014 Computational & Quantitative Biology Lecture 6 RNA Sequencing We Have

More information

Isolation of total nucleic acids from FFPE tissues using FormaPure DNA

Isolation of total nucleic acids from FFPE tissues using FormaPure DNA APPLICATION NOTE Isolation of total nucleic acids from FFPE tissues using FormaPure DNA Jung Hoon Doh, Ph.D. Senior Application Scientist Beckman Coulter Life Sciences, Indianapolis, IN USA Summary Extensive

More information

Automated size selection of NEBNext Small RNA libraries with the Sage Pippin Prep

Automated size selection of NEBNext Small RNA libraries with the Sage Pippin Prep Automated size selection of NEBNext Small RNA libraries with the Sage Pippin Prep DNA CLONING DNA AMPLIFICATION & PCR EPIGENETICS RNA ANALYSIS LIBRARY PREP FOR NEXT GEN SEQUENCING PROTEIN EXPRESSION &

More information

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford

RNAseq Applications in Genome Studies. Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Applications in Genome Studies Alexander Kanapin, PhD Wellcome Trust Centre for Human Genetics, University of Oxford RNAseq Protocols Next generation sequencing protocol cdna, not RNA sequencing

More information

RNA SEQUINS LABORATORY PROTOCOL

RNA SEQUINS LABORATORY PROTOCOL INSTRUCTIONS FOR ADDITION OF SEQUINS TO RNA SAMPLES (for use with RNA sequins version 2) RNA Sequins are designed, validated and manufactured at the Garvan Institute of Medical Research, Sydney Australia.

More information

Analysis Datasheet Exosome RNA-seq Analysis

Analysis Datasheet Exosome RNA-seq Analysis Analysis Datasheet Exosome RNA-seq Analysis Overview RNA-seq is a high-throughput sequencing technology that provides a genome-wide assessment of the RNA content of an organism, tissue, or cell. Small

More information

Guidelines Analysis of RNA Quantity and Quality for Next-Generation Sequencing Projects

Guidelines Analysis of RNA Quantity and Quality for Next-Generation Sequencing Projects Title: Protocol number: Guidelines Analysis of RNA Quantity and Quality for Next-Generation Sequencing Projects GAF S002 Version: Version 4 Date: December 17 th 2015 Author: P. van der Vlies, C.C. van

More information

an innovation in high throughput single cell profiling

an innovation in high throughput single cell profiling an innovation in high throughput single cell profiling www.dolomite-bio.com Why use high throughput single cell profiling? Techniques such as high throughput scrna-seq (single cell RNA sequencing) offer

More information

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland

ChIP-seq data analysis with Chipster. Eija Korpelainen CSC IT Center for Science, Finland ChIP-seq data analysis with Chipster Eija Korpelainen CSC IT Center for Science, Finland chipster@csc.fi What will I learn? Short introduction to ChIP-seq Analyzing ChIP-seq data Central concepts Analysis

More information

Form for publishing your article on BiotechArticles.com this document to

Form for publishing your article on BiotechArticles.com  this document to Your Article: Article Title (3 to 12 words) Article Summary (In short - What is your article about Just 2 or 3 lines) Category Transcriptomics sequencing and lncrna Sequencing Analysis: Quality Evaluation

More information

The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks.

The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks. Open Seqmonk Launch SeqMonk The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything is in order, indicated by the green check marks. SeqMonk Analysis Page 1 Create

More information

Parts of a standard FastQC report

Parts of a standard FastQC report FastQC FastQC, written by Simon Andrews of Babraham Bioinformatics, is a very popular tool used to provide an overview of basic quality control metrics for raw next generation sequencing data. There are

More information

RNA-Seq Software, Tools, and Workflows

RNA-Seq Software, Tools, and Workflows RNA-Seq Software, Tools, and Workflows Monica Britton, Ph.D. Sr. Bioinformatics Analyst September 1, 2016 Some mrna-seq Applications Differential gene expression analysis Transcriptional profiling Assumption:

More information

Introduction of RNA-Seq Analysis

Introduction of RNA-Seq Analysis Introduction of RNA-Seq Analysis Jiang Li, MS Bioinformatics System Engineer I Center for Quantitative Sciences(CQS) Vanderbilt University September 21, 2012 Goal of this talk 1. Act as a practical resource

More information

Introduction to RNAseq Analysis. Milena Kraus Apr 18, 2016

Introduction to RNAseq Analysis. Milena Kraus Apr 18, 2016 Introduction to RNAseq Analysis Milena Kraus Apr 18, 2016 Agenda What is RNA sequencing used for? 1. Biological background 2. From wet lab sample to transcriptome a. Experimental procedure b. Raw data

More information

RNA-Seq analysis using R: Differential expression and transcriptome assembly

RNA-Seq analysis using R: Differential expression and transcriptome assembly RNA-Seq analysis using R: Differential expression and transcriptome assembly Beibei Chen Ph.D BICF 12/7/2016 Agenda Brief about RNA-seq and experiment design Gene oriented analysis Gene quantification

More information

Transcriptome Assembly, Functional Annotation (and a few other related thoughts)

Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Transcriptome Assembly, Functional Annotation (and a few other related thoughts) Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 23, 2017 Differential Gene Expression Generalized Workflow File Types

More information

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved.

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved. RNA-Seq Analysis August 2014 Strand Genomics, Inc. 2014. All rights reserved. Contents Introduction... 3 Sample import... 3 Quantification... 4 Novel exon... 5 Differential expression... 12 Differential

More information

Next-generation sequencing technologies

Next-generation sequencing technologies Next-generation sequencing technologies NGS applications Illumina sequencing workflow Overview Sequencing by ligation Short-read NGS Sequencing by synthesis Illumina NGS Single-molecule approach Long-read

More information

Introduction to transcriptome analysis using High Throughput Sequencing technologies. D. Puthier 2012

Introduction to transcriptome analysis using High Throughput Sequencing technologies. D. Puthier 2012 Introduction to transcriptome analysis using High Throughput Sequencing technologies D. Puthier 2012 A typical RNA-Seq experiment Library construction Protocol variations Fragmentation methods RNA: nebulization,

More information

02 Agenda Item 03 Agenda Item

02 Agenda Item 03 Agenda Item 01 Agenda Item 02 Agenda Item 03 Agenda Item SOLiD 3 System: Applications Overview April 12th, 2010 Jennifer Stover Field Application Specialist - SOLiD Applications Workflow for SOLiD Application Application

More information

SCALABLE, REPRODUCIBLE RNA-Seq

SCALABLE, REPRODUCIBLE RNA-Seq SCALABLE, REPRODUCIBLE RNA-Seq SCALABLE, REPRODUCIBLE RNA-Seq Advances in the RNA sequencing workflow, from sample preparation through data analysis, are enabling deeper and more accurate exploration

More information

Quantifying gene expression

Quantifying gene expression Quantifying gene expression Genome GTF (annotation)? Sequence reads FASTQ FASTQ (+reference transcriptome index) Quality control FASTQ Alignment to Genome: HISAT2, STAR (+reference genome index) (known

More information

Supplementary Information for Single-cell sequencing of the small-rna transcriptome

Supplementary Information for Single-cell sequencing of the small-rna transcriptome Supplementary Information for Single-cell sequencing of the small-rna transcriptome Omid R. Faridani 1,6,*, Ilgar Abdullayev 1,2,6, Michael Hagemann-Jensen 1,3, John P. Schell 4, Fredrik Lanner 4,5 and

More information

Gene expression microarrays and assays. Because your results can t wait

Gene expression microarrays and assays. Because your results can t wait Gene expression microarrays and assays Because your results can t wait A simple path from data to decision-making The power of expression microarrays Transcriptome-wide analysis can be complex. Matching

More information