oqtans A Galaxy-Integrated Workflow for Quantitative Transcriptome Analysis from NGS Data

Size: px
Start display at page:

Download "oqtans A Galaxy-Integrated Workflow for Quantitative Transcriptome Analysis from NGS Data"

Transcription

1 oqtans A Galaxy-Integrated Workflow for Quantitative Transcriptome Analysis from NGS Data Géraldine Jean <gjean@tue.mpg.de> Sebastian J. Schultheiss <sebi@tue.mpg.de> Machine Learning in Biology, Rätsch Lab, FML of the Max Planck Society Tübingen, Germany Presented at ISMB 2011, Vienna, Austria c oqtans online quantitative transcript analysis

2 Galaxy Approach Persistent, transparent, reproducible approach to bioinformatics research Integration made simple with xml wrappers of command line tools Accessible: web service, download, cloud J. Goecks et al D. Blankenberg et al E. Afgan et al S. Koskovsky Pond et al W. Miller et al J. Taylor et al D. Blankenberg et al M. Giardine et al. 2005

3 RNA-Seq Analysis c oqtans online quantitative transcript analysis Read Mapping Transcript Prediction Quantification Common analysis tasks compare two samples (wildtype, mutant) identify new transcripts Lee et al Nucleic Acids Res Yamashita et al Genome Res Daines et al Genome Res Ramani et al Genome Res Tang et al Nature Methods Grabherr et al Nature Biotech Li et al Science Gerstein et al Science

4 oqtans Galaxy Tools c oqtans online quantitative transcript analysis Read Mapping Transcript Prediction Quantification PALMapper* mtim* rquant* BWA SplAdder* rdiff* TopHat * developed by the MLB Rätsch Research Group, FML, Tübingen ASP* mgene.web* Cufflinks GFF Toolkit* Cufflinks/ Cuffdiff DESeq GOrilla SAFT* Plus EasySVM, KIRMES and more

5 oqtans Transcriptome analysis toolsuite PALMapper Read Mapping Transcript Prediction Quantification PALMapper: highly accurate short-read mapper using base quality and splice site predictions G. Jean et al Curr Protoc Bioinformatics

6 oqtans Transcriptome analysis toolsuite Read Mapping mtim Transcript Prediction Segmentation Quantification mtim: reconstructs exon-intron structure from alignments and splice site predictions SplAdder: adds isoforms to known annotation based on splice graph G. Zeller et al i.p.

7 oqtans Transcriptome analysis toolsuite Read Mapping Transcript Prediction Quantification rquant rquant: estimates biases in library prep, sequencing, and read mapping; accurately determines the abundances of transcripts R. Bohnert & G. Rätsch 2010 Nucleic Acids Res

8 oqtans Transcriptome analysis toolsuite Read Mapping Transcript Prediction rdiff/deseq Quantification rdiff/deseq: determines significant differences in transcript/gene expression between experiments using statistical tests O. Stegle et al Nature Preceedings S. Anders and W. Huber 2010 Genome Biology

9 oqtans Transcriptome analysis toolsuite Read Mapping Transcript Prediction SAFT, Quantification GOrilla,... SAFT: The simple alignment filtering tool evaluates alignment accuracy A. Kahles et al i.p. GOrilla: Gene ontology enrichment analysis and visualization tool E. Eden et al BMC Bioinformatics

10 oqtans Live Demonstration Read Mapping Transcript Prediction Quantification

11 oqtans Live Demonstration PALMapper DESeq D. melanogaster GOrilla Illumina, 75 nt RNA-seq reads 3-day-old male adult (7.4 million) 3-day-old female adult (12.4 million)

12 Intron prediction accuracy of read alignments 0.90 PALMapper TopHat min 0.45 C. elegans 75 nt RNA-seq reads (24 million) min 0 F-score Runtime

13 Intron and transcript accuracy evaluation 0.90 PALMapper mtim C. elegans TopHat Cufflinks F-Score nt RNA-seq reads (24 million) Introns Transcripts

14 oqtans Availability: Our Server External compute cluster 21 nodes, 168 CPUs our Galaxy test instance All tools described here, and more!

15 oqtans Availability: Source Code Free, open-source packages of our own tools Including Galaxy Tool Wrappers Fabric scripts to install on any Galaxy instance Community Tool Shed

16 oqtans Availability: Cloud Computing Demo cloud instance with all oqtans tools AMI at Amazon Web Services for EC2 Cloudman to launch any number of instances as a compute cluster AMI in your own virtualizer (e.g. Virtual Box) search for oqtans

17 oqtans Availability: Cloud Computing

18 Live Demo Results

19 Live Demo Results PALMapper DESeq Compute resources: 19 4xlarge GOrilla Time: Alignments: 20 minutes Quantitative analysis: 10 minutes Cost on Amazon EC2: approx. 7.82

20 Jonas Behr, Regina Bohnert, Philipp Drewe, Nico Görnitz André Kahles, Pramod Mudrakarta, Vipin T. Sreedharan, Georg Zeller, Gunnar Rätsch