CGE Pipeline. Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7.

Similar documents
Transcription:

CGE Pipeline Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7. Q&A Jose Luis Bellod Cisneros PhD. Student

Content 1. The Batch Upload 2. The User System 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

The METADATA Excel, CSV, TSV... Sequencing platform Ilumina Ion Torrent... Organism, source, strain... Pathogenic Usage restrictions Private (Delete?) Public Spatio-temporal location Country, city... Coordinates Collection date Raw read files or contigs ISOLATE FILES Data submission Batch 1. Store files 2. Run services 3. Visualize/get results Upload CGE CGE Server Server Services Services CGE DB CGE DB

Metadata Excel File

Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

The$CGE$Pipeline$! Why$do$we$make$a$pipeline?$ " For$speed$ " For$user$convenience$ " For$standardisa5on$ " To$enable$comparison$between$isolates$! How$did$we$make$a$pipeline?$ " We$created$user$pla+orm$ " We$created$a$system$for$storing$user$uploaded$data$ " We$created$a$mySQL$database$to$store$all$informa5on$in$ regards$to$user$and$isolate$associa5ons$ " We$created$an$interface$to$talk$to$the$mySQL$database$$ 1 Add$an$isolate$ 2 Add$a$service$to$an$isolate$ 3 Etc.$ " We$created$a$script$to$process$the$user$provided$input$and$ execute$the$services$in$a$specific$order$ 6

The CGE Pipeline Input Assembly KmerFinder Assembly worked? No Report Failure Yes ContigAnalyzer ResFinder Wait No KmerFinder Done? Yes MLST PlasmidFinder pmlst VirulenceFinder Wait No All Services Done? Yes Report Results

Content 1. The Batch Upload 2. The User System 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

The User System https://cge.cbs.dtu.dk/cge/user/login/user_manager.php?action=create

The User System https://cge.cbs.dtu.dk/tools_new/client/platform/sample/

The User System https://cge.cbs.dtu.dk/services/cge/submitservice.php

The CGE Pipeline Service Assembler KmerFinder ContigAnalyzer MLST PlasmidFinder ResFinder pmlst VirulenceFinder Description A denovo assembly pipeline A species typing algorithm based on K-mer profiles In-house service for annotation of contig metrics A multi locus sequence typing algorithm using BLAST and a MLST database A plasmid identification algorithm using BLAST and a plasmid replicon database An antimicrobial resistance gene identification algorithm using BLAST and a resistance gene database A plasmid multilocus sequence identification algorithmidentify the plasmids multilocus sequence type in the input sample, A virulence gene identification algorithm using BLAST and a virulence gene database

Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

Isolate Overview

DEMO Isolate Overview

Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

Cluster View Map Visualisation Color-based clusters Influenza

Cluster View Map Visualisation Influenza Predefined Visualizations

Single isolate View Map Visualisation Grouped by Isolate/location Dengue Fever

Map Visualisation

Color Isolates Map Visualisation

DEMO Map Visualisation

Link Map Visualisation

Exercises The Batch Upload

Content 1. The User System Exercise 0 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

The Batch Upload Exercise 1 Fill out and Upload Remember to download the excel template! Provide mandatory values: sample_name Batch Upload Website https://cge.cbs.dtu.dk/services/cge/batch.php file_names (contig file name) pre_assembled (yes) Other mandatory fields Upload excel file Upload contigs file (Link) Press Submit If you hesitate filling out the template, take a look at this

The CGE Pipeline Accessing The Pipeline Results https://cge.cbs.dtu.dk/cge/user/isolate/isolate_manager.php Exercise 2 View pipeline results Genus and Species =??? MLST =??? Human Pathogen Pred. =??? Resistance Genes =???

The CGE Pipeline Accessing The Pipeline Results https://cge.cbs.dtu.dk/cge/user/isolate/isolate_manager.php Exercise 2 View pipeline results Genus and Species = Pseudomonas aeruginosa MLST = Unknown ST Human Pathogen Pred. = 0.753 Resistance Genes = Aminoglycoside, Betalactam, Phenicol

The CGE Pipeline Accessing The Pipeline Results https://cge.cbs.dtu.dk/cge/user/isolate/isolate_manager.php Exercise 2 View pipeline results

Isolate Overview Accessing the list of isolates https://cge.cbs.dtu.dk/services/cge/map.php Exercise 3 Browse and color set of samples Load Salmonella Data (Link) (click on Metadata File button an upload the excel file salmonella_metadata.xlsx) Use search box to find two different subsets (Eg: chicken and pepper) Color them using the palette at the topright corner Make a search query to show both groups using the OR operator Restore colors

Map Visualisation Map visualisation https://cge.cbs.dtu.dk/services/cge/map.php Exercise 4 Visualise isolates on the map Go back to exercise 4 if you don t have the data ready Use search box to color two different subsets (Eg: human and animal) Click on Map tab Explore the Clusters view (zoom in/out) Click on Isolates view and find your two coloured subsets Apply different filters to the isolates

Map Visualisation Map visualisation (Hexagonal binning) https://cge.cbs.dtu.dk/tools/client/hexmap Exercise 5 Filter timeline Go back to exercise 4 if you don t have the data ready Load salmonella data Click on Map tab Select human as source. Filter the timeline to see isolates since 2010 to the present Inspect the distribution of isolates over time

Acknowledgments Martin Thomsen Johanne Arhenfeldt Vanessa Jurtz