CGE Pipeline. Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7.

Size: px
Start display at page:

Download "CGE Pipeline. Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7."

Transcription

1 CGE Pipeline Content 1. The User System 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. FuturePlans 7. Q&A Jose Luis Bellod Cisneros PhD. Student

2 Content 1. The Batch Upload 2. The User System 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

3 The METADATA Excel, CSV, TSV... Sequencing platform Ilumina Ion Torrent... Organism, source, strain... Pathogenic Usage restrictions Private (Delete?) Public Spatio-temporal location Country, city... Coordinates Collection date Raw read files or contigs ISOLATE FILES Data submission Batch 1. Store files 2. Run services 3. Visualize/get results Upload CGE CGE Server Server Services Services CGE DB CGE DB

4 Metadata Excel File

5 Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

6 The$CGE$Pipeline$! Why$do$we$make$a$pipeline?$ " For$speed$ " For$user$convenience$ " For$standardisa5on$ " To$enable$comparison$between$isolates$! How$did$we$make$a$pipeline?$ " We$created$user$pla+orm$ " We$created$a$system$for$storing$user$uploaded$data$ " We$created$a$mySQL$database$to$store$all$informa5on$in$ regards$to$user$and$isolate$associa5ons$ " We$created$an$interface$to$talk$to$the$mySQL$database$$ 1 Add$an$isolate$ 2 Add$a$service$to$an$isolate$ 3 Etc.$ " We$created$a$script$to$process$the$user$provided$input$and$ execute$the$services$in$a$specific$order$ 6

7 The CGE Pipeline Input Assembly KmerFinder Assembly worked? No Report Failure Yes ContigAnalyzer ResFinder Wait No KmerFinder Done? Yes MLST PlasmidFinder pmlst VirulenceFinder Wait No All Services Done? Yes Report Results

8 Content 1. The Batch Upload 2. The User System 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

9 The User System

10 The User System

11 The User System

12 The CGE Pipeline Service Assembler KmerFinder ContigAnalyzer MLST PlasmidFinder ResFinder pmlst VirulenceFinder Description A denovo assembly pipeline A species typing algorithm based on K-mer profiles In-house service for annotation of contig metrics A multi locus sequence typing algorithm using BLAST and a MLST database A plasmid identification algorithm using BLAST and a plasmid replicon database An antimicrobial resistance gene identification algorithm using BLAST and a resistance gene database A plasmid multilocus sequence identification algorithmidentify the plasmids multilocus sequence type in the input sample, A virulence gene identification algorithm using BLAST and a virulence gene database

13 Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

14 Isolate Overview

15 DEMO Isolate Overview

16 Content 1. The User System Exercise 1 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

17 Cluster View Map Visualisation Color-based clusters Influenza

18 Cluster View Map Visualisation Influenza Predefined Visualizations

19 Single isolate View Map Visualisation Grouped by Isolate/location Dengue Fever

20 Map Visualisation

21 Color Isolates Map Visualisation

22 DEMO Map Visualisation

23 Link Map Visualisation

24 Exercises The Batch Upload

25 Content 1. The User System Exercise 0 Create a user account 2. The Batch Upload 3. The Pipeline 4. The List Tool 5. The Map Tool 6. Exercises CGE Pipeline

26 The Batch Upload Exercise 1 Fill out and Upload Remember to download the excel template! Provide mandatory values: sample_name Batch Upload Website file_names (contig file name) pre_assembled (yes) Other mandatory fields Upload excel file Upload contigs file (Link) Press Submit If you hesitate filling out the template, take a look at this

27 The CGE Pipeline Accessing The Pipeline Results Exercise 2 View pipeline results Genus and Species =??? MLST =??? Human Pathogen Pred. =??? Resistance Genes =???

28 The CGE Pipeline Accessing The Pipeline Results Exercise 2 View pipeline results Genus and Species = Pseudomonas aeruginosa MLST = Unknown ST Human Pathogen Pred. = Resistance Genes = Aminoglycoside, Betalactam, Phenicol

29 The CGE Pipeline Accessing The Pipeline Results Exercise 2 View pipeline results

30 Isolate Overview Accessing the list of isolates Exercise 3 Browse and color set of samples Load Salmonella Data (Link) (click on Metadata File button an upload the excel file salmonella_metadata.xlsx) Use search box to find two different subsets (Eg: chicken and pepper) Color them using the palette at the topright corner Make a search query to show both groups using the OR operator Restore colors

31 Map Visualisation Map visualisation Exercise 4 Visualise isolates on the map Go back to exercise 4 if you don t have the data ready Use search box to color two different subsets (Eg: human and animal) Click on Map tab Explore the Clusters view (zoom in/out) Click on Isolates view and find your two coloured subsets Apply different filters to the isolates

32 Map Visualisation Map visualisation (Hexagonal binning) Exercise 5 Filter timeline Go back to exercise 4 if you don t have the data ready Load salmonella data Click on Map tab Select human as source. Filter the timeline to see isolates since 2010 to the present Inspect the distribution of isolates over time

33 Acknowledgments Martin Thomsen Johanne Arhenfeldt Vanessa Jurtz