From Bands to Base Pairs: Implementation of WGS in a PulseNet Laboratory

Size: px
Start display at page:

Download "From Bands to Base Pairs: Implementation of WGS in a PulseNet Laboratory"

Transcription

1

2 From Bands to Base Pairs: Implementation of WGS in a PulseNet Laboratory Sara Wagner Microbiologist WI State Lab of Hygiene InFORM Meeting Nov 19, 2015

3 Objectives Describe WGS implementation at WSLH What has gone well? What challenges have been experienced? Discuss the impact of WGS in WI Impact on WSLH Impact on WI Division of Public Health Discuss future of WGS at WSLH Generate discussion- what are other labs experiencing with WGS implementation?

4 WGS Implementation at WSLH- Start-up Considerations Instrumentation Building/ Space Considerations Start-up Cost/ Funding Issues

5 Instrumentation at WSLH Illumina MiSeq Interface is user friendly Able to access their online portal and cloud storage, Basespace Service contract is expensive but has been worth the money for WSLH.

6 Building/ Space Considerations MiSeq instruments have very specific requirements for laboratory placement 24 of space on each side and above, 4 behind the machine. On a surface that minimizes vibrations and not near equipment that causes vibrations Room needs to be temperature controlled. The MiSeq is sensitive to high temperatures.

7 Building/ Space Considerations Dedicated space is important in order to minimize the risk of contamination during pre-amplification Storage is needed at different temperatures as well, -20 freezers, refrigerators, and room temp. Reagents are stored in a variety of locations.

8 Start-Up Cost/ Funding Issues Initial cost is high just for equipment* Illumina MiSeq machine and service contract $115,000 Qubit 2.0 Fluorometer $2,035 NanoDrop 2000 UV-Vis $10,000 Reagent cost also high Nextera XT DNA Sample Preparation Kit 96 samples - $2,755 Nextera XT Index Kit for 96 indices - $950 MiSeq Reagent Kit v2 500 cycles - $1100 The reagent kit is a one time use reagent Achieving ongoing funding can be difficult *Costs of reagents and platforms may vary by site

9 Impact of WGS on WSLH Workflow Staffing/ Training New Language and Concepts Data Analysis Ongoing Costs and Funding Sources

10 Workflow Differences PFGE Turn around time ~2 days Pure isolates Uploading PFGE Gels Can run up to 11 specimens on a gel Based on the size of the gel being used Throughput depends on the number of CHEF Mappers Isolates receive pattern names after uploading to PulseNet national database WGS Turn around time ~4 days Pure Isolates Extraction Library Prep MiSeq Run Uploading Can run up to 16 isolates with one 500 cycle kit Based on the size of the genome and desired coverage Throughput depends on the number of MiSeq machines Isolates are uploaded to NCBI and shared with CDC PulseNet Cluster data can be requested

11 Workflow Differences Maintenance component to WGS Washes need to be performed to keep the machine working optimally Post Run Wash Template Line Wash Maintenance Wash Stand-by Wash As with any computer the MiSeq needs to be restarted Data needs to be moved from the MiSeq to a server for longer storage

12 Staffing and Training Training has taken more time than expected Availability of time for trainer and trainee Long learning curve Certification panel available It has been difficult to work WGS into the normal rotation Much easier to have dedicated people or a dedicated rotation May be a possibility when there is higher throughput

13 New Language and Concepts Controls vs metrics PFGE uses control strains on each gel vs standard control WGS uses metrics to determine validity of run Cluster density (number of clusters on the flow cell) Clusters passing filter (number of clusters with optimum intensity) Q30 (quality score) FWHM (focusing metric) Coverage (calculated off of the total number of sequences)

14

15

16 Data Analysis A variety of applications can be run from Basespace. Fast QC Provides data on individual specimens Contains graphical representations of sequence statistics

17 Data Analysis hqsnp trees (high quality single-nucleotide polymorphism) These trees are not something most labs will be able to currently create without bioinformatician support. Bootstrapping values a numerical indicator of confidence in the placement of the specimens on the tree How many SNP s or alleles apart is significant? This can depend on the analysis of the data and the reference being used

18 Ongoing Costs and Funding Issues There is a high cost per test A run costs around $1,600 in our hands It is most cost effective to run as many specimens as possible on a cartridge Variety of places to try to gain funding in the Epidemiology and Laboratory Capacity (ELC) Grant- Advanced Molecular Detection (AMD) PulseNet FoodCORE COE

19 Impact of WGS on WDPH New Language/ Technology Epis will need to learn how to interpret WGS data Dealing with heat maps/snp data vs. cluster names Manage data that has been de-identified PNUSA numbers vs. laboratory numbers Cluster investigation Increased follow-up (more clusters identified) compared to PFGE Decreased follow-up (can differentiate within a PulseNet cluster) There will need to be Epi training on WGS interpretation

20

21 Future Considerations Troubleshooting Bioinformaticist vs BioNumerics v7.5 Data storage (BaseSpace) Sharing Costs Increased lab footprint/ space demands Streamlining workflow

22 Troubleshooting Problems that arise are not always easy to troubleshoot. Same MiSeq error can come from multiple issues Example: A best focus error can happen from a bad motor, over clustering, debris, dirty flow cell, etc. Reagent issues Illumina and CDC have been very helpful with diagnosing a variety of problems that we have had.

23 Bioinformaticist vs BN v 7.5 Currently WSLH does not have a bioinformaticist on staff. We have been relying on CDC as well as NY State Department of Health to provide data interpretation for us. BioNumerics V 7.5 will bring a lot of the analysis into the laboratory. Having a bioinformatician on staff may become necessary in the future; especially if you intend to use WGS for non- PulseNet organisms CDC working on WGS pipelines for other organisms

24 Data Storage Illumina gives you 1TB of storage in the Basespace cloud for free. No current prices on how much more storage will cost. Even just saving the fastq.gz files, the raw sequences, takes 5-6GB for each run. Basespace allows for sharing data easily. Data management after the 1TB limit will need to be incorporated. What data is important to save and what data is able to be discarded?

25 Increased Lab Footprint WGS takes up a lot of space within the lab to begin with. More space will be necessary if all PFGE specimens are run on WGS More machines may be necessary in the future. Space needs to be vibration free and temperature controlled. Reagents need to be kept in a -20C freezer. WSLH is already experiencing issues with finding enough space for reagents and kits during high volume months. We have chosen to use newer frost-free freezers that are connected to the alarm system due to the cost of the reagents.

26 Streamlining Running more isolates can lead to more efficiency. Isolates are currently batched so MiSeq runs are intermittent. WSLH does not have a WGS rotation due to batching. Easier to follow up for Epi s as well when information is timely. Could be possible to cross-train in the future. At WSLH both the bacteriology and virology units are performing WGS.

27 Summary WGS has a variety of challenges to overcome including space limitations, funding issues, new concepts, and communication. Different concepts and a different kind of workflow will need to be incorporated into laboratories for WGS. There are a variety of unknowns in the future of WGS, but data storage and data interpretation are going to be very important.

28 Acknowledgements Wisconsin Department of Health Rachel Klos, D.V.M., M.P.H. Justin Kohl, M.P.H. MDH Public Health Laboratory New York State Department of Health William Wolfgang Ph. D. PulseNet WGS staff Eija Trees, Ph. D., D.V.M. Heather Carleton, M.P.H., Ph. D. Ashley Sabol, M.S.