Updates from CDC: Cluster Detection and Reporting Guidelines

Size: px
Start display at page:

Download "Updates from CDC: Cluster Detection and Reporting Guidelines"

Transcription

1 National Center for Emerging and Zoonotic Infectious Diseases Updates from CDC: Cluster Detection and Reporting Guidelines Molly Leeper Salmonella Database Manager PulseNet Western Regional Meeting February 2019

2 Update to PulseNet s Transition to WGS for Foodborne Surveillance WGS is the standard subtyping method for Listeria Campylobacter, Salmonella and STEC/Shigella surveillance will begin to use WGS as the standard subtyping method this year. (expected timeline: March 2019). At this time, laboratories will be requested to transition to WGS. For other PulseNet organisms (Vibrio, Yersinia) laboratories may continue to pulse isolates or perform WGS as funding allows Laboratories are in the process of converting existing PFGE databases to BioNumerics 7.6 Expectation is that all labs will be converted by March

3 BioNumerics v7.6 Conversion Labs are in the process of converting their PFGE databases to BioNumerics v7.6 Once labs are converted they can request WGS analysis certification sets; to do this: The PulseNet team at CDC has posted training documents covering PFGE and WGS analysis and management of data in BioNumerics v7.6 to the PulseNet SharePoint site Library of PulseNet Documents WGS PHL Upgrade to BioNumerics v7.6 WGS analysis in BioNumerics v7.6 is expected to be available to certified individuals in March

4

5 Database Validation Outbreaks detected by PFGE with good epi data were compared using hqsnp, cgmlst and wgmlst analyses to determine which method worked best to separate outbreak vs. sporadic cases 0 74 SNPs PNUSAS SNP PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS SNPs 0-5 SNPs [0, 35] 1.0 [0, 4] PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS [1, 1] 31.0 [0, 51] 1.0 [0, 4] PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS hqsnp: cluster is 0 5 SNPs cgmlst: cluster is 0 4 alleles wgmlst: cluster is 0 4 alleles

6 Thresholds for Detecting WGS Clusters cgmlst will be used to detect clusters (targets the core genome) wgmlst may be used to further discriminate if necessary (targets the entire genome) Look for local clusters of sequences within 10 alleles by cgmlst, with at least two of those sequences being within 5 alleles May want to report historical sequences that are closely related to newly detected clusters Allele differences within a cluster may be larger or smaller depending on the organism and epi data There can be similar strains by WGS that may not be epidemiologically linked More clonal species/serotypes may have smaller allele differences Zoonotic outbreaks may have larger allele differences

7 Cluster Detection Methods Tool Listeria Salmonella Escherichia Campylobacter Core genome MLST (cgmlst) Yes Yes Yes Yes Whole genome MLST (wgmlst): if further analysis needed Yes Yes Yes Yes SNP analysis: if further analysis needed Yes Yes Yes Yes Fast Character Matching Yes Yes Yes Yes Allele code nomenclature Yes Available March Available March Available March Find Clusters Tool Available late February 2019 Available March 2019 Available March 2019 Available March 2019

8 Cluster Detection Method: 60 or 120 Day Search Select entries uploaded in the past 60 or 120 days Can choose different allele schemes cgmlst, wgmlst, etc.

9 Cluster Detection Method: 60 or 120 day dendrogram Allele differences can be viewed by right clicking on nodes The number of allele differences are shown by median and range [minimum maximum] Similarity matrix and differences between clades can be exported into Excel

10 Cluster Detection Method: wgsnp Analysis in BioNumerics Use for further analysis of clusters First create a new experiment/sequence type (must do for each cluster/analysis) Next, map to the reference strain (if using an existing denovo assembly, click on the denovo experiment green dot for the reference you want to use, or import a closed genome as a reference) Next, select entries for comparison and submit to CE. When analysis is finished, retrieve jobs. Next, run SNP analysis (Choose Analysis Sequence types Start SNP Analysis). Apply customized SNP filters. Last, export entries to a comparison and create SNP tree.

11 Cluster Detection Method: wgsnp Analysis in BioNumerics Selecting the node in the tree gives you the SNP differences between cases These strains are 24 SNPs different based on the reference chosen

12 Cluster Detection Method: Fast Character Matching (FCM) cgmlst is the default character view The results can be restricted to only include entries in a specific date range or database field Can search for a specific number of allele differences Can choose how results are shown

13 Cluster Detection Method: Allele Codes Allele codes are built on percent similarity thresholds between core genomes (cgmlst) to form a stable Allele Code, similar to a pattern name We can use these names for cluster detection by knowing how related isolates are based on their name Names can be complete or partial depending on how they relate on the tree from which the nomenclature was built QC is built in nomenclature so that strains will not be named if the core genome falls below 95% or genome size is incorrect Naming and thresholds of relatedness will vary by organism

14 Cluster Detection Method: Allele Codes All uploads that pass quality will receive an allele code which can then be downloaded into local databases Poor quality sequences will receive a failed allele code and should be resequenced FAILED QC: CORE FAILED QC: LENGTH FAILED QC: CORE, LENGTH Compare entries within the past 60 or 120 days that share allele codes up to the cluster detection threshold (may vary depending on organism) Allele codes will be available in SEDRIC

15 1 Cluster Detection Method: Find Clusters Tool Note: the below recommendations are for Listeria 1. Recommend to look at allele code up to the 5 th digit (~7 allele difference) 2. Human entries only when looking for clusters of cases 3. Cluster size dependent on lab a) National level, we look at 3 b) Local level may need to change to 2, for example 4. Use the last 120 days for Listeria 5. Found clusters are displayed below with Allele code and number of cases Note Defaults can be changed to match on any digit in the allele code (1 st, 2 nd, 3 rd, 4 th, 5 th, or 6 th ), historical cases, nonhuman, cluster size, and/or number of days to check

16 Cluster Detection Method: Find Clusters Tool 6. Select clusters and choose OK to open a comparison of those entries in BioNumerics 6

17 Cluster Detection Method: Find Clusters Tool Another option: search for the allele code (up to the 4 th or 5 th digit) of the identified cluster using the find entries in list option. Only include the numbers of the allele code. Select your identified cluster and create a dendrogram

18 Now that I know how to find clusters, which method should I use? This will probably vary by lab, but a combination of methods may be helpful FCM FCM all new entries Use the FCM parameters to search within the past 60 or 120 days or within a certain database field Note: may be helpful to narrow down the search by species/serotype cgmlst Dendrogram Keep a saved comparison of the past 60 or 120 days Add all new entries to the comparison and create a cgmlst dendrogram Note: may be helpful to save multiple comparisons based on species/serotype Allele Codes and Find Clusters Tool Wait for allele code to be assigned to uploaded entry Download allele code Use find clusters tool to perform cluster search Allele Codes and FCM Wait for allele code to be assigned to uploaded entry Download allele code Search for closely related entries uploaded in the past 60 or 120 days using the allele code or FCM

19 Query the National Database for Closely Related Matches Query a field in the national database to temporarily download allele calls and metadata uploaded by labs other than your own

20 Query the National Database for Closely Related Matches Can use Fast Match Selection Against Complete Server to find closely related matches to your entry Note: if searching by wgmlst, allele differences may be higher

21 Post to SharePoint Once a cluster has been detected post the cluster to SharePoint Include key numbers, allele code(s), collection dates, epi information if available Bundle files do not need to be posted since all good quality uploads will receive allele codes within 24 hours CDC database managers will review postings and respond with a cluster code, line list and sequencing data 1806GAGX6-1WGS Year Month LabID* *ML is used for multi-state clusters Organism Code Cluster #

22 What should I send to my epidemiologists? Reports describing the clusters Number of isolates included Outbreak code (if available) and allele code(s) involved in the cluster Both can be downloaded from the national database Allele differences for the cluster Information regarding any relevant historical matches (past outbreaks, non-human, etc.) Closely related sequences in other states

23 What should I send to my epidemiologists? Line lists containing allele codes and relevant demographic information Allele codes will also be available in SEDRIC Key WGS_id NCBI_ACCESSION SRR_ID Allele_code Outbreak SourceType SourceSite CO PNUSAL SAMN SRR LMO TXGX6-1 Human Blood CO PNUSAL SAMN SRR LMO TXGX6-1 Human Blood CO PNUSAL SAMN SRR LMO TXGX6-1 Human Abdominal Fluid PatientAgeYears PatientSex SourceCounty SourceCity IsolatDate ReceivedDate PulseNet_UploadDate 60 FEMALE 10/19/ /26/ /8/ FEMALE Harris Houston 11/6/ /15/ /28/ FEMALE Houston 11/21/ /7/ /27/2018

24 What should I send to my epidemiologists? Dendrograms or similarity matrices exported into PowerPoint or Excel Mark allele differences using BioNumerics Use groups to highlight entries of interest Clade differences if two clusters are closely related or are being investigated together wgmlst_v3 (Core) 10 Cluster #1 Cluster #1 Cluster #1 Cluster #1 PNUSAS PNUSAS PNUSAS PNUSAS Cluster #1 Cluster #2 Cluster #3 Cluster #1 0.0 [0, 0] 14.0 [13, 17] 22.0 [19, 26] Cluster # [13, 17] 0.0 [0, 1] 27.0 [24, 30] Cluster # [19, 26] 27.0 [24, 30] 1.0 [0, 3] Cluster #1 PNUSAS [0, 1] 1.0 [0, 3] Cluster #2 Cluster #2 Cluster #2 Cluster #2 Cluster #2 Cluster #2 Cluster #3 Cluster #3 Cluster #3 Cluster #3 Cluster #3 Cluster #3 Cluster #3 PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS PNUSAS059325

25 What should I send to my epidemiologists? Notify epis when new isolates are included Exporting weekly dendrograms may not be necessary Allele Differences WGS_id Key Outbreak PFGE-XbaI-pattern SourceSite PatientSex IsolatDate 0-3 alleles PNUSAS TX TXAML MLJKX-1 JKXX Stool UNKNOWN 10/10/ alleles PNUSAS TX TXAML MLJKX-1 JKXX Stool MALE 10/2/ alleles PNUSAS NM MLJKX-1 JKXX Stool MALE 10/1/ alleles PNUSAS NM MLJKX-1 JKXX Stool FEMALE 10/11/ alleles PNUSAS NM MLJKX-1 JKXX Stool FEMALE 9/6/ alleles PNUSAS CA M18X MLJKX-1 JKXX Stool FEMALE 9/14/ alleles PNUSAS CA M18X MLJKX-1 JKXX Stool MALE 9/13/ alleles PNUSAS CA M18X MLJKX-1 JKXX Stool MALE 9/10/ alleles PNUSAS CAOC_BE MLJKX-1 JKXX Stool FEMALE 9/7/ alleles PNUSAS CAOC_BE MLJKX-1 JKXX Stool FEMALE 9/10/ alleles PNUSAS CAOC_BE MLJKX-1 JKXX Stool FEMALE 10/17/ alleles PNUSAS LAC T3729_Salmonella 1810MLJKX-1 JKXX Stool FEMALE 9/27/ alleles PNUSAS LAC T4351_Salmonella 1810MLJKX-1 JKXX Stool FEMALE 10/5/ alleles PNUSAS CAOC_BE MLJKX-1 JKXX Stool MALE 9/11/ alleles PNUSAS LAC W16906_Salmonella 1810MLJKX-1 JKXX Stool MALE 9/14/2018

26 Thank you For more information, contact CDC CDC-INFO ( ) TTY: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention. Telephone: #PulseNet Web: