Fragment analysis: RFLP, VNTR, MLVA: looking for differences. Kristin Elwin Cryptosporidium Reference Unit, Wales, UK

Size: px
Start display at page:

Download "Fragment analysis: RFLP, VNTR, MLVA: looking for differences. Kristin Elwin Cryptosporidium Reference Unit, Wales, UK"

Transcription

1 Fragment analysis: RFLP, VNTR, MLVA: looking for differences Kristin Elwin Cryptosporidium Reference Unit, Wales, UK

2 What differences do we mean? Differences between parasite genera E.g. Giardia and Cryptosporidium Symptoms, aetiology, traditional diagnosis, multiplex-multi-organism screening PCR Differences between parasite species E.g. Cryptosporidium parvum and C. hominis Species-specific PCR or sequencing Differences within parasite species E.g. Giardia duodenalis assemblage A and assemblage B E.g. Within C. parvum Assemblage-specific PCR or single or multilocus typing

3 When would we look for differences? Prospectively in routine food safety testing? During or retrospectively after an outbreak of illness or for catchment studies to establish the genetic variation in potential sources such as drinking water / recreational waters

4 Why would we look for differences? Why Investigation of clusters of disease: establishing relationships in potential outbreaks of disease, source identification and routes of transmission: E.g. 20 people present with an illness, and 18 have a common exposure e.g a restaurant; two do not. Typing can help support epidemiological analyses by eliminating two cases from that investigation helping to focus on a potential source.

5 Why would we look for differences? Why Investigation of clusters of disease: establishing relationships in potential outbreaks of disease, source identification and routes of transmission:????? Illness caused by parasite A associated with restaurant X: cases ate a range of common and unique foods. Detection and typing of organisms in these foods can help to establish the source of infection.

6 Why would we look for differences? Why Investigation of clusters of disease: establishing relationships in potential outbreaks of disease, source identification and routes of transmission:

7 Looking for differences Genetic polymorphisms are the basis of differences between species, sometime they convey phenotypic differences, sometimes they do not e.g. Cryptosporidium spp. SSU rdna:

8 Differences between sequences Single Nucleotide Polymorphisms (SNP s) have been used to differentiate between species and genotypes (and other taxonomic units) Nucleotide differences at a particular locus, in a particular location between two species sometimes creates or removes a Restriction Enzyme (RE) recognition site Restriction enzymes are produced by bacteria as a cellular defence mechanism but we use their specific endonuclease action (on a specific sequence of nucleotides) Thus, RE digestion can be used, followed by gel electrophoresis to cleave DNA leaving fragments in a characteristic pattern: Restriction Fragment Length Polymorphism RFLP

9 Alignment and gel: RsaI digest of Cryptosporidium COWP gene C. hominis has an enzyme recognition site at 129 bp, 412 bp and 519 bp C. parvum has an enzyme recognition site at 412 bp and 519 bp C. meleagridis has an enzyme recognition site at 147bp and 519 bp This results in RFLP, and fragments of: C. hominis 34, 106, 129, 284 bp C. parvum 34, 106, 413 bp C. meleagridis 34, 147, 372 bp GEL Ch PICTURE Cp Cm

10 Differences within species During outbreaks of disease caused by a single species it is useful to be able to differentiate between cases and potential sources Microbiological (genetic) and epidemiological evidence of relatedness Cryptosporidium, Giardia and Toxoplasma have been typed using RFLP, single and multi-locus sequence typing (MLST), Multilocus fragment typing (MLFT) and microsatellite analyses.

11 Single locus typing Cryptosporidium typing mostly (until very recently) relied upon single-locus typing (sequencing) of the polymorphic sporozoite surface glycoprotein (gp60) and is the basis of development work on MLFT.

12 Currently: gp60 typing Gene encoding a highly variable surface antigen (60kD glycoprotein) bases Microsatellite region 900 bases containing various SNPs C. hominis IbA10G2 example: TTCTGTTGAGAGC TCATCATCATCATCATCATCATCGTCATCATCGTCA ACAAC Species: Family: TCA: TCG: I b A10 G b Commonly used worldwide

13 Single locus typing Cryptosporidium typing mostly (until very recently) relied upon single-locus typing (sequencing) of the polymorphic sporozoite surface glycoprotein (gp60) and is the basis of development work on MLFT. ADVANTAGES Simple nomenclature to report and add new types to Single method which is transferrable Can be highly variable in some species Lends itself to international scheme adoption / acceptance

14 Single locus typing DISADVANTAGES Problematic if parasite undergoes sexual reproduction (recombination) Single allele can dominate (e.g. C. hominis IbA10G2) through selective pressure Same allele emerging in different locations not very helpful unless actually genuine Cannot surrogate for a multi-locus scheme which differentiates differently when we compare with multi locus (see later on.)

15 Multi locus typing Successful schemes (which are managed allowing new types ) are used for many disease causing organisms e.g. Mycobacterium, Campylobacter, Trypanosoma, Leishmania. Many others have been published: variable number of markers SNP s and VNTR Different loci Different detection technologies: sequencing or fragment sizing Often a need for conformity and consistency Therefore consensus is required otherwise comparison is impossible

16 MLST: gold standard?

17 Will MLFT do as a cheaper and quicker alternative? MLST requires the facility and funding for sequencing MLFT requires accurate and reproducible fragment sizing Choice of markers likely different: MLST with SNP s outside of or even without a variable region, whereas MLFT only the microsatellite or tandem repeat for example is variable and this conveys the discrimination MLST undisputed superior method in some circumstances but much more costly per sample

18 MLVA: Genetic loci which contain a variable number of tandem repeats (VNTR) can be used to differentiate between isolates Multilocus VNTR analysis (MLVA) has greater discriminatory power than MLST E.g. Streptococcus pneumoniae Assessment of this: Simpsons index of diversity (how different are the isolates level of discrimination) and Wallace index (how do two methods [MLST and MLVA] compare?)

19 MLVA: Isolates with indistinguishable subtypes are more likely to have originated from a common source than those with different subtypes. Proposes international consensus for the development, validation, nomenclature and Quality Control for MLVA Problem: e.g: STEC0157 has 6 protocols for MLVA Inconsistent loci choices, nomenclature, size [primers], platforms, interpretation of incomplete repeats all adversely affect inter-lab comparisons and surveillance

20 MLVA Multi-Locus Variable (number tandem repeat) Analysis Published assays (if not schemes) for: Brucella Streptococcus Anaplasma Mycobacterium Clostridium Listeria Salmonella enterica (Typhimurium & Enteritidis)

21 Key performance indicators of MLVA Typability the ability to amplify each locus Reproducibility is it robust? Discriminatory power does it reveal differences between isolates? Epidemiological concordance very important, does it support what is already known about cases / isolates (validation)

22 Other performance criteria Speed Throughput Cost Ease of use Objectivity Versatility Portability

23 But. While MLVA scores well in some indicators: Discriminatory power Robustness Portability Objectivity Throughput It does less well in another Versatility (usually species / serotype specific)

24 By comparison with PFGE (bacterial foodborne pathogens): Discriminatory power Versatility Portability Objectivity Throughput So no approach is perfect!

25 How to choose markers (1) Generic selection criteria for markers have been described for bacterial pathogens but apply to parasites: Nadon et al (2013) Spread of markers across chromosomes or distantly on same chromosome to exclude gene linkage Smallest number possible that allows resolution and cost effectiveness (ultimate scheme would require WGS which is carried out for some bacterial pathogens)

26 How to choose markers (2) Criteria for selection: >6 bp repeats Amplicon < 300 bp

27 How to select and evaluate markers / repeats >90% similarity amongst intra-isolate repeats Tandem Repeat Finder software (Boston University)

28 MLVA - Cryptosporidium Eg. Cryptosporidium parvum Several markers published but some are monoallelic in certain populations; are too short to differentiate; or are prone to slippage Need whole genomes (until 2015 only 3 genomes) Survey (review) of current (published) markers (55 different loci) CRU has created and mined our own new genomes to develop new markers

29 Our strategy to identify VNTR markers: Step 1. Cryptosporidium parvum Iowa II reference genome retrieved from NCBI database and interrogated with Tandem Repeat Finder software (version 4.07b, Boston University) 2284 tandem repeat loci found 210 loci 2074 loci rejected < 6 bp repeat (too small to differentiate) or repeat region is <90% homogenous Step 2. Identified these loci in our new C. parvum genomes 182 loci rejected No variation in copy number 28 loci selected for further analysis

30 Locus evaluation: Candidate loci from 7 genomes were aligned (Bioedit) Only loci which showed variation in the number of repeats in the 7 (8) genomes were chosen : rpt: AAAGAC

31 Measurement and detection - MLVA Size measurement rather than sequencing Technology transfer: availability of platforms, adaptable approach Capillary electrophoresis accuracy ABI ± bp bp 5 kb ± 2-5 bp Cost Choice of platform often decided by convenience: what you have in the lab or what you can borrow!

32 Splitting up gp60 groups Earlier we looked at single locus typing

33 Single locus typing Cryptosporidium typing mostly (until very recently) relied upon single-locus typing (sequencing) of the polymorphic sporozoite surface glycoprotein (gp60) and is the basis of development work on MLFT. IIaA15G2R1 IIaA18G1R1 IIcA5G3 IIaA17G1R1 IIdA15G1 IIdA24

34 Splitting up gp60 subtype families/ subtypes Earlier we looked at single locus typing for Cryptosporidium parvum Does the MLVA do a better job??

35 Outbreak and sporadic cases UKP3 UKP5 UKP IIaA15G2R1 National OB IIaA18G2R1 Sporadic case IIaA17G1R1 NW OB IIdA22G1 Farm OB IIaA19G1R2 Sporadic case 5.00 UKP6 Iowa II UKP7 UKP UKP2

36 Evaluation of Cryptosporidium parvum MLVA markers Evaluation of markers discrimination between geographically local and distant isolates Reference Origin gp60 1_470_1429 4_2350_796 MSF 5_4490_2941 6_4290_9811 8_4440_NC_506 MM19 MLG UKP90 Human outbreak A IIaA18G3R UKP91 Human sporadic IIaA20G5R UKP92 Huma outbreak B IIaA19G4R UKP93 Lamb outbreak C IIaA18G2R UKP94 Human sample C IIaA18G2R UKP95 Human outbreak D IIaA19G1R UKP96 Human sporadic IIdA15G Neg UKP97 Human sporadic IIdA19G1 Neg ?? UKP98 Human sporadic IIdA21G1 Neg UKP99 Human sporadic IIcA5G3a UKP100 Human sporadic IIcA5G3j Neg UKP101 Lamb outbreak E IIaA15G2R UKP102 Human outbreak E IIaA15G2R UKP103 Human outbreak E IIaA15G2R UKP104 Human outbreak F IIaA15G2R UKP105 Human outbreak F IIaA15G2R UKP106 Human outbreak F+G IIaA15G1R UKP107 Human outbreak F+G IIaA17G1R UKP108 Human outbreak H IIdA24G UKP109 Human outbreak H IIdA24G

37 STSM (COST short term scientific mission) samples: Method showing some promise 20 samples Different outbreaks and sporadic cases 17 multi-locus genotypes Linked outbreak samples identical, all others different Separated two geographically-linked samples with the same gp60 during an outbreak Subsequent Epidemiology also confirmed that the different type was not part of the outbreak

38 7 VNTR loci selected for in vitro evaluation. Loci Repeat sequence Reference cgd1_470_1429 TC(T/G)GAT Previously presented as TTCTGA (Herges et al., 2012) cgd4_2350_796 CC(T/C)GGTATGGG(T/C)CC(A/G) Pérez-Cordón et al. (2016) MSF GCTCAGGAAGGA Tanriverdi & Widmer (2006) cgd5_4490_2941 CAGAGC Pérez-Córdon et al. (2016) cgd6_4290_9811 (TCT/TCC)*TCTTCTTCCTCCTCT(TCTTCTTCC/TCCTCCTCT *TCT/TCC (MSC6-5 in Xiao and Ryan (2006) cgd8_nc_4440_506 TGAGC(C/T) Pérez-Cordón et al., (2016) MM19 GGA(G/T)C(A/T) Morrison et al., (2008) Direct stool DNA extraction Workflow VNTR PCR Fragment sizing C. parvum identification

39 The whole package international consensus?

40 Successful method for looking for differences? 7 loci plus gp60 single locus sequence analysis 3 laboratories 3 platforms (QIAexcel, ABI3500XL, ABI3730+Genescan) 14 samples characterised initially by gp60 Products from each locus sized according to reference alleles 7/7 loci concordant between the labs = reproducible

41 Thank you! Any questions now or anytime this week?