Appendix 1a. Microsatellite analysis of P1-hyg, P2-neo and their. Amplified cdna (base pairs) P1-hyg P2-neo All progeny A A

Size: px
Start display at page:

Download "Appendix 1a. Microsatellite analysis of P1-hyg, P2-neo and their. Amplified cdna (base pairs) P1-hyg P2-neo All progeny A A"

Transcription

1 Supplementary information Appendix 1a. Microsatellite analysis of P1-hyg, P2-neo and their double drug resistant progeny (ABI 377). Primer set a,b Amplified cdna (base pairs) P1-hyg P2-neo All progeny A A C MCLE01 c MCLF TCP

2 E N MCLE SCLE L660 d a Microsatellite primers are described previously 13 or in appendix 7. b The following microsatellite alleles had identical homozygous alleles for both P1-hyg and P2-neo, A831 (150bp), A831.3 (172bp), J356 (142bp), K368 (119bp), O860 (224bp), MCL03 (257bp), MCL05 (208bp), MCLG10 (153bp), SCLE10 (253bp). c Note that towards saturation further polymorphisms may appear, for example 123 and 131bp for P1-hyg and 125 and 127bp for P2-neo; all these alleles are present in the progeny. d Locus L660 represents allele loss.

3 Appendix 1b. Microsatellite electropherogram traces of the experimental cross for 3 out of 13 informative loci demonstrating one example of each type of informative locus: polyploid double drug resistant progeny (a), progeny showing no allele segregation (b) and progeny showing allele loss (c). a: TCP 247bp c: SCLE11 143bp 241bp P1-hyg (Parent 1) 243bp 249bp 251bp P1-hyg (Parent 1) 139bp 143bp P2-neo (Parent 2) P2-neo (Parent 2) 243bp 241bp 247bp 249bp 251bp bp 143bp All double drug resistant progeny All double drug resistant progeny c: L bp P1-hyg (Parent 1) 383bp 1000 P2-neo (Parent 2) 377bp 4000 All double drug resistant progeny

4 Appendix 2. Putative recombination amongst cdna clones derived from double drug resistant hybrids of T. cruzi P1-hyg and P2-neo Locus Genotype and nucleotide position (bp) d No. parental clones of given genotype No. progeny clones from 1C2, 1D12, 2A2, 2C1, 2D9 and 2F9 combined gpi T T P1 hyg C T P2-neo 2 2 T C P1 hyg P2-neo 7 C a C P1 hyg & P2-neo 0 Putative rec.: 7 1C2, 2A2, 2C1, 2F9, 2D9 tcp e T C T T A G a P1 hyg 2 0 A G A A G C T T A P2-neo 7 19 W b S b T T R b G P1 hyg P2-neo 17 A G A A G G P1 hyg & P2-neo 0 Putative rec.: 5 1C2, 1D12, 2A2, 2D9, 2F9 pgm e A G T T G G T T A P1 hyg & P2-neo 0 Putative rec.: 2D T G G G A A T P1 hyg 2 4 T A A G A A A P2-neo 5 3 Y c A A A P1 hyg 5 36 P2-neo 4 T A A G A A T P1 hyg & P2-neo 0 Putative rec.: 2C1 1 C A A G A A A P1 hyg & P2-neo 0 Putative rec.: 2A2 1 C A A G A A T P1 hyg & P2-neo 0 Putative rec.: 2F9 1 a rec. = recombination; = putative site of recombination - = nucleotide gap. b 3 parental genotypes are represented by degenerate nucleotide codes (W = A or T; S = G or C; R = A or G). c 2 genotypes are represented by degenerate nucleotide codes (Y = C or T). d Position relative to T. cruzi gpi (Genbank accession, AC137988), or putative Leishmania pgm (Genbank protein accession, CAC14526). e Locus amplified using Taq Extender TM [Stratagene]. Note: recombination frequencies could not be ascertained. Genbank accession numbers AY AY

5 Appendix 3. Polyploidy in field isolates detected by microsatellite size length polymorphisms; 6 loci (isolate 83) not shown did not show polyploidy (Methods). Microsatellite Locus Field isolate Allele size (bp) A427 SP13 (TCIIc) 181, SP14 (TCIIc) 181, 187, b (TCI) 170, 178, 186 SCLE11 SP13 (TCIIc) 153, 155, 159, 161 SP14 (TCIIc) 153, 155, 159, 161 RMA34 (TCIIb) 158, 166, 168 RMA134 (TCIIb) 158, 166, (TCI) 139, 145, 147 A833 a 83 (TCI) 189, 201, 215 C875 a 83 (TCI) 177, 179, 181 L660 a 83 (TCI) 377, 385, 389 N (TCI) 224, 236, 240, 252 a Locus was not sampled for every isolate in the study. b Locus E801 was not sampled for isolates 83, 303 and 304. Microsatellite trace electropherogram of TCI field isolates for the A427 locus, bearing a striking resemblance (this locus only) to the hybrid progeny, genotype 170, 178 and 186bp, generated from P1-hyg and P2-neo (Fig. 1). Isolates 303 and 304 were not sympatric with 83. A bp 186bp Isolate 303 Isolate 304 Isolate bp 170bp 178bp 186bp

6 Appendix 4a. Maximum likelihood breakpoint analysis of the gpi locus using putative parents and progeny identified from the analysis in Figure 2. Putative cross Parent, TCIIb Parent, TCIIc Progeny Break-point (bp, amplicon) a Likelihood ratio difference (LD) Significance RMA134 X9/ cl P<0.01 RMA134 X9/3 CL Brener P<0.01 RMA134 X9/3 HC P<0.01 RMA 134 X9/3 Para P<0.01 a complete triplet codons only Appendix 4b. Phylogenetic support for recombination in tcp (excluding gaps, alignment available on request) demonstrated by incongruence between TCIIb (+d +e) and TCIIc (+d +e) or TCIIb (+d +e) and TCI lineages for putative recombinants, where = putative recombinant and = parents used for maximum likelihood breakpoint analysis. Asterisk (*) denotes sequences derived from published data 20. The locus was amplified using Taq Extender TM (Stratagene).

7 tcp 1-450bp tcp bp TCIIb TCIIc TCIIa 88 TCI Appendix 4c. Maximum likelihood breakpoint analysis of the tcp locus using putative parents and progeny identified from the analysis above, where g = genotype. Putative cross Parent (TCIIb) Parent, Progeny Break -point (bp) Likelihood ratio difference (LD) Significance RMA 134 X10 (TCI) SC43 (g1) P<0.01 RMA 134 CL Brener (TCIIc) SC43 (g2) P<0.01 RMA 134 CL Brener (TCIIc) X9/ P<0.01 RMA 134 CL Brener (TCIIc) Para P<0.01 RMA 134 CL Brener (TCIIc) RMA34 (g1+g2) , 11.5 P<0.01

8 Appendix 5. Refined split decomposition analysis of DHFR-TS and TR for T. cruzi strains previously published 3 and gpi for reference strains and field isolates to illustrate mosaic gene recombination. This genetic distance method is easily distorted by rate variation between lineages, therefore it should only be viewed as an illustration. For example, DHFR-TS Esmeraldo (Esm.) cl2 could have arisen from recombination, a more robust example depends on more informative sites. Numbers in brackets denote nucleotide position. Asterisk (*) denotes a putative recombinant and double asterisk (**) denotes genotype of all designated (bracketed) lineages. For gpi red font denotes putative TCIIb parents and blue font denotes putative TCIIc parents, denotes putative progeny. DHFR-TS TCIIb (+d + e) C (438) T (753) EPP, Tulahuen cl2, CL F11F5 (Brener) SO3 cl5 PSC-O MSC2 CBB cl3, TU18 cl2 C (438) C (753)** * T (438) Esm cl2 T (753) Esm cl3, MSC2 T (438) C (753) TCIIa CANIII EP 255 CM 17 TCIIc (+d + e) Tulahuen, Cl F11F5 M6241 cl6 X110 M5631 cl5, EPP PSC-O cl7 FLORIDA C16, Vin C6, TEH cl2, cl92 SC13 OPS21 cl11 CUICA cl1 P209 cl1,so34 cl4, Esquilo cl1, 85/818, TCI CUTIA cl1, 26 79, SABP3, A80, X10 cl

9 TR P209 cl1, SO34 cl4, Esquilo cl1, cl7, CUTIA cl1, SABP3 OPS21 cl11, Vin C6, Florida cl6, TEH cl2 CUICA cl1 SC13 85/818 FLORIDA C X10 cl1, A80, TCI CANIII cl1 TCIIa CM 17 TCIIc (+d + e) TCIIb (+d + e) M5631 cl5, M6241 cl6 PSC-O, SO3 cl5, X110, EPP, Tulahuen cl2, CL F11F5 (Brener) PSC-O, SO3 cl5, MSC2, CBB cl3, TU18 cl2, Esm. cl3, EPP, Tulahuen cl2, CL F11F5 (Brener)

10 gpi TCIIb (+d + e) putative parents RMA134 Esm. cl3 Esm. cl3 SC43 Para2 CL Brener RMA134 TCI Progeny TCIIa HC10 Para cl2 CANIII HCl3 Progeny CL Brener cl2 X9/3 Chaco2 RMA34 SP13 X10 HC9 Sp14 X109/2 Chaco2 M6241 SC43 TCIIc (+d + e) putative parents

11 Appendix 6. Primers used for amplification and sequencing. Locus Primer Name Sequence (5 3 ) Hygromycin Hyg1(f ) a GTCCTGCGGGTAAATAGCTG phosphotransferase Hyg2(r) a GTCAGGCTCTCGCTGAATTC Neomycin Neo1(f ) TGAATGAACTGCAGGACGAG phosphotransferase Neo2(r) TGATGGATACTTTCTCGGCAG tcp IG1(f ) 20 GGT SGA CAT GCT CGG TGT GC IG2(r) 20 TCP17.1(s) a TCP17.2(s) AAS CTT CAG TCC GCA CTC GTG GAT GAT GAA GGC GTG GAA AT ATC GAT TTT GCA CCC TCA GT gpi SO1(f ) GGC ATG TGA AGC TTT GAG GCC TTT TTC AG SO2(r) GPI.1(s) GPI.2(s) TGT AAG GGC CCA GTG AGA GCG TTC GTT GAA TAG C TGT GAA GCT TTG AAG CCT TT GGT CAG GAG AGG TGA ATG GA pgm PGM.for(f ) GGG CGG AAC TAC TAC TGT CG PGM.rev(r) GGA GGG AGT AAA AGA AAG GAA AA tpn1 TRX1-1(f ) AGT AGA TCT GCC ACG CGT ACT TGG G AF TRX1-2(r) TTG AAG CTT TCA CCG CCA GAA TTG ATA coii-nd1 ND1.3A(f ) 3 GCT ACT ART TCA CTT TCA CAT TC COII.2A(r) 3 Mito.425(s) Mito.850(s) Mito. 950(s) GCA TAA ATC CAT GTA ACA CMC CAC ATG CCG TCT GTA ATA GGT GTC A ATC CAC AAA TTT TGA TGA TAT A CAA AAT TTA AAC AAC CGA AAT ATA a f, forward primer; r, reverse primer; s, primer for sequencing only; all other primers, except hyg and neo, were used for both amplification and sequencing.

12 Appendix 7. Microsatellite primers based on dinucleotide repeats in Genbank and at the TIGR database (http// Primers were designed from CL Brener sequences at Accession Primers Sequence (5 to 3 ) Tandem repeats (CL Brener) number (s) / locus [products, bp] AF A427 ACG CGC GTT ACT TGT GGT AT (AC) 14 [189] CCA AAT ATG CAT GTG TTT GGA AF A831.1 CAT CCG TGT GTG GAT CTG TT (CT) 5 CAGT(CT) 15 [172] AFT GAC ACC GAG AGG GTG AC AF A831.3 CCT GCA CTT ACT TCG CTT CC (AT) 19 (AC) 4 [203] GAG AAA TTG TGG AGG CAT GAA AF A833 GTT GTT CTC GCA GAC GTC AA (TA) 3 (CA)(TA) 7 (CA) (TATG) 2 (N) 9 (GAG) 4 [202] GCT TCC TCT TCT CTC CCA CA AI C875 CCA TGT CGA CTC CAT GTC TC (AC) 6 (AG) 13 TTG TTG CTG TTG TTG GCA AT AI E801 TGT GTT TCA AGC TCC CGT GT (TG) 14 [253] TCC CAA GCA CGA AAA CAA AT AI J356 GGG GGT AAA CTG AAA GAA AAA GA (AC) 6 (AG) 16 [177] ATA AGA AGC AAG CGC CAA AA AI K368 AGT TGA CAT CCC CAA GCA AG (AC) 13 (GA)(CA) 3 (AC) 3 [169] CCC TGA TGC TGC AGA CTC TT AW L660 GCG AAG GGA AAC AAA CAA TC (CA) 8 (n) 4 (GA) 10 [393] AGG GCA TTG TTC AAA TCT GC

13 AW N060 TGT AGA GAT AGA ATG AAG CGCAAA (TG) 2 TC (TG) 3 (TGTC) 3 (TG) 2 (TGTC) 4 (TG) 5 (TC) [270] TGA GAA GAC GGG TGA GAG AAA AI O860 CTT CTG CGC ACA CAT TCA TT (TA) 26 [277] CCG TTC TTC ATC ACC ATC CT AF TCP GAT GAT GAA GGC GTR RA AAT (TG) 3 TT (TG) 3 CTGT [236] AF ATC GAT TTT GCA CCC TCA GT

14 Appendix 8. T. cruzi isolates used in this study. Strain Origin of strain Lineage b X10 Reference TCI HC9 Reference TCI 83 c Field isolate, Brazil a TCI 303 c Field isolate, Brazil TCI 304 c Field isolate, Brazil TCI CANIII Reference TCIIa HCL3 d Reference TCIIa Esmeraldo (Esm.) cl3 d Reference TCIIb HC1O d Reference TCIIb RMA34 Field isolate, Paraguay TCIIb RMA134 Field isolate, Paraguay TCIIb M6241 Reference TCIIc X109/2 d Reference TCIIc X9/3 Reference TCIIc SP13 d Field isolate, Paraguay TCIIc SP14 d Field isolate, Paraguay TCIIc cl2 Reference TCIId Chaco2 Field isolate, Paraguay TCIId Para2 Field isolate, Paraguay TCIId SC43 Reference TCIId CL Brener Reference TCIIe a all DNA stocks from field isolates were derived from a single trypanosome (biological clones).

15 b lineage identification determined using established methods 11. c indicates isolates used for microsatellite analysis only. d indicates isolates used for microsatellite and gpi sequence analysis only.