Understanding Schistosoma japonicum population structure and relatedness using reduced representation sequencing

Size: px
Start display at page:

Download "Understanding Schistosoma japonicum population structure and relatedness using reduced representation sequencing"

Transcription

1 Understanding Schistosoma japonicum population structure and relatedness using reduced representation sequencing Jonathan Shortt Global Health Symposium Anschutz Medical Campus Oct 20, 2017

2 Schistosomiasis >200 M infections worldwide Anemia, impaired growth and cognitive development, other health problems

3 Schistosomiasis persists in China despite treatment No. villages S. japonicum infections in China 1950 s:12m Current:~100K Control measures Snail Praziquantel No infections % prevalence % prevalence Out of 36 villages tested

4 Affordable and reproducible deep genotyping of schistosomes DNA Excision Whole Genome Amplification Sequencing and Variant Identification ng DNA μg DNA Successful amplification of samples up 9 years old Read Depth Genome Position Reads from Sample A Reads from Sample B ~100K Variants Sequenced Shortt et al. (2017) PLoS Neglected Tropical Diseases 11(1): e

5 Sampling locations in Sichuan, China 1km

6 Genetic variation captures geographic differences

7 Genotype similarity decreases with distance Genotype Similarity < >10 Distance between samples (km)

8 Relatedness increases within villages and individuals Different Villages Probability Density Same Village- Different Hosts Lots of cousins Same Host Siblings Genotype Similarity

9 Other mammalian hosts may serve as reservoirs for infection

10 Miracidia relatedness within villages Sibling Half-sibling Cousin

11 How many mating pairs are active in a host? Sibling Half-sibling Cousin

12 Multiple generations of infection from same source Sibling Half-sibling Cousin

13 Retained infections and cross-village transmission Sibling Half-sibling Cousin

14 The Lab David Pollock Kenji Fukushima Aaron Wacholder Stephen Pollard Kristen Wade Hamish Pike Jackie Billotte Key Collaborators Elizabeth Carlton Todd Castoe Drew Schield Nicky Hales Katerina Kechris Liu Yang Zhong Bo Funding NIH, NIAID

15

16 Adaptation and Selection to Anthelmintic? Other? Control regime is not limited to chemotherapy

17 Schistosomiasis Emergence in Sichuan No transmission < 0.2% prevalence < 1% prevalence Endemic Emergent S. japonicum infections in China 12M->100K Praziquantel treatment Controlled but not eliminated Liang et al. Bull. World Health Organ. 84, (2006).

18 Efficient Recovery of Size-targeted Fragments Fraction Recovered Fragment Length (bp) Coverage Depth 1x 10x 20x 50x 100x

19

20

21 Sampling progeny of active infections leads to identification of infection sources Infection sources Active infections Collected samples

22 Genome Sampling with ddradseq No stutter (relevant for SSRs) Reads contain phasing information Much cheaper than whole genome sequencing Information dense Possibility of discovery of new alleles

23 Figure Kbp 22 Kbp 163 Kbp 1.2 Mbp Dependency of number of mapped ddradseq loci mapped to a scaffold versus scaffold length. The data is shown for a single miracidium. Loci were counted if the doubly digested restriction fragment was between 100 and 600 bp, and high quality sequence was obtained at least 20 times. Both are shown on natural logarithm scales. The blue line indicates a moving average of 30 data points. This version of the figure updated by JAS on 1/10/17

24 Figure , Frequencies of number of ddradseq loci mapped to a scaffold. The data is shown for a single miracidium. Loci were counted if the doubly digested restriction fragment was between 100 and 600 bp, and high quality sequence was obtained at least 20 times. The locus counts were transformed by the natural logarithm function, and grouped into half unit ranges. The number of discrete counts grouped in the first 5 bins are small and displayed above the histogram bars. This version of the figure updated by JAS on 1/10/17

25 Sequencing Results and Sample Filtering Millions of Reads Sample Index 1 M reads Keep (with additional filters) Discard (for now)

26 Locus Depth in Filtered.vcf Nearly all variants used in this set are sequenced at high depth-> confident genotype calling