Plasmodium vivax. (Guerra, 2006) (Winzeler, 2008)

Similar documents
Genetic Exercises. SNPs and Population Genetics

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Introduction to Bioinformatics and Gene Expression Technologies

Chapter 15 Gene Technologies and Human Applications

March 20-23, 2010 Sacramento, CA

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

Bioinformatics Advice on Experimental Design

Sequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es

The long walk to a malaria-free world

Goals of pharmacogenomics

Biotechnology and Genomics in Public Health. Sharon S. Krag, PhD Johns Hopkins University

A PROTEIN INTERACTION NETWORK OF THE MALARIA PARASITE PLASMODIUM FALCIPARUM

SNP calling and VCF format

Getting high-quality cytogenetic data is a SNP.

Axiom mydesign Custom Array design guide for human genotyping applications

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology

ON-CHIP AMPLIFICATION OF GENOMIC DNA WITH SHORT TANDEM REPEAT AND SINGLE NUCLEOTIDE POLYMORPHISM ANALYSIS

Genome Sequence Assembly

Targeted Sequencing in the NBS Laboratory

Welcome to the NGS webinar series

Microarray Gene Expression Analysis at CNIO

Illumina s GWAS Roadmap: next-generation genotyping studies in the post-1kgp era

Lecture #1. Introduction to microarray technology

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Cancer Genetics Solutions

SMARTer Ultra Low RNA Kit for Illumina Sequencing Two powerful technologies combine to enable sequencing with ultra-low levels of RNA

Validation Study of FUJIFILM QuickGene System for Affymetrix GeneChip

Alignment methods. Martijn Vermaat Department of Human Genetics Center for Human and Clinical Genetics

Illumina (Solexa) Throughput: 4 Tbp in one run (5 days) Cheapest sequencing technology. Mismatch errors dominate. Cost: ~$1000 per human genme

Data Analysis with CASAVA v1.8 and the MiSeq Reporter

High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Why can GBS be complicated? Tools for filtering, error correction and imputation.

Introduction to Bioinformatics

Introduction to Bioinformatics and Gene Expression Technology

Péter Antal Ádám Arany Bence Bolgár András Gézsi Gergely Hajós Gábor Hullám Péter Marx András Millinghoffer László Poppe Péter Sárközy BIOINFORMATICS

RADSeq Data Analysis. Through STACKS on Galaxy. Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé

Global response plan for pfhrp2/3 deletions

Bio Rad PCR Song Lyrics

TECHNOLOGIES, PRODUCTS & SERVICES for MOLECULAR DIAGNOSTICS, MDx ABA 298

Text S1: Validation of Plasmodium vivax genotyping based on msp1f3 and MS16 as molecular markers

Bioinformatics and computational tools

Medicines for Malaria Venture (MMV) Strategic Consultation POSITION PAPER EXPLORING COMBINATIONS OF ANTIMALARIAL DRUGS

RNA-Sequencing analysis

GENE MAPPING. Genetica per Scienze Naturali a.a prof S. Presciuttini

Molecular Biology: DNA sequencing

Analysis of Differential Gene Expression in Cattle Using mrna-seq

Ion S5 and Ion S5 XL Systems

Complementary Technologies for Precision Genetic Analysis

Targeted Sequencing Using Droplet-Based Microfluidics. Keith Brown Director, Sales

2 Gene Technologies in Our Lives

Genome-Wide Association Studies (GWAS): Computational Them

E2ES to Accelerate Next-Generation Genome Analysis in Clinical Research

Introductory Next Gen Workshop

Theory and Application of Multiple Sequence Alignments

Introduction to BioMEMS & Medical Microdevices DNA Microarrays and Lab-on-a-Chip Methods

Read Mapping and Variant Calling. Johannes Starlinger

Genetics Lecture 21 Recombinant DNA

Analysing genomes and transcriptomes using Illumina sequencing

Variation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI

Vivax Working Group Workplan

CSC Assignment1SequencingReview- 1109_Su N_NEXT_GENERATION_SEQUENCING.docx By Anonymous. Similarity Index

De novo Genome Assembly

Gene Expression Profiling and Validation Using Agilent SurePrint G3 Gene Expression Arrays

Target Enrichment Strategies for Next Generation Sequencing

Designing the next generation of medicines for malaria control and eradication

Lecture 8: Sequencing and SNP. Sept 15, 2006

BST227 Introduction to Statistical Genetics. Lecture 8: Variant calling from high-throughput sequencing data

Please purchase PDFcamp Printer on to remove this watermark. DNA microarray

Polymerase Chain Reaction (PCR) and Its Applications

Bioinformatics Support of Genome Sequencing Projects. Seminar in biology

Modern Epigenomics. Histone Code

Drew et al. BMC Medicine (2016) 14:144 DOI /s

Variant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012

Growing Together: How Viruses Have Shaped Human Evolution. Shirlee Wohl Katherine Wu

Chapter 20: Biotechnology

Gene Regulation Solutions. Microarrays and Next-Generation Sequencing

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary

Viruses 11/30/2015. Chapter 19. Key Concepts in Chapter 19

Stefano Monti. Workshop Format

Next Generation Sequencing Technologies. Some slides are modified from Robi Mitra s lecture notes

Aristos Aristodimou, Athos Antoniades, Constantinos Pattichis University of Cyprus David Tian, Ann Gledson, John Keane University of Manchester

Interpreting RNA-seq data (Browser Exercise II)

3.1.4 DNA Microarray Technology

Genome Sequencing. I: Methods. MMG 835, SPRING 2016 Eukaryotic Molecular Genetics. George I. Mias

Principal Component Analysis in Genomic Data

Concepts of Genetics Ninth Edition Klug, Cummings, Spencer, Palladino

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

By Dr. Zainab khalid

Gap Filling for a Human MHC Haplotype Sequence

Genome Sequencing-- Strategies

Biochemistry 412. New Strategies, Technologies, & Applications For DNA Sequencing. 12 February 2008

Expression Array System

QTL ANALYSIS OF EMAMECTIN BENZOATE SUSCEPTIBILITY IN THE SALMON LOUSE Lepeophtheirus salmonis

Next Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms

DNA Microarray Technology

Gary Ketner, PhD Johns Hopkins University. Treatment of Infectious Disease: Drugs and Drug Resistance

COURSE OUTLINE. School of Engineering Technology and Applied Science. Biological Technology Industrial Microbiology. Advanced Biotechnology

Pathogenic organisms no thanks: Use of next generation sequencing techniques in risk assessment and HACCP

+ + + MALARIA HIV/AIDS TUBERCULOSIS LEISHMANIASIS CHAGAS DISEASE DENGUE EBOLA ZIKA

Transcription:

Plasmodium vivax Major cause of malaria outside Africa 25 40% of clinical cases worldwide Not amenable to in vitro culture Interesting biology Hypnozoites: dormant liver stage responsible for relapses Primaquine: only drug for radical cure Unknown determinants of chloroquine resistance (Guerra, 2006) (Winzeler, 2008)

P. vivax microarray analysis P. vivax infected patient blood samples from Iquitos in Peruvian Amazon, filtered to remove human white blood cells and gametocytes DNA was extracted, amplified and hybridized to a custom whole genome P. vivax tiling array

P. vivax whole genome tiling array Custom tiling array for P. vivax >4 million probes of ~25 nucleotides each Six base pair spacing based on sequence of Salvador I strain Probes alternating sense and antisense strands 1.8 million probes corresponding to 5305 P. vivax genes TTGAGCAATTTGTACAACGTGGTGA TTAAACATGTTGCACCACTACGCGG TACAACGTGGTGATGCGCCTGAATG CACCACTACGCGGACTTACGCCTCA ATGCGCCTGAATGCGGAGTGCATTT GACTTACGCCTCACGTAAAACTCAA GCGGAGTGCATTTTGAGTTGCTACT Similar P. falciparum tiling array detects >90% SNPs genome wide (Dharia, et al. 2009, Genome Biology)

P. vivax isolate Genetic Diversity Isolate PQSJ67 Gene Groups Polymorphic Probe Rate vir 50.9% msp3 26.3% msp7 15.5% GO:0008152 metabolism 0.7%

P. vivax Drug Resistance Genes

PQSJ67 Solexa Sequencing 3 lanes of Illumina Genome Analyzer sequencing of Whole genome amplified patient sample PQSJ67 28,958,672 forward; 28,958,672 reverse; 57,917,344 total reads Aligned reads to reference genome sequence using Bowtie software (Langmead et al, 2009) Genome P. vivax Genome PlasmoDB 5.5 total unique Human Genome total unique Reads 23,182,796 22,427,291 26,530,335 25,774,830 % of Total Reads 40.03% 38.72% 45.81% 44.50% Unaligned Variable P. vivax sequences? 8,959,718 15.47% sequenced by John Walker s Group

Human and Parasite DNA Cell DNA Content of Cell Reads per Mb Average Coverage Relative Number of Cells Relative Gene Copy Number Human nucleated cell 6.4 Gb (diploid) 4,027 0.16 1 1 P. vivax parasite 26.8 Mb (haploid) 836,839 33.47 208 104 Human erythrocytes with 0.1% parasitemia median(raw intensity) ± 25 percentiles 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Affy bg probes Sal1 Sal1 PQSJ67 Human probes P. Vivax probes Bacterial probes none except in parasitized cells Affy bg probes Human probes P. vivax probes log 2 (intensity Sal1 ) 207,791 Reference strain Patient isolate log 2 (intensity PQSJ67 )

Pileup Alignment of P. vivax Reads Unique Reads Average Coverage 22,427,291 33.47

Low coverage of variable genes

Comparison of SNP Calls Used Maq software (Li, 2008) to find SNPs in Solexa alignment SNP Quality scores based on base call and read mapping 69,816 total calls 41,641 Mixed SNPs, Mostly sequencing errors 28,175 unmixed calls, 24,181 covered by 3 or more unique probes We detect over 82% of all Solexa SNPs at p value<1x10 2, 65% at <1x10 5 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Missed by microarray Detected by microarray, p<1e-2 Detected by microarray, p<1e-5 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100 100-110 110-120 120-130 130-140 140-150 150-160 160-170 170-180 180-190 190-200 200-210 210-220 220-230 230-240 240-250 250-260 Maq Score

Gene SNP Rates and SFP Rates Positive Correlation between the two technologies Higher SFP rates, since one SNP will overlap multiple microarray probes, resulting in multiple SFPs

Confirmed SNPs in resistance genes

Conclusions New technologies for P. vivax will help us generate a more complete picture of genetic diversity. Whole genome data allows us to study to geographical parasite populations at an unprecedented level of detail. Highly variable genes under positive selection by the host immune system as potential vaccine candidate antigens. Will dramatically accelerate drug development and discovery of resistance mechanisms. Loss of heterozygosity, limited diversity, and linkage disequilibrium will indicate regions under selection in drug resistant samples. New SNP genotyping assays can be used for tracking the spread of drug resistance in populations.

Microarray vs Sequencing Microarray advantages Faster, 2 days for sample prep and all data analysis Reproducible, consistent, easy data analysis Cheaper per array, but greater cost investment to design and purchase the arrays in large quantities Microarray disadvantages Don t know the exact polymorphism, only differences from the reference genome sequence Have to re sequence polymorphisms in regions of interest Sequencing advantages Identify the exact position and base pair change Can identify new sequences, with de novo assembly Sequencing disadvantages Data analysis is difficult and time consuming with free software Not user friendly, need bioinformatics expertise Difficult to identify indels, highly variable regions have no coverage

Acknowledgements Joseph M. Vinetz UCSD Colleen McClean (Peru) Raul Chuquiyauri (Peru/UCSD) Anonymous malaria patients of Iquitos, Peru Elizabeth A. Winzeler The Scripps Research Inst. Neekesh V. Dharia Yingyao Zhou (GNF) Selina E.R. Bopp Stephan Meister Shailendra K. Sharma A. Taylor Bright Gonzalo E. González Páez Naval Medical Research Center Detachment (Peru) David J. Bacon Paul Graf Centers for Disease Control John Barnwell William Collins Steve Hoffman (Sanaria, Inc.)