NGS to address ncrna and viruses

Size: px
Start display at page:

Download "NGS to address ncrna and viruses"

Transcription

1 NGS to address ncrna and viruses Introduction & TRON Next generation sequencing transcriptomics ncrnas vrna June 30, 2010 John Castle Institute for Translational Oncology and Immunology (TRON) Mainz, Germany

2 Chris Raymond Chris Armour Matt Biery Michael Koslowski Martin Löwer Bernhard Renard Ludmila Schemarow Marius Byl

3

4

5 Improve human life through personalized healthcare TRON Institute for Translational Oncology and Immunology CIMT Association for Cancer Immunotherapy Universität Mainz & Uni. Medizin Ganymed AG Anti-tumor mabs R&D, translational science, products, healthcare, education BioNTech AG Cancer Immunotherapies and Biomarkers CI3 Cluster für Individualisierte Immunointervention

6 Institute for Translational Oncology and Immunology (TRON) Ideas TRON incubator U. Sahin, Ö. Türeci, & C. Huber (directors) Open positions (posted at Nature Jobs, Job Vector) Products that impact patient healthcare (e.g., drugs, diagnostics, companies, information)

7 Goal: sequencing as clinical assay for immediate impact and long term healthcare improvement Immediate patient & healthcare gain for clinical decision making, including diagnosis and therapy selection Clinical Assay R&D Development Drug and biomarker development for long term patient & healthcare gain

8 HiSeq instrument

9 Sample preparation ACTCTACTACTACAACCCA ATATCTAGCTAGCTACGTG ACTGACTGATCGTGAACCC GCTGCTAGCTAGCTGCTAG CATGCTAGCTAGCTAGCAC CATGCATCGTAGCTCGACC ACGTACGCGACAGTTTCAC CGCATGGTCGTAGCTACTA Analysis and interpretation Sequencing Billions of sequence reads

10 Sample preparation ACTCTACTACTACAACCCA ATATCTAGCTAGCTACGTG ACTGACTGATCGTGAACCC GCTGCTAGCTAGCTGCTAG CATGCTAGCTAGCTAGCAC CATGCATCGTAGCTCGACC ACGTACGCGACAGTTTCAC CGCATGGTCGTAGCTACTA Analysis and interpretation Sequencing Billions of sequence reads

11 Analysis and interpretation (examples) Methylation SNP calls Population SNPs Somatic mutations ACTCTACTACTACAACCCA ATATCTAGCTAGCTACGTG ACTGACTGATCGTGAACCC GCTGCTAGCTAGCTGCTAG CATGCTAGCTAGCTAGCAC CATGCATCGTAGCTCGACC ACGTACGCGACAGTTTCAC CGCATGGTCGTAGCTACTA Genome alignment BWA BOWTIE MAQ ELAND MOSAIK DNA structural variation CNV Insertions / deletions Breakpoints / connections Transcriptome alignment Gene & isoform expression Allele specific expression Bioinformatics & statistics Billions of sequence reads Genome assembly VELVET ALLPATHS VAAL ChIP-Seq & CLIP-SEQ DNA and RNA binding Interpretation & application

12 Whole transcriptome profiling - Better biomarkers (e.g. for patient stratification ) E.g., which isoforms of APP, PSEN1/2, BACE1/2, GSK3B, and MAPT are Alzheimer s BMx? - New drug targets E.g., tumor-specific extra-cellular exons as mab targets - New biological and disease pathways and networks Transcriptome Protein coding transcripts Non-coding RNAs Degraded transcripts Alternative polyadenylation Alternative splicing mirna Other ncrna families Antisense RNAs

13 Human ncrna Digital genome-wide ncrna expression, including snornas, across 11 human tissues using polya-neutral amplification

14 Not-so-random primer technology: NSR 4096 hexamers 2781 cytoplasmic rrna hexamers 1 st strand cdna 2 nd strand cdna PCR 749 whole transcriptome 1 st strand hexamers 749 whole transcriptome 2 nd strand hexamers 566 mitochondrial rrna hexamers random-primed cdna cytoplasmic rrna 67% sequence NSR-primed cdna non-rrna 87% non-rrna 22% Raymond, Armour, Castle mitochondrial rrna 11% mitochondrial rrna 2% cytoplasmic rrna 11%

15 Raymond, Armour, Castle Whole transcriptome profiling

16 RPKM RPKM RPKM RPKM ncrna expression compendium HY3 qpcr RPKM snr39b BC TERC (telomerase RNA component) RPKM = Reads per thousand nt per million aligned reads A measure of gene expression

17 Prader-Willi Syndrome (PWS) Known genes Chr15, q11-13 Read coverage from NGS, + strand Read coverage from NGS, - strand

18 Prader-Willi Syndrome, chr15q11-13 ADIPOSE COLON HEART HYPOTHALAMUS KIDNEY LIVER LUNG OVARY SKELETAL MUSCLE SPLEEN TESTES ADIPOSE COLON HEART HYPOTHALAMUS KIDNEY LIVER LUNG OVARY SKELETAL MUSCLE SPLEEN TESTES mrna gene expression NDN 20 0 UBE3A

19 Prader-Willi Syndrome, chr15q11-13 snorna HBII qpcr RPMK 0

20 ADIPOSE COLON HEART HYPOTHALAMUS KIDNEY LIVER LUNG OVARY SKELETAL MUSCLE SPLEEN TESTES ADIPOSE COLON HEART HYPOTHALAMUS KIDNEY LIVER LUNG OVARY SKELETAL MUSCLE SPLEEN TESTES ADIPOSE COLON HEART HYPOTHALAMUS KIDNEY LIVER LUNG OVARY SKELETAL MUSCLE SPLEEN TESTES Prader-Willi Syndrome, chr15q11-13 snorna HBII-85 HBII-85 Group I Group II Group III HBII 438A HBII 85 1 HBII HBII HBII HBII 52 1 HBII HBII 438B HBII-85 (group I) expression HBII-85 (group II) expression HBII-85 (group III) expression qpcr RPMK

21 Prader-Willi Syndrome, chr15q11-13 Novel transcription in the disease locus reads per million aligned reads

22 SARS Detect vrna in host tissues With Chris Raymond, Chris Armour, and the Katze lab, Univ. of Washington

23 NGS of infected samples Sequence RNA + SARS With Katze lab, Univ. of Washington

24 NGS of infected samples Sequence RNA Align to Mm genome + SARS

25 NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression 1. Count reads that overlap each gene 2. Estimate uncertainty 3. Normalize 4. Compare across samples, calculating differences, p-values, t-scores Genes involved with immune response to viral infection are expressed at higher levels in SARs infected samples. Ifit1 interferon-induced protein with tetratricopeptide repeats 1 Mx1 myxovirus (influenza virus) resistance 1 - SARS + SARS Difference Uncertainty Uncertainty Transcript Counts (+/-) Counts (+/-) T-score P-value NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NR_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00 NM_ E+00

26 With Katze lab, Univ. of Washington NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression Gbp10 guanylate-binding protein 10 H2-Ea histocompatibility 2, class II antigen E alpha Transcripts with ZERO counts in uninfected samples Transcript Counts, w/o SARS Error, w/o SARS Counts, w/ SARS Error, w/ SARS T-score P-value NM_ E-45 NM_ E-43 NM_ E-119 NM_ E-35 NM_ E-01 NM_ E-25 NM_ E-07 NM_ E-07 NM_ E-06 NR_ E-04 NM_ E-07 NM_ E-05

27 NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression With Katze lab, Univ. of Washington

28 NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression Align to viral genomes

29 NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression Align to viral genomes

30 NGS of infected samples Sequence RNA Align to Mm genome + SARS Mouse gene expression Align to viral genomes

31 EXTERNAL Data integration CLINICIANS BIOMARKERS PATIENTS RESEARCHERS SEQUENCER DRUGS INTERPRETATION, REPORT, IMPACT OLIGO MICROARRAY INTEGRATION & ANALYSIS PEPTIDE ARRAY MEASUREMENT DATA FILE SYSTEM / DATABASE RESULTS FILE SYSTEM / DATABASE ELISPOT CLINICAL DATA

32 Summary Mainz Exciting genomic applications to translational medicine Next generation sequencing Outstanding tool Challenges remain to manage data, integrate clinical annotation, move results (and platform?) into clinical use