bioinformatics: state of art tools for NGS immunogenetics

Similar documents
NGS immunogenetic analysis in vitro: clonality feasibility study

These products are sold FOR RESEARCH USE ONLY; not for use in diagnostic procedures.

From the patient to the sequence : Primers, PCR, Detection of clonality, Sequencing

NGS-Based Clonality Testing Assessing Clonality Status, Somatic Hypermutation and Monitoring Minimum Residual Disease (MRD)

IMGT-ONTOLOGY and IMGT databases, tools and Web resources for immunoinformatics

Basic principles of IG sequence analysis: Immunogenetic analysis: in vitro

IgH/TCR Clonality Status. Performance Monitoring Cover Sheet. Final IgH Clonality Result IGH 131. Uncontrolled Copy

IMGT Databases and Tools for Immunoglobulin (IG) and T cell receptor (TR) analysis, and for Antibody humanization.

UNDERSTANDING THE CLONOSEQ ASSAY

Applications of AmpliSeq-based Ion Torrent TCRB Immune Repertoire Sequencing

IG Cer'fica'on. Andreas Agathangelidis Ins'tute of Applied Biosciences, CERTH ADFDHHHHHHHHHHHHHHHH

Next Generation Sequencing Activities

an innovation in high throughput single cell profiling

Applications of the Ion AmpliSeq Immune Repertoire Assay Plus TCRβ

Introduction to RNA-Seq. David Wood Winter School in Mathematics and Computational Biology July 1, 2013

Genome Sequence Assembly

TCRG TCRA/D IGH IGK/L

dbcamplicons pipeline Amplicons

Nature Immunology: doi: /ni Supplementary Figure 1. Data-processing pipeline.

Towards detection of minimal residual disease in multiple myeloma through circulating tumour DNA sequence analysis

dbcamplicons pipeline Amplicons

TECH NOTE SMARTer T-cell receptor profiling in single cells

Introduction into single-cell RNA-seq. Kersti Jääger 19/02/2014

T and B cell gene rearrangement October 17, Ram Savan

IMGT Locus on Focus. ABC Fax Marie-Paule Lefranc

Antigen receptor (immunoglobulin and T-cell receptor) gene rearrangements: Utility in Routine Diagnostic Hematopathology

Who pairs with whom? High-throughput sequencing of the human paired heavy and light chain repertoire

IMGT, the international ImMunoGeneTics information system.

Outline. Clonality Targets. Human IGH locus. Human IGK locus. Human TRB locus

DNBseq TM SERVICE OVERVIEW Plant and Animal Whole Genome Re-Sequencing

Quality assurance in NGS (diagnostics)

European guidelines for the universal description of Ig / TCR clonality testing data

Roche Molecular Biochemicals Technical Note No. LC 10/2000

SO YOU WANT TO DO A: RNA-SEQ EXPERIMENT MATT SETTLES, PHD UNIVERSITY OF CALIFORNIA, DAVIS

Introduction to RNA-Seq in GeneSpring NGS Software

Atlas of Genetics and Cytogenetics in Oncology and Haematology. IMMUNOGLOBULIN GENES: CONCEPT OF DNA REARRANGEMENT * Introduction

Summary of Proposed Revisions to the 2013 Standards November 2014

Basics of RNA-Seq. (With a Focus on Application to Single Cell RNA-Seq) Michael Kelly, PhD Team Lead, NCI Single Cell Analysis Facility

Outline General NGS background and terms 11/14/2016 CONFLICT OF INTEREST. HLA region targeted enrichment. NGS library preparation methodologies

Experimental Design. Dr. Matthew L. Settles. Genome Center University of California, Davis

The HLA Community s Success in Combining Clinical & Genomic Data

RNA standards v May

NGS in Pathology Webinar

Primary diversity mechanisms of the antibody synthesis in humans, mice and chickens

How to deal with your RNA-seq data?

Real-Time PCR Principles and Applications

Chapter 5. Genetic Models. Organization and Expression of Immunoglobulin Genes 3. The two-gene model: Models to Explain Antibody Diversity

ImmunoID NeXT. Precision Genomics for Immuno-Oncology. ImmunoID NeXT. The Universal Cancer Immunogenomics Platform

TECH NOTE Pushing the Limit: A Complete Solution for Generating Stranded RNA Seq Libraries from Picogram Inputs of Total Mammalian RNA

GENETIC BASIS OF ANTIBODY STRUCTURE AND DIVERSITY. Steven J. Norris, Ph.D

The pipeline repertoire of Ig-Seq analysis

INTELLIGENT ANTIBODY DISCOVERY FROM HUMANS AND OTHER ANIMALS. Guy Cavet

TISSUE MICROARRAY (TMA)

Detection of T-cell clonality in patients with B-cell chronic lymphocytic leukemia

Sample to Insight. Dr. Bhagyashree S. Birla NGS Field Application Scientist

Plateforme IMGT Bases de données anticorps.

working with scientists to advance single cell research

SuperTCRExpress TM Human TCR Vβ Repertoire CDR3 Diversity Determination (Spectratyping) and Quantitative Analysis Kit

Targeted Sequencing in the NBS Laboratory

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq

irweb: Data Analysis Guide

MicroRNA profiling directly from low amounts of plasma or serum using the Multiplex Circulating mirna Assay with Firefly particle technology

Whole Transcriptome Analysis of Illumina RNA- Seq Data. Ryan Peters Field Application Specialist

Immunoglobulins. Generation of Diversity

Antibody humanization and engineering: what do we learn from IMGT standardization.

APPLICATION NOTE. Abstract. Introduction

Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology

Measuring and Understanding Gene Expression

Introducing QIAseq. Accelerate your NGS performance through Sample to Insight solutions. Sample to Insight

Next Generation Sequencing of HLA: Challenges and Opportunities in the era of Precision Medicine. Dr. Paul Keown, 2016

Applications of short-read

Implementing ACMG guidelines on sequence variant interpretation: software-assisted variant curation and filtering

Implementing ACMG guidelines on sequence variant interpretation: software-assisted variant curation and filtering

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Supplementary Information Supplementary Figures

Analysis of NGS data. resources. Grid computing workshop 2015 Jan Oppelt, NCBR & CEITEC MU 1 st December, 2015

Interpretation of karyotyping using mitogens vs FISH vs SNP-based array in CLL. Arnon Kater Dept of Hematology AMC Amsterdam

CAPTURE-BASED APPROACH FOR COMPREHENSIVE DETECTION OF IMPORTANT ALTERATIONS

Technical Information. Intended Use. Contraindications. Special Conditions for Use. Summary and Explanation

Intended Use. Contraindications. Special Conditions for Use. Summary and Explanation

Implementation of Ion AmpliSeq in molecular diagnostics

SMRT Analysis Barcoding Overview (v6.0.0)

Sequencing applications. Today's outline. Hands-on exercises. Applications of short-read sequencing: RNA-Seq and ChIP-Seq

FFPE in your NGS Study

Experimental Design. Sequencing. Data Quality Control. Read mapping. Differential Expression analysis

DATA FORMATS AND QUALITY CONTROL

Quality Control of Next Generation Sequence Data

Analysis of data from high-throughput molecular biology experiments Lecture 6 (F6, RNA-seq ),

Illumina Read QC. UCD Genome Center Bioinformatics Core Monday 29 August 2016

Get to Know Your DNA. Every Single Fragment.

Read Quality Assessment & Improvement. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016

Implementing ACMG guidelines on sequence variant interpretation: software-assisted variant curation and filtering

Course Presentation. Ignacio Medina Presentation

Data Retrieval from GenBank

Pioneering Clinical Omics

HaloPlex HS. Get to Know Your DNA. Every Single Fragment. Kevin Poon, Ph.D.

Discovering gene regulatory control using ChIP-chip and ChIP-seq. An introduction to gene regulatory control, concepts and methodologies

Fundamentals of Next-Generation Sequencing: Technologies and Applications

Use of Spike-ins for Sample Tracking in Agilent Array CGH

Transcription:

bioinformatics: state of art tools for NGS immunogenetics Nikos Darzentas, Ph.D. CEITEC MU, Brno, Czech Republic bat.infspire.org nikos.darzentas@gmail.com Ministry of Health of theczech Republic, grant# 16 34272A CEITEC MU ESLHO::EuroClonality NGS MetaCentrum Virtual Organization ofcesnet

Nikos Darzentas ndarz@certh.gr Identifying Stereotyped Subsets in CLL Novel Computational Tools for Identifying Stereotyped Subsets in Chronic Lymphocytic Leukemia Nikos Darzentas Computational Genomics Unit (CGU) Centre for Research and Technology Hellas (CERTH) Greece ndarz@certh.gr Educational Workshop on Immunoglobulin Gene Analysis in Chronic Lymphocytic Leukemia June 14-15 in Uppsala, Sweden

e.g. eventually link to ARResT/ AssignSubsets other related tools diverse datasets / projects literature

unique challenges of NGS immunogenetics enormous inherent complexity, huge diversity and temporal variation of immune responses highly non trivial annotation vs. multiple germline sequences (from IMGT!) and of the rearrangement junction wide variety of applications, many with their own needs: basic research, technology development e.g. primers and assays, diagnostic / clinical, MRD and clonality assessment and monitoring, repertoire studies errors and biases of protocols and humans, in data and results

generic and specific challenges for bioinformatics modularity and flexibility, to support the many applications multiplexing, i.e. many receptors and chains and junction classes (e.g. incomplete) interpretation clonotype definition, with implications for assessment of clonality thresholds and cut offs and normalisations, incl. what to consider for relative abundances visualisation and user interaction detailed logging and reporting, for development and troubleshooting, but also interpretation and record keeping efficiency, although this is not as big a challenge as with e.g. full genome NGS foolproofing, esp. challenging for very diverse applications, data, and users

junction classes normal IG VJ : Vh (Dh) Jh IG VJ : Vk Jk IG VJ : Vl Jl TR VJ : Va Ja TR VJ : Va Jd TR VJ : Vb (Db) Jb TR VJ : Vd (Dd) Ja TR VJ : Vd (Dd) Jd TR VJ : Vg Jg incomplete IG DJ : Dh Jh TR DD : Dd2 Dd3 TR DJ : Db Jb TR DJ : Dd2 Jd TR DJ : Dd Ja special IG INTRON KDE IG Vk KDE TR VD : Vd Dd3

bioinformaticplatform focused on low throughput sequences, and CLL, mainly with IgCLL ARResT, or Antigen Receptors Research Tool bat.infspire.org/arrest/ ARResT/Teiresias discovering new subsets of stereotyped sequences ARResT/SeqCure curating antigen receptor sequences ARResT/AssignSubsets assigning new members to existing subsets of stereotyped sequences specifically for NGS, and within ESLHO s EuroClonality NGSconsortium (coordinated by Ton) ARResT/Interrogate web accessible, interactive, and integrating a data producing pipeline and a results browser

user experience user interactivity, esp. when users and questions can be diverse, as is the case here

user experience

user experience

user experience user messaging system, which will react to user actions and share info, advice, notes, tips, warnings, and errors user modes, e.g. simple, advanced, don t even bother, diagnostics, etc. application specific modes, e.g. clonality and MRD

visualisations

a)heatmaps: sample dynamics diagnostic prior to SCT donor sample sample then you can directly mix and match sample feature after SCT b)line chart: MRD kinetics 1 after SCT 10 1 10 2 10 3 10 4 single read / NGS depth Graft versus leukemia effects in T prolymphocytic leukemia: evidence from MRD kinetics and TCR repertoire analyses Sellner, Brüggemann, Schlitt, Knecht, Herrmann, Reigl, Krejci, Bystry, Darzentas et al. (submitted)

sequence forensics : sequence search + network of sequence differences or assessment, monitoring and quantification of rearrangements of interest sensitive (no heuristics), smart (rearrangement network aware distance calculation), NGS enabled (normalisation based on experimental setup, incl. MRD spike ins), adaptive (small/big, e.g. MRD/clonality, data), interactive (user control of final results) reads vs. distance to target

sequence forensics user control: change distance threshold, add more sequences to clone, get final %s

sequence forensics with network connected interactive multiple alignments and differences highlighted

sequence forensics and the ability to simplify the network and reduce the data to a manageable summary

primers specific functionality of the pipeline to: identify primers, also taking into account expected coordinates report their frequencies and characteristics, i.e. score and position statistics trim sequenced reads to before or after the primer, i.e. leaving on or removing primer leaving primer on, even if artificial, might help in identifying rearrangements then, primer development: IGHV1 IGHV2 IGHV3 IGHV4 experimental condition A1 experimental condition A2

primers also usable as controls for the health of an NGS run, compared to a golden standard dataset:

data quality our current strategy: keep as much as you can until the end e.g. paired end joining, sequence length, sequence quality PCR and NGS errors can hurt specific applications, e.g. SHM and evolution error correction is (arguably) a rather theoretical exercise unless helped by lab work (e.g. unique molecular identifiers / barcodes) contamination, wet but also digital usual demultiplexing does not handle noise well, leading to assignment of reads to no wrong samples => more statistics strength in numbers, and replicates, and experimental design in general normalize and/or filter on abundance with experimental information (spike in controls, number of cells, or amount of DNA)

EuroClonality NGS standardisation, and SOPs this can involve, for example: (standard operating procedures) predetermined computations with predetermined options i.e. locked scenarios and even sample sheets with complete control of complicated runs centrally available, curated sequences * primers, for development work, and batch quality control * spike ins + copy numbers, see MRD quantification

capture based enrichment and NGS conceptually elegant and practically useful that can create expert panels of genes if probes (or primers) for the IG locus are designed, IG rearrangements could be sequenced as well, incl. incomplete ones two main challenges (if applicable): with probe based capture, fragments and thus reads are not centered around the same area, and thus reconstruction of the rearrangements might be needed with paired end NGS, depending on read and fragment lengths, identifying a junction might be difficult e.g. non overlapping reads, but still reporting the rearrangement as a translocation event may be useful data seen so far show neither breadth nor depth of rearrangements, but major clones are usually found

access to ARResT/Interrogate manuscript for browser under revision, pipeline+browserto follow (he said) code on GitHub (not the whole platform yet, but soon) contact us nikos.darzentas@gmail.com bat@infspire.org bat.infspire.org eventually, also shared as a EuroClonality NGS validated and standardised platform for other ARResT tools, and an overview bat.infspire.org/arrest

acknowledging all people in my bioinformatics Team: VojtaBystry, TomasReigl, AdamKrejci, AndreaGrioni, and previously Baraand Martin and all our friends, collaborators and colleagues, including many you probably already know: Lesley, Anastasia, Andreas, Vassilis, Panagiotis, et al and across many networks: C. Belessi(Athens) F. Davi(Paris) P. Ghia(Milan) R. Rosenquist(Uppsala) K. Stamatopoulos (Thessaloniki) M P. Lefranc, V. Giudicelli (Montpellier) BIOMED II A. Langerak, J. van Dongen (Rotterdam+) M. Brüggemann, C. Pott(Kiel) G. Cazzaniga(Monza) F. Davi(Paris) D. Gonzalez(London) P. Groenen (Nijmegen) M P. Lefranc, V. Giudicelli(Montpellier) K. Stamatopoulos(Thessaloniki)