Detection of Fusion Genes by Targeted Roche 454 Sequencing

Size: px
Start display at page:

Download "Detection of Fusion Genes by Targeted Roche 454 Sequencing"

Transcription

1 Detection of Fusion Genes by Targeted Roche 454 Sequencing Hans-Ulrich Klein 1, Christoph Bartenhagen 1, Alexander Kohlmann 2, Vera Grossmann 2, Christian Ruckert 1, Torsten Haferlach 2, Martin Dugas 1 1 Department of Medical Informatics and Biomathematics, University of Münster 2 Munich Leukemia Laboratory 55. GMDS Jahrestagung 07 September 2010 Motivation Fusion genes caused by structural variants are an important characteristic for prognosis and therapy of cancer. Study at the Munich Leukemia Laboratory: Evaluation of Roche 454 Sequencing to detect structural variants

2 Motivation Fusion genes caused by structural variants are an important characteristic for prognosis and therapy of cancer. Study at the Munich Leukemia Laboratory: Evaluation of Roche 454 Sequencing to detect structural variants Implement a bioinformatics work flow for the detection of structual variants (the method should not make use of prior knowledge) Assess the reliability of the technology Structural variants

3 Experimental design Targeted Sequencing by capture arrays: A1: RUNX1, CBFB, MLL + exons of 92 other genes (5 samples) A2: MLL (10 samples) A3: RUNX1 (5 samples) A4: PDGFRB (2 samples) Experimental design Targeted Sequencing by capture arrays: A1: RUNX1, CBFB, MLL + exons of 92 other genes (5 samples) A2: MLL (10 samples) A3: RUNX1 (5 samples) A4: PDGFRB (2 samples) reads per sample in median 325bp median read length

4 Analysis work flow 1 Image and signal processing 2 Preprocessing 3 Sequence alignment 4 Filter chimeric reads 5 Detect putative breakpoints 6 Visualization of chromosomal aberrations Sequence alignment Common aligners for 454: SSAHA2, BLAT, BWA-SW We used BWA-SW (H. Li, R. Durbin, Bioinformatics 2010) Only best sequence alignment is reported Alignment is non-overlapping on the query sequence

5 Sequence alignment Common aligners for 454: SSAHA2, BLAT, BWA-SW We used BWA-SW (H. Li, R. Durbin, Bioinformatics 2010) Only best sequence alignment is reported Alignment is non-overlapping on the query sequence Aligned Reads On Target Cvg. Chimeric A (91.6%) 63.0% A (90.8%) 5.1% Filter chimeric reads 1 No more than two local alignments 2 At least one segment must align to the target region 3 No linker sequence between local alignments 4 Local alignments must not be located to close to each other (> 1kb) 5 Remove duplicated reads Chimeric A A

6 Cluster chimeric reads Define distance d between two chimeric reads x, y: d = if chrom. or strand and orientation are not compatible, d = (x A y A ) 2 +(x B y B ) 2 else. Hierarchical clustering, cut dendrogram at d = 100 Compute consensus breakpoint for each cluster Merge breakpoints presumably caused by the same variant Read x Chr A Read y Chr B Cluster chimeric reads Define distance d between two chimeric reads x, y: d = if chrom. or strand and orientation are not compatible, d = (x A y A ) 2 +(x B y B ) 2 else. Hierarchical clustering, cut dendrogram at d = 100 Compute consensus breakpoint for each cluster Merge breakpoints presumably caused by the same variant Read x Read y x A ya x B yb Chr B Chr A

7 Summary detected breakpoints Cluster Size Cvg. Dominant Cluster Bp 1 Bp 2 N01 - A N03 - A N04 - A N05 - A N06 - A N14 - A N16 - A N17 - A N20 - A N21 - A N38 - A N39 - A N40 - A N41 - A N42 - A N27 - A N28 - A N29 - A N30 - A N33 - A N36 - A N37 - A Sample N01 - Visualization (1/2) deletion insertion mismatch breakpoint 67,121,088 67,121,533 5' 3' + MYH11 CBFB + 3' 5' 15,815,687 15,815,191 15,815,189 15,814, ' 3' CBFB MYH11 3' 5' + 67,120,631 67,121,086

8 Sample N01 - Visualization (2/2) CCAGTCCAAAAACCTCCTTCCATTTCCGATGATAGTTCGCTATGAAAAAGTAATCTCCAAATATAATGTAGCTGAAGAGCACTTTTTAGAAAATGATTCC CCAGTCCAAAAACCTCCTTCCATTTCCGATGATAGTTCGCTATGAAAAAGTAATCTCCAAATATAATGTAGCTGAAGAGCACTTTTTAGAAAATGATTCC CCAGTCCAAAAACCTCCTTCCATTTCCGATGATAGTTCGCTATGAAAAAGTAATCTCCAAATATAATGTAGCTGAAGAGCACTTTTTAGAAAATGATTCC + 5' CCAGTCCAAAAACCTCCTTCCATTTCCGATGATAGTTCGCTATGAAAAAGTAATCTCCAAATATAATGTAGCTGAAGAGCACTTTTTAGAAAATGATTCC 3' MYH11 CBFB 3' GGTCAGGTTTTTGGAGGAAGGTAAAGGCTACTATCAAGCGATACTTTTTCATTAGAGGTTTATATTACATCGACTTCTCGTGAAAAATCTTTTACTAAGG 5' + GGTCAGGTTTTTGGAGGAAGGTAAAGGCTACTATCAAGCGATACTTTTTCATTAGAGGTTTATATTACATCGACTTCTCGTGAAAAATCTTTTACTAAGG GCAAAATACATACAAAAGCTTTCAACAGTTGTTCCATTAATTGTCAAATAGCCAGGAGCTAGCCTCGCATGGACTGGTGAATAGCACAGAGGGTGGGCAG GCAAGATACATACAAAAGCTTTCAACAGTTGTTCCATTAATTGTCAAATAGCCAGGAGCTAGCCTCGCATGGACTGGTGAATAGCACAGAGGGTGGGCAG GCAAAATACATACAAAAGCTTTCAACAGTTGTTCCATTAATTGTCAAATAGCCAGGAGCTAGCCTCGCATGGACTGGTGAATAGCACAGAGGGTGGGCAG ATACAAAAGCTTTCAACAGTTGTTCCATTAATTGTCAAATAGCCAGGAGCTAGCCTCGCATGGACTGGTGAATAGCACAGAGGGTGGGCAG 5' GCAAAATACATACAAAAGCTTTCAACAGTTGTTCCATTAATTGTCAAATAGCCAGGAGCTAGCCTCGCATGGACTGGTGAATAGCACAGAGGGTGGGCAG 3' + CBFB MYH11 + 3' CGTTTTATGTATGTTTTCGAAAGTTGTCAACAAGGTAATTAACAGTTTATCGGTCCTCGATCGGAGCGTACCTGACCACTTATCGTGTCTCCCACCCGTC 5' CGTTTTATGCATGTTTTCGAAAGTTGTCAACAAGGTAATTAACAGTTTATCGGTCCTCGATCGGAGCGTACCTGACCACTTACCGTGTCTCCCACCCCTC CGTTTTATGTATGTTTTCGAAAGTTGTCAACAAGGTAATTAACAGTTTATCGGTCCTCGATCGGAGCGTACCTGACCACTTATCGTGTCTCCCACCCGTC CGTTTTATGTATGTTTTCGAAAGTTGTCAACAAGGTAATTAACAGTTTATCGGTCCTCGATCGGAGCGTACCTGACCACTTATCGTGTCTCCCACCCGTC CGTTTTATGTATGTTTTCGAAAGTTGTCAACAAGGTAATTAACAGTTTATCGGTCCTCGATCGGAGCGTACCTGACCACTTATCGTGTCTCCCACCCGTC Conclusions Unsupervised approach to filter out interesting reads and to propose a few putative structural variants One single chimeric read is not sufficient coverage Implemented as R-package R453Plus1Toolbox Future: Include biological knowledge about common variants and improve merging

9 Conclusions Unsupervised approach to filter out interesting reads and to propose a few putative structural variants One single chimeric read is not sufficient coverage Implemented as R-package R453Plus1Toolbox Future: Include biological knowledge about common variants and improve merging Thank You!

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow

From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with

More information

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère C3BI VARIANTS CALLING November 2016 Pierre Lechat Stéphane Descorps-Declère General Workflow (GATK) software websites software bwa picard samtools GATK IGV tablet vcftools website http://bio-bwa.sourceforge.net/

More information

Analysis of structural variation. Alistair Ward - Boston College

Analysis of structural variation. Alistair Ward - Boston College Analysis of structural variation Alistair Ward - Boston College What is structural variation? What differentiates SV from short variants? What are the major SV types? Summary of MEI detection What is an

More information

Nanopore long read sequencing for detection of point mutations and structural variants

Nanopore long read sequencing for detection of point mutations and structural variants Nanopore long read sequencing for detection of point mutations and structural variants Viapath Genetics, Guy s Hospital, London Presented by: Kezia Brown Nanopore for long read sequencing Why do we need

More information

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah Analysis of structural variation Alistair Ward USTAR Center for Genetic Discovery University of Utah What is structural variation? What differentiates SV from short variants? What are the major SV types?

More information

Introduction to RNA-Seq in GeneSpring NGS Software

Introduction to RNA-Seq in GeneSpring NGS Software Introduction to RNA-Seq in GeneSpring NGS Software Dipa Roy Choudhury, Ph.D. Strand Scientific Intelligence and Agilent Technologies Learn more at www.genespring.com Introduction to RNA-Seq In a few years,

More information

Targeted Sequencing of Leukemia-Associated Genes Using 454 Sequencing Systems

Targeted Sequencing of Leukemia-Associated Genes Using 454 Sequencing Systems Sequencing Application Note March 2012 Targeted Sequencing of Leukemia-Associated Genes Using 454 Sequencing Systems GS GType TET2/CBL/KRAS and RUNX1 Primer Sets for the GS Junior and GS FLX Systems. Introduction

More information

Result Tables The Result Table, which indicates chromosomal positions and annotated gene names, promoter regions and CpG islands, is the best way for

Result Tables The Result Table, which indicates chromosomal positions and annotated gene names, promoter regions and CpG islands, is the best way for Result Tables The Result Table, which indicates chromosomal positions and annotated gene names, promoter regions and CpG islands, is the best way for you to discover methylation changes at specific genomic

More information

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer.

DNA concentration and purity were initially measured by NanoDrop 2000 and verified on Qubit 2.0 Fluorometer. DNA Preparation and QC Extraction DNA was extracted from whole blood or flash frozen post-mortem tissue using a DNA mini kit (QIAmp #51104 and QIAmp#51404, respectively) following the manufacturer s recommendations.

More information

Ad5 genome. Before filtering. After filtering. Suppl. File 4 PAGE mated reads. Ad5 DNA insertion site 5' 3' 293 unmated reads

Ad5 genome. Before filtering. After filtering. Suppl. File 4 PAGE mated reads. Ad5 DNA insertion site 5' 3' 293 unmated reads Before filtering 0 20 40 60 80 293 Ad5 DNA insertion site 5' 3' 0 20 60 100 48 000 000 48 100 000 48 200 000 48 300 000 48 400 000 48 500 000 After filtering 48 000 000 48 100 000 48 200 000 48 300 000

More information

Read Mapping and Variant Calling. Johannes Starlinger

Read Mapping and Variant Calling. Johannes Starlinger Read Mapping and Variant Calling Johannes Starlinger Application Scenario: Personalized Cancer Therapy Different mutations require different therapy Collins, Meredith A., and Marina Pasca di Magliano.

More information

CBFB-MYH11 REAL TIME QUANTITATIVE PCR DETECTION KIT ONKOTEST RQ

CBFB-MYH11 REAL TIME QUANTITATIVE PCR DETECTION KIT ONKOTEST RQ CBFB-MYH11 REAL TIME QUANTITATIVE PCR DETECTION KIT ONKOTEST RQ3021-20 Product Information Inv(16)(p13q22) or the variant t(16;16)(p13;q22) are frequent recurring chromosomal rearrangements reported to

More information

Haploid Assembly of Diploid Genomes

Haploid Assembly of Diploid Genomes Haploid Assembly of Diploid Genomes Challenges, Trials, Tribulations 13 October 2011 İnanç Birol Assembly By Short Sequencing IEEE InfoVis 2009 2 3 in Literature ~40 citations on tool comparisons ~20 citations

More information

The Human Genome and its upcoming Dynamics

The Human Genome and its upcoming Dynamics The Human Genome and its upcoming Dynamics Matthias Platzer Genome Analysis Leibniz Institute for Age Research - Fritz-Lipmann Institute (FLI) Sequencing of the Human Genome Publications 2004 2001 2001

More information

De novo meta-assembly of ultra-deep sequencing data

De novo meta-assembly of ultra-deep sequencing data De novo meta-assembly of ultra-deep sequencing data Hamid Mirebrahim 1, Timothy J. Close 2 and Stefano Lonardi 1 1 Department of Computer Science and Engineering 2 Department of Botany and Plant Sciences

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Analysis of neo-antigens to identify T-cell neo-epitopes in human Head & Neck cancer. Project XX1001. Customer Detail

Analysis of neo-antigens to identify T-cell neo-epitopes in human Head & Neck cancer. Project XX1001. Customer Detail Analysis of neo-antigens to identify T-cell neo-epitopes in human Head & Neck cancer Project XX Customer Detail Table of Contents. Bioinformatics analysis pipeline...3.. Read quality check. 3.2. Read alignment...3.3.

More information

Challenging algorithms in bioinformatics

Challenging algorithms in bioinformatics Challenging algorithms in bioinformatics 11 October 2018 Torbjørn Rognes Department of Informatics, UiO torognes@ifi.uio.no What is bioinformatics? Definition: Bioinformatics is the development and use

More information

Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4

Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4 WHITE PAPER Oncomine Comprehensive Assay Variant calling workflow for the Oncomine Comprehensive Assay using Ion Reporter Software v4.4 Contents Scope and purpose of document...2 Content...2 How Torrent

More information

Structural Variant Detection in SMRT Link 5 with pbsv

Structural Variant Detection in SMRT Link 5 with pbsv Structural Variant Detection in SMRT Link 5 with pbsv Aaron Wenger 2017-06-27 For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences of California, Inc. All

More information

Structural Variant Detection in SMRT Link 5 with pbsv

Structural Variant Detection in SMRT Link 5 with pbsv Structural Variant Detection in SMRT Link 5 with pbsv Aaron Wenger 2017-06-27 For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences of California, Inc. All

More information

UHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009

UHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009 UHT Sequencing Course Large-scale genotyping Christian Iseli January 2009 Overview Introduction Examples Base calling method and parameters Reads filtering Reads classification Detailed alignment Alignments

More information

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Supplementary Material Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions Joshua N. Burton 1, Andrew Adey 1, Rupali P. Patwardhan 1, Ruolan Qiu 1, Jacob O. Kitzman 1, Jay Shendure 1 1 Department

More information

Quantifying gene expression

Quantifying gene expression Quantifying gene expression Genome GTF (annotation)? Sequence reads FASTQ FASTQ (+reference transcriptome index) Quality control FASTQ Alignment to Genome: HISAT2, STAR (+reference genome index) (known

More information

CREST maps somatic structural variation in cancer genomes with base-pair resolution

CREST maps somatic structural variation in cancer genomes with base-pair resolution Nature Methods CREST maps somatic structural variation in cancer genomes with base-pair resolution Jianmin Wang, Charles G Mullighan, John Easton, Stefan Roberts, Jing Ma, Michael C Rusch, Ken Chen, Christopher

More information

COMPARISON OF GENE FUSION DETECTION TOOLS TO DETECT NOVEL GENE FUSIONS USING A CUSTOM ANNOTATION

COMPARISON OF GENE FUSION DETECTION TOOLS TO DETECT NOVEL GENE FUSIONS USING A CUSTOM ANNOTATION COMPARISON OF GENE FUSION DETECTION TOOLS TO DETECT NOVEL GENE FUSIONS USING A CUSTOM ANNOTATION - current state - 17.02.2017 Carolin Schimmelpfennig c.schimmelpfennig@izi.fraunhofer.de Fraunhofer What

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing

More information

Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies

Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies The MIT Faculty has made this article openly available. Please share how this access benefits you. Your

More information

Figure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle

Figure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle Figure S1 a Unrearranged locus Rearranged locus Concordant read pairs Region1 Concordant read pairs Cluster of discordant read pairs, bundle Region2 Concordant read pairs b Physical coverage 5 4 3 2 1

More information

Bioinformatics in next generation sequencing projects

Bioinformatics in next generation sequencing projects Bioinformatics in next generation sequencing projects Rickard Sandberg Assistant Professor Department of Cell and Molecular Biology Karolinska Institutet May 2013 Standard sequence library generation Illumina

More information

Bioinformatics for High Throughput Sequencing

Bioinformatics for High Throughput Sequencing Bioinformatics for High Throughput Sequencing Eric Rivals LIRMM & IBC, Montpellier http://www.lirmm.fr/~rivals http://www.lirmm.fr/~rivals 1 / High Throughput Sequencing or Next Generation Sequencing High

More information

Bionano Access : Assembly Report Guidelines

Bionano Access : Assembly Report Guidelines Bionano Access : Assembly Report Guidelines Document Number: 30255 Document Revision: A For Research Use Only. Not for use in diagnostic procedures. Copyright 2018 Bionano Genomics Inc. All Rights Reserved

More information

SNP calling and VCF format

SNP calling and VCF format SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide

More information

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004 Analysis of large deletions in human-chimp genomic alignments Erika Kvikstad BioInformatics I December 14, 2004 Outline Mutations, mutations, mutations Project overview Strategy: finding, classifying indels

More information

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology Lecture 2: Microarray analysis Genome wide measurement of gene transcription using DNA microarray Bruce Alberts, et al., Molecular Biology

More information

Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast

Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast Rob Davey NCYC 2009 Introduction SGRP project Ribosomal DNA and variation Computational methods Preliminary Results Conclusions SGRP

More information

Supplemental Methods. Exome Enrichment and Sequencing

Supplemental Methods. Exome Enrichment and Sequencing Supplemental Methods Exome Enrichment and Sequencing Genomic libraries were prepared using the Illumina Paired End Sample Prep Kit following the manufacturer s instructions. Enrichment was performed as

More information

AN ALGORITHM FOR STRUCTURAL VARIANT DETECTION WITH THIRD GENERATION SEQUENCING HUI-JOU CHOU. A thesis submitted to the. Graduate School Camden

AN ALGORITHM FOR STRUCTURAL VARIANT DETECTION WITH THIRD GENERATION SEQUENCING HUI-JOU CHOU. A thesis submitted to the. Graduate School Camden AN ALGORITHM FOR STRUCTURAL VARIANT DETECTION WITH THIRD GENERATION SEQUENCING BY HUI-JOU CHOU A thesis submitted to the Graduate School Camden Rutgers, The State University of New Jersey in partial fulfillment

More information

Variant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012

Variant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012 + Variant Detection in Next Generation Sequencing Data John Osborne Sept 14, 2012 + Overview My Bias Talk slanted towards analyzing whole genomes using Illumina paired end reads with open source tools

More information

Gene Expression Data Analysis

Gene Expression Data Analysis Gene Expression Data Analysis Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu BMIF 310, Fall 2009 Gene expression technologies (summary) Hybridization-based

More information

Assay Validation Services

Assay Validation Services Overview PierianDx s assay validation services bring clinical genomic tests to market more rapidly through experimental design, sample requirements, analytical pipeline optimization, and criteria tuning.

More information

Introduction to Short Read Alignment. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016

Introduction to Short Read Alignment. UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 Introduction to Short Read Alignment UCD Genome Center Bioinformatics Core Tuesday 14 June 2016 From reads to molecules Why align? Individual A Individual B ATGATAGCATCGTCGGGTGTCTGCTCAATAATAGTGCCGTATCATGCTGGTGTTATAATCGCCGCATGACATGATCAATGG

More information

Structural variation analysis using NGS sequencing

Structural variation analysis using NGS sequencing Structural variation analysis using NGS sequencing Victor Guryev NBIC NGS taskforce meeting April 15th, 2011 Scale of genomic variants Scale 1 bp 10 bp 100 bp 1 kb 10 kb 100 kb 1 Mb Variants SNPs Short

More information

Comparing a few SNP calling algorithms using low-coverage sequencing data

Comparing a few SNP calling algorithms using low-coverage sequencing data Yu and Sun BMC Bioinformatics 2013, 14:274 RESEARCH ARTICLE Open Access Comparing a few SNP calling algorithms using low-coverage sequencing data Xiaoqing Yu 1 and Shuying Sun 1,2* Abstract Background:

More information

NGS in Pathology Webinar

NGS in Pathology Webinar NGS in Pathology Webinar NGS Data Analysis March 10 2016 1 Topics for today s presentation 2 Introduction Next Generation Sequencing (NGS) is becoming a common and versatile tool for biological and medical

More information

GENES AND CHROMOSOMES II

GENES AND CHROMOSOMES II 1 GENES AND CHROMOSOMES II Lecture 4 BIOL 266/2 2014-15 Dr. S. Azam Biology Department Concordia University 2 GENE AND THE GENOME The Structure of the Genome DNA fingerprinting 3 DNA fingerprinting: DNA-based

More information

Introduction to human genomics and genome informatics

Introduction to human genomics and genome informatics Introduction to human genomics and genome informatics Session 1 Prince of Wales Clinical School Dr Jason Wong ARC Future Fellow Head, Bioinformatics & Integrative Genomics Adult Cancer Program, Lowy Cancer

More information

NGS to address ncrna and viruses

NGS to address ncrna and viruses NGS to address ncrna and viruses Introduction & TRON Next generation sequencing transcriptomics ncrnas vrna June 30, 2010 John Castle Institute for Translational Oncology and Immunology (TRON) Mainz, Germany

More information

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs Next Genera*on Sequencing II: Personal Genomics Jim Noonan Department of Gene*cs Personal genome sequencing Iden*fying the gene*c basis of phenotypic diversity among humans Gene*c risk factors for disease

More information

Genome 373: Mapping Short Sequence Reads II. Doug Fowler

Genome 373: Mapping Short Sequence Reads II. Doug Fowler Genome 373: Mapping Short Sequence Reads II Doug Fowler The final Will be in this room on June 6 th at 8:30a Will be focused on the second half of the course, but will include material from the first half

More information

CBFB-MYH11 REAL TIME PCR DETECTION KIT

CBFB-MYH11 REAL TIME PCR DETECTION KIT CBFB-MYH11 REAL TIME PCR DETECTION KIT ONKOTEST R3020-20 Keep the kit at -15 C to -25 C Rev. 1.4 Product Information Inv(16)(p13q22) or the variant t(16;16)(p13;q22) are frequent recurring chromosomal

More information

Fig Flowchart of bacterial evolutionary algorithm

Fig Flowchart of bacterial evolutionary algorithm Activity: Draw the flowchart of bacterial evolutionary algorithm Explain the two bacterial operators with your own words Draw the figure of the two bacterial operators The original Genetic Algorithm (GA)

More information

Bionano Solve Theory of Operation: Variant Annotation Pipeline

Bionano Solve Theory of Operation: Variant Annotation Pipeline Bionano Solve Theory of Operation: Variant Annotation Pipeline Document Number: 30190 Document Revision: B For Research Use Only. Not for use in diagnostic procedures. Copyright 2018 Bionano Genomics,

More information

LCR_Finder: A de novo low copy repeat finder for human genome

LCR_Finder: A de novo low copy repeat finder for human genome LCR_Finder: A de novo low copy repeat finder for human genome Xuan Liu, David Wai-lok Cheung, Hing-Fung Ting, Tak-Wah Lam *, and Siu-Ming Yiu * Department of Computer Science, The University of Hong Kong,

More information

Efficient. Fast. Extensive.

Efficient. Fast. Extensive. Discover why cytogeneticists are using NGS in tandem with FISH Lab directors are inundated with cases, so gaining efficiency is critical. Next-generation sequencing (NGS) can complement FISH, saving you

More information

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50 MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50 DATE TITLE LECTURER Thu Jan 14 Introduction, Genomic low copy repeats Pierce Tue Jan

More information

Machine Learning. HMM applications in computational biology

Machine Learning. HMM applications in computational biology 10-601 Machine Learning HMM applications in computational biology Central dogma DNA CCTGAGCCAACTATTGATGAA transcription mrna CCUGAGCCAACUAUUGAUGAA translation Protein PEPTIDE 2 Biological data is rapidly

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Why can GBS be complicated? Tools for filtering, error correction and imputation.

Why can GBS be complicated? Tools for filtering, error correction and imputation. Why can GBS be complicated? Tools for filtering, error correction and imputation. Edward Buckler USDA-ARS Cornell University http://www.maizegenetics.net Many Organisms Are Diverse Humans are at the lower

More information

ChIP-seq and RNA-seq

ChIP-seq and RNA-seq ChIP-seq and RNA-seq Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions (ChIPchromatin immunoprecipitation)

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

Next-Generation Sequencing. Technologies

Next-Generation Sequencing. Technologies Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062

More information

CS273B: Deep learning for Genomics and Biomedicine

CS273B: Deep learning for Genomics and Biomedicine CS273B: Deep learning for Genomics and Biomedicine Lecture 2: Convolutional neural networks and applications to functional genomics 09/28/2016 Anshul Kundaje, James Zou, Serafim Batzoglou Outline Anatomy

More information

Short Read Alignment to a Reference Genome

Short Read Alignment to a Reference Genome Short Read Alignment to a Reference Genome Shamith Samarajiwa CRUK Summer School in Bioinformatics Cambridge, September 2018 Aligning to a reference genome BWA Bowtie2 STAR GEM Pseudo Aligners for RNA-seq

More information

mrna Sequencing Quality Control (V6)

mrna Sequencing Quality Control (V6) mrna Sequencing Quality Control (V6) Notes: the following analyses are based on 8 adult brains sequenced in USC and Yale 1. Error Rates The error rates of each sequencing cycle are reported for 120 tiles

More information

High-Throughput Bioinformatics: Re-sequencing and de novo assembly. Elena Czeizler

High-Throughput Bioinformatics: Re-sequencing and de novo assembly. Elena Czeizler High-Throughput Bioinformatics: Re-sequencing and de novo assembly Elena Czeizler 13.11.2015 Sequencing data Current sequencing technologies produce large amounts of data: short reads The outputted sequences

More information

02 Agenda Item 03 Agenda Item

02 Agenda Item 03 Agenda Item 01 Agenda Item 02 Agenda Item 03 Agenda Item SOLiD 3 System: Applications Overview April 12th, 2010 Jennifer Stover Field Application Specialist - SOLiD Applications Workflow for SOLiD Application Application

More information

Taking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013

Taking Advantage of Long RNA-Seq Reads. Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Taking Advantage of Long RNA-Seq Reads Vince Magrini Pacific Biosciences User Group Meeting September 18, 2013 Overview Proof-of-Principle SMART-cDNA Synthesis PB-SBL size distributions Gene Annotation

More information

Dipping into Guacamole. Tim O Donnell & Ryan Williams NYC Big Data Genetics Meetup Aug 11, 2016

Dipping into Guacamole. Tim O Donnell & Ryan Williams NYC Big Data Genetics Meetup Aug 11, 2016 Dipping into uacamole Tim O Donnell & Ryan Williams NYC Big Data enetics Meetup ug 11, 2016 Who we are: Hammer Lab Computational lab in the department of enetics and enomic Sciences at Mount Sinai Principal

More information

Variant calling in NGS experiments

Variant calling in NGS experiments Variant calling in NGS experiments Jorge Jiménez jjimeneza@cipf.es BIER CIBERER Genomics Department Centro de Investigacion Principe Felipe (CIPF) (Valencia, Spain) 1 Index 1. NGS workflow 2. Variant calling

More information

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING Ken Chen, Ph.D. kchen@genome.wustl.edu The Genome Center, Washington University in St. Louis The path

More information

Variation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI

Variation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI Variation detection based on second generation sequencing data Xin LIU Department of Science and Technology, BGI liuxin@genomics.org.cn 2013.11.21 Outline Summary of sequencing techniques Data quality

More information

Almac Diagnostics. NGS Panels: From Patient Selection to CDx. Dr Katarina Wikstrom Head of US Operations Almac Diagnostics

Almac Diagnostics. NGS Panels: From Patient Selection to CDx. Dr Katarina Wikstrom Head of US Operations Almac Diagnostics Almac Diagnostics NGS Panels: From Patient Selection to CDx Dr Katarina Wikstrom Head of US Operations Almac Diagnostics Overview Almac Diagnostics Overview Benefits and Challenges of NGS Panels for Subject

More information

Supplementary Figures and Data

Supplementary Figures and Data Supplementary Figures and Data Whole Exome Screening Identifies Novel and Recurrent WISP3 Mutations Causing Progressive Pseudorheumatoid Dysplasia in Jammu and Kashmir India Ekta Rai 1, Ankit Mahajan 2,

More information

Alignment and Assembly

Alignment and Assembly Alignment and Assembly Genome assembly refers to the process of taking a large number of short DNA sequences and putting them back together to create a representation of the original chromosomes from which

More information

dbcamplicons pipeline Amplicons

dbcamplicons pipeline Amplicons dbcamplicons pipeline Amplicons Matthew L. Settles Genome Center Bioinformatics Core University of California, Davis settles@ucdavis.edu; bioinformatics.core@ucdavis.edu Microbial community analysis Goal:

More information

Iso-Seq TM Bioinformatics Analysis of the Human MCF-7 Transcriptome Sequenced with PacBio Long Reads

Iso-Seq TM Bioinformatics Analysis of the Human MCF-7 Transcriptome Sequenced with PacBio Long Reads Iso-Seq TM Bioinformatics Analysis of the Human MCF-7 Transcriptome Sequenced with PacBio Long Reads Elizabeth Tseng, Senior Bioinformatics Scientist FIND MEANING IN COMPLEXITY Pacific Biosciences, the

More information

About Strand NGS. Strand Genomics, Inc All rights reserved.

About Strand NGS. Strand Genomics, Inc All rights reserved. About Strand NGS Strand NGS-formerly known as Avadis NGS, is an integrated platform that provides analysis, management and visualization tools for next-generation sequencing data. It supports extensive

More information

Normal-Tumor Comparison using Next-Generation Sequencing Data

Normal-Tumor Comparison using Next-Generation Sequencing Data Normal-Tumor Comparison using Next-Generation Sequencing Data Chun Li Vanderbilt University Taichung, March 16, 2011 Next-Generation Sequencing First-generation (Sanger sequencing): 115 kb per day per

More information

Transcriptome analysis

Transcriptome analysis Statistical Bioinformatics: Transcriptome analysis Stefan Seemann seemann@rth.dk University of Copenhagen April 11th 2018 Outline: a) How to assess the quality of sequencing reads? b) How to normalize

More information

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Genetic variation How much genetic variation is there between individuals? What type of variants

More information

Reads to Discovery. Visualize Annotate Discover. Small DNA-Seq ChIP-Seq Methyl-Seq. MeDIP-Seq. RNA-Seq. RNA-Seq.

Reads to Discovery. Visualize Annotate Discover. Small DNA-Seq ChIP-Seq Methyl-Seq. MeDIP-Seq. RNA-Seq. RNA-Seq. Reads to Discovery RNA-Seq Small DNA-Seq ChIP-Seq Methyl-Seq RNA-Seq MeDIP-Seq www.strand-ngs.com Analyze Visualize Annotate Discover Data Import Alignment Vendor Platforms: Illumina Ion Torrent Roche

More information

Data Mining for Biological Data Analysis

Data Mining for Biological Data Analysis Data Mining for Biological Data Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Data Mining Course by Gregory-Platesky Shapiro available at www.kdnuggets.com Jiawei Han

More information

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014

Introduction to metagenome assembly. Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Introduction to metagenome assembly Bas E. Dutilh Metagenomic Methods for Microbial Ecologists, NIOO September 18 th 2014 Sequencing specs* Method Read length Accuracy Million reads Time Cost per M 454

More information

HHS Public Access Author manuscript Nat Biotechnol. Author manuscript; available in PMC 2012 May 07.

HHS Public Access Author manuscript Nat Biotechnol. Author manuscript; available in PMC 2012 May 07. Integrative Genomics Viewer James T. Robinson 1, Helga Thorvaldsdóttir 1, Wendy Winckler 1, Mitchell Guttman 1,2, Eric S. Lander 1,2,3, Gad Getz 1, and Jill P. Mesirov 1 1 Broad Institute of Massachusetts

More information

Data Retrieval from GenBank

Data Retrieval from GenBank Data Retrieval from GenBank Peter J. Myler Bioinformatics of Intracellular Pathogens JNU, Feb 7-0, 2009 http://www.ncbi.nlm.nih.gov (January, 2007) http://ncbi.nlm.nih.gov/sitemap/resourceguide.html Accessing

More information

Agilent SurePrint G3 Human Catalog CGH Microarrays

Agilent SurePrint G3 Human Catalog CGH Microarrays Agilent SurePrint G3 Human Catalog CGH Microarrays Product Note The new Agilent 1M CGH array combines the excellent probe design and performance characteristics of the Agilent acgh platforms with a million

More information

Sequence Alignments. Week 3

Sequence Alignments. Week 3 Sequence Alignments Week 3 Independent Project Gene Due: 9/25 (Monday--must be submitted by email) Rough Draft Due: 11/13 (hard copy due at the beginning of class, and emailed to me) Final Version Due:

More information

The Diploid Genome Sequence of an Individual Human

The Diploid Genome Sequence of an Individual Human The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Mapping strategies for sequence reads

Mapping strategies for sequence reads Mapping strategies for sequence reads Ernest Turro University of Cambridge 21 Oct 2013 Quantification A basic aim in genomics is working out the contents of a biological sample. 1. What distinct elements

More information

SV-BET: Structure Variation Benchmarking and Evaluation Tool with Comparative Analysis of Split Read-Based Approaches

SV-BET: Structure Variation Benchmarking and Evaluation Tool with Comparative Analysis of Split Read-Based Approaches International Journal of Pharma Medicine and Biological Sciences Vol. 5, No. 4, October 2016 SV-BET: Structure Variation Benchmarking and Evaluation Tool with Comparative Analysis of Split Read-Based Approaches

More information

Transcriptomics analysis with RNA seq: an overview Frederik Coppens

Transcriptomics analysis with RNA seq: an overview Frederik Coppens Transcriptomics analysis with RNA seq: an overview Frederik Coppens Platforms Applications Analysis Quantification RNA content Platforms Platforms Short (few hundred bases) Long reads (multiple kilobases)

More information

Biol 478/595 Intro to Bioinformatics

Biol 478/595 Intro to Bioinformatics Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12

More information

Deep Sequencing technologies

Deep Sequencing technologies Deep Sequencing technologies Gabriela Salinas 30 October 2017 Transcriptome and Genome Analysis Laboratory http://www.uni-bc.gwdg.de/index.php?id=709 Microarray and Deep-Sequencing Core Facility University

More information

De novo assembly in RNA-seq analysis.

De novo assembly in RNA-seq analysis. De novo assembly in RNA-seq analysis. Joachim Bargsten Wageningen UR/PRI/Plant Breeding October 2012 Motivation Transcriptome sequencing (RNA-seq) Gene expression / differential expression Reconstruct

More information

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved.

RNA-Seq Analysis. August Strand Genomics, Inc All rights reserved. RNA-Seq Analysis August 2014 Strand Genomics, Inc. 2014. All rights reserved. Contents Introduction... 3 Sample import... 3 Quantification... 4 Novel exon... 5 Differential expression... 12 Differential

More information

Supplementary Information Supplementary Figures

Supplementary Information Supplementary Figures Supplementary Information Supplementary Figures Supplementary Figure 1. Frequency of the most highly recurrent gene fusions in 333 prostate cancer patients from the TCGA. The Y-axis shows numbers of patients.

More information

Frequently asked questions

Frequently asked questions Frequently asked questions Affymetrix Mouse Diversity Genotyping Array The Affymetrix Mouse Diversity Genotyping Array features more than 623,000 single nucleotide polymorphisms (SNPs) and more than 916,000

More information