Calling DNA Variants Steve Laurie Centro Nacional de Analisis Genomico (CNAG-CRG), Barcelona
|
|
- Allyson Preston
- 6 years ago
- Views:
Transcription
1 Calling DNA Variants Steve Laurie Centro Nacional de Analisis Genomico (CNAG-CRG), Barcelona Variant Effect Predictor Training Course Heraklion, 31st October 2016
2 Calling DNA Variants - Overview What is a variant, and how many do we have? What do we do with our lab coats on? What do we do sitting at a computer? Some real world issues How well do we do in general?
3 What is a variant? 3 A variant is any position/region in our sample which differs from the haploid reference genome to which we are comparing it Single Nucleotide Variants (SNVs) e.g. A G note diploid individual may be AA, AG, or GG Short insertions and deletions (InDels) e.g. TA TATA (insertion of TA ) e.g. CT C (loss of the T at the third position) Copy Number Variants (CNVs) tandem duplication of longer regions (~1-100kb) that are typically polymorphic within the population e.g. AMY1 Structural Variants (SVs) larger still, and often more complex
4 What is a variant? 4 A variant is any position/region in our sample which differs from the haploid reference genome to which we are comparing it Single Nucleotide Variants (SNVs) e.g. A G note diploid individual may be AA, AG, or GG Short insertions and deletions (InDels) e.g. TA TATA (insertion of TA ) e.g. CT C (loss of the T at the third position) Copy Number Variants (CNVs) tandem duplication of longer regions (~1-100kb) that are typically polymorphic within the population e.g. AMY1 Structural Variants (SVs) larger still, and often more complex
5 How many variants do we have? 5 A variant is any position/region in our sample which differs from the haploid reference genome to which we are comparing it Single Nucleotide Variants (SNVs) ~ 3,500,000 4,000,000 ( ~ 30, ,000 exomic ) Short insertions and deletions (InDels) ~ 300, ,000 Copy Number Variants (CNVs) ~ 5-10% of genome Structural Variants larger still, and often more complex ~ 13% of genome
6 NGS Workfl ow what do we do with our lab coats on? 6 Kassahn, K. (2013)
7 NGS Workfl ow what do we do with our lab coats on? 7 Kassahn, K. (2013)
8 Targeted NGS Fracti onati on and Capture fractionation Figure from Mardis, E.R. (2012)
9 Targeted NGS Fracti onati on and Capture hybridisation 1 - fractionation Figure from Mardis, E.R. (2012)
10 Targeted NGS Fracti onati on and Capture hybridisation 1 - fractionation 3 - enrichment Figure from Mardis, E.R. (2012)
11 Targeted NGS Fracti onati on and Capture hybridisation 1 - fractionation 3 - enrichment 4 - amplification Figure from Mardis, E.R. (2012)
12 Paired-end Reads 12
13 Paired-end Reads 13 Typically bp
14 Paired-end Reads 14 Typically bp nt nt ~50-100bp
15 Read-mapping 15 ~50-400nt linker ~50-400nt linker
16 Read-mapping 16 ~50-400nt linker
17 Read-mapping 17 ~50-400nt linker
18 Mapped WES reads viewed in IGV 18 Coverage Reads Exons
19 Variant Calling ideal 19
20 Variant Calling real world 20
21 Variant Calling Tools 21 SAMtools
22 Variant Calling Tools 22 SAMtools GATK
23 Variant Calling Tools 23 SAMtools GATK freebayes
24 Variant Calling Tools 24 SAMtools GATK freebayes Platypus
25 Variant Calling Tools 25 Variant calling tools will start by calling every potential variant they observe This will include true variants, and false-positives due to: Library preparation artefacts PCR artefacts Sequencing errors Mapping issues Algorithm issues Subsequently apply a number of mechanisms to attempt to help identify the true positives from the false-positives, and provide metrics Currently, you will always have some false positives, and some false negatives
26 Variant Calling Data 26
27 Annotation of variant call at sample level These fields are shown in the FORMAT field of the VCF, and there are values for every sample in a multi-sample VCF G A 999 PASS DP=180;VDB= e-01;RPB= e01;AF1=0.5;AC1=4; DP4=43,49,46,37;MQ=35;FQ=999;PV4=0.29,1,1,1 GT:PL:DP:SP:GQ 0/0:0,135,255:45:0:99 1/1:255,102,0:34:0:99 0/1:255,0,255:46:6:99 Tag Field Definition GQ Genotype Quality Phred-scaled confidence that the real genotype is that reported versus the next most likely genotype DP Depth PL Probability Likelihood The likelihood of the possible genotypes (order 0/0, 0/1, 1/1, 0/2, 1/2, 2/2 ), normalised such that the value for the reported GT is set to 0 SP Strand Bias p-value Phred-scaled strand bias p-value for sample Number of reliable base calls at this position
28 Annotation of variant call at pedigree level This information is taken from FILTER/INFO field of the VCF and indicate positions that are failing across samples Tag Field Definition sb0.05 / sb0.001 Strand Bias Indicates that there was a signficant bias towards variant calls only being observed on one strand across all samples at <0.05 or <0.001 respecively tdb0.05 Tail Distance Bias Indicates that there was a significant positional bias within reads for variant calls across all samples at this position mrd10 / mrd15 Minimum Read Depth Indicates that at least one of the samples had coverage <10 or <15 at this position msb30 Maximum Strand Bias Indicates that at least one of the samples had a strand bias (SP) >30 map Mappability Variant observed in a region to which we know we have problems to align short reads SALX=Y Samples At Least (covered) SAL10=3 would mean that at least 3 of the samples in the VCF have a read depth of 10 at this position
29 Indel identi fi cati on Raw BWA mapped reads 4 Following local realignment DePristo, M. et al. (2011)
30 Indel identi fi cati on Raw BWA mapped reads Following local realignment DePristo, M. et al. (2011)
31 Indel identi fi cati on where exactly? 31
32 Prioritize variants: advanced technical filtering Tail distance bias/read position bias No reads spanning this region ReadPosRankSum = ReadPosRankSum = ReadPosRankSum = samtools: VDB field (proportion), PV4 field (p-value) GATK: ReadPosRankSum field (Z-score)
33 Prioritize variants: advanced technical filtering Strand bias Reads Reads samtools: PV4 field (p-value) GATK: FS field (Phred-scaled p-value)
34 Raw Data Alignment Standardise Representation Illumina Platinum 50x WGS NA12878 NimbleGen MedExome 90x WES NA12878 BWA-MEM v0.7.8 GEM v3.1 Sort & Mark Duplicates (Picard) + Indel Realigment (GATK v3.3) Call Variants FreeBayes v0.9 Minimal Filtering QUAL >30 HaplotypeCaller v3.3 + GenotypeGVCF QUAL >30 SAMtools v1.2 1) fast (-Bug) 2) slow (-ug) QUAL >30 16 Final Call Set VCFs ( 8 Genome & 8 Exome)
35 NIST provide calls for callable regions of NA12878 genome i.e. excluding simplerepeats, CNVs and known segemental duplications, in this sample 2,191MB, 2,915,728 unique variant positions
36 Comparison with NIST set Intersect our VCFs to same regions and compared positions in our VCFs with the NIST VCF for concordance at level of Chr-Pos-Ref-Alt-GT Ignored positions that were multi-allelic for the alternative allele ~0.15% in NIST
37 Comparison with NIST - SNVs Dataset Total Calls TP FP FN Specificity Sensitivity F1 score Whole Genome SNVs NIST v2.18 Gold Standard BWA + FreeBayes BWA + HaplotypeCaller BWA + SAMtools fast BWA + SAMtools normal GEM3 + FreeBayes GEM3 + HaplotypeCaller GEM3 + SAMtools fast GEM3 + SAMtools normal TP Call identical in NIST and CNAG FP Call in CNAG not found in NIST FN Call in NIST not found in CNAG
38 Comparison with NIST - Deletions Dataset Total Calls TP FP FN Specificity Sensitivity F1 score Whole Genome Deletions NIST v2.18 Gold Standard BWA + FreeBayes BWA + HaplotypeCaller BWA + SAMtools fast BWA + SAMtools normal GEM3 + FreeBayes GEM3 + HaplotypeCaller GEM3 + SAMtools fast GEM3 + SAMtools normal TP Call identical in NIST and CNAG FP Call in CNAG not found in NIST FN Call in NIST not found in CNAG
39 Comparison with NIST - Insertions Dataset Total Calls TP FP FN Specificity Sensitivity F1 score Whole Genome Insertions NIST v2.18 Gold Standard BWA + FreeBayes BWA + HaplotypeCaller BWA + SAMtools fast BWA + SAMtools normal GEM3 + FreeBayes GEM3 + HaplotypeCaller GEM3 + SAMtools fast GEM3 + SAMtools normal TP Call identical in NIST and CNAG FP Call in CNAG not found in NIST FN Call in NIST not found in CNAG
40 Consensus between callers NIST callable region (2,195Mb)
41 Consensus between callers non-callable but mappable (594Mb)
42 WES V WGS
43 Summary Findings GEM3.1 is fast and resultant variant calling results are similar to those for BWA-MEM All variant callers tested are fairly similar in SNV accuracy There is much more variety in indel calls There is not a lot of diffence in accuracy between the two SamTools modes FreeBayes is very fast, but perhaps not as accurate as Haplotype Caller for indels
44
45
46 RD-Connect Genomics Platform D. Piscia, S. Laurie, S. Beltran
47 RD-Connect Genomics Platform Demos tomorrow at 14h and 16h D. Piscia, S. Laurie, S. Beltran
48 ISO 9001:2008
49 Acknowledgements rd-connect.eu platform.rd-connect.eu WP1: Coordination Hanns Lochmüller (Newcastle and TREAT-NMD) WP2: Patient registries Domenica Taruscio (ISS and EPIRARE) WP3: Biobanks Lucia Monaco (Fondaz. Telethon & EuroBioBank) WP4: Bioinformatics Christophe Béroud (INSERM Marseille) WP5: Unified platform Ivo Gut (CNAG Barcelona) WP6 Ethical/legal/social Mats Hansson (Uppsala) WP7: Impact/Innovation Kate Bushby (Newcastle and EUCERD/ EJARD) CNAG I. Gut S. Beltran D. Piscia S. Laurie J. Protasio A. Papakons. I. Martinez R. Tonda J.R. Trotta LUMC P.B. t Hoen M. Roos R. Raliyaperumal M. Thompson CNIO A. Valencia V. de la Torre J.M. Fernández A. Cañada U. Aveiro J.L. Oliveira P. Lopes P. Sernaleda Murdoch U. M. Bellgard MME H. Rehm AMU C. Béroud D. Salgado J.P. Desvignes Interactive BioSoftware A. Blavier S. Lair U. of Patras G. Patrinos Genesis S. Zuchner M.Gonzalez R. Acosta EGA D. Spalding J. Almeida-King A. Navarro J. Rambla Newcastle U. H. Lochmüller R. Thompson J. Dawson A. Topf I. Zaharieva U. Of Toronto M. Brudno M. Girdea S. Dumitriu O. Buske
RD-Connect: data sharing and analysis for RD research within the integrated platform and through GA4GH Beacon and MatchMaker Exchange.
RD-Connect: data sharing and analysis for RD research within the integrated platform and through GA4GH Beacon and MatchMaker Exchange. Sergi Beltran Bioinformatics Analysis Group Leader Centro Nacional
More informationSNP calling and VCF format
SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide
More informationVariant calling in NGS experiments
Variant calling in NGS experiments Jorge Jiménez jjimeneza@cipf.es BIER CIBERER Genomics Department Centro de Investigacion Principe Felipe (CIPF) (Valencia, Spain) 1 Index 1. NGS workflow 2. Variant calling
More informationAims of the International Workshop
Aims of the International Workshop Domenica Taruscio domenica.taruscio@iss.it National Centre Rare Diseases National Institute for Health Rome - Italy 2 nd International Workshop Rare Disease and Orphan
More informationNext Generation Sequencing: Data analysis for genetic profiling
Next Generation Sequencing: Data analysis for genetic profiling Raed Samara, Ph.D. Global Product Manager Raed.Samara@QIAGEN.com Welcome to the NGS webinar series - 2015 NGS Technology Webinar 1 NGS: Introduction
More informationVariant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012
+ Variant Detection in Next Generation Sequencing Data John Osborne Sept 14, 2012 + Overview My Bias Talk slanted towards analyzing whole genomes using Illumina paired end reads with open source tools
More informationSingle Nucleotide Variant Analysis. H3ABioNet May 14, 2014
Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide
More informationBioinformatics small variants Data Analysis. Guidelines. genomescan.nl
Next Generation Sequencing Bioinformatics small variants Data Analysis Guidelines genomescan.nl GenomeScan s Guidelines for Small Variant Analysis on NGS Data Using our own proprietary data analysis pipelines
More informationGermline variant calling and joint genotyping
talks Germline variant calling and joint genotyping Applying the joint discovery workflow with HaplotypeCaller + GenotypeGVCFs You are here in the GATK Best PracDces workflow for germline variant discovery
More informationChang Xu Mohammad R Nezami Ranjbar Zhong Wu John DiCarlo Yexun Wang
Supplementary Materials for: Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller Chang Xu Mohammad R Nezami Ranjbar Zhong Wu John
More informationPersonal Genomics Platform White Paper Last Updated November 15, Executive Summary
Executive Summary Helix is a personal genomics platform company with a simple but powerful mission: to empower every person to improve their life through DNA. Our platform includes saliva sample collection,
More informationAssignment 9: Genetic Variation
Assignment 9: Genetic Variation Due Date: Friday, March 30 th, 2018, 10 am In this assignment, you will profile genome variation information and attempt to answer biologically relevant questions. The variant
More informationEcole de Bioinforma(que AVIESAN Roscoff 2014 GALAXY INITIATION. A. Lermine U900 Ins(tut Curie, INSERM, Mines ParisTech
GALAXY INITIATION A. Lermine U900 Ins(tut Curie, INSERM, Mines ParisTech How does Next- Gen sequencing work? DNA fragmentation Size selection and clonal amplification Massive parallel sequencing ACCGTTTGCCG
More informationThe Sentieon Genomic Tools Improved Best Practices Pipelines for Analysis of Germline and Tumor-Normal Samples
The Sentieon Genomic Tools Improved Best Practices Pipelines for Analysis of Germline and Tumor-Normal Samples Andreas Scherer, Ph.D. President and CEO Dr. Donald Freed, Bioinformatics Scientist, Sentieon
More informationRead Mapping and Variant Calling. Johannes Starlinger
Read Mapping and Variant Calling Johannes Starlinger Application Scenario: Personalized Cancer Therapy Different mutations require different therapy Collins, Meredith A., and Marina Pasca di Magliano.
More informationVariant detection analysis in the BRCA1/2 genes from Ion torrent PGM data
Variant detection analysis in the BRCA1/2 genes from Ion torrent PGM data Bruno Zeitouni Bionformatics department of the Institut Curie Inserm U900 Mines ParisTech Ion Torrent User Meeting 2012, October
More informationTargeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition
RESEARCH ARTICLE Targeted Sequencing Reveals Large-Scale Sequence Polymorphism in Maize Candidate Genes for Biomass Production and Composition Moses M. Muraya 1,2, Thomas Schmutzer 1 *, Chris Ulpinnis
More informationNext-Generation Sequencing. Technologies
Next-Generation Next-Generation Sequencing Technologies Sequencing Technologies Nicholas E. Navin, Ph.D. MD Anderson Cancer Center Dept. Genetics Dept. Bioinformatics Introduction to Bioinformatics GS011062
More informationIntroduction to Next Generation Sequencing (NGS) Andrew Parrish Exeter, 2 nd November 2017
Introduction to Next Generation Sequencing (NGS) Andrew Parrish Exeter, 2 nd November 2017 Topics to cover today What is Next Generation Sequencing (NGS)? Why do we need NGS? Common approaches to NGS NGS
More informationAlignment. J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014
Alignment J Fass UCD Genome Center Bioinformatics Core Wednesday December 17, 2014 From reads to molecules Why align? Individual A Individual B ATGATAGCATCGTCGGGTGTCTGCTCAATAATAGTGCCGTATCATGCTGGTGTTATAATCGCCGCATGACATGATCAATGG
More informationOral Cleft Targeted Sequencing Project
Oral Cleft Targeted Sequencing Project Oral Cleft Group January, 2013 Contents I Quality Control 3 1 Summary of Multi-Family vcf File, Jan. 11, 2013 3 2 Analysis Group Quality Control (Proposed Protocol)
More informationRareVariantVis 2: R suite for analysis of rare variants in whole genome sequencing data.
RareVariantVis 2: R suite for analysis of rare variants in whole genome sequencing data. Adam Gudyś and Tomasz Stokowy October 30, 2017 Introduction The search for causative genetic variants in rare diseases
More informationIncorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits
Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing
More informationThe Sentieon Genomics Tools A fast and accurate solution to variant calling from next-generation sequence data
The Sentieon Genomics Tools A fast and accurate solution to variant calling from next-generation sequence data Donald Freed 1*, Rafael Aldana 1, Jessica A. Weber 2, Jeremy S. Edwards 3,4,5 1 Sentieon Inc,
More informationVariant Analysis. CB2-201 Computational Biology and Bioinformatics! February 27, Emidio Capriotti!
Variant Analysis CB2-201 Computational Biology and Bioinformatics February 27, 2015 Emidio Capriotti http://biofold.org/emidio Division of Informatics Department of Pathology Variant Call Format The final
More informationVariation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI
Variation detection based on second generation sequencing data Xin LIU Department of Science and Technology, BGI liuxin@genomics.org.cn 2013.11.21 Outline Summary of sequencing techniques Data quality
More informationL3: Short Read Alignment to a Reference Genome
L3: Short Read Alignment to a Reference Genome Shamith Samarajiwa CRUK Autumn School in Bioinformatics Cambridge, September 2017 Where to get help! http://seqanswers.com http://www.biostars.org http://www.bioconductor.org/help/mailing-list
More informationHLA and Next Generation Sequencing it s all about the Data
HLA and Next Generation Sequencing it s all about the Data John Ord, NHSBT Colindale and University of Cambridge BSHI Annual Conference Manchester September 2014 Introduction In 2003 the first full public
More informationWelcome to the NGS webinar series
Welcome to the NGS webinar series Webinar 1 NGS: Introduction to technology, and applications NGS Technology Webinar 2 Targeted NGS for Cancer Research NGS in cancer Webinar 3 NGS: Data analysis for genetic
More informationAxiom mydesign Custom Array design guide for human genotyping applications
TECHNICAL NOTE Axiom mydesign Custom Genotyping Arrays Axiom mydesign Custom Array design guide for human genotyping applications Overview In the past, custom genotyping arrays were expensive, required
More informationWhole genome sequencing in drug discovery research: a one fits all solution?
Whole genome sequencing in drug discovery research: a one fits all solution? Marc Sultan, September 24th, 2015 Biomarker Development, Translational Medicine, Novartis On behalf of the BMD WGS pilot team:
More informationFundamentals of Next-Generation Sequencing: Technologies and Applications
Fundamentals of Next-Generation Sequencing: Technologies and Applications Society for Hematopathology European Association for Haematopathology 2017 Workshop Eric Duncavage, MD Washington University in
More informationCourse Presentation. Ignacio Medina Presentation
Course Index Introduction Agenda Analysis pipeline Some considerations Introduction Who we are Teachers: Marta Bleda: Computational Biologist and Data Analyst at Department of Medicine, Addenbrooke's Hospital
More informationFrom Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow
From Variants to Pathways: Agilent GeneSpring GX s Variant Analysis Workflow Technical Overview Import VCF Introduction Next-generation sequencing (NGS) studies have created unanticipated challenges with
More informationWhy can GBS be complicated? Tools for filtering, error correction and imputation.
Why can GBS be complicated? Tools for filtering, error correction and imputation. Edward Buckler USDA-ARS Cornell University http://www.maizegenetics.net Many Organisms Are Diverse Humans are at the lower
More informationBST227 Introduction to Statistical Genetics. Lecture 8: Variant calling from high-throughput sequencing data
BST227 Introduction to Statistical Genetics Lecture 8: Variant calling from high-throughput sequencing data 1 PC recap typical genome Differs from the reference genome at 4-5 million sites ~85% SNPs ~15%
More informationEnsembl Tools. EBI is an Outstation of the European Molecular Biology Laboratory.
Ensembl Tools EBI is an Outstation of the European Molecular Biology Laboratory. Questions? We ve muted all the mics Ask questions in the Chat box in the webinar interface I will check the Chat box periodically
More informationData processing and analysis of genetic variation using next-generation DNA sequencing!
Data processing and analysis of genetic variation using next-generation DNA sequencing! Mark DePristo, Ph.D.! Genome Sequencing and Analysis Group! Medical and Population Genetics Program! Broad Institute
More informationRADSeq Data Analysis. Through STACKS on Galaxy. Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé
RADSeq Data Analysis Through STACKS on Galaxy Yvan Le Bras Anthony Bretaudeau Cyril Monjeaud Gildas Le Corguillé RAD sequencing: next-generation tools for an old problem INTRODUCTION source: Karim Gharbi
More informationNext Generation Sequencing Lecture Saarbrücken, 19. March Sequencing Platforms
Next Generation Sequencing Lecture Saarbrücken, 19. March 2012 Sequencing Platforms Contents Introduction Sequencing Workflow Platforms Roche 454 ABI SOLiD Illumina Genome Anlayzer / HiSeq Problems Quality
More informationIntroductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie. Sander van Boheemen Medical Microbiology
Introductie en Toepassingen van Next-Generation Sequencing in de Klinische Virologie Sander van Boheemen Medical Microbiology Next-generation sequencing Next-generation sequencing (NGS), also known as
More informationIdentifying copy number alterations and genotype with Control-FREEC
Identifying copy number alterations and genotype with Control-FREEC Valentina Boeva contact: freec@curie.fr Most approaches for predicting copy number alterations (CNAs) require you to have whole exomesequencing
More informationACCEL-NGS 2S DNA LIBRARY KITS
ACCEL-NGS 2S DNA LIBRARY KITS Accel-NGS 2S DNA Library Kits produce high quality libraries with an all-inclusive, easy-to-use format. The kits contain all reagents necessary to build high complexity libraries
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More informationCreate a Planned Run. Using the Ion AmpliSeq Pharmacogenomics Research Panel Plugin USER BULLETIN. Publication Number MAN Revision A.
USER BULLETIN Create a Planned Run Using the Ion AmpliSeq Pharmacogenomics Research Panel Plugin Publication Number MAN0013730 Revision A.0 For Research Use Only. Not for use in diagnostic procedures.
More informationIdentifying recessive gene candidates with GEMINI
Identifying recessive gene candidates with GEMINI Aaron Quinlan University of Utah! quinlanlab.org Please refer to the following Github Gist to find each command for this session. Commands should be copy/pasted
More informationInternational networks and DMD registries. Hanns Lochmüller, Newcastle University
International networks and DMD registries Hanns Lochmüller, Newcastle University Why have a network? Rare diseases - no one country is enough To tackle issues that can be settled more effectively collaboratively
More informationSanger vs Next-Gen Sequencing
Tools and Algorithms in Bioinformatics GCBA815/MCGB815/BMI815, Fall 2017 Week-8: Next-Gen Sequencing RNA-seq Data Analysis Babu Guda, Ph.D. Professor, Genetics, Cell Biology & Anatomy Director, Bioinformatics
More informationTargeted Sequencing in the NBS Laboratory
Targeted Sequencing in the NBS Laboratory Christopher Greene, PhD Newborn Screening and Molecular Biology Branch Division of Laboratory Sciences Gene Sequencing in Public Health Newborn Screening February
More informationFigure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle
Figure S1 a Unrearranged locus Rearranged locus Concordant read pairs Region1 Concordant read pairs Cluster of discordant read pairs, bundle Region2 Concordant read pairs b Physical coverage 5 4 3 2 1
More informationOHSU Digital Commons. Oregon Health & Science University. Benjamin Cordier. Scholar Archive
Oregon Health & Science University OHSU Digital Commons Scholar Archive 5-19-2017 Evaluation Of Background Prediction For Variant Detection In A Clinical Context: Towards Improved Ngs Monitoring Of Minimal
More informationSequence variation in the short tandem repeat system SE33 discovered by next generation sequencing
Sequence variation in the short tandem repeat system SE33 discovered by next generation sequencing Eszter Rockenbauer, MSc, PhD and Line Møller, MSc Forensic Geneticist Section of Forensic Genetics Department
More informationSequencing technologies. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es Outline Sequencing technologies: Sanger 2nd generation sequencing: 3er generation sequencing: 454 Illumina SOLiD Ion Torrent PacBio
More informationIntroduction to Bioinformatics
Introduction to Bioinformatics Richard Corbett Canada s Michael Smith Genome Sciences Centre Vancouver, British Columbia June 28, 2017 Our mandate is to advance knowledge about cancer and other diseases
More informationA Pipeline for Markers Selection Using Restriction Site Associated DNA Sequencing (RADSeq)
European Journal of Biophysics 2018; 6(1): 7-16 http://www.sciencepublishinggroup.com/j/ejb doi: 10.11648/j.ejb.20180601.12 ISSN: 2329-1745 (Print); ISSN: 2329-1737 (Online) A Pipeline for Markers Selection
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.
Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a
More informationDesign and Validation of a 2 nd Tier Next Generation Sequencing (NGS) Panel for Newborn Screening for Severe Combined Immunodeficiency Disease (SCID)
Design and Validation of a 2 nd Tier Next Generation Sequencing (NGS) Panel for Newborn Screening for Severe Combined Immunodeficiency Disease (SCID) September 13, 2017 Colleen Stevens, Ph.D. Research
More informationIon S5 and Ion S5 XL Systems
Ion S5 and Ion S5 XL Systems Targeted sequencing has never been simpler Introducing the Ion S5 and Ion S5 XL systems Now, adopting next-generation sequencing in your lab is simpler than ever. The Ion S5
More informationCNV and variant detection for human genome resequencing data - for biomedical researchers (II)
CNV and variant detection for human genome resequencing data - for biomedical researchers (II) Chuan-Kun Liu 劉傳崑 Senior Maneger National Center for Genome Medican bioit@ncgm.sinica.edu.tw Abstract Common
More informationRelease Notes for Genomes Processed Using Complete Genomics Software
Release Notes for Genomes Processed Using Complete Genomics Software Version 1.11.0 Related Documents... 1 Changes to Version 1.11.0... 2 Changes to Version 1.10.0... 6 Changes to Version 1.9.0... 10 Changes
More informationNext Generation Sequencing. Target Enrichment
Next Generation Sequencing Target Enrichment Next Generation Sequencing Your Partner in Every Step from Sample to Data NGS: Revolutionizing Genetic Analysis with Single-Molecule Resolution Next generation
More informationGap Filling for a Human MHC Haplotype Sequence
American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human
More informationCancer Genetics Solutions
Cancer Genetics Solutions Cancer Genetics Solutions Pushing the Boundaries in Cancer Genetics Cancer is a formidable foe that presents significant challenges. The complexity of this disease can be daunting
More informationNext-Generation Sequencing Services à la carte
Next-Generation Sequencing Services à la carte www.seqme.eu ngs@seqme.eu SEQme 2017 All rights reserved The trademarks and names of other companies and products mentioned in this brochure are the property
More informationJenny Gu, PhD Strategic Business Development Manager, PacBio
IDT and PacBio joint presentation Characterizing Alzheimer s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing Jenny Gu, PhD Strategic Business Development Manager,
More informationMoGUL: Detecting Common Insertions and Deletions in a Population
MoGUL: Detecting Common Insertions and Deletions in a Population Seunghak Lee 1,2, Eric Xing 2, and Michael Brudno 1,3, 1 Department of Computer Science, University of Toronto, Canada 2 School of Computer
More informationUnravelling the genetic basis of Mayer-Rokitansky- Küster-Hauser syndrome through whole exome sequencing
RESEARCH PROJECTS 2014 Unravelling the genetic basis of Mayer-Rokitansky- Küster-Hauser syndrome through whole exome sequencing Dr Antigone Dimas, Postdoctoral Research Fellow, BSRC Al. Fleming Dr Klelia
More informationStatistical method for Next Generation Sequencing pipeline comparison
Statistical method for Next Generation Sequencing pipeline comparison Pascal Roy, MD PhD EPICLIN 2016 Strasbourg 25-27 mai 2016 MH Elsensohn 1-4*, N Leblay 1-4, S Dimassi 5,6, A Campan-Fournier 5,6, A
More informationAn Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values
2016 Data Compression Conference An Evaluation Framework for Lossy Compression of Genome Sequencing Quality Values Claudio Alberti *, Noah Daniels +, Mikel Hernaez, Jan Voges^, Rachel L. Goldfeder, Ana
More informationMHC Region. MHC expression: Class I: All nucleated cells and platelets Class II: Antigen presenting cells
DNA based HLA typing methods By: Yadollah Shakiba, MD, PhD MHC Region MHC expression: Class I: All nucleated cells and platelets Class II: Antigen presenting cells Nomenclature of HLA Alleles Assigned
More informationtherascreen BRCA1/2 NGS FFPE gdna Kit Handbook Part 2: Analysis
February 2017 therascreen BRCA1/2 NGS FFPE gdna Kit Handbook Part 2: Analysis Version 1 For the identification of variants in BRCA1 and BRCA2 For in vitro diagnostic use For use with Illumina MiSeqDx platform
More informationDetection of Fusion Genes by Targeted Roche 454 Sequencing
Detection of Fusion Genes by Targeted Roche 454 Sequencing Hans-Ulrich Klein 1, Christoph Bartenhagen 1, Alexander Kohlmann 2, Vera Grossmann 2, Christian Ruckert 1, Torsten Haferlach 2, Martin Dugas 1
More informationIntroduction to the UCSC genome browser
Introduction to the UCSC genome browser Dominik Beck NHMRC Peter Doherty and CINSW ECR Fellow, Senior Lecturer Lowy Cancer Research Centre, UNSW and Centre for Health Technology, UTS SYDNEY NSW AUSTRALIA
More informationStructural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona
Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Genetic variation How much genetic variation is there between individuals? What type of variants
More informationTangram: a comprehensive toolbox for mobile element insertion detection
Wu et al. BMC Genomics 2014, 15:795 METHODOLOGY ARTICLE Open Access Tangram: a comprehensive toolbox for mobile element insertion detection Jiantao Wu 1,Wan-PingLee 1, Alistair Ward 1, Jerilyn A Walker
More informationSupplementary Material for Extremely low-coverage whole genome sequencing in South Asians captures population genomics information
Supplementary Material for Extremely low-coverage whole genome sequencing in South Asians captures population genomics information Navin Rustagi, Anbo Zhou, W. Scott Watkins, Erika Gedvilaite, Shuoguo
More informationDigital DNA/RNA sequencing enables highly accurate and sensitive biomarker detection and quantification
Digital DNA/RNA sequencing enables highly accurate and sensitive biomarker detection and quantification Erwin Chen ( 陳立德 ) Technical Product Specialist QIAGEN Taiwan Precision medicine: Right drug, right
More informationTargeted Sequencing Using Droplet-Based Microfluidics. Keith Brown Director, Sales
Targeted Sequencing Using Droplet-Based Microfluidics Keith Brown Director, Sales brownk@raindancetech.com Who we are: is a Provider of Microdroplet-based Solutions The Company s RainStorm TM Technology
More informationLUMPY: A probabilistic framework for structural variant discovery
LUMPY: A probabilistic framework for structural variant discovery Ryan M Layer 1, Aaron R Quinlan* 1,2,3,4 and Ira M Hall* 2,4 1 Department of Computer Science 2 Department of Biochemistry and Molecular
More informationSeqStudio Genetic Analyzer
SeqStudio Genetic Analyzer Optimized for Sanger sequencing and fragment analysis Easy to use for all levels of experience From a leader in genetic analysis instrumentation, introducing the new Applied
More informationTowards detection of minimal residual disease in multiple myeloma through circulating tumour DNA sequence analysis
Towards detection of minimal residual disease in multiple myeloma through circulating tumour DNA sequence analysis Trevor Pugh, PhD, FACMG Princess Margaret Cancer Centre, University Health Network Dept.
More informationLab methods: Exome / Genome. Ewart de Bruijn
Lab methods: Exome / Genome 27 06 2013 Ewart de Bruijn Library prep is only a small part of the complete DNA analysis workflow DNA isolation library prep enrichment flowchip prep sequencing bioinformatics
More informationSequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements
More informationMedSavant: An open source platform for personal genome interpretation
MedSavant: An open source platform for personal genome interpretation Marc Fiume 1, James Vlasblom 2, Ron Ammar 3, Orion Buske 1, Eric Smith 1, Andrew Brook 1, Sergiu Dumitriu 2, Christian R. Marshall
More informationVariant Calling CHRIS FIELDS MAYO-ILLINOIS COMPUTATIONAL GENOMICS WORKSHOP, JUNE 19, 2017
Variant Calling CHRIS FIELDS MAYO-ILLINOIS COMPUTATIONAL GENOMICS WORKSHOP, JUNE 19, 2017 Up-front acknowledgments Many figures/slides come from: GATK Workshop slides: http://www.broadinstitute.org/gatk/guide/events?id=2038
More informationIntroduc)on to NGS Variant Calling
Introduc)on to NGS Variant Calling Bioinforma)cs analysis and annota)on of variants in NGS data workshop Cape Town, 4 th to 6 th April 2016 Sumir Panji, Amel Ghouila, Gerrit Botha Types of variants Learning
More informationCS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016
CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene
More informationIon S5 and Ion S5 XL Systems
Ion S5 and Ion S5 XL Systems Targeted sequencing has never been simpler Explore the Ion S5 and Ion S5 XL Systems Adopting next-generation sequencing (NGS) in your lab is now simpler than ever The Ion S5
More informationRIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP)
Application Note: RIPTIDE HIGH THROUGHPUT RAPID LIBRARY PREP (HT-RLP) Introduction: Innovations in DNA sequencing during the 21st century have revolutionized our ability to obtain nucleotide information
More informationImplementation of Ion AmpliSeq in molecular diagnostics
Implementation of Ion AmpliSeq in molecular diagnostics The Rotterdam Experience Ronald van Marion Deelnemersbijeenkomst SKML sectie Pathologie Amersfoort, 26 mei 2016 Molecular Diagnostics in Rotterdam
More informationValidation of Identity and Ancestry SNP Panels for the Ion PGM
Validation of Identity and Ancestry SNP Panels for the Ion PGM Christopher Phillips, Carla Santos, Maria de la Puente, Manuel Fondevila, Ángel Carracedo, Maviky Lareu Forensic Genetics Unit, University
More informationBest practices for Variant Calling with Pacific Biosciences data
Best practices for Variant Calling with Pacific Biosciences data Mauricio Carneiro, Ph.D. Mark DePristo, Ph.D. Genome Sequence and Analysis Medical and Population Genetics carneiro@broadinstitute.org 1
More informationSequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations
INVESTIGATION Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations Matthew W. Hahn,*,,1 Simo V. Zhang, and Leonie C. Moyle* *Department of Biology and School of Informatics
More informationwith drmid Dx for Illumina NGS systems
Performance Characteristics BRCA MASTR Dx with drmid Dx for Illumina NGS systems Manufacturer Multiplicom N.V. Galileïlaan 18 2845 Niel Belgium Revision date: July 27, 2017 Page 1 of 8 Table of Contents
More informationBioinformatics Advice on Experimental Design
Bioinformatics Advice on Experimental Design Where do I start? Please refer to the following guide to better plan your experiments for good statistical analysis, best suited for your research needs. Statistics
More informationComprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next- Generation Sequencing
Comprehensive Analysis to Improve the Validation Rate for Single Nucleotide Variants Detected by Next- Generation Sequencing Mi-Hyun Park 1., Hwanseok Rhee 2., Jung Hoon Park 2, Hae-Mi Woo 1, Byung-Ok
More informationTargeted resequencing
Targeted resequencing Sarah Calvo, Ph.D. Computational Biologist Vamsi Mootha laboratory Snapshots of Genome Wide Analysis in Human Disease (MPG), 4/20/2010 Vamsi Mootha, PI How can I assess a small genomic
More informationOutline. General principles of clonal sequencing Analysis principles Applications CNV analysis Genome architecture
The use of new sequencing technologies for genome analysis Chris Mattocks National Genetics Reference Laboratory (Wessex) NGRL (Wessex) 2008 Outline General principles of clonal sequencing Analysis principles
More informationNext Generation Sequence Analysis and Computational Genomics Using Graphical Pipeline Workflows
Genes 2012, 3, 545-575; doi:10.3390/genes3030545 Article OPEN ACCESS genes ISSN 2073-4425 www.mdpi.com/journal/genes Next Generation Sequence Analysis and Computational Genomics Using Graphical Pipeline
More information