Structural(varia+on!

Size: px
Start display at page:

Download "Structural(varia+on!"

Transcription

1 Structural(varia+on! Programming)for)Biology)! CSH,!October!2012!!!! Tomas!Marques7Bonet! ICREA!Research!Professor! InsAtut!de!Biologia!EvoluAva! Nucleotide Forms!of!geneAc!variaAon.!! Single)base8pair)changes) Cytogenetics Con/nuum)of)Genomic)Varia/on)!!Point!mutaAons!(1!per!800!bp)!! Small)inser/ons/dele/ons)! FrameshiM,!microsatellite,!minisatellite!!! Mobile)elements)! Retroelement!inserAons!(300bp!710!kb)!! Large8scale)genomic)copy)) number)varia/on!(>10!kb)!! Large7scale!DeleAons!! Segmental!DuplicaAons!! Local)Rearangements)! Chromosomal)varia/on)! TranslocaAon,!inversion,!fusion!! Copy Number Variation Structural Variants (SV)

2 Genomic)Structural)Varia/on) Human!GeneAc!! VariaAon! Frequency! SNPs! structural!! variaaon! cytogeneac!!gene7altering,!e.g.!immune! response,!drug!metabolism!!abundant:!majority!of!human! heterozygosity!!numerous!plausible!funcaonal! consequences! 1!bp! Size! 1!chr! Types of Structural Variation) Hurles et al. 2008

3 Why)Study)Structural)Varia/on?) Common!in! normal!human!genomes77 major!cause!of!phenotypic!variaaon! Common!in!certain!diseases,!parAcularly! cancer! Now!showing!up!in!rare!disease;!auAsm,! schizophrenia! 17q21.31!deleAon!syndrome!! 16p12.1!deleAon!syndrome!! Zody)et(al.)Nature!GeneAcs!(2008)! Antonacci)et(al.)Nature!GeneAcs!(2010)! MR,!global!delay!and!! congenital!cardiac!defects!! childhood!intellectual!disability,!developmental!delay!

4 Challenges!of!CNV!studies! OMen!involves!repeated!regions! Rearrangements!are!complex! Can!involve!highly!repeAAve!elements! Methods)to!Find)SVs) Experimental!approach!! ArrayCGH (SNP based and genomic) Sequence!based! Local and de novo assembly Read pair analysis Read depth analysis Split read analysis

5 METHOD)1:)Copy)Number)Varia/on:) Array)Compara/ve)Genomic)Hybridiza/on) Modified:Feuk et al. Nat Rev Genet 2006 Genome)Tiling)Arrays! 800!bp! 25736mer!

6 Typical)Analysis)Procedure) For!each!probe,!calculate!a!log2!raAo!of!test/ reference! Log2!serves!to!center!values!around!0! Hemizygous!deleAon!in!test:!log2(test/ reference)=log2(1/2)=71! DuplicaAon!in!test:!!!!log2(test/reference)=log2(3/2)=0.59! Homozygous!duplicaAon:!!!log2(test/reference)=log2(4/2)=1! Copy Number Variations in the Human Genome Signal) Person)1) Signal) Person)2) Chromosome)Posi/on) Extra)DNA) Missing)DNA)

7 37State!HMM! log2! State!! Assignment! Gain! SegmentaAon!using!a!37state!HMM!(Viterbi!Algorithm)! Normal! Loss! METHOD)2:)Copy)Number)Varia/on:) SNP)genotyping)Array) Steemers!et)al.)

8 SNP)Fluorescence8Based)Dele/on)Discovery) A!B!B!A! B!A!B!B! B7Allele!Freq! 0.5! 0! 1! AB! BB! 0! 71! 1! LogR! CopyNum=2! A7! B7! A!B!B!A! B7Allele!Freq! 0.5! 0! 1! 0! 71! 1! LogR! CopyNum=1! A!B!B!A! B!A!B!B! 0! 71! 1! LogR! B7Allele!Freq! 0.5! 0! 1! AB! BB! CopyNum=2! A!B!B!A! B!A!B!B! A!B!B!A! B7Allele!Freq! 0.5! 0! 1! AAB! BBB! ABB! 0! 71! 1! LogR! CopyNum=3!

9 1! LogR!and!B7Allele!Frequency! 0.5! 0! 70.5! 71! ~90)kbp) Human!chromosome!2!posiAon! Sequencing)Methods) Going!Backwards!Sanger,!454,!Illumina..! CNV!and!SV!are!hotspots!of!research!but!reality!is:! LimitaAons!of!the!methods! Indirect!methods.!ALL!have!problems!!! What!do!we!want?! Clone!approached! Finish!sequence!!

10 De)novo)assemblies) Theory!vs.!Reality! Most!assemblies!(even!with!Sanger!technology!)!are!collapsed.! Examples.! DistribuAon!of!Total!duplicaAons! BONOBO CHIMP (pantro2) WGS! 454! HUMAN (hg18) Hg18 identity Distribution inter WGS! Sanger!sequence! intra Bases(bp) % 91.00% 92.00% 93.00% 94.00% 95.00% 96.00% 97.00% identity(%) 98.00% 99.00% 99.50% % BAC!hierarchical! Sanger!sequence!

11 LimitaAons!of!NGS!assemblies! Alkan et al. Nature Methods 2010 Method)2:)End8Sequence)Pair) (ESP))Analysis) End-Sequence Pairs Insert Size Distribution Human DNA Fosmid Vector Map to reference Number of Clones < 32 kb Putative Insertion >48 kb Putative Deletion Apparent Insert Size Genomic DNA Diploid Sample Tuzun et al. (2005)

12 Method)2:)End8Sequence)Pair) (ESP))Analysis) End-Sequence Pairs Insert Size Distribution Human DNA Fosmid Vector Map to reference Number of Clones < 32 kb Putative Insertion >48 kb Putative Deletion discordant by orientation (yellow/gold) discordant by size (red) reference sample Deletion Insertion Inversion Apparent Insert Size Tuzun et al. (2005) What!can!we!find?! Structural!variaAon!detecAon:! Alkan!et!al.!Nature!Review!GeneAcs!2011!

13 Map)of)Validated)Variants) ABC14 (CEPH) ABC13 (Yoruba) ABC12 (CEPH) ABC11 (China) ABC10 (Yoruba) ABC9 (Japan) ABC8 (Yoruba) ABC7 (Yoruba) G248 chr17 Genome wide map of variants Ability to resolve structure of individual haplotypes Insertion (Fosmid) Deletion Inversion Gaps Insertion (Fosmid) Deletion Inversion Gaps Novel Sequence ABC14 (CEPH) ABC13 (Yoruba) ABC12 (CEPH) ABC11 (China) ABC10 (Yoruba) ABC9 (Japan) ABC8 (Yoruba) ABC7 (Yoruba) G248

14 Frequency)of)Validated)Sites) Reference genome represents minor allele Number of sites Deletions Insertions Inversions Number of individuals (libraries) reporting variant site 261 (15%) sites where reference genome represents a minor allele Method)3:)Sequence!Read!Depth! Analysis) Individual sequence Reads Mapping Reference genome Counting mapped reads Read depth signal #!reads! 28 HMM!calls!

15 Method)3:)Sequence!Read!Depth! Analysis) Individual sequence Reads Mapping Reference genome Read depth signal Counting mapped reads #!reads! 29 HMM!calls! Sequence!coverage!and!detecAon!power!

16 ValidaAon!of!copy7number!esAmaAons! Alkan)et)al.,)Nature)Gene0cs,)2009) Defensin!gene!cluster!+!FAM90A7) Associated)with)psoriasis)and) Crohn s)disease) Alkan)et)al.,)Nature)Gene0cs,)2009)

17 Scaling!up:!1000!Genomes!and!more! Individuals)sequenced)in)Pilot)1) Histogram)of)Pilot)1)Illumina)effec/ve)coverage) 120) Yoruba)56) Number!of!genomes! 100) CEPH)48) Japanese)27) Han)29) 80) 60) 40) 20) 0) 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) EffecAve!Coverage! Individuals)sequenced)in)Pilot)2) Other)Genomes) Sudmant,)Kitzman,)et)al.,)Science,)2010) Copy)number)varia/on)in)) human)popula/ons) Sudmant)et)al.)Science)2010)

18 DeleAons Split8read)Analysis) Dele/on)Event) Reference DeleAon! Read Breakpoint! Inser/on)Event) Reference Read InserAon!

19 Experimental)Valida/on) a A))CGH! c B) Fiber-FISH (For inversions) Without inversion With inversion CGH) PEM( b C))PCR! M A B C D A B C D A B C D A B C D A B C D A B C D A B C D A B C D M 3000 bp 1500 bp 500 bp Experimental!approach!! Methods)to!Find)SVs) ArrayCGH (SNP based and genomic) Based on ratios, Saturate quite fast, poor breakpoint resolution Sequence!based! Read pair analysis Deletions, small novel insertions, inversions, transposons Size and breakpoint resolution dependent to insert size Read depth analysis Deletions and duplications Relatively poor breakpoint resolution Split read analysis Small novel insertions/deletions, and mobile element insertions 1bp breakpoint resolution Local and de novo assembly SV in unique segments 1bp breakpoint resolution

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Genetic variation How much genetic variation is there between individuals? What type of variants

More information

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING Ken Chen, Ph.D. kchen@genome.wustl.edu The Genome Center, Washington University in St. Louis The path

More information

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah Analysis of structural variation Alistair Ward USTAR Center for Genetic Discovery University of Utah What is structural variation? What differentiates SV from short variants? What are the major SV types?

More information

The Human Genome and its upcoming Dynamics

The Human Genome and its upcoming Dynamics The Human Genome and its upcoming Dynamics Matthias Platzer Genome Analysis Leibniz Institute for Age Research - Fritz-Lipmann Institute (FLI) Sequencing of the Human Genome Publications 2004 2001 2001

More information

Analysis of structural variation. Alistair Ward - Boston College

Analysis of structural variation. Alistair Ward - Boston College Analysis of structural variation Alistair Ward - Boston College What is structural variation? What differentiates SV from short variants? What are the major SV types? Summary of MEI detection What is an

More information

Structural variation analysis using NGS sequencing

Structural variation analysis using NGS sequencing Structural variation analysis using NGS sequencing Victor Guryev NBIC NGS taskforce meeting April 15th, 2011 Scale of genomic variants Scale 1 bp 10 bp 100 bp 1 kb 10 kb 100 kb 1 Mb Variants SNPs Short

More information

CS681: Advanced Topics in Computational Biology

CS681: Advanced Topics in Computational Biology CS681: Advanced Topics in Computational Biology Can Alkan EA224 calkan@cs.bilkent.edu.tr Week 1, Lectures 2-3 http://www.cs.bilkent.edu.tr/~calkan/teaching/cs681/ DNA structure refresher DNA has a double

More information

The Diploid Genome Sequence of an Individual Human

The Diploid Genome Sequence of an Individual Human The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.

More information

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse SUPPLEMENTARY INFORMATION De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations Wong et al. The Supplementary Information contains 4 Supplementary Figures, 3

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

02 Agenda Item 03 Agenda Item

02 Agenda Item 03 Agenda Item 01 Agenda Item 02 Agenda Item 03 Agenda Item SOLiD 3 System: Applications Overview April 12th, 2010 Jennifer Stover Field Application Specialist - SOLiD Applications Workflow for SOLiD Application Application

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Mapping and Sequencing of Structural Variation from Eight Human Genomes Supplementary Material Table of Contents 1. Project Overview... 2 2. DNA Sample Selection... 2 3. Library Production and End-Sequencing...

More information

Detecting copy-neutral LOH in cancer using Agilent SurePrint G3 Cancer CGH+SNP Microarrays

Detecting copy-neutral LOH in cancer using Agilent SurePrint G3 Cancer CGH+SNP Microarrays Detecting copy-neutral LOH in cancer using Agilent SurePrint G3 Cancer CGH+SNP Microarrays Application Note Authors Paula Costa Anniek De Witte Jayati Ghosh Agilent Technologies, Inc. Santa Clara, CA USA

More information

Supplementary Figures

Supplementary Figures Supplementary Figures A B Supplementary Figure 1. Examples of discrepancies in predicted and validated breakpoint coordinates. A) Most frequently, predicted breakpoints were shifted relative to those derived

More information

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Sequence Assembly and Alignment. Jim Noonan Department of Genetics Sequence Assembly and Alignment Jim Noonan Department of Genetics james.noonan@yale.edu www.yale.edu/noonanlab The assembly problem >>10 9 sequencing reads 36 bp - 1 kb 3 Gb Outline Basic concepts in genome

More information

SNP calling and VCF format

SNP calling and VCF format SNP calling and VCF format Laurent Falquet, Oct 12 SNP? What is this? A type of genetic variation, among others: Family of Single Nucleotide Aberrations Single Nucleotide Polymorphisms (SNPs) Single Nucleotide

More information

CREST maps somatic structural variation in cancer genomes with base-pair resolution

CREST maps somatic structural variation in cancer genomes with base-pair resolution Nature Methods CREST maps somatic structural variation in cancer genomes with base-pair resolution Jianmin Wang, Charles G Mullighan, John Easton, Stefan Roberts, Jing Ma, Michael C Rusch, Ken Chen, Christopher

More information

A large, complex structural polymorphism at 16p12.1 underlies microdeletion disease risk. 1. Structural variation at 16p

A large, complex structural polymorphism at 16p12.1 underlies microdeletion disease risk. 1. Structural variation at 16p 1 A large, complex structural polymorphism at 16p12.1 underlies microdeletion disease risk Francesca Antonacci 1, Jeffrey M. Kidd 1, Tomas Marques-Bonet 1, Brian Teague 2, Mario Ventura 3, Santhosh Girirajan

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Contents De novo assembly... 2 Assembly statistics for all 150 individuals... 2 HHV6b integration... 2 Comparison of assemblers... 4 Variant calling and genotyping... 4 Protein truncating variants (PTV)...

More information

Structural Variation in the Human Genome

Structural Variation in the Human Genome Structural Variation in the Human Genome Michael Snyder March 2, 2010 Genetic Variation Among People Single nucleotide polymorphisms (SNPs) GATTTAGATCGCGATAGAG GATTTAGATCTCGATAGAG 0.1% difference among

More information

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci SUPPLEMENTARY METHODS AND DATA General strategy for identifying deletion loci We used two approaches for identifying potentially suitable deletion loci for PDP-FISH analysis. In the first approach, we

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

Bionano Access : Assembly Report Guidelines

Bionano Access : Assembly Report Guidelines Bionano Access : Assembly Report Guidelines Document Number: 30255 Document Revision: A For Research Use Only. Not for use in diagnostic procedures. Copyright 2018 Bionano Genomics Inc. All Rights Reserved

More information

Recombination, and haplotype structure

Recombination, and haplotype structure 2 The starting point We have a genome s worth of data on genetic variation Recombination, and haplotype structure Simon Myers, Gil McVean Department of Statistics, Oxford We wish to understand why the

More information

Variation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI

Variation detection based on second generation sequencing data. Xin LIU Department of Science and Technology, BGI Variation detection based on second generation sequencing data Xin LIU Department of Science and Technology, BGI liuxin@genomics.org.cn 2013.11.21 Outline Summary of sequencing techniques Data quality

More information

Lecture 2: Biology Basics Continued

Lecture 2: Biology Basics Continued Lecture 2: Biology Basics Continued Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A-T and

More information

Estimating the rates and modes of creation of new genetic variation in plants using NGS technologies

Estimating the rates and modes of creation of new genetic variation in plants using NGS technologies Estimating the rates and modes of creation of new genetic variation in plants using NGS technologies 14/06/2016 Supervisor: Prof. Michele Morgante Co-supervisor: Fabio Marroni PhD Student: Ettore Zapparoli

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids. Supplementary Figure 1 Number and length distributions of the inferred fosmids. Fosmid were inferred by mapping each pool s sequence reads to hg19. We retained only those reads that mapped to within a

More information

Variant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012

Variant Detection in Next Generation Sequencing Data. John Osborne Sept 14, 2012 + Variant Detection in Next Generation Sequencing Data John Osborne Sept 14, 2012 + Overview My Bias Talk slanted towards analyzing whole genomes using Illumina paired end reads with open source tools

More information

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs Next Genera*on Sequencing II: Personal Genomics Jim Noonan Department of Gene*cs Personal genome sequencing Iden*fying the gene*c basis of phenotypic diversity among humans Gene*c risk factors for disease

More information

Genome Sequencing and Structural Variation

Genome Sequencing and Structural Variation Genome Sequencing and Structural Variation Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #10 Today Structural Variation Deletions Duplications

More information

Research techniques in genetics. Medical genetics, 2017.

Research techniques in genetics. Medical genetics, 2017. Research techniques in genetics Medical genetics, 2017. Techniques in Genetics Cloning (genetic recombination or engineering ) Genome editing tools: - Production of Knock-out and transgenic mice - CRISPR

More information

Computational methods for discovering structural variation with next-generation sequencing

Computational methods for discovering structural variation with next-generation sequencing Computational methods for discovering structural variation with next-generation sequencing Paul Medvedev 1, Monica Stanciu 1 & Michael Brudno 1,2 In the last several years, a number of studies have described

More information

Capturing Complex Human Genetic Variations using the GS FLX+ System

Capturing Complex Human Genetic Variations using the GS FLX+ System SeqCap EZ Library: Technical Note August 2012 Capturing Complex Human Genetic Variations using the GS FLX+ System Sequence Capture of Structural Variants in the Human Genome Primary Authors: Lindsay Freeberg*

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

Biol 478/595 Intro to Bioinformatics

Biol 478/595 Intro to Bioinformatics Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12

More information

TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR)

TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) tru TruSPAdes: analysis of variations using TruSeq Synthetic Long Reads (TSLR) Anton Bankevich Center for Algorithmic Biotechnology, SPbSU Sequencing costs 1. Sequencing costs do not follow Moore s law

More information

Variant Callers. J Fass 24 August 2017

Variant Callers. J Fass 24 August 2017 Variant Callers J Fass 24 August 2017 Variant Types Caller Consistency Pabinger (2014) Briefings Bioinformatics 15:256 Freebayes Bayesian haplotype caller that can call SNPs, short CNVs / duplications,

More information

Complementary Technologies for Precision Genetic Analysis

Complementary Technologies for Precision Genetic Analysis Complementary NGS, CGH and Workflow Featured Publication Zhu, J. et al. Duplication of C7orf58, WNT16 and FAM3C in an obese female with a t(7;22)(q32.1;q11.2) chromosomal translocation and clinical features

More information

Genome 373: Mapping Short Sequence Reads II. Doug Fowler

Genome 373: Mapping Short Sequence Reads II. Doug Fowler Genome 373: Mapping Short Sequence Reads II Doug Fowler The final Will be in this room on June 6 th at 8:30a Will be focused on the second half of the course, but will include material from the first half

More information

Gap Filling for a Human MHC Haplotype Sequence

Gap Filling for a Human MHC Haplotype Sequence American Journal of Life Sciences 2016; 4(6): 146-151 http://www.sciencepublishinggroup.com/j/ajls doi: 10.11648/j.ajls.20160406.12 ISSN: 2328-5702 (Print); ISSN: 2328-5737 (Online) Gap Filling for a Human

More information

Discovery and genotyping of genome structural polymorphism by sequencing on a population scale

Discovery and genotyping of genome structural polymorphism by sequencing on a population scale Discovery and genotyping of genome structural polymorphism by sequencing on a population scale The Harvard community has made this article openly available. Please share how this access benefits you. Your

More information

Revolutionize Genomics with SMRT Sequencing. Single Molecule, Real-Time Technology

Revolutionize Genomics with SMRT Sequencing. Single Molecule, Real-Time Technology Revolutionize Genomics with SMRT Sequencing Single Molecule, Real-Time Technology Resolve to Master Complexity Despite large investments in population studies, the heritability of the majority of Mendelian

More information

Variant Discovery. Jie (Jessie) Li PhD Bioinformatics Analyst Bioinformatics Core, UCD

Variant Discovery. Jie (Jessie) Li PhD Bioinformatics Analyst Bioinformatics Core, UCD Variant Discovery Jie (Jessie) Li PhD Bioinformatics Analyst Bioinformatics Core, UCD Variant Type Alkan et al, Nature Reviews Genetics 2011 doi:10.1038/nrg2958 Variant Type http://www.broadinstitute.org/education/glossary/snp

More information

Whole genome sequencing in the UK Biobank

Whole genome sequencing in the UK Biobank Whole genome sequencing in the UK Biobank Part of the UK Government s Industrial Strategy Challenge Fund (ISCF) for the Data to Early Diagnosis and Precision Medicine initiative Aim to produce deep characterisation

More information

Getting high-quality cytogenetic data is a SNP.

Getting high-quality cytogenetic data is a SNP. Getting high-quality cytogenetic data is a SNP. SNP data. Increased insight. Cytogenetics is at the forefront of the study of cancer and congenital disorders. And we put you at the forefront of cytogenetics.

More information

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère C3BI VARIANTS CALLING November 2016 Pierre Lechat Stéphane Descorps-Declère General Workflow (GATK) software websites software bwa picard samtools GATK IGV tablet vcftools website http://bio-bwa.sourceforge.net/

More information

ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms

ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms ARTICLE Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms Catarina D. Campbell, 1 Nick Sampas, 2 Anya Tsalenko, 2 Peter H. Sudmant, 1 Jeffrey M. Kidd, 1,3 Maika Malig, 1 Tiffany

More information

GINDEL: Accurate genotype calling of insertions and deletions from low coverage population sequence reads

GINDEL: Accurate genotype calling of insertions and deletions from low coverage population sequence reads Washington University School of Medicine Digital Commons@Becker Open Access Publications 2014 GINDEL: Accurate genotype calling of insertions and deletions from low coverage population sequence reads Chong

More information

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014 Single Nucleotide Variant Analysis H3ABioNet May 14, 2014 Outline What are SNPs and SNVs? How do we identify them? How do we call them? SAMTools GATK VCF File Format Let s call variants! Single Nucleotide

More information

Targeted complete next generation sequencing and quality control of transgenes and integration sites in CHO cell line development

Targeted complete next generation sequencing and quality control of transgenes and integration sites in CHO cell line development Targeted Locus Amplification Technology Targeted complete next generation sequencing and quality control of transgenes and integration sites in CHO cell line development Cergentis B.V. Yalelaan 62 3584

More information

Ad5 genome. Before filtering. After filtering. Suppl. File 4 PAGE mated reads. Ad5 DNA insertion site 5' 3' 293 unmated reads

Ad5 genome. Before filtering. After filtering. Suppl. File 4 PAGE mated reads. Ad5 DNA insertion site 5' 3' 293 unmated reads Before filtering 0 20 40 60 80 293 Ad5 DNA insertion site 5' 3' 0 20 60 100 48 000 000 48 100 000 48 200 000 48 300 000 48 400 000 48 500 000 After filtering 48 000 000 48 100 000 48 200 000 48 300 000

More information

Detecting Structural Variants in PacBio Reads Tools and Applications

Detecting Structural Variants in PacBio Reads Tools and Applications Detecting Structural Variants in PacBio Reads Tools and Applications Aaron Wenger 2017-06-28 For Research Use Only. Not for use in diagnostics procedures. Copyright 2017 by Pacific Biosciences of California,

More information

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50 MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50 DATE TITLE LECTURER Thu Jan 14 Introduction, Genomic low copy repeats Pierce Tue Jan

More information

The Basics of Understanding Whole Genome Next Generation Sequence Data

The Basics of Understanding Whole Genome Next Generation Sequence Data The Basics of Understanding Whole Genome Next Generation Sequence Data Heather Carleton-Romer, MPH, Ph.D. ASM-CDC Infectious Disease and Public Health Microbiology Postdoctoral Fellow PulseNet USA Next

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

Figure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle

Figure S1. Unrearranged locus. Rearranged locus. Concordant read pairs. Region1. Region2. Cluster of discordant read pairs, bundle Figure S1 a Unrearranged locus Rearranged locus Concordant read pairs Region1 Concordant read pairs Cluster of discordant read pairs, bundle Region2 Concordant read pairs b Physical coverage 5 4 3 2 1

More information

SNP calling. Jose Blanca COMAV institute bioinf.comav.upv.es

SNP calling. Jose Blanca COMAV institute bioinf.comav.upv.es SNP calling Jose Blanca COMAV institute bioinf.comav.upv.es SNP calling Genotype matrix Genotype matrix: Samples x SNPs SNPs and errors A change in a read may due to: Sample contamination Cloning or PCR

More information

Implementation and Evaluation of 10X Genomics Chromium technology

Implementation and Evaluation of 10X Genomics Chromium technology Implementation and Evaluation of 10X Genomics Chromium technology Claire Kuchly & Olivier Bouchez 28/11/2017 get@genotoul.fr @get_genotoul 1 Chromium evaluation: pilot phase Platform installed in november

More information

Genomes contain all of the information needed for an organism to grow and survive.

Genomes contain all of the information needed for an organism to grow and survive. Section 3: Genomes contain all of the information needed for an organism to grow and survive. K What I Know W What I Want to Find Out L What I Learned Essential Questions What are the components of the

More information

Bionano Solve Theory of Operation: Variant Annotation Pipeline

Bionano Solve Theory of Operation: Variant Annotation Pipeline Bionano Solve Theory of Operation: Variant Annotation Pipeline Document Number: 30190 Document Revision: B For Research Use Only. Not for use in diagnostic procedures. Copyright 2018 Bionano Genomics,

More information

RNA-SEQUENCING ANALYSIS

RNA-SEQUENCING ANALYSIS RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS

More information

Nature Methods: doi: /nmeth Supplementary Figure 1. Ideograms showing scaffold boundaries and segmental duplication locations.

Nature Methods: doi: /nmeth Supplementary Figure 1. Ideograms showing scaffold boundaries and segmental duplication locations. Supplementary Figure 1 Ideograms showing scaffold boundaries and segmental duplication locations. Blue lines mark the boundaries of scaffolds. Black marks show the locations of segmental duplications.

More information

Estimation problems in high throughput SNP platforms

Estimation problems in high throughput SNP platforms Estimation problems in high throughput SNP platforms Rob Scharpf Department of Biostatistics Johns Hopkins Bloomberg School of Public Health November, 8 Outline Introduction Introduction What is a SNP?

More information

The Human Genome. The raw data. The repeat content. Composition of the human genome bases. A s T s C s and G s and N s.

The Human Genome. The raw data. The repeat content. Composition of the human genome bases. A s T s C s and G s and N s. 3000000000 bases The Human Genome The raw data GATCTGATAAGTCCCAGGACTTCAGAAGagctgtgagaccttggccaagt cacttcctccttcaggaacattgcagtgggcctaagtgcctcctctcggg ACTGGTATGGGGACGGTCATGCAATCTGGACAACATTCACCTTTAAAAGT TTATTGATCTTTTGTGACATGCACGTGGGTTCCCAGTAGCAAGAAACTAA

More information

Identifying copy number alterations and genotype with Control-FREEC

Identifying copy number alterations and genotype with Control-FREEC Identifying copy number alterations and genotype with Control-FREEC Valentina Boeva contact: freec@curie.fr Most approaches for predicting copy number alterations (CNAs) require you to have whole exomesequencing

More information

GENES AND CHROMOSOMES II

GENES AND CHROMOSOMES II 1 GENES AND CHROMOSOMES II Lecture 4 BIOL 266/2 2014-15 Dr. S. Azam Biology Department Concordia University 2 GENE AND THE GENOME The Structure of the Genome DNA fingerprinting 3 DNA fingerprinting: DNA-based

More information

Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome

Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome Jan O. Korbel, 1,2 * Alexander Eckehart Urban, 3 * Jason P. Affourtit, 4 * Brian Godwin, 4 Fabian Grubert, 5 Jan Fredrik Simons,

More information

Office Hours. We will try to find a time

Office Hours.   We will try to find a time Office Hours We will try to find a time If you haven t done so yet, please mark times when you are available at: https://tinyurl.com/666-office-hours Thanks! Hardy Weinberg Equilibrium Biostatistics 666

More information

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary Executive Summary Helix is a personal genomics platform company with a simple but powerful mission: to empower every person to improve their life through DNA. Our platform includes saliva sample collection,

More information

12/8/09 Comp 590/Comp Fall

12/8/09 Comp 590/Comp Fall 12/8/09 Comp 590/Comp 790-90 Fall 2009 1 One of the first, and simplest models of population genealogies was introduced by Wright (1931) and Fisher (1930). Model emphasizes transmission of genes from one

More information

High-throughput genome scaffolding from in vivo DNA interaction frequency

High-throughput genome scaffolding from in vivo DNA interaction frequency correction notice Nat. Biotechnol. 31, 1143 1147 (213) High-throughput genome scaffolding from in vivo DNA interaction frequency Noam Kaplan & Job Dekker In the version of this supplementary file originally

More information

Haplotypes, linkage disequilibrium, and the HapMap

Haplotypes, linkage disequilibrium, and the HapMap Haplotypes, linkage disequilibrium, and the HapMap Jeffrey Barrett Boulder, 2009 LD & HapMap Boulder, 2009 1 / 29 Outline 1 Haplotypes 2 Linkage disequilibrium 3 HapMap 4 Tag SNPs LD & HapMap Boulder,

More information

Agilent CytoGenomics 2.0

Agilent CytoGenomics 2.0 For Detection ti of CNC, LOH and UPD New algorithms for CGH+SNP analysis of hematological tumor and constitutional samples Arne IJpma, Ph.D. Product Manager CytoGenomics free trial @ https://earray.chem.agilent.com/earray/

More information

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es Sequencing project Unknown sequence { experimental evidence result read 1 read 4 read 2 read 5 read 3 read 6 read 7 Computational requirements

More information

Variant Finding. UCD Genome Center Bioinformatics Core Wednesday 30 August 2016

Variant Finding. UCD Genome Center Bioinformatics Core Wednesday 30 August 2016 Variant Finding UCD Genome Center Bioinformatics Core Wednesday 30 August 2016 Types of Variants Adapted from Alkan et al, Nature Reviews Genetics 2011 Why Look For Variants? Genotyping Correlation with

More information

Workshop on. Genome analysis tools applied to forest tree breeding

Workshop on. Genome analysis tools applied to forest tree breeding Workshop on Genome analysis tools applied to forest tree breeding Vantaa (Finland), the 18 th October 2012 BOOK OF ABSTRACTS Introduction Giusi Zaina University of Udine, Udine, Italy Contact: giusi.zaina@uniud.it

More information

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY.

Petar Pajic 1 *, Yen Lung Lin 1 *, Duo Xu 1, Omer Gokcumen 1 Department of Biological Sciences, University at Buffalo, Buffalo, NY. The psoriasis associated deletion of late cornified envelope genes LCE3B and LCE3C has been maintained under balancing selection since Human Denisovan divergence Petar Pajic 1 *, Yen Lung Lin 1 *, Duo

More information

What is genetic variation?

What is genetic variation? enetic Variation Applied Computational enomics, Lecture 05 https://github.com/quinlan-lab/applied-computational-genomics Aaron Quinlan Departments of Human enetics and Biomedical Informatics USTAR Center

More information

REVIEWS. Structural variation in the human genome

REVIEWS. Structural variation in the human genome REVIEWS Structural variation in the human genome Lars Feuk, Andrew R. Carson and Stephen W. Scherer Abstract The first wave of information from the analysis of the human genome revealed SNPs to be the

More information

14 March, 2016: Introduction to Genomics

14 March, 2016: Introduction to Genomics 14 March, 2016: Introduction to Genomics Genome Genome within Ensembl browser http://www.ensembl.org/homo_sapiens/location/view?db=core;g=ensg00000139618;r=13:3231547432400266 Genome within Ensembl browser

More information

Development and application of CGHPRO, a novel software package for retrieving, handling and analysing array CGH data

Development and application of CGHPRO, a novel software package for retrieving, handling and analysing array CGH data Aus dem Max Planck Institut für molekulare Genetik DISSERTATION Development and application of CGHPRO, a novel software package for retrieving, handling and analysing array CGH data Zur Erlangung des akademischen

More information

Agilent NGS Solutions : Addressing Today s Challenges

Agilent NGS Solutions : Addressing Today s Challenges Agilent NGS Solutions : Addressing Today s Challenges Charmian Cher, Ph.D Director, Global Marketing Programs 1 10 years of Next-Gen Sequencing 2003 Completion of the Human Genome Project 2004 Pyrosequencing

More information

Chapter 14: Genes in Action

Chapter 14: Genes in Action Chapter 14: Genes in Action Section 1: Mutation and Genetic Change Mutation: Nondisjuction: a failure of homologous chromosomes to separate during meiosis I or the failure of sister chromatids to separate

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

Map-Based Cloning of Qualitative Plant Genes

Map-Based Cloning of Qualitative Plant Genes Map-Based Cloning of Qualitative Plant Genes Map-based cloning using the genetic relationship between a gene and a marker as the basis for beginning a search for a gene Chromosome walking moving toward

More information

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2012 1. Understanding Human Genetic Variation

More information

Decoding of Superimposed Traces Produced by Direct Sequencing of Heterozygous Indels Dmitriev, D.A. & Rakitov, R.A.

Decoding of Superimposed Traces Produced by Direct Sequencing of Heterozygous Indels Dmitriev, D.A. & Rakitov, R.A. Decoding of Superimposed Traces Produced by Direct Sequencing of Heterozygous Indels Dmitriev, D.A. & Rakitov, R.A. Illinois Natural History Survey, Institute of Natural Resource Sustainability, University

More information

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004 Analysis of large deletions in human-chimp genomic alignments Erika Kvikstad BioInformatics I December 14, 2004 Outline Mutations, mutations, mutations Project overview Strategy: finding, classifying indels

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

Next Generation Genetics: Using deep sequencing to connect phenotype to genotype

Next Generation Genetics: Using deep sequencing to connect phenotype to genotype Next Generation Genetics: Using deep sequencing to connect phenotype to genotype http://1001genomes.org Korbinian Schneeberger Connecting Genotype and Phenotype Genotyping SNPs small Resequencing SVs*

More information

VCGDB: A Virtual and Dynamic Genome Database of the Chinese Population

VCGDB: A Virtual and Dynamic Genome Database of the Chinese Population VCGDB: A Virtual and Dynamic Genome Database of the Chinese Population Jiayan Wu Associate Professor Director of Science and Technology Department Director of Core Facility Beijing Institute of Genomics,

More information

Sundaram DGA43A19 Page 1. Finishing Drosophila grimshawi Fosmid: DGA43A19 Varun Sundaram 2/16/09

Sundaram DGA43A19 Page 1. Finishing Drosophila grimshawi Fosmid: DGA43A19 Varun Sundaram 2/16/09 Sundaram DGA43A19 Page 1 Finishing Drosophila grimshawi Fosmid: DGA43A19 Varun Sundaram 2/16/09 Sundaram DGA43A19 Page 2 Abstract My project focused on the fosmid clone DGA43A19. The main problems with

More information

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed. MCB 104 MIDTERM #2 October 23, 2013 ***IMPORTANT REMINDERS*** Print your name and ID# on every page of the exam. You will lose 0.5 point/page if you forget to do this. Name KEY If you need more space than

More information

Basic Concepts of Human Genetics

Basic Concepts of Human Genetics Basic Concepts of Human Genetics The genetic information of an individual is contained in 23 pairs of chromosomes. Every human cell contains the 23 pair of chromosomes. One pair is called sex chromosomes

More information

Mate-pair library data improves genome assembly

Mate-pair library data improves genome assembly De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate

More information

SEQUENCING. M Ataei, PhD. Feb 2016

SEQUENCING. M Ataei, PhD. Feb 2016 CLINICAL NEXT GENERATION SEQUENCING M Ataei, PhD Tehran Medical Genetics Laboratory Feb 2016 Overview 2 Background NGS in non-invasive prenatal diagnosis (NIPD) 3 Background Background 4 In the 1970s,

More information

Molecular Biology: DNA sequencing

Molecular Biology: DNA sequencing Molecular Biology: DNA sequencing Author: Prof Marinda Oosthuizen Licensed under a Creative Commons Attribution license. SEQUENCING OF LARGE TEMPLATES As we have seen, we can obtain up to 800 nucleotides

More information