Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Similar documents
Chromosome inversions in human populations Maria Bellet Coll

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah

Analysis of structural variation. Alistair Ward - Boston College

Enzyme that uses RNA as a template to synthesize a complementary DNA

Structural(varia+on!

The Human Genome and its upcoming Dynamics

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

Research techniques in genetics. Medical genetics, 2017.

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

The Diploid Genome Sequence of an Individual Human

Applicazioni biotecnologiche

Sept 2. Structure and Organization of Genomes. Today: Genetic and Physical Mapping. Sept 9. Forward and Reverse Genetics. Genetic and Physical Mapping

Biology 105: Introduction to Genetics PRACTICE FINAL EXAM Part I: Definitions. Homology: Reverse transcriptase. Allostery: cdna library

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

Lecture 2: Biology Basics Continued

A. Incorrect! This statement is true. Transposable elements can cause chromosome rearrangements.

Supplementary Figures

Chapter 14: Genes in Action

Exome Sequencing Exome sequencing is a technique that is used to examine all of the protein-coding regions of the genome.

Complementary Technologies for Precision Genetic Analysis

REVIEWS. Structural variation in the human genome

Keystone Biology Remediation B2: Genetics

Molecular Genetics of Disease and the Human Genome Project

GENES AND CHROMOSOMES II

Biotechnology Chapter 20

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50

Chapter 4 Gene Linkage and Genetic Mapping

Authors: Vivek Sharma and Ram Kunwar

DNA REPLICATION & BIOTECHNOLOGY Biology Study Review

The study of the structure, function, and interaction of cellular proteins is called. A) bioinformatics B) haplotypics C) genomics D) proteomics

GENETICS EXAM 3 FALL a) is a technique that allows you to separate nucleic acids (DNA or RNA) by size.

Biotechnology. Chapter 20. Biology Eighth Edition Neil Campbell and Jane Reece. PowerPoint Lecture Presentations for

Long-range gene regulation

DESIGNER GENES SAMPLE TOURNAMENT

Genomes summary. Bacterial genome sizes

Results WCP (Whole chromosome paint) FISH

Biol 478/595 Intro to Bioinformatics

14 March, 2016: Introduction to Genomics

Human Chromosomes Section 14.1

BIO 304 Genetics (Fall 2003) Exam #2 Name KEY SSN

Chapter 20 DNA Technology & Genomics. If we can, should we?

TEKS 5C describe the roles of DNA, ribonucleic acid (RNA), and environmental factors in cell differentiation

Genetics and Biotechnology. Section 1. Applied Genetics

PV92 PCR Bio Informatics

Chapter 5. Structural Genomics

Map-Based Cloning of Qualitative Plant Genes

Structural variation analysis using NGS sequencing

SNP calling and VCF format

Mutations, Genetic Testing and Engineering

Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome

CHAPTER 5 Principle of Genetics Review

Unit 6: Molecular Genetics & DNA Technology Guided Reading Questions (100 pts total)

Recombinant DNA Technology

Capturing Complex Human Genetic Variations using the GS FLX+ System

Lecture 2: High-Throughput Biology

Name Class Date. a. identify similarities and

CHAPTER 21 GENOMES AND THEIR EVOLUTION

Genetics Transcription Translation Replication

Recombination, and haplotype structure

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger.

Genome editing. Knock-ins

COURSE OUTLINE Biology 103 Molecular Biology and Genetics

From DNA to Protein: Genotype to Phenotype

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

Chapter 20 Recombinant DNA Technology. Copyright 2009 Pearson Education, Inc.

Crash-course in genomics

American Society of Cytopathology Core Curriculum in Molecular Biology

Protein Synthesis

Recombinant DNA recombinant DNA DNA cloning gene cloning

BIOLOGY - CLUTCH CH.20 - BIOTECHNOLOGY.

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

6E identify and illustrate changes in DNA and evaluate the significance of these changes

Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly

Chapter 15 The Human Genome Project and Genomics. Chapter 15 Human Heredity by Michael Cummings 2006 Brooks/Cole-Thomson Learning

02 Agenda Item 03 Agenda Item

Overview of Human Genetics

From DNA to Protein: Genotype to Phenotype

Multiple choice questions (numbers in brackets indicate the number of correct answers)

Bi 8 Lecture 4. Ellen Rothenberg 14 January Reading: from Alberts Ch. 8

Bootcamp: Molecular Biology Techniques and Interpretation

2 Gene Technologies in Our Lives

1

Bio 101 Sample questions: Chapter 10

CHAPTER 21 LECTURE SLIDES

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

Chapter 15 Gene Technologies and Human Applications

Section 14.1 Structure of ribonucleic acid

Genome research in eukaryotes

-Is the process of manipulating genes and genomes

Researchers use genetic engineering to manipulate DNA.

Human genome sequence February, 2001

12/31/16. I. Manipulating DNA (9.1) A. Scientists use several techniques to manipulate DNA. 1. DNA is a very large molecule

GENE MAPPING. Genetica per Scienze Naturali a.a prof S. Presciuttini

Design. Construction. Characterization

Unit 8: Genomics Guided Reading Questions (150 pts total)

Biotechnology. Chapter 20. Biology Eighth Edition Neil Campbell and Jane Reece. PowerPoint Lecture Presentations for

Concepts: What are RFLPs and how do they act like genetic marker loci?

Single Molecule Variant Detection: From Heteroduplexes in a Single DNA Molecule to Whole Chromosome Rearrangements

Transcription:

Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Genetic variation How much genetic variation is there between individuals? What type of variants exist and how are they generated? What is the genetic basis of phenotypic traits?

Overview 1. Types of structural variants (SVs) 2. Methods for detecting SVs 3. Copy number variants (CNVs) 4. Indels and transposable element (TE) insertions 5. Inversions 6. Mechanisms of generation 7. Functional effects and examples

INDELS Types of structural variants a b c d e f w x y z w x y c d e f a b e f DELETION a b c d e f a b c d e f a b TRANSLOCATION a b c d e f z a b c d e f INSERTION a b c d e f a e d c b f INVERSION a b c c d e f DUPLICATION

Pang et al. (2010) Genome Biology 11: R52 Structural variation vs SNPs Structural variants (SV) Genomic alterations that change the organization of the DNA molecule In comparison with SNPs: SVs represent a lower number of mutations SVs affect a higher number of nucleotides in the genome Comparison Venter Reference genomes 808.346 structural variants 48,8 Mb (1,5%) 808.179 indels 39,54 Mb (1,2%) 167 inversions 9,26 Mb (0,3%) 3.213.401 SNPs 3,2 Mb (0,1%) Mutations 80% SNPs 20% SVs Variable bases 6,15% SNPs 93,85% SVs

Methods for detection of SVs Cytogenetic techniques - Comparative genomic hybridization (CGH) arrays Paired-end mapping (PEM) Sequencing and de novo assembly of complete genomes + RESOLUTION THROUGHPUT

Cytogenetic techniques Karyotyping FISH Deletion Duplication Deletion Chromosome painting Fiber FISH Translocation Copy number variant

Fluorescence in situ hybridization (FISH) Labelled probe hybridization Final result Figure 3.32. Genomes 3. Brown. 3rd edition (2007)

Figure 3. Feuk et al. (2005) PLoS Genetics 1: e56. Inversion detection by FISH in interphase nucleus Fixed inversion between humans and chimpanzees STD INV inversion Polymorphic inversion in humans

Comparative genomic hybridization arrays (acgh) The ratio of fluoresecence intensity of the test and the reference DNA indicates the differences in copy number for a particular location in the genome Figure 2. Feuk et al. (2006) Nature Reviews Genetics 7: 85-97

Genomic DNA Fragmentation Genomic DNA library DNA digestion with restriction enzymes Cloning inside vector 2 4 1 3 5 GENOMIC LIBRARY

Paired-end mapping (PEM) 1. Construction of a DNA library of fragments of a defined size from the DNA of interest (test DNA) 2. Sequencing of both ends of a large number of fragments DNA test 40 kb 3. Mapping of both ends to a reference genome and prediction of SVs Ref DNA 40 kb 30 kb 60 kb X kb Ref DNA 20 kb Punt trencament 1 Punt trencament 2 Test DNA No variant 10 kb Insertion Deletion Inversion

Copy number variants (CNV) CNV DNA segment present in a variable number of copies compared to a reference genome Individu 1 Individu 2 2 copies 3 copies Individu 3 5 copies 8599 validated CNVs spanning a total of 112.7 Mb (3.7% of the genome) detected in 450 individuals of European, African and Asian ancestry Two genomes show different copy number in 1098 CNV regions Detected CNVs have sizes between 443 bp and 1.28 Mb (average size 2.9 kb) CNVs can include genes Some CNVs do not seem to have any influence in phenotype but others have been associated to diseases Conrad et al. (2010) Nature 464: 704-712

CNVs and segmental duplications CNV Ind. 1 Ind. 2 Ind. 3 DNA segment present in a variable number of copies compared to a reference genome 1 copy 2 copies 5 copies SD Ind. 1 Ind. 2 Ind. 3 Segment of DNA with very similar sequence present in more than one copy in the genome 2 copies 2 copies 2 copies Lesson 6. Structural variation 14 Mario Cáceres

Redon et al. (2006) Nature 444:444-454 Copy number variants (CNV) Chromosomal distribution of 1447 regions with CNVs 24% of CNVs associated with SDs 58% of CNVs overlap known genes

Montgomery et al. (2013) Genome Res 23: 749-761 Short indels <50 bp 1.6 million indels from 179 individuals representing 3 diverse human populations Purifying selection against indels in functional regions 43-48% of indels occur 4% of the genome (indel hotspots), while in the remaining 96% their prevalence is 16 times lower than that for SNPs Polymerase slippage can explain ¾ of all indels Indel density in 6 genic regions Coding indel lengths

Kidd et al. (2008) Nature 453: 56-64 Large indels Fosmid (40 kb) PEM in 8 humans found 747 deletions and 724 insertions >5 kb 32 kb deletions between 12-kb direct SDs with 94% identity Identification of novel sequence not included in the reference genome

TE insertion confirmed by PCR TE insertion polymorphisms Present in reference Absent in test Read pairs mapping into the insertion Reads containing part of the insertion Insertion in test Absent in reference Stewart et al. (2011) PLoS Genetics 7:e1002236 Read pairs with longer distance between them

Stewart et al. (2011) PLoS Genetics 7:e1002236 Figure 1. Li et al. (2011) Nature Biotechnology 29: 723-730 Polymorphic TE insertions in humans Data from 1000 genomes project (185 individuals from 3 populations) Active elements are Alu, L1 and SVA Size distribution of structural variants <1 kb in two sequenced genomes Alu 7380 polymorphic insertions detected Polymorphisms between two individuals: 2 European 600 2 African 1400 1 European and 1 African 2000 Polymorphic TE insertions within genes De novo insertion frequency = 1 insertion per 20 births

Inversions Change of orientation of a segment of DNA 2st 2j Distal Inverted regions Proximal Cen. Cen. STD INV Types of inversions Inversions have been associated to phenotypical traits Mechanisms by which inversions are able to affect phenotype remain unknown Balanced events are difficult to study They can present repeats in opposite orientation at their breakpoints

Effects and consequences of inversions Suppression of recombination Within the inverted sequence in STD/INV heterozygotes Alleles found together within an inversion tend to be inherited together http://mhanswers-auth.mhhe.com/biology/genetics/mcgraw-hill-answers-changes-chromosome-structure-and-number

Effects and consequences of inversions Position effects Altered gene expression of adjacent genes caused by the mutational effects of inversion breakpoints BP location Consequences Between genes Change of positions Within genes Disrupted gene Between regulatory elements and genes Disrupted regulatory elements Expression Normal expression No expression Altered expression patterns http://mhanswers-auth.mhhe.com/biology/genetics/mcgraw-hill-answers-changes-chromosome-structure-and-number

Mechanisms of generation of SVs SVs are typically generated during DNA break-induced repair, recombination or replication by different possible mechanisms: Non-Allelic Homologous Recombination (NAHR) (duplications, deletions, inversions, translocations) Non-Homologous End Joining (NHEJ) (deletions, inversions) Transposition of transposable elements (insertions, deletions) Fork Stalling and Template Switching (FoSTeS) (duplications, deletions, inversions, translocations)

Non-Allelic Homologous Recombination (NAHR) Intra or interchomosomal recombination between copies of a sequence in different genomic positions Duplications and deletions Translocations Inversions Figure 4. Bailey and Eichler (2006) Nature Reviews Genetics 7: 552-564

Repeated sequences in the human genome v Gaps SD Segmental duplications Intra/interchromosomal duplicated sequences with length 1 kb and identity 90% Figure 4. International Human Genome Sequencing Consortium (2004) Nature 409: 931-945. Represent 5.3% of the human genome Transposable elements Almost 50% of the human genome are transposable elements High number of copies of each TE: 850000 LINEs 1.5 million SINEs 450000 LTR 300000 DNA Figure 1. Cordaux and Batzer (2009) Nature Reviews Genetics 10:691-703

Non-Homologous End Joining (NHEJ) Original DNA molecules Generation of an inversion Double strand breaks Generation of a translocation Repair Repaired DNA molecules

FoSTeS Fork Stalling and Template Switching (FoSTeS) Replication based mechanism Could be combined with microhomology Typically generates very complex rearrangements Figure 5. Gu et al. (2008) PathoGenetics 1:4

Altered gene dosage and expression (CNVs) Disruption of gene or regulatory elements (insertions, deletions, inversions) Gene fusion (deletions, inversions) Change in the exon-intron structure (insertions, deletions, CNVs, inversions) Functional consequences of SVs Modification of gene regulatory regions (insertion, deletions, CNVs, inversions) Indirect effects though increased susceptibility of genomic rearrangements (CNVs, inversions)

SVs and disease Tuzun et al. (2007) Nature Genetics 37:727-732

CNVs and complex diseases Summary of Common Disorders for Which Associations to CNVs Have Been Reported Table 3. Estivill and Armengol (2007) PLoS Genetics 3:e190

CNV example: the amilase gene Japanese individual High-starch diet (14 copies) The amylase protein levels in saliva are proportional to the number of the AMY1 gene copies African individual Low-starch diet (6 copies) Chimpanzee Low-starch diet (2 copies) Figures 1, 2 and 3. Perry et al. (2007) Nature Genetics 39: 1256-1260 Individuals from populations with high-starch diets have on average more AMY1 copies than those with traditionally low-starch diets.

González et al. (2005) Science 307: 1434-1440 CNV example: CCL3L1 Individuals with low copy numbers of the chemokine gene, relative to their ethnic background, are associated with markedly enhanced HIV-1 (AIDS) susceptibility.

Feschotte (2008) Nature Reviews Genetics 9: 397-405 Effects of TEs on genes

Stefansson et al. (2005) Nature Genetics 37: 129-137 Chromosome 17 inversion in humans 900-kb polymorphic inversion originated by NAHR between 200-500 kb segmental duplications Detected mainly in European populations where it has a 20% frequency It is possible that this inversion is positively selected because it may be associated to an increased fertility in female carriers

It affects flowering time causing reproductive isolation Figures 1 and 2. Lowry and Willis (2010) PLoS Biology 8: e1000500 Inversion in the plant Mimmulus guttatus Mimulus guttatus ecotypes coastal perennial inland annual North-American plant Mimulus guttatus A polymorphic inversion causes the differences between the annual and perennial forms adapted to different environments

The 1000 genomes project http://www.1000genomes.org Objective Experiments Identify all genetic variants with a frequency higher than 1% in the studied populations Sequencing using next-generation techniques of 2500 whole genomes from 25 world-wide populations with a 4x redundancy Pilot phase 179 individuals from 4 populations 15 million SNPs 1 million short insertions and deletions 20000 structural variants >95% of variants with frequencies >5%) Phase I 1092 individuals from 14 populations 38 million SNPs 1.4 millions short indels 14000 larger deletions 98% of SNPs with frequencies >1% The 1000 Genomes Project Consortium (2010) Nature 467: 1061 1073 The 1000 Genomes Project Consortium (2012) Nature 491: 56-65