The Human Genome and its upcoming Dynamics

Similar documents
Next Generation Genetics: Using deep sequencing to connect phenotype to genotype

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

The Diploid Genome Sequence of an Individual Human

Sequence Assembly and Alignment. Jim Noonan Department of Genetics

Analysis of large deletions in human-chimp genomic alignments. Erika Kvikstad BioInformatics I December 14, 2004

NUCLEOTIDE RESOLUTION STRUCTURAL VARIATION DETECTION USING NEXT- GENERATION WHOLE GENOME RESEQUENCING

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Towards Personal Genomics

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Structural(varia+on!

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah

14 March, 2016: Introduction to Genomics

Chapter 5. Structural Genomics

Building a platinum human genome assembly from single haplotype human genomes. Karyn Meltz Steinberg PacBio UGM December,

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

Analysis of structural variation. Alistair Ward - Boston College

Lecture 2: Biology Basics Continued

Get to Know Your DNA. Every Single Fragment.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Cancer Genetics Solutions

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

Structural variation analysis using NGS sequencing

Linking Genetic Variation to Important Phenotypes

Gap Filling for a Human MHC Haplotype Sequence

C3BI. VARIANTS CALLING November Pierre Lechat Stéphane Descorps-Declère

What is genetic variation?

Mate-pair library data improves genome assembly

Structure, Measurement & Analysis of Genetic Variation

Genomes contain all of the information needed for an organism to grow and survive.

SUPPLEMENTARY INFORMATION

Genome Projects. Part III. Assembly and sequencing of human genomes

Recombination, and haplotype structure

Introduction to the UCSC genome browser

Genome-wide association studies (GWAS) Part 1

SNP calling and VCF format

02 Agenda Item 03 Agenda Item

Supplementary Table 1. Summary of whole genome shotgun sequence used for genome assembly

GENES AND CHROMOSOMES II

Training materials.

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

De novo Genome Assembly

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Technologies, resources and tools for the exploitation of the sheep and goat genomes.

Sequence assembly. Jose Blanca COMAV institute bioinf.comav.upv.es

Functional Genomics and Single-Cell Genomics. Christine Queitsch Department of Genome Sciences

The Basics of Understanding Whole Genome Next Generation Sequence Data

Algorithms for Genetics: Introduction, and sources of variation

Capturing Complex Human Genetic Variations using the GS FLX+ System

Introduction to BIOINFORMATICS

UHT Sequencing Course Large-scale genotyping. Christian Iseli January 2009

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

Crash-course in genomics

Nature Biotechnology: doi: /nbt Supplementary Figure 1. Number and length distributions of the inferred fosmids.

Variant Discovery. Jie (Jessie) Li PhD Bioinformatics Analyst Bioinformatics Core, UCD

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci

DNA. bioinformatics. genomics. personalized. variation NGS. trio. custom. assembly gene. tumor-normal. de novo. structural variation indel.

Identification of disease-related genes. Heymut Omran Department of Pediatrics and Adolescent Medicine; Freiburg; Germany

STAT 536: Genetic Statistics

Welcome to the NGS webinar series

Chapter 4 Gene Linkage and Genetic Mapping

Microsatellite markers

Ecological genomics and molecular adaptation: state of the Union and some research goals for the near future.

Chapter 14: Genes in Action

Training materials.

Estimating the rates and modes of creation of new genetic variation in plants using NGS technologies

Map-Based Cloning of Qualitative Plant Genes

The tomato genome re-seq project

MI615 Syllabus Illustrated Topics in Advanced Molecular Genetics Provisional Schedule Spring 2010: MN402 TR 9:30-10:50

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

NGS developments in tomato genome sequencing

Estimation problems in high throughput SNP platforms

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

BIOINF/BENG/BIMM/CHEM/CSE 184: Computational Molecular Biology. Lecture 2: Microarray analysis

Supplementary Figures

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

The fourth chromosome: targeting heterochromatin formation in Drosophila. Drosophila melanogaster chromosomes

The University of California, Santa Cruz (UCSC) Genome Browser

DNA Studies (Genetic Studies) Disclosure Statement. Definition of Genomics. Outline. Immediate Present and Near Future

Genome Sequence Assembly

REVIEWS. Structural variation in the human genome

Molecular Biology (2)

Computational methods for discovering structural variation with next-generation sequencing

Drosophila White Paper 2003 August 13, 2003

Resolution of fine scale ribosomal DNA variation in Saccharomyces yeast

Agilent NGS Solutions : Addressing Today s Challenges

Course summary. Today. PCR Polymerase chain reaction. Obtaining molecular data. Sequencing. DNA sequencing. Genome Projects.

Detecting Structural Variants in PacBio Reads Tools and Applications

Analysis of genome-wide genotype data

Lecture 2: Biology Basics Continued. Fall 2018 August 23, 2018

Bioinformatics Advice on Experimental Design

Informatic Issues in Genomics

De novo assembly of human genomes with massively parallel short read sequencing. Mikk Eelmets Journal Club

Answers to additional linkage problems.

Introduction to Bioinformatics

DNA-Sequencing. Technologies & Devices. Matthias Platzer. Genome Analysis Leibniz Institute on Aging - Fritz Lipmann Institute (FLI)

This place covers: Methods or systems for genetic or protein-related data processing in computational molecular biology.

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

Transcription:

The Human Genome and its upcoming Dynamics Matthias Platzer Genome Analysis Leibniz Institute for Age Research - Fritz-Lipmann Institute (FLI)

Sequencing of the Human Genome Publications 2004 2001 2001 International academic consortium Private Enterprise Celera

International Human Genome Project Contributions of the Members May 2000 March 2005 January 2006 Human Sequence % 21 X 8 IHGSC (2001), Nature 409: 860

Human Genome Working Draft versions February 2001 Academic Private Initial Sequencing & Analysis The Sequence of 2.72 Gb Sequenced Bases 2.65 Gb 1,000 Clone gaps 54,000 146,000 Sequence gaps 116,000 147.000 Gaps 170,000 overall coverage: 94% quality of unfinished data : < 1 error/10kb in 91% human population heterogeneity: variation between two individuals: 1 SNP/kb 1 SNP/10kb

Human Genome Final version October 2004 Academic Private Initial Seq Finishing the euchromatic sequence The Sequence of 2.72 Gb 2.85 Gb Sequenced Bases 2.65 Gb 1,000 283 Clone gaps 54,000 146,000 58 Sequence gaps 116,000 147.000 341 Gaps 170,000 near-complete sequence: extremely high quality: 99% of euchromatin < 1 error/100kb

Segmental Duplications Problems of the human reference sequence ~50% of the 273 interior euchromatic gaps located in segmentally duplicated regions

Segmental Duplication Definition genomic regions >1kb with nt identity >90% Human genome 5.3% segmentally duplicated 87% of all segmental duplications >50 kb

Segmental Tandem Duplications Assembly problems allele 1 id. 99% locus A locus B id. 99% allele 2 locus A locus B alignment locus A

Segmental Tandem Duplications Assembly problems allele 1 id. 99% locus A locus B id. 99% allele 2 locus A locus B alignment locus A locus B

Segmental Tandem Duplications Assembly problems allele 1 id. 99% locus A locus B id. 99% allele 2 locus A locus B alignment locus A locus B gap

Segmental duplications Mechanisms Hurles, Plos Biology 2:900 (2004)

Segmental duplications Fate of duplicated genes Hurles, Plos Biology 2:900 (2004)

Structural Variations / Polymorphisms Types 255 large-scale copy number variations (>100 kb). Nat Genet 36:949 (2004) Intermediate size variations >8 kb: 139 insertions, 102 deletions, 56 inversion break points Nat Genet 37:727 (2005) 76 large-scale copy number polymorphisms; on average of 11 variations between two individuals with a median length of 222 kb. Science 305:525 (2004) 586 deletion polymorphisms covering 276 genes; typical individual is hemizygous for 30-50 deletions >5 kb, totaling 550-750 kb. Nat Genet 38:75 (2006) 541 deletion polymorphisms covering 1-745 kb; 278 in multiple, unrelated individuals; 120 in homozygous state. Nat Genet 38:86 (2006)

DEF cluster at 8p23.1 hg16: 6.3-8.3 Mb chr 8 Repeat clusters DEF a DEF b1 gap gap gap DEF b3 360 kb gap DEF b2 DEF b4 DEF b5 DEF b6 additional copies n n = 2-4 Taudien et al., BMC Genomics 5:92 (2004)

Genomic variability of 8p23.1 DEF locus Hypothetical organisation Taudien et al., BMC Genomics 5:92 (2004)

Defensins (DEF) Multiple roles Immunity & Cancer Yang et al., 2001. Cell. Mol. Life Sci. 58, 978-989

Complex phenotypes / diseases Structural variations Eichler et al., Nature 447:161 (2007)

Complex phenotypes / diseases Structural variations FCGR3 copy number & glomerulonephritis in humans and rats Nature 439:851 (2006) Strong association of de novo copy number mutations with autism Science 316: 445 (2007)

Segmental duplications Content of sequenced animal genomes Bailey & Eichler, Nat Rev Genet 7:552 (2006)

Segmental duplication content of hominoids Hyperexpansions in chimpanzee Bailey & Eichler, Nat Rev Genet 7:552 (2006)

Genomic variations Lexicon Scherer et al., Nat Genet Suppl 39:s7 (2007)

Copy number variation (CNV) Detection by DNA microarrays 0.5-2 Mio data points comparative hybridization vs. a reference Lee et al., Nat Genet Suppl 39:s48 (2007)

Genetic Variability Structural variations Chromosome A Chromosome B Inversion InDel Allele variation Combination

Conclusions Genomes of any two individuals in the human population differ more at the structural level than at the nucleotide sequence level. Differences between individuals CNV: >4 Mb >1/800 bp > 0.12 % SNP: 2.5 Mb 1/1,200 bp 0.08 % Sebat, Nat Gen Suppl 39:s3 (2007)

Conclusions Copy number polymorphism at orthologous regions of divers genomes suggests that genome plasticity is a more common cause of genetically complex phenotypes than has hitherto been observed. Aitman et al., Nature 439:851 (2006)

Completing the map of human genetic variation Mapping structural variations Eichler et al., Nature 447:161 (2007)

Completing the map of human genetic variation Sequencing structural variations Eichler et al., Nature 447:161 (2007)

Genome Dynamics Patchwork people? News Feature, Nature 437:1084 (2005)

genome.fli-leibniz.de Lectures