The Human Genome. The raw data. The repeat content. Composition of the human genome bases. A s T s C s and G s and N s.

Size: px
Start display at page:

Download "The Human Genome. The raw data. The repeat content. Composition of the human genome bases. A s T s C s and G s and N s."

Transcription

1 bases The Human Genome The raw data GATCTGATAAGTCCCAGGACTTCAGAAGagctgtgagaccttggccaagt cacttcctccttcaggaacattgcagtgggcctaagtgcctcctctcggg ACTGGTATGGGGACGGTCATGCAATCTGGACAACATTCACCTTTAAAAGT TTATTGATCTTTTGTGACATGCACGTGGGTTCCCAGTAGCAAGAAACTAA AGGGTCGCAGGCCGGTTTCTGCTAATTTCTTTAATTCCAAGACAGTCTCA AATATTTTCTTATTAACTTCCTGGAGGGAGGCTTATCATTCTCTCTTTTG GATGATTCTAAGTACCAGCTAAAATACAGCTATCATTCATTTTCCTTGAT TTGGGAGCCTAATTTCTTTAATTTAGTATGCAAGAAAACCAATTTGGAAA TATCAACTGTTTTGGAAACCTTAGACCTAGGTCATCCTTAGTAAGATctt cccatttatataaatacttgcaagtagtagtgccataattaccaaacata aagccaactgagatgcccaaagggggccactctccttgcttttcctcctt tttagaggatttatttcccatttttcttaaaaaggaagaacaaactgtgc cctagggtttactgtgtcagaacagagtgtgccgattgtggtcaggactc catagcatttcaccattgagttatttccgcccccttacgtgtctctcttc agcggtctattatctccaagagggcataaaacactgagtaaacagctctt ttatatgtgtttcctggatgagccttcttttaattaattttgttaaggga tttcctctagggccactgcacgtcatggggagtcacccccagacactccc aattggccccttgtcacccaggggcacatttcagctatttgtaaaacctg aaatcactagaaaggaatgtctagtgacttgtgggggccaaggcccttgt tatggggatgaaggctcttaggtggtagccctccaagagaatagatggtg Aatgtctcttttcagacattaaaggtgtcagactctcagttaatctctcc tagatccaggaaaggcctagaaaaggaaggcctgactgcattaatggaga ttctctccatgtgcaaaatttcctccacaaaagaaatccttgcagggcca ttttaatgtgttggccctgtgacagccatttcaaaatatgtcaaaaaata tattttggagtaaaatactttcattttccttcagagtctgctgtcgtatg atgccataccagagtcaggttggaaagtaagccacattatacagcgttaa cctaaaaaaacaaaaaactgtctaacaagattttatggtttatagagcat gattccccggacacattagatagaaatctgggcaagagaagaaaaaaagg tcagagtttaatcctcattcctaagttatgtaaaccaaaaataaaattct gaagatgtcctgatcatctgaatggacccttcctctggaccagggcattc caaagttaacctgaaaattggtttgggccatgatgggaagggaggtttgg atatgcctcattatgccctcttccctttcagaattcaggaaaagccaacc agcattaacatcaacacagattttcagatcttaggtttctttccgatcta ttctctctgaaccctgctacctggaggcttcatctgcataataaaacttt agtctccacaaccccttatcttaccccagacattcctttctattgataat aactctttcaaccaattgccaatcagggtatgtttaaatctacctatgac ctggaagcccccactttgcaccctgagatcaaaccagtgcaaatcttata tgtattgatttgtcaatgaaaacagtcaaagccagtcaggcacagtggct catgcctgtaatcccagcactttgggaggctgaggcgggtagatcacctg aggtcaggagttcgacaccagcctggccaacatggtgaaaccccgtccct actaaaatacaaaaattagcccagcttggtggtgggcacctgtaatctta gctactgcagagactgaggcaggagaatcgcttgaacccaggaggtggag gttgcagtgacctgagattttgccattgcactccagcctgggcaacagag caagactctatctcaaaaaacaaacaaacaaacaaacaaacaaacaaact gtcaaaatctgtacagtatgtgaagagatttgttctgaaccaaatatgaa tgaccatggtccatgacacagccctcagaagaccctgagaacatgtgccc aaggtggtcacagtgcatcttagttttgtacattttagggagatatgaga cttcagtcaaatacatttttaaaaaatacattggttttgtccagaaagcc agaaccactcaaagcaggggtttccaggttataagtagatttaaaatttt tctgattgacaattggttgaaagagttgtcaatagaaaggaatgtctgca ttgtgacaagaggttgtggagaccaagtttctgtcatgcagatgaagcct tcaggtagcaggcttccaagataacaggttgtaaatagttcttatcagac ttaagttctgtggagacgtaaaatgaggcatatctgacctccacttccaa aaacatctgagacaggtctcagttaattaagaaagtttgttctgcctagt ttaaggacatgcccatgacactgcctcaggaggtcctgacagcatgtgcc caaggtggtcaggatacagcttgcttctatatattttagggagaaaatac atcagcctgtaaacaaaaaattaaattctaaggtccctgaaccatctgaa tgggctttcttctaggccagggcactctaaaattgaagaacctgaacatt cctttctattgataatactttcagccagttgagcccattcagaccacagc AAGGTGCCAGGCCAGGCAAGGGCTGACTTGAGATACCTGCCAGATGAGTC ACTGGCAAAAGGTGCTGCTCCCTGGTGAGGGAGAAACACCAGGGGCTGGG AGAGGCCCAGAAGGCTCTGAAGGAGTTTTGGTTTGGCTGGCCATGTGTGC AATTAGCGTGATGAGCTCTGACATGGCCTTGCATGGACGGATTGGGCAGG A s T s C s and G s and N s Composition of the human genome The repeat content Jumping -genes Nearly half the genome is repeats Only approximately 1.5% is known coding genes Unknown functional fraction?! 1. Transposition-derived repeats 2. Inactive retroposed cellular genes. 3. Simple repeats - microstats 4. Segmental duplications 5. Tandom repeats (telomere, centromere)

2 Few than expected genes Genome complexity GeneSweep Ewan Birney (Welcome Trust Sanger Institute) Alternative splicing 56% for Humans 22% for Worms The happy winner. Lee Rowen of the Institute for Systems Biology. 25,947 genes. Regulators elements Promoters, enhancers, repressors This is where it get complicated. Variation among chromosomes Variation within chromosomes GC Recombination Initial sequencing and analysis of the human genome International Human Genome Sequencing Consortium Nature 409, (15 February 2001) Gene density Overall recombination rate dependent on chromosome length. Large variation in the gene density between chromosome. Difference in organisation The genome is non-random in its organisation Recombination High at telomere GC Variation at many scales - Isochores Gene Density Organisation by function

3 New observations Completing the Human Genome 2001 Variation at multiple scales within and between chromosomes Only twice as many genes as flies and worms but more proteins have arrived from bacteria and transposable elements Transposons inactive and LTR probably also (Alu s in GC rich regions) Most mutations occur in males (higher mutation rate) GC poor regions correspond to dark bands. Recombination rates are higher at telomeres Lots of between individual variation Humans Genome Project starts 1990 Draft Human Genome completed 2001 Fewer gaps 147, More continuity 81kb 38,500kb Gene rich regions completed 2003 Error rate of ~1 in per 100,000 bases 2.85 billion bases Covers ~99% of the euchromatic genome. Each chromosome compiled and annotated. 2006! Go home? Not quite finished New builds: Build 36, May 2006 Build 35, May 2004 Build 34, July 2003 Build 33, April 2003 Chromosome 1 Segmental duplications - allow genes to diversify and acquire novel functions. December NCBI 28 July NCBI 34 Duplication of a gene from one to many positions on the chromosome. A pericentric inversion follows a duplication of two genes

4 Chromosomes 2 and 4 Chromosomes 3 Gene deserts Megabase sized genomic segments containing no known coding genes. (some show conservation) Lowest rate of segmental duplication Large inversion from our ancestor with chimps. Role of these regions? Lowest recombination rates of all the autosomes Chromosomes 7 Chromosomes 10 Complex repeat patterns and fragile locations Multi-species alignment gene involved in cancer Williams-Beuren syndrome associated with a large deletion (1.6Mb). Lots of repetitive and duplicated DNA. What is the true sequences? It is characterized by a distinctive, "elfish" facial appearance, along with a low nasal bridge; an unusually cheerful demeanor and ease with strangers, coupled with unpredictably occurring negative outbursts; mental retardation coupled with an unusual facility with language; a love for music; and cardiovascular problems, such as supravalvular aortic stenosis and transient hypercalcemia. Conservation indicates the location of functional elements. Some are known genes. Others aren t higher levels of conservation!

5 Chromosomes 19 Chromosomes 12 and 3 Very high gene density Recombination rate variation Increase in all classes of known genes. 26 genes per megabase. Knowing the physical positions of variants allows recombination rates What is special about this chromosome? Male and female rates differ Fine scale variation Has high recombination rate. And repeat density And GC content. Where is the data available N.C.B.I. Part of the National Institute of Health. Has a number of important associated projects. Mr NCBI David Lipman. Ensembl A joint project between EMBL and the Sanger Institute. Primarily funded by the Welcome Trust. Mr Ensembl Ewan Birney UCSC genome.ucsc.edu/cgi-bin/hggateway Based at the University of California Santa Cruz. Largely funded by the NHGRI. Mr UCSC David Hassler What data available Compositional Base composition Insertion deletions Segmental duplications Repeats Transposable elements Functional Regulatory elements Gene expression Evolutionary Species comparison Variation data Population genetic analysis Use drop down controls below and press refresh to alter tracks displayed. Tracks with lots of items will automatically be displayed in more compact modes. Mapping and Sequencing Tracks Base Positio Chromoso STS n me Band Markers Map Contigs Fosmid End Pairs RGD QTL Known Vega N- SCAN August us sno/mi RNA mrna and EST Tracks Human mrna Spliced s ESTs H-Inv Assembly GC Percent Human Mutation CCDS Vega Pseudogen es SGP Retropose d ExonWalk TIGR Gene Index Gap WSSD Duplicatio n Phenotype and Disease Associations and Gene Prediction Tracks RefSeq Ensembl Geneid Superfami ly Human ESTs UniGene Expression and Regulation Allen Brain GNF Atlas 2 GNF Ratio FISH Clones Coverage Short Match Other RefSeq AceView Genscan Yale Pseudo Other mrnas Gene Bounds Affy HuEx 1.0 Recomb Rate BAC End Pairs Restr Enzymes MGC ECgene Exoniphy EvoFold Other ESTs Alt-Splicing Affy U133

6 Orientation Annotation - Repeats Transposable elements Make up a large proportion of the genome Human chromosomes are numbered Arms are labelled p and q Regions labelled ascending from centromere. Bases numbered from beginning of small arm to end of long arm. Microsatellites and repeats Important in many common diseases Some of the most polymorphic loci Annotation - genes Annotation Expression and Regulation Different levels of evidence for genes mrna evidence Based on homology Based on expression Based on prediction Expression Levels & Tissues Regulatory Elements Protein evidence Gene prediction EST evidence Predicted transcripts - Known Novel Manually annotated genes Regulatory elements might be important in complex diseases Micro array technology is generating expression data on a large scale Expression varies in space and time

7 Annotation Evolutionary Encylopedia of DNA Elements - Encode Cross Species (issues - alignment) 1% of genome 14 manually chosen regions (Alpha & beta globin, HOXA, FOXP2 and CFTR) Plus 26 random regions Within Humans (issues - ascertainment) Variation group SNPs indels Function group Promoters, transcription and binding Chromatin group Chromatin modification, replication origins Multiple sequence alignment Conservation vs Constraint Variation is the most important feature of the genome!? Aim: Understand everything possible about these regions. Human Variation HapMap Project SNPs most common variation in the human genome 2002 HapMap phase I begins Three populations (YRI) Yoruba in Ibadan, Nigeria 90 (CEU) Utah, USA 90 (CHB) Han Chinese in Beijing 45 (JPT) Japanese in Tokyo 44 Approximately 1 million SNPs 10 million common variants. Synonymous Non-synonymous variation Information in the density of SNPs. Information in the frequency of SNPs. Information in the correlation between SNPs Phase I complete, phase II begins Increase from 1 million to ~ 4.6 million 2006 Phase II complete, phase III begins Additional 6 populations Kenya, African Americans, Mexican Americans, Italy, India

8 The International HapMap Learing from studies of human variation Can learn about how genetic diversity is structured across the globe Identify regions which have been under recent positive selection Identify recombination hotspots Linkage Disequilibrium information is an important tool Population genetic annotation is often sample specific Hot Topics Chromosomes X and Y Micro RNA s 20mers of RNA that form a diversity of roles e.g. regulating mrna levels Sex chromosomes Structural variation The genome of is full of polymorphic insertions and deletions, from 1kb to a Megabase Genome-wide association studies Millions of s being spend on scanning the genome for loci showing association with disease status.

The University of California, Santa Cruz (UCSC) Genome Browser

The University of California, Santa Cruz (UCSC) Genome Browser The University of California, Santa Cruz (UCSC) Genome Browser There are hundreds of available userselected tracks in categories such as mapping and sequencing, phenotype and disease associations, genes,

More information

The Diploid Genome Sequence of an Individual Human

The Diploid Genome Sequence of an Individual Human The Diploid Genome Sequence of an Individual Human Maido Remm Journal Club 12.02.2008 Outline Background (history, assembling strategies) Who was sequenced in previous projects Genome variations in J.

More information

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF GENOMICS Genomics is the study of genomes in their entirety Bioinformatics is the analysis of the information content of genomes - Genes, regulatory sequences,

More information

CHAPTER 21 GENOMES AND THEIR EVOLUTION

CHAPTER 21 GENOMES AND THEIR EVOLUTION GENETICS DATE CHAPTER 21 GENOMES AND THEIR EVOLUTION COURSE 213 AP BIOLOGY 1 Comparisons of genomes provide information about the evolutionary history of genes and taxonomic groups Genomics - study of

More information

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph

Introduction to Plant Genomics and Online Resources. Manish Raizada University of Guelph Introduction to Plant Genomics and Online Resources Manish Raizada University of Guelph Genomics Glossary http://www.genomenewsnetwork.org/articles/06_00/sequence_primer.shtml Annotation Adding pertinent

More information

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome

3. human genomics clone genes associated with genetic disorders. 4. many projects generate ordered clones that cover genome Lectures 30 and 31 Genome analysis I. Genome analysis A. two general areas 1. structural 2. functional B. genome projects a status report 1. 1 st sequenced: several viral genomes 2. mitochondria and chloroplasts

More information

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation

More information

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Structural variation Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona Genetic variation How much genetic variation is there between individuals? What type of variants

More information

AP Biology. The BIG Questions. Chapter 19. Prokaryote vs. eukaryote genome. Prokaryote vs. eukaryote genome. Why turn genes on & off?

AP Biology. The BIG Questions. Chapter 19. Prokaryote vs. eukaryote genome. Prokaryote vs. eukaryote genome. Why turn genes on & off? The BIG Questions Chapter 19. Control of Eukaryotic Genome How are genes turned on & off in eukaryotes? How do cells with the same genes differentiate to perform completely different, specialized functions?

More information

user s guide Question 1

user s guide Question 1 Question 1 How does one find a gene of interest and determine that gene s structure? Once the gene has been located on the map, how does one easily examine other genes in that same region? doi:10.1038/ng966

More information

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica

The Ensembl Database. Dott.ssa Inga Prokopenko. Corso di Genomica The Ensembl Database Dott.ssa Inga Prokopenko Corso di Genomica 1 www.ensembl.org Lecture 7.1 2 What is Ensembl? Public annotation of mammalian and other genomes Open source software Relational database

More information

CHAPTER 21 LECTURE SLIDES

CHAPTER 21 LECTURE SLIDES CHAPTER 21 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.

More information

Chapter 20: The human genome

Chapter 20: The human genome Chapter 20: The human genome Jonathan Pevsner, Ph.D. pevsner@kennedykrieger.org Bioinformatics and Functional Genomics (Wiley-Liss, 3 rd edition, 2015) You may use this PowerPoint for teaching purposes

More information

Molecular Genetics of Disease and the Human Genome Project

Molecular Genetics of Disease and the Human Genome Project 9 Molecular Genetics of Disease and the Human Genome Project Fig. 1. The 23 chromosomes in the human genome. There are 22 autosomes (chromosomes 1 to 22) and two sex chromosomes (X and Y). Females inherit

More information

Haplotypes, linkage disequilibrium, and the HapMap

Haplotypes, linkage disequilibrium, and the HapMap Haplotypes, linkage disequilibrium, and the HapMap Jeffrey Barrett Boulder, 2009 LD & HapMap Boulder, 2009 1 / 29 Outline 1 Haplotypes 2 Linkage disequilibrium 3 HapMap 4 Tag SNPs LD & HapMap Boulder,

More information

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics

S G. Design and Analysis of Genetic Association Studies. ection. tatistical. enetics S G ection ON tatistical enetics Design and Analysis of Genetic Association Studies Hemant K Tiwari, Ph.D. Professor & Head Section on Statistical Genetics Department of Biostatistics School of Public

More information

Genomes summary. Bacterial genome sizes

Genomes summary. Bacterial genome sizes Genomes summary 1. >930 bacterial genomes sequenced. 2. Circular. Genes densely packed. 3. 2-10 Mbases, 470-7,000 genes 4. Genomes of >200 eukaryotes (45 higher ) sequenced. 5. Linear chromosomes 6. On

More information

The Human Genome Project

The Human Genome Project The Human Genome Project The Human Genome Project Began in 1990 The Mission of the HGP: The quest to understand the human genome and the role it plays in both health and disease. The true payoff from the

More information

ab initio and Evidence-Based Gene Finding

ab initio and Evidence-Based Gene Finding ab initio and Evidence-Based Gene Finding A basic introduction to annotation Outline What is annotation? ab initio gene finding Genome databases on the web Basics of the UCSC browser Evidence-based gene

More information

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls BMI/CS 776 www.biostat.wisc.edu/bmi776/ Colin Dewey cdewey@biostat.wisc.edu Spring 2012 1. Understanding Human Genetic Variation

More information

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences.

Question 2: There are 5 retroelements (2 LINEs and 3 LTRs), 6 unclassified elements (XDMR and XDMR_DM), and 7 satellite sequences. Bio4342 Exercise 1 Answers: Detecting and Interpreting Genetic Homology (Answers prepared by Wilson Leung) Question 1: Low complexity DNA can be described as sequences that consist primarily of one or

More information

Genome annotation & EST

Genome annotation & EST Genome annotation & EST What is genome annotation? The process of taking the raw DNA sequence produced by the genome sequence projects and adding the layers of analysis and interpretation necessary

More information

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the POPULATION GENETICS POPULATION GENETICS studies the genetic composition of populations and how it changes with time. It includes the study of forces that induce evolution (the change of the genetic constitution)

More information

Chapter 19. Control of Eukaryotic Genome. AP Biology

Chapter 19. Control of Eukaryotic Genome. AP Biology Chapter 19. Control of Eukaryotic Genome The BIG Questions How are genes turned on & off in eukaryotes? How do cells with the same genes differentiate to perform completely different, specialized functions?

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

CHAPTERS 16 & 17: DNA Technology

CHAPTERS 16 & 17: DNA Technology CHAPTERS 16 & 17: DNA Technology 1. What is the function of restriction enzymes in bacteria? 2. How do bacteria protect their DNA from the effects of the restriction enzymes? 3. How do biologists make

More information

The Human Genome and its upcoming Dynamics

The Human Genome and its upcoming Dynamics The Human Genome and its upcoming Dynamics Matthias Platzer Genome Analysis Leibniz Institute for Age Research - Fritz-Lipmann Institute (FLI) Sequencing of the Human Genome Publications 2004 2001 2001

More information

Genotyping Technology How to Analyze Your Own Genome Fall 2013

Genotyping Technology How to Analyze Your Own Genome Fall 2013 Genotyping Technology 02-223 How to nalyze Your Own Genome Fall 2013 HapMap Project Phase 1 Phase 2 Phase 3 Samples & POP panels Genotyping centers Unique QC+ SNPs 269 samples (4 populations) HapMap International

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

7.03, 2005, Lecture 20 EUKARYOTIC GENES AND GENOMES I

7.03, 2005, Lecture 20 EUKARYOTIC GENES AND GENOMES I 7.03, 2005, Lecture 20 EUKARYOTIC GENES AND GENOMES I For the last several lectures we have been looking at how one can manipulate prokaryotic genomes and how prokaryotic genes are regulated. In the next

More information

Genome research in eukaryotes

Genome research in eukaryotes Functional Genomics Genome and EST sequencing can tell us how many POTENTIAL genes are present in the genome Proteomics can tell us about proteins and their interactions The goal of functional genomics

More information

The HapMap Project and Haploview

The HapMap Project and Haploview The HapMap Project and Haploview David Evans Ben Neale University of Oxford Wellcome Trust Centre for Human Genetics Human Haplotype Map General Idea: Characterize the distribution of Linkage Disequilibrium

More information

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger.

How does the human genome stack up? Genomic Size. Genome Size. Number of Genes. Eukaryotic genomes are generally larger. How does the human genome stack up? Organism Human (Homo sapiens) Laboratory mouse (M. musculus) Mustard weed (A. thaliana) Roundworm (C. elegans) Fruit fly (D. melanogaster) Yeast (S. cerevisiae) Bacterium

More information

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse

De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse SUPPLEMENTARY INFORMATION De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations Wong et al. The Supplementary Information contains 4 Supplementary Figures, 3

More information

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping

BENG 183 Trey Ideker. Genome Assembly and Physical Mapping BENG 183 Trey Ideker Genome Assembly and Physical Mapping Reasons for sequencing Complete genome sequencing!!! Resequencing (Confirmatory) E.g., short regions containing single nucleotide polymorphisms

More information

Whole genome sequencing in the UK Biobank

Whole genome sequencing in the UK Biobank Whole genome sequencing in the UK Biobank Part of the UK Government s Industrial Strategy Challenge Fund (ISCF) for the Data to Early Diagnosis and Precision Medicine initiative Aim to produce deep characterisation

More information

3I03 - Eukaryotic Genetics Repetitive DNA

3I03 - Eukaryotic Genetics Repetitive DNA Repetitive DNA Satellite DNA Minisatellite DNA Microsatellite DNA Transposable elements LINES, SINES and other retrosequences High copy number genes (e.g. ribosomal genes, histone genes) Multifamily member

More information

Analysing Alu inserts detected from high-throughput sequencing data

Analysing Alu inserts detected from high-throughput sequencing data Analysing Alu inserts detected from high-throughput sequencing data Harun Mustafa Mentor: Matei David Supervisor: Michael Brudno July 3, 2013 Before we begin... Even though I'll only present the minimal

More information

GENES AND CHROMOSOMES II

GENES AND CHROMOSOMES II 1 GENES AND CHROMOSOMES II Lecture 4 BIOL 266/2 2014-15 Dr. S. Azam Biology Department Concordia University 2 GENE AND THE GENOME The Structure of the Genome DNA fingerprinting 3 DNA fingerprinting: DNA-based

More information

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding

UCSC Genome Browser. Introduction to ab initio and evidence-based gene finding UCSC Genome Browser Introduction to ab initio and evidence-based gene finding Wilson Leung 06/2006 Outline Introduction to annotation ab initio gene finding Basics of the UCSC Browser Evidence-based gene

More information

Linking Genetic Variation to Important Phenotypes

Linking Genetic Variation to Important Phenotypes Linking Genetic Variation to Important Phenotypes BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under

More information

Enzyme that uses RNA as a template to synthesize a complementary DNA

Enzyme that uses RNA as a template to synthesize a complementary DNA Biology 105: Introduction to Genetics PRACTICE FINAL EXAM 2006 Part I: Definitions Homology: Comparison of two or more protein or DNA sequence to ascertain similarities in sequences. If two genes have

More information

Training materials.

Training materials. Training materials - Ensembl training materials are protected by a CC BY license - http://creativecommons.org/licenses/by/4.0/ - If you wish to re-use these materials, please credit Ensembl for their creation

More information

Genome Sequencing-- Strategies

Genome Sequencing-- Strategies Genome Sequencing-- Strategies Bio 4342 Spring 04 What is a genome? A genome can be defined as the entire DNA content of each nucleated cell in an organism Each organism has one or more chromosomes that

More information

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah

Analysis of structural variation. Alistair Ward USTAR Center for Genetic Discovery University of Utah Analysis of structural variation Alistair Ward USTAR Center for Genetic Discovery University of Utah What is structural variation? What differentiates SV from short variants? What are the major SV types?

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm

The Whole Genome TagSNP Selection and Transferability Among HapMap Populations. Reedik Magi, Lauris Kaplinski, and Maido Remm The Whole Genome TagSNP Selection and Transferability Among HapMap Populations Reedik Magi, Lauris Kaplinski, and Maido Remm Pacific Symposium on Biocomputing 11:535-543(2006) THE WHOLE GENOME TAGSNP SELECTION

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics In silico and In clinico characterization of genetic variations Assistant Professor Department of Biomedical Informatics Center for Human Genetics Research ATCAAAATTATGGAAGAA ATCAAAATCATGGAAGAA

More information

Resources at HapMap.Org

Resources at HapMap.Org Resources at HapMap.Org HapMap Phase II Dataset Release #21a, January 2007 (NCBI build 35) 3.8 M genotyped SNPs => 1 SNP/700 bp # polymorphic SNPs/kb in consensus dataset International HapMap Consortium

More information

Lecture 20: Drosophila melanogaster

Lecture 20: Drosophila melanogaster Lecture 20: Drosophila melanogaster Model organisms Polytene chromosome Life cycle P elements and transformation Embryogenesis Read textbook: 732-744; Fig. 20.4; 20.10; 20.15-26 www.mhhe.com/hartwell3

More information

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls BMI/CS 776 www.biostat.wisc.edu/bmi776/ Mark Craven craven@biostat.wisc.edu Spring 2011 1. Understanding Human Genetic Variation!

More information

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu. handouts, papers, datasets

Ensembl workshop. Thomas Randall, PhD bioinformatics.unc.edu.   handouts, papers, datasets Ensembl workshop Thomas Randall, PhD tarandal@email.unc.edu bioinformatics.unc.edu www.unc.edu/~tarandal/ensembl handouts, papers, datasets Ensembl is a joint project between EMBL - EBI and the Sanger

More information

Annotating 7G24-63 Justin Richner May 4, Figure 1: Map of my sequence

Annotating 7G24-63 Justin Richner May 4, Figure 1: Map of my sequence Annotating 7G24-63 Justin Richner May 4, 2005 Zfh2 exons Thd1 exons Pur-alpha exons 0 40 kb 8 = 1 kb = LINE, Penelope = DNA/Transib, Transib1 = DINE = Novel Repeat = LTR/PAO, Diver2 I = LTR/Gypsy, Invader

More information

Why learn linkage analysis?

Why learn linkage analysis? Why learn linkage analysis? - and some basic genetics Kaja Selmer 2013 Outline What is linkage analysis and why learn it? An example of a successful linkage analysis story Basic genetics DNA content and

More information

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html UCSC

More information

Aaditya Khatri. Abstract

Aaditya Khatri. Abstract Abstract In this project, Chimp-chunk 2-7 was annotated. Chimp-chunk 2-7 is an 80 kb region on chromosome 5 of the chimpanzee genome. Analysis with the Mapviewer function using the NCBI non-redundant database

More information

Authors: Vivek Sharma and Ram Kunwar

Authors: Vivek Sharma and Ram Kunwar Molecular markers types and applications A genetic marker is a gene or known DNA sequence on a chromosome that can be used to identify individuals or species. Why we need Molecular Markers There will be

More information

Genome Projects. Part III. Assembly and sequencing of human genomes

Genome Projects. Part III. Assembly and sequencing of human genomes Genome Projects Part III Assembly and sequencing of human genomes All current genome sequencing strategies are clone-based. 1. ordered clone sequencing e.g., C. elegans well suited for repetitive sequences

More information

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics

Genetic Variation and Genome- Wide Association Studies. Keyan Salari, MD/PhD Candidate Department of Genetics Genetic Variation and Genome- Wide Association Studies Keyan Salari, MD/PhD Candidate Department of Genetics How many of you did the readings before class? A. Yes, of course! B. Started, but didn t get

More information

Introduction to human genomics and genome informatics

Introduction to human genomics and genome informatics Introduction to human genomics and genome informatics Session 1 Prince of Wales Clinical School Dr Jason Wong ARC Future Fellow Head, Bioinformatics & Integrative Genomics Adult Cancer Program, Lowy Cancer

More information

Technologies, resources and tools for the exploitation of the sheep and goat genomes.

Technologies, resources and tools for the exploitation of the sheep and goat genomes. Technologies, resources and tools for the exploitation of the sheep and goat genomes. B. P. Dalrymple, G. Tosser-Klopp, N. Cockett, A. Archibald, W. Zhang and J. Kijas. The plan The current state of the

More information

Relationship of Gene s Types and Introns

Relationship of Gene s Types and Introns Chi To BME 230 Final Project Relationship of Gene s Types and Introns Abstract: The relationship in gene ontology classification and the modification of the length of introns through out the evolution

More information

Structure, Measurement & Analysis of Genetic Variation

Structure, Measurement & Analysis of Genetic Variation Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine

More information

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers

Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Week 1 BCHM 6280 Tutorial: Gene specific information using NCBI, Ensembl and genome viewers Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html

More information

user s guide Question 3

user s guide Question 3 Question 3 During a positional cloning project aimed at finding a human disease gene, linkage data have been obtained suggesting that the gene of interest lies between two sequence-tagged site markers.

More information

Studying the Human Genome. Lesson Overview. Lesson Overview Studying the Human Genome

Studying the Human Genome. Lesson Overview. Lesson Overview Studying the Human Genome Lesson Overview 14.3 Studying the Human Genome THINK ABOUT IT Just a few decades ago, computers were gigantic machines found only in laboratories and universities. Today, many of us carry small, powerful

More information

Motivation From Protein to Gene

Motivation From Protein to Gene MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein

More information

Chimp Sequence Annotation: Region 2_3

Chimp Sequence Annotation: Region 2_3 Chimp Sequence Annotation: Region 2_3 Jeff Howenstein March 30, 2007 BIO434W Genomics 1 Introduction We received region 2_3 of the ChimpChunk sequence, and the first step we performed was to run RepeatMasker

More information

Array-Ready Oligo Set for the Rat Genome Version 3.0

Array-Ready Oligo Set for the Rat Genome Version 3.0 Array-Ready Oligo Set for the Rat Genome Version 3.0 We are pleased to announce Version 3.0 of the Rat Genome Oligo Set containing 26,962 longmer probes representing 22,012 genes and 27,044 gene transcripts.

More information

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes Coalescence Scribe: Alex Wells 2/18/16 Whenever you observe two sequences that are similar, there is actually a single individual

More information

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573 Mark J. Rieder Department of Genome Sciences mrieder@u.washington washington.edu Epidemiology Studies Cohort Outcome Model to fit/explain

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype?

FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype? FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype? 1 Linkage & Recombination HUH? What? Why? Who cares? How? Multiple choice question. Each colored line represents

More information

Greene 1. Finishing of DEUG The entire genome of Drosophila eugracilis has recently been sequenced using Roche

Greene 1. Finishing of DEUG The entire genome of Drosophila eugracilis has recently been sequenced using Roche Greene 1 Harley Greene Bio434W Elgin Finishing of DEUG4927002 Abstract The entire genome of Drosophila eugracilis has recently been sequenced using Roche 454 pyrosequencing and Illumina paired-end reads

More information

Chimp Chunk 3-14 Annotation by Matthew Kwong, Ruth Howe, and Hao Yang

Chimp Chunk 3-14 Annotation by Matthew Kwong, Ruth Howe, and Hao Yang Chimp Chunk 3-14 Annotation by Matthew Kwong, Ruth Howe, and Hao Yang Ruth Howe Bio 434W April 1, 2010 INTRODUCTION De novo annotation is the process by which a finished genomic sequence is searched for

More information

Guided tour to Ensembl

Guided tour to Ensembl Guided tour to Ensembl Introduction Introduction to the Ensembl project Walk-through of the browser Variations and Functional Genomics Comparative Genomics BioMart Ensembl Genome browser http://www.ensembl.org

More information

Course Overview. Objectives

Course Overview. Objectives Current Topics in Course Overview Objectives Introduce the frontiers in genomics and epigenomic research, including the new concepts in the related fields and the new computational and experimental techniques.

More information

Genomes and Their Evolution

Genomes and Their Evolution 18 CAMPBELL BIOLOGY IN FOCUS Genomes and Their Evolution URRY CAIN WASSERMAN MINORSKY REECE Lecture Presentations by Kathleen Fitzpatrick and Nicole Tunbridge, Simon Fraser University SECOND EDITION Overview:

More information

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome Of course, every person on the planet with the exception of identical twins has a unique

More information

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses

Course Information. Introduction to Algorithms in Computational Biology Lecture 1. Relations to Some Other Courses Course Information Introduction to Algorithms in Computational Biology Lecture 1 Meetings: Lecture, by Dan Geiger: Mondays 16:30 18:30, Taub 4. Tutorial, by Ydo Wexler: Tuesdays 10:30 11:30, Taub 2. Grade:

More information

Human genetic variation

Human genetic variation Human genetic variation CHEW Fook Tim Human Genetic Variation Variants contribute to rare and common diseases Variants can be used to trace human origins Human Genetic Variation What types of variants

More information

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017 INTRODUCTION TO MOLECULAR GENETICS Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017 Learning Objectives Understand: The distinction between Quantitative Genetic

More information

Computational Biology I LSM5191 (2003/4)

Computational Biology I LSM5191 (2003/4) Computational Biology I LSM5191 (2003/4) Aylwin Ng, D.Phil. Lecture Notes: Features of the Human Genome Reading List International Human Genome Sequencing Consortium (2001). Initial sequencing and analysis

More information

Analysis of structural variation. Alistair Ward - Boston College

Analysis of structural variation. Alistair Ward - Boston College Analysis of structural variation Alistair Ward - Boston College What is structural variation? What differentiates SV from short variants? What are the major SV types? Summary of MEI detection What is an

More information

Recombination, and haplotype structure

Recombination, and haplotype structure 2 The starting point We have a genome s worth of data on genetic variation Recombination, and haplotype structure Simon Myers, Gil McVean Department of Statistics, Oxford We wish to understand why the

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

Chapter 2: Access to Information

Chapter 2: Access to Information Chapter 2: Access to Information Outline Introduction to biological databases Centralized databases store DNA sequences Contents of DNA, RNA, and protein databases Central bioinformatics resources: NCBI

More information

DNA Evolution of knowledge about gene. Contains information about RNAs and proteins. Polynucleotide chains; Double stranded molecule;

DNA Evolution of knowledge about gene. Contains information about RNAs and proteins. Polynucleotide chains; Double stranded molecule; Evolution of knowledge about gene G. Mendel Hereditary factors W.Johannsen, 1909 G.W.Beadle, E.L.Tatum, 1945 Ingram, 1957 Actual concepts The gene hereditary unit located in chromosomes Hypotheses One

More information

Practical Of Genetics

Practical Of Genetics Practical Of Genetics 1. Students will be able to demonstrate a microtechnique for reliable chromosomal analysis of leucocytes obtained from peripheral blood. 2. Students will be able to prepare a karyotype

More information

Next Generation Genetics: Using deep sequencing to connect phenotype to genotype

Next Generation Genetics: Using deep sequencing to connect phenotype to genotype Next Generation Genetics: Using deep sequencing to connect phenotype to genotype http://1001genomes.org Korbinian Schneeberger Connecting Genotype and Phenotype Genotyping SNPs small Resequencing SVs*

More information

Introduction to Algorithms in Computational Biology Lecture 1

Introduction to Algorithms in Computational Biology Lecture 1 Introduction to Algorithms in Computational Biology Lecture 1 Background Readings: The first three chapters (pages 1-31) in Genetics in Medicine, Nussbaum et al., 2001. This class has been edited from

More information

Human Chromosomes Section 14.1

Human Chromosomes Section 14.1 Human Chromosomes Section 14.1 In Today s class. We will look at Human chromosome and karyotypes Autosomal and Sex chromosomes How human traits are transmitted How traits can be traced through entire families

More information

Axiom mydesign Custom Array design guide for human genotyping applications

Axiom mydesign Custom Array design guide for human genotyping applications TECHNICAL NOTE Axiom mydesign Custom Genotyping Arrays Axiom mydesign Custom Array design guide for human genotyping applications Overview In the past, custom genotyping arrays were expensive, required

More information

Lac Operon contains three structural genes and is controlled by the lac repressor: (1) LacY protein transports lactose into the cell.

Lac Operon contains three structural genes and is controlled by the lac repressor: (1) LacY protein transports lactose into the cell. Regulation of gene expression a. Expression of most genes can be turned off and on, usually by controlling the initiation of transcription. b. Lactose degradation in E. coli (Negative Control) Lac Operon

More information

BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D. Steve Thompson:

BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D. Steve Thompson: BIOL 1030 Introduction to Biology: Organismal Biology. Fall 2009 Sections B & D Steve Thompson: stthompson@valdosta.edu http://www.bioinfo4u.net 1 DNA transcription and regulation We ve seen how the principles

More information

Happy Monday! Have out: 15.1 Notes (due today) Pen or pencil. Upcoming: 15.1 Quiz on block day 15.2 Notes due Friday (2/1)

Happy Monday! Have out: 15.1 Notes (due today) Pen or pencil. Upcoming: 15.1 Quiz on block day 15.2 Notes due Friday (2/1) Happy Monday! Have out: 15.1 Notes (due today) Pen or pencil Upcoming: 15.1 Quiz on block day 15.2 Notes due Friday (2/1) Plan for today Check 15.1 Notes Go over 15.1 Practice problems 15.1: Human Chromosomes

More information

FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype?

FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype? FINDING THE PAIN GENE How do geneticists connect a specific gene with a specific phenotype? 1 Linkage & Recombination HUH? What? Why? Who cares? How? Multiple choice question. Each colored line represents

More information

Overview of Human Genetics

Overview of Human Genetics Overview of Human Genetics 1 Structure and function of nucleic acids. 2 Structure and composition of the human genome. 3 Mendelian genetics. Lander et al. (Nature, 2001) MAT 394 (ASU) Human Genetics Spring

More information

Biol 478/595 Intro to Bioinformatics

Biol 478/595 Intro to Bioinformatics Biol 478/595 Intro to Bioinformatics September M 1 Labor Day 4 W 3 MG Database Searching Ch. 6 5 F 5 MG Database Searching Hw1 6 M 8 MG Scoring Matrices Ch 3 and Ch 4 7 W 10 MG Pairwise Alignment 8 F 12

More information

Lecture 2: Biology Basics Continued

Lecture 2: Biology Basics Continued Lecture 2: Biology Basics Continued Central Dogma DNA: The Code of Life The structure and the four genomic letters code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A-T and

More information