Linking Genetic Variation to Important Phenotypes

Similar documents
Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

SNPs - GWAS - eqtls. Sebastian Schmeier

Applied Bioinformatics

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Training materials.

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Algorithms for Genetics: Introduction, and sources of variation

1b. How do people differ genetically?

Epigenetics and DNase-Seq

Lecture 2: Biology Basics Continued

Next Generation Genetics: Using deep sequencing to connect phenotype to genotype

Analysis of genome-wide genotype data

Course: IT and Health

Molecular Genetics of Disease and the Human Genome Project

Crash-course in genomics

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Genome 373: Mapping Short Sequence Reads II. Doug Fowler

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Genomes contain all of the information needed for an organism to grow and survive.

AS91159 Demonstrate understanding of gene expression

Potential of human genome sequencing. Paul Pharoah Reader in Cancer Epidemiology University of Cambridge

Genome-wide association studies (GWAS) Part 1

SNP calling and VCF format

Authors: Vivek Sharma and Ram Kunwar

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017

Lab 1: A review of linear models

14 March, 2016: Introduction to Genomics

Applicazioni biotecnologiche

Computational Workflows for Genome-Wide Association Study: I

What is genetic variation?

Genome-Wide Association Studies (GWAS): Computational Them

Chapter 14: Genes in Action

BIOLOGY - CLUTCH CH.20 - BIOTECHNOLOGY.

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey

Familial Breast Cancer

Bioinformatic Analysis of SNP Data for Genetic Association Studies EPI573

What determines if a mutation is deleterious, neutral, or beneficial?

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA.

CUMACH - A Fast GPU-based Genotype Imputation Tool. Agatha Hu

Why do we need statistics to study genetics and evolution?

Happy Monday! Have out: 15.1 Notes (due today) Pen or pencil. Upcoming: 15.1 Quiz on block day 15.2 Notes due Friday (2/1)

The Human Genome Project has always been something of a misnomer, implying the existence of a single human genome

Deleterious mutations

Structure, Measurement & Analysis of Genetic Variation

BST227 Introduction to Statistical Genetics

Estimation problems in high throughput SNP platforms

Structural variation. Marta Puig Institut de Biotecnologia i Biomedicina Universitat Autònoma de Barcelona

Global Screening Array (GSA)

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Exome Sequencing Exome sequencing is a technique that is used to examine all of the protein-coding regions of the genome.

GENETICS - CLUTCH CH.15 GENOMES AND GENOMICS.

Annotating your variants: Ensembl Variant Effect Predictor (VEP) Helen Sparrow Ensembl EMBL-EBI 2nd November 2016

GENE MAPPING. Genetica per Scienze Naturali a.a prof S. Presciuttini

Axiom Biobank Genotyping Solution

Biomedical Big Data and Precision Medicine

Association studies (Linkage disequilibrium)

Whole Genome Sequencing. Biostatistics 666

THE HEALTH AND RETIREMENT STUDY: GENETIC DATA UPDATE

Genome Wide Association Studies

Nutrigenomics and nutrigenetics are they the keys for healthy nutrition?

The Human Genome and its upcoming Dynamics

CS 262 Lecture 14 Notes Human Genome Diversity, Coalescence and Haplotypes

Single Nucleotide Polymorphisms (SNPs)

Gene Mapping in Natural Plant Populations Guilt by Association

Biology 105: Introduction to Genetics PRACTICE FINAL EXAM Part I: Definitions. Homology: Reverse transcriptase. Allostery: cdna library

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Advanced Bioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2018

Lesson Overview. Studying the Human Genome. Lesson Overview Studying the Human Genome

Mutations during meiosis and germ line division lead to genetic variation between individuals

For more information about how to cite these materials visit

A brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Genetics and Biotechnology. Section 1. Applied Genetics

Bio 311 Learning Objectives

Agilent NGS Solutions : Addressing Today s Challenges

Next Generation Sequencing Technologies. Rob Mitra 1/30/17

Introduction to Genetics and Pharmacogenomics

Wake Acceleration Academy - Biology Note Guide Unit 5: Molecular Genetics

SENIOR BIOLOGY. Blueprint of life and Genetics: the Code Broken? INTRODUCTORY NOTES NAME SCHOOL / ORGANISATION DATE. Bay 12, 1417.

So, just exactly when will genetics revolutionise medicine? - Part 1

Why learn linkage analysis?

Measuring transcriptomes with RNA-Seq

Identifying Signaling Pathways. BMI/CS 776 Spring 2016 Anthony Gitter

Finding Enrichments of Functional Annotations for Disease- Associated Single-Nucleotide Polymorphisms

This is a closed book, closed note exam. No calculators, phones or any electronic device are allowed.

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

Bull Selection Strategies Using Genomic Estimated Breeding Values

BTRY 7210: Topics in Quantitative Genomics and Genetics

Enzyme that uses RNA as a template to synthesize a complementary DNA

From Genotype to Phenotype

POLYMORPHISM AND VARIANT ANALYSIS. Matt Hudson Crop Sciences NCSA HPCBio IGB University of Illinois

The Diploid Genome Sequence of an Individual Human

Mutations, Meioses, and Maps

Transcription:

Linking Genetic Variation to Important Phenotypes BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2018 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by Mark Craven, Colin Dewey, and Anthony Gitter

Outline How does the genome vary between individuals? How do we identify associations between genetic variations and simple phenotypes/diseases? How do we identify associations between genetic variations and complex phenotypes/diseases? 2

Understanding Human Genetic Variation The human genome was determined by sequencing DNA from a small number of individuals (2001) The HapMap project (initiated in 2002) looked at polymorphisms in 270 individuals (Affymetrix GeneChip) The 1000 Genomes project (initiated in 2008) sequenced the genomes of 2500 individuals from diverse populations 23andMe genotyped its 1 millionth customer in 2015 Genomics England plans to sequence 100k whole genomes and link with medical records, 49k so far 3

Classes of Variants Single Nucleotide Polymorphisms (SNPs) Indels (insertions/deletions) Structural variants Formal definitions: https://www.snpedia.com/index.php/glossary 4

Single Nucleotide Polymorphisms (SNPs) One nucleotide changes Variation occurs with some minimal frequency in a population Pronounced snip www.mdpi.com 5

Insertions and Deletions Black box: DNA template strand White box: newly replicated DNA Insertion: slippage inserts extra nucleotides Forster et al. Proc. R. Soc. B 2015 Deletion: slippage excludes template nucleotides 6

Structural Variants Copy number variants (CNVs) Gain or loss or large genomic regions, even entire chromosomes Inversions DNA subsequence is reversed Translocations DNA subsequence is moved to a different chromosome 7

Genetic Recombination 8

Recombination Errors Lead to Copy Number Variants (CNVs) 9

1000 Genomes Project Project goal: produce a catalog of human variation down to variants that occur at >= 1% frequency over the genome 10

Understanding Associations Between Genetic Variation and Disease Genome-wide association study (GWAS) Gather some population of individuals Genotype each individual at polymorphic markers (usually SNPs) Test association between state at marker and some variable of interest (say disease) Adjust for multiple comparisons Phenotypes: observable traits 11

p = E-5 p = E-3 12

Wellcome Trust GWAS 13

Morning Person GWAS P = 5.0 10 8 Hu et al. Nature Communications 2016 14

Understanding Associations Between Genetic Variation and Disease International Cancer Genome Consortium Includes NIH s The Cancer Genome Atlas Sequencing DNA from 500 tumor samples for each of 50 different cancers Goal is to distinguish drivers (mutations that cause and accelerate cancers) from passengers (mutations that are byproducts of cancer s growth) 15

A Circos Plot 16

Some Cancer Genomes 17

Understanding Associations Between Genetic Variation and Complex Phenotypes Quantitative trait loci (QTL) mapping Gather some population of individuals Genotype each individual at polymorphic markers Map quantitative trait(s) of interest to chromosomal locations that seem to explain variation in trait 18

QTL Mapping Example 19

QTL Mapping Example QTL mapping of mouse blood pressure, heart rate [Sugiyama et al., Broman et al.] Logarithm of Odds LOD(q) = log 10 P(q QTL at m) P(q no QTL at m) quantitative trait position in the genome 20

QTL Example: Genotype-Tissue Expression Project (GTEx) Expression QTL (eqtl): traits are expression levels of various genes Map genotype to gene expression in different human tissues 21

QTL Example: GTEx https://www.genome.gov/27543767/ 22

GWAS Versus QTL Both associate genotype with phenotype GWAS pertains to discrete phenotypes For example, disease status is binary QTL pertains to quantitative (continuous) phenotypes Height Gene expression Splicing events Metabolite abundance 23

Determining Association is Not Enough A simple case: CFTR (Cystic Fibrosis Transmembrane Conductance Regulator) 24

Many Measured SNPs Not in Coding Regions Genes encoding CD40 and CD40L with relative positions of the SNPs studied Chadha et al. Eur J Hum Genet 2005 25

Computational Problems Assembly and alignment of thousands of genomes Data structures to capture extensive variation Identifying functional roles of markers of interest (which genes/pathways does a mutation affect and how?) Identifying interactions in multi-allelic diseases (which combinations of mutations lead to a disease state?) Identifying genetic/environmental interactions that lead to disease Inferring network models that exploit all sources of evidence: genotype, expression, metabolic, etc. Detecting large structural variants 26