Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics

Size: px
Start display at page:

Download "Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics"

Transcription

1 Lecture: Genetic Basis of Complex Phenotypes Advanced Topics in Computa8onal Genomics

2 Genome Polymorphisms

3 A Human Genealogy TCGAGGTATTAAC The ancestral chromosome

4 From SNPS TCGAGGTATTAAC TCTAGGTATTAAC TCGAGGCATTAAC TCTAGGTGTTAAC TCGAGGTATTAGC TCTAGGTATCAAC * ** * *

5 To Haplotypes A disease muta8on

6 Population-Based Association Study Case/control data are collected from unrelated individuals All individuals are related if we go back far enough in the ancestry Balding, Nature Reviews Gene8cs, 2006

7 Type of Polymorphisms Each variant is called an allele " Almost always bi-allelic" Account for most of the genetic diversi ty among different (normal) individual, e.g. drug response, disease susceptib ility

8 Advantages of SNPs in Genetic Analysis of Complex Traits Abundance: high frequency on the genome Posi8on: throughout the genome coding region, intron region, promoter site Ease of genotyping Less mutable than other forms of polymorphisms SNPs account for around 90% of human genomic varia8on About 10 million SNPs exist in human popula8ons Most SNPs are outside of the protein coding regions 1 SNP every 600 base pairs More than 5 million common SNPs each with frequency 10-50% account for the bulk of human DNA sequence difference It is es8mated that ~60,000 SNPs occur within exons; 85% of exons are within 5 kb of the nearest SNP

9 Causal Mutations and Genetic Markers Causal Muta8on X X X SNP Marker Linkage Disequilibrium SNP marker serves only as a marker for the causal muta8on In order to find the causal muta8on, fine mapping (sequencing the SNP region) is required

10 Linkage Analysis vs. Association Analysis Strachan & Read, Human Molecular Gene8cs, 2001

11 Overview Single SNP associa8on test Discrete- valued phenotype: case/control study Con8nuous- valued phenotype: quan8ta8ve traits Correc8ng for mul8ple tes8ng Leveraging linkage disequilibrium Mul8marker associa8on test Genotype imputa8on method

12 Single SNP Association Analysis: Case/Control Study For each marker locus, find the 3x2 con8ngency table containing the counts of three genotypes Genotype Case Control AA Ncase,AA Ncontrol,AA Aa Ncase,Aa Ncontrol,Aa aa Ncase,aa Ncontrol,aa 2 χ Total Ncase Ncontrol test with 2 df, or Fisher s exact test under the null hypothesis of no associa8on Genotype score = the number of minor alleles

13 Single SNP Association Analysis: Case/Control Study Alterna8vely, assume an addi8ve model, where the heterozygote risk is approximately between the two homozygotes Form a 2x2 con8ngency table. Each individual contributes twice from each of the two chromosomes. Genotype Case Control A Gcase,A Gcontrol,A a Gcase,a Gcontrol,a Total 2xNcase 2xNcontrol 2 χ test with 1df

14 Single SNP Association Analysis: Continuous-valued Traits Con8nuous- valued traits Also called quan8ta8ve traits Cholesterol level, blood pressure etc. For each locus, fit a linear regression using the number of minor alleles at the given locus of the individual as covariate

15 Genetic Model for Association Addi8ve effect Major allele homozygote: 0 Heterozygote: a + a x k Minor allele homozygote: 2a k=1: dominant effect of the minor allele k=0: no dominance k=- 1: dominant effect of the minor allele

16 Penetrance Propor8ons of individuals carrying a par8cular allele that possess an associated trait Alleles with high penetrance are easier to detect in associa8on analysis

17 Correcting for Multiple Testing What happens when we scan the genome of 1 million markers for associa8on with α = 0.05? 50,000 (=1 millionx0.05) SNPs are expected to be found significant just by chance We need to be more conserva8ve when we decide a given marker is significantly associated with the trait. Correc8on methods Bonferroni correc8on Permuta8on test

18 Bonferroni Correction If N markers are tested, we correct the significance level as α = α/n Assumes the N tests are independent, although this is not true because of the linkage disequilibrium. Overly conserva8ve for 8ghtly linked markers

19 Permutation Procedure Step 1: Compute the test sta8s8c T using the original dataset Step 2: Set Nsig = 0 Step 3: Repeat 1:Nperm Step 3a: Randomly permute the individuals in the phenotype data to generate datasets with no associa8on (retain the original genotype) Step 3b: Find the test sta8s8cs Tperm of SNPs using the permuted dataset Step 3c: if T> Tperm, Nsig = Nsig+1 Step 4: Compute p- value as (1- Nsig/Nperm) This approach is computa8onally demanding because onen a large N perm is required.

20 Multi-marker Association Test Idea: a haplotype of mul8ple SNPs is a beoer proxy for a true causal SNP than a single SNP Exploit the linkage disequilibrium structure in genome Form a new allele by combining mul8ple SNPs for a haplotype SNP A SNP B Auxiliary Markers for Haplotypes Test the haplotype allele for associa8on

21 Multi-marker Association Test Mul8- marker approach can capture dependencies across mul8ple markers SNPs in LD form a haplotype that can be tested as a single allele Can achieve the same power with data collected for fewer samples Challenge as the size of haplotype increases Haplotype of K SNPs results in 2 K different haplotypes, but the number of samples corresponding to each haplotype decreases quickly as we increase K Large K requires a large sample size

22 Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks Nature Gene8cs, (J. Zhu et al.)

23 Yeast Genomic Datasets (Zhu et al.) Yeast genomic datasets - Genotypes from 112 segregants from a yeast cross between BY and RM strains - Microarray gene- expression data - Transcrip8on factor binding site data - Protein- protein interac8on data

24 Analysis Procedure (Zhu et al.) Gene expression data analysis to infer gene coexpression network eqtl (expression quan8ta8ve trait locus) analysis Gene expression data as phenotype data Can we iden8fy the gene8c locus that controls the expression of genes? Learning a predic8ve model for yeast gene network Integrate mul8ple genomic data to infer gene network gene expression/eqtl/tfbs/ppi data

25 Gene Coexpression Network Hierarchical clustering of genes Iden8fied gene modules How to validate the gene modules? GO enrichment analysis as a proxy

26 Gene Set Enrichment Analysis Given a subset of genes, we would like to test whether these genes share a common func8on. KEGG pathway and gene ontology (GO) database provide informa8on on known gene func8on

27 Gene Set Enrichment Test for Computational Validation of Gene Clusters Suppose we have generated k clusters (sets of gene profiles) C 1,,C k. How do we assess the significance of their rela8on to m known (poten8ally overlapping) categories G 1,,G m (e.g., GO categories)? Let's start by comparing a single cluster C i with a single category G j. The p- value for such a match is based on the hyper- geometric distribu8on. This is the probability that a randomly chosen C i elements out of N would have m elements in common with G j. P(l) = G i N G i m C i m N C i m: the total number of genes in C i that overlap with G j

28 Overlap: m genes P(l) = G i N G i m C i m N C i N genes Genes in cluster C j Genes in G j in the given GO category

29 Network Modules, GO Enrichment, eqtl Hotspots

30 eqtl Hotspots eqtl hotspots: pleiotropic control of mul8ple genes by a common genomic locus cis eqtl: affected genes are physically located in cis to the genomic locus trans eqtl: affected genes are located distantly from the eqtl

31 Network Modules, GO Enrichment, eqtl Hotspots

32 eqtl Hotspots No ground truth for eqtls. How to validate the results? Use results from knockout experiments, TFBS experiments as a proxy Again, gene set enrichment analysis

33 TFBS Target Enrichment, Knock-Out Signature Enrichment

34 Learning Bayesian Networks: Integrating Different Genomic Data Incorpora8ng more genomic data into network learning can increase the predic8ve power for regulators Bayesian network I (BN raw ) Derived from gene expression data Bayesian network II (BN qtl ) Derived from gene expression, eqtl data Bayesian network III (BN full ) Derived from gene expression, eqtl, TFBS (ChIP- chip experiments), PPI data

35 Incorporating eqtls in Network Learning A two step analysis: First perform eqtl analysis Incorporate the iden8fied eqtls in the network learning process For a given eqtl, genes with cis eqtls can be parents of genes with trans eqtls For a given eqtl, genes with trans eqtls are not allowed to be parents of genes with cis eqtls.

36 Computationally Identified Causal Regulators

Linkage Analysis Computa.onal Genomics Seyoung Kim

Linkage Analysis Computa.onal Genomics Seyoung Kim Linkage Analysis 02-710 Computa.onal Genomics Seyoung Kim Genome Polymorphisms Gene.c Varia.on Phenotypic Varia.on A Human Genealogy TCGAGGTATTAAC The ancestral chromosome SNPs and Human Genealogy A->G

More information

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies

Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Introduc)on to Sta)s)cal Gene)cs: emphasis on Gene)c Associa)on Studies Lisa J. Strug, PhD Guest Lecturer Biosta)s)cs Laboratory Course (CHL5207/8) March 5, 2015 Gene Mapping in the News Study Finds Gene

More information

Gene Regulatory Networks Computa.onal Genomics Seyoung Kim

Gene Regulatory Networks Computa.onal Genomics Seyoung Kim Gene Regulatory Networks 02-710 Computa.onal Genomics Seyoung Kim Transcrip6on Factor Binding Transcrip6on Control Gene transcrip.on is influenced by Transcrip.on factor binding affinity for the regulatory

More information

Computational Genomics

Computational Genomics Computational Genomics 10-810/02 810/02-710, Spring 2009 Quantitative Trait Locus (QTL) Mapping Eric Xing Lecture 23, April 13, 2009 Reading: DTW book, Chap 13 Eric Xing @ CMU, 2005-2009 1 Phenotypical

More information

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics,

Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es. Addie Thompson Genomics, Genome-Wide Associa/on Studies: History, Current Approaches, and Future Opportuni/es Addie Thompson Genomics, 11-15-2016 Outline History and terminology Sta5s5cs and breeding Linkage and associa5on analysis,

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

Structure, Measurement & Analysis of Genetic Variation

Structure, Measurement & Analysis of Genetic Variation Structure, Measurement & Analysis of Genetic Variation Sven Cichon, PhD Professor of Medical Genetics, Director, Division of Medcial Genetics, University of Basel Institute of Neuroscience and Medicine

More information

Natural Selection Advanced Topics in Computa8onal Genomics

Natural Selection Advanced Topics in Computa8onal Genomics Natural Selection 02-715 Advanced Topics in Computa8onal Genomics Natural Selection Compara8ve studies across species O=en focus on protein- coding regions Genes under selec8ve pressure Immune- related

More information

Understanding genetic association studies. Peter Kamerman

Understanding genetic association studies. Peter Kamerman Understanding genetic association studies Peter Kamerman Outline CONCEPTS UNDERLYING GENETIC ASSOCIATION STUDIES Genetic concepts: - Underlying principals - Genetic variants - Linkage disequilibrium -

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu January 29, 2015 Why you re here

More information

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics

Decoding Chromatin States with Epigenome Data Advanced Topics in Computa8onal Genomics Decoding Chromatin States with Epigenome Data 02-715 Advanced Topics in Computa8onal Genomics HMMs for Decoding Chromatin States Epigene8c modifica8ons of the genome have been associated with Establishing

More information

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim

Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on Computa.onal Genomics Seyoung Kim Popula'on Gene'cs I: Gene'c Polymorphisms, Haplotype Inference, Recombina'on 02-710 Computa.onal Genomics Seyoung Kim Overview Two fundamental forces that shape genome sequences Recombina.on Muta.on, gene.c

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu Spring 2015, Thurs.,12:20-1:10

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

SNPs - GWAS - eqtls. Sebastian Schmeier

SNPs - GWAS - eqtls. Sebastian Schmeier SNPs - GWAS - eqtls s.schmeier@gmail.com http://sschmeier.github.io/bioinf-workshop/ 17.08.2015 Overview Single nucleotide polymorphism (refresh) SNPs effect on genes (refresh) Genome-wide association

More information

Popula'on Structure Computa.onal Genomics Seyoung Kim

Popula'on Structure Computa.onal Genomics Seyoung Kim Popula'on Structure 02-710 Computa.onal Genomics Seyoung Kim What is Popula'on Structure? Popula.on Structure A set of individuals characterized by some measure of gene.c dis.nc.on A popula.on is usually

More information

Forensics and DNA Sta1s1cs. Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland

Forensics and DNA Sta1s1cs. Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland Forensics and DNA Sta1s1cs Harry R Erwin, PhD CIS308 Faculty of Applied Sciences University of Sunderland References Goodwin, Linacre, and Hadi (2007) An Introduc+on to Forensic Gene+cs, Wiley. Butler

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

Downstream analysis of transcriptomic data

Downstream analysis of transcriptomic data Downstream analysis of transcriptomic data Shamith Samarajiwa CRUK Bioinforma3cs Summer School July 2015 General Methods Dimensionality reduc3on methods (clustering, PCA, MDS) Visualizing PaKerns (heatmaps,

More information

Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics

Lecture 2: Population Structure Advanced Topics in Computa8onal Genomics Lecture 2: Population Structure 02-715 Advanced Topics in Computa8onal Genomics 1 What is population structure? Popula8on Structure A set of individuals characterized by some measure of gene8c dis8nc8on

More information

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of

More information

POLYMORPHISM AND VARIANT ANALYSIS. Matt Hudson Crop Sciences NCSA HPCBio IGB University of Illinois

POLYMORPHISM AND VARIANT ANALYSIS. Matt Hudson Crop Sciences NCSA HPCBio IGB University of Illinois POLYMORPHISM AND VARIANT ANALYSIS Matt Hudson Crop Sciences NCSA HPCBio IGB University of Illinois Outline How do we predict molecular or genetic functions using variants?! Predicting when a coding SNP

More information

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Introduction to Add Health GWAS Data Part I Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Outline Introduction to genome-wide association studies (GWAS) Research

More information

Association studies (Linkage disequilibrium)

Association studies (Linkage disequilibrium) Positional cloning: statistical approaches to gene mapping, i.e. locating genes on the genome Linkage analysis Association studies (Linkage disequilibrium) Linkage analysis Uses a genetic marker map (a

More information

RNA sequencing Integra1ve Genomics module

RNA sequencing Integra1ve Genomics module RNA sequencing Integra1ve Genomics module Michael Inouye Centre for Systems Genomics University of Melbourne, Australia Summer Ins@tute in Sta@s@cal Gene@cs 2016 SeaBle, USA @minouye271 inouyelab.org This

More information

Variant Simulation Tools

Variant Simulation Tools Variant Simulation Tools Bo Peng Sep 25, 2014 Genetic Simulations Why perform simulations? To get data that match these (unrealis+c) assump+ons of our methods Validate sta+s+cal methods using simulated

More information

A very brief introduc0on to bioinforma0cs. Mikhail Spivakov, PhD European Bioinforma0cs Ins0tute

A very brief introduc0on to bioinforma0cs. Mikhail Spivakov, PhD European Bioinforma0cs Ins0tute A very brief introduc0on to bioinforma0cs Mikhail Spivakov, PhD European Bioinforma0cs Ins0tute What bioinforma0cs does? Cataloguing Mining Modelling For lab biologists to look at favourite genes etc.

More information

Human Genetics and Gene Mapping of Complex Traits

Human Genetics and Gene Mapping of Complex Traits Human Genetics and Gene Mapping of Complex Traits Advanced Genetics, Spring 2018 Human Genetics Series Thursday 4/5/18 Nancy L. Saccone, Ph.D. Dept of Genetics nlims@genetics.wustl.edu / 314-747-3263 What

More information

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work Genome 371, 1 March 2010, Lecture 13 Association Mapping Mendelian versus Complex Phenotypes How to Perform an Association Study Why Association Studies (Can) Work Introduction to LOD score analysis Common

More information

Sta$s$cs for Genomics ( )

Sta$s$cs for Genomics ( ) Sta$s$cs for Genomics (140.688) Instructor: Jeff Leek Website: http://www.biostat.jhsph.edu/~jleek/teaching/2011/genomics/ Class Times: MW, 10:30AM-11:50AM + R Lab TBA Grading: 20% Reading Assignments,

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics In silico and In clinico characterization of genetic variations Assistant Professor Department of Biomedical Informatics Center for Human Genetics Research ATCAAAATTATGGAAGAA ATCAAAATCATGGAAGAA

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA.

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA. Question & Answer Q&A: Genetic analysis of quantitative traits Trudy FC Mackay What are quantitative traits? Quantitative, or complex, traits are traits for which phenotypic variation is continuously distributed

More information

QTL Mapping, MAS, and Genomic Selection

QTL Mapping, MAS, and Genomic Selection QTL Mapping, MAS, and Genomic Selection Dr. Ben Hayes Department of Primary Industries Victoria, Australia A short-course organized by Animal Breeding & Genetics Department of Animal Science Iowa State

More information

Genetic Variation, Biological Pathways, and Networks. Sarah Pendergrass Center for Systems Genomics

Genetic Variation, Biological Pathways, and Networks. Sarah Pendergrass Center for Systems Genomics Genetic Variation, Biological Pathways, and Networks Sarah Pendergrass Center for Systems Genomics Outline What are networks and why are they important in biology? Biological Pathways Why do we care about

More information

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR

Human Genetic Variation. Ricardo Lebrón Dpto. Genética UGR Human Genetic Variation Ricardo Lebrón rlebron@ugr.es Dpto. Genética UGR What is Genetic Variation? Origins of Genetic Variation Genetic Variation is the difference in DNA sequences between individuals.

More information

SNP Matching Guide, BF McAllister

SNP Matching Guide, BF McAllister Informa(on in this guide is prepared and presented by Bryant McAllister, Associate Professor of Biology at The University of Iowa. This and other resources for understanding the interpreta(ons and uses

More information

Statistical Methods for Network Analysis of Biological Data

Statistical Methods for Network Analysis of Biological Data The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative

More information

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score Midterm 1 Results 10 Midterm 1 Akey/ Fields Median - 69 8 Number of Students 6 4 2 0 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 Exam Score Quick review of where we left off Parental type: the

More information

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs

Next Genera*on Sequencing II: Personal Genomics. Jim Noonan Department of Gene*cs Next Genera*on Sequencing II: Personal Genomics Jim Noonan Department of Gene*cs Personal genome sequencing Iden*fying the gene*c basis of phenotypic diversity among humans Gene*c risk factors for disease

More information

AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY

AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY E. ZEGGINI and J.L. ASIMIT Wellcome Trust Sanger Institute, Hinxton, CB10 1HH,

More information

Pathway Analysis Adding Func2onal Context to High- Throughput Results

Pathway Analysis Adding Func2onal Context to High- Throughput Results Pathway Analysis Adding Func2onal Context to High- Throughput Results Stephen D. Turner, Ph.D. Bioinforma2cs Core Director bioinforma2cs@virginia.edu Outline Bioinforma2cs & the Bioinforma2cs Core Service

More information

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY

HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Third Pavia International Summer School for Indo-European Linguistics, 7-12 September 2015 HISTORICAL LINGUISTICS AND MOLECULAR ANTHROPOLOGY Brigitte Pakendorf, Dynamique du Langage, CNRS & Université

More information

Introduction to BIOINFORMATICS

Introduction to BIOINFORMATICS COURSE OF BIOINFORMATICS a.a. 2016-2017 Introduction to BIOINFORMATICS What is Bioinformatics? (I) The sinergy between biology and informatics What is Bioinformatics? (II) From: http://www.bioteach.ubc.ca/bioinfo2010/

More information

Mapping and Mapping Populations

Mapping and Mapping Populations Mapping and Mapping Populations Types of mapping populations F 2 o Two F 1 individuals are intermated Backcross o Cross of a recurrent parent to a F 1 Recombinant Inbred Lines (RILs; F 2 -derived lines)

More information

Linkage Disequilibrium

Linkage Disequilibrium Linkage Disequilibrium Why do we care about linkage disequilibrium? Determines the extent to which association mapping can be used in a species o Long distance LD Mapping at the tens of kilobase level

More information

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Lecture 2: Height in Plants, Animals, and Humans Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013 Is height a polygenic trait? http://en.wikipedia.org/wiki/gregor_mendel Case Study

More information

Topics in Statistical Genetics

Topics in Statistical Genetics Topics in Statistical Genetics INSIGHT Bioinformatics Webinar 2 August 22 nd 2018 Presented by Cavan Reilly, Ph.D. & Brad Sherman, M.S. 1 Recap of webinar 1 concepts DNA is used to make proteins and proteins

More information

From reads to results: differen1al expression analysis with RNA seq. Alicia Oshlack Bioinforma1cs Division Walter and Eliza Hall Ins1tute

From reads to results: differen1al expression analysis with RNA seq. Alicia Oshlack Bioinforma1cs Division Walter and Eliza Hall Ins1tute From reads to results: differen1al expression analysis with RNA seq Alicia Oshlack Bioinforma1cs Division Walter and Eliza Hall Ins1tute Purported benefits and opportuni1es of RNA seq All transcripts are

More information

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer

Multi-SNP Models for Fine-Mapping Studies: Application to an. Kallikrein Region and Prostate Cancer Multi-SNP Models for Fine-Mapping Studies: Application to an association study of the Kallikrein Region and Prostate Cancer November 11, 2014 Contents Background 1 Background 2 3 4 5 6 Study Motivation

More information

Prostate Cancer Genetics: Today and tomorrow

Prostate Cancer Genetics: Today and tomorrow Prostate Cancer Genetics: Today and tomorrow Henrik Grönberg Professor Cancer Epidemiology, Deputy Chair Department of Medical Epidemiology and Biostatistics ( MEB) Karolinska Institutet, Stockholm IMPACT-Atanta

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

Advanced Genetics. Why Study Genetics?

Advanced Genetics. Why Study Genetics? Advanced Genetics Advanced Genetics Why Study Genetics? Why Study Genetics? 1. Historical and aesthetic appreciation Why Study Genetics? 1. Historical and aesthetic appreciation 2. Practical applications

More information

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Statistical Methods for Quantitative Trait Loci (QTL) Mapping Statistical Methods for Quantitative Trait Loci (QTL) Mapping Lectures 4 Oct 10, 011 CSE 57 Computational Biology, Fall 011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 1:00-1:0 Johnson

More information

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift

Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Population and Statistical Genetics including Hardy-Weinberg Equilibrium (HWE) and Genetic Drift Heather J. Cordell Professor of Statistical Genetics Institute of Genetic Medicine Newcastle University,

More information

Genetics Effective Use of New and Existing Methods

Genetics Effective Use of New and Existing Methods Genetics Effective Use of New and Existing Methods Making Genetic Improvement Phenotype = Genetics + Environment = + To make genetic improvement, we want to know the Genetic value or Breeding value for

More information

Identifying Genes Underlying QTLs

Identifying Genes Underlying QTLs Identifying Genes Underlying QTLs Reading: Frary, A. et al. 2000. fw2.2: A quantitative trait locus key to the evolution of tomato fruit size. Science 289:85-87. Paran, I. and D. Zamir. 2003. Quantitative

More information

Supplementary Text. eqtl mapping in the Bay x Sha recombinant population.

Supplementary Text. eqtl mapping in the Bay x Sha recombinant population. Supplementary Text eqtl mapping in the Bay x Sha recombinant population. Expression levels for 24,576 traits (Gene-specific Sequence Tags: GSTs, CATMA array version 2) was measured in RNA extracted from

More information

Human linkage analysis. fundamental concepts

Human linkage analysis. fundamental concepts Human linkage analysis fundamental concepts Genes and chromosomes Alelles of genes located on different chromosomes show independent assortment (Mendel s 2nd law) For 2 genes: 4 gamete classes with equal

More information

Human linkage analysis. fundamental concepts

Human linkage analysis. fundamental concepts Human linkage analysis fundamental concepts Genes and chromosomes Alelles of genes located on different chromosomes show independent assortment (Mendel s 2nd law) For 2 genes: 4 gamete classes with equal

More information

Genetic data concepts and tests

Genetic data concepts and tests Genetic data concepts and tests Cavan Reilly September 21, 2018 Table of contents Overview Linkage disequilibrium Quantifying LD Heatmap for LD Hardy-Weinberg equilibrium Genotyping errors Population substructure

More information

An introduction to genetics and molecular biology

An introduction to genetics and molecular biology An introduction to genetics and molecular biology Cavan Reilly September 5, 2017 Table of contents Introduction to biology Some molecular biology Gene expression Mendelian genetics Some more molecular

More information

FFGWAS. Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data

FFGWAS. Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data FFGWAS Fast Functional Genome Wide Association AnalysiS of Surface-based Imaging Genetic Data Chao Huang Department of Biostatistics Biomedical Research Imaging Center The University of North Carolina

More information

The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction

The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction M. E. Goddard 1,2, I.M. MacLeod 2, K.E. Kemper 3, R. Xiang 1, I. Van den Berg 1, M. Khansefid

More information

Lecture 2: Biology Basics Con4nued

Lecture 2: Biology Basics Con4nued Lecture 2: Biology Basics Con4nued Central Dogma DNA: The Code of Life The structure and the four genomic le=ers code for all living organisms Adenine, Guanine, Thymine, and Cytosine which pair A- T and

More information

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the

POPULATION GENETICS studies the genetic. It includes the study of forces that induce evolution (the POPULATION GENETICS POPULATION GENETICS studies the genetic composition of populations and how it changes with time. It includes the study of forces that induce evolution (the change of the genetic constitution)

More information

Genetics and Psychiatric Disorders Lecture 1: Introduction

Genetics and Psychiatric Disorders Lecture 1: Introduction Genetics and Psychiatric Disorders Lecture 1: Introduction Amanda J. Myers LABORATORY OF FUNCTIONAL NEUROGENOMICS All slides available @: http://labs.med.miami.edu/myers Click on courses First two links

More information

Course Announcements

Course Announcements Statistical Methods for Quantitative Trait Loci (QTL) Mapping II Lectures 5 Oct 2, 2 SE 527 omputational Biology, Fall 2 Instructor Su-In Lee T hristopher Miles Monday & Wednesday 2-2 Johnson Hall (JHN)

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics CMSC423: Bioinformatic Algorithms, Databases and Tools Some Genetics CMSC423 Fall 2009 2 Chapter 13 Reading assignment CMSC423 Fall 2009 3 Gene association studies Goal: identify genes/markers associated

More information

Haplotype Based Association Tests. Biostatistics 666 Lecture 10

Haplotype Based Association Tests. Biostatistics 666 Lecture 10 Haplotype Based Association Tests Biostatistics 666 Lecture 10 Last Lecture Statistical Haplotyping Methods Clark s greedy algorithm The E-M algorithm Stephens et al. coalescent-based algorithm Hypothesis

More information

The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction

The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction The use of multi-breed reference populations and multi-omic data to maximize accuracy of genomic prediction M. E. Goddard 1,2, I.M. MacLeod 2, K.E. Kemper 3, R. Xiang 1, I. Van den Berg 1, M. Khansefid

More information

Using RNAseq data to improve genomic selection in dairy cattle

Using RNAseq data to improve genomic selection in dairy cattle Using RNAseq data to improve genomic selection in dairy cattle T. Lopdell 1,2 K. Tiplady 1 & M. Littlejohn 1 1 R&D, Livestock Improvement Corporation, Ruakura Rd, Newstead, Hamilton, New Zealand 2 School

More information

QTL Mapping Using Multiple Markers Simultaneously

QTL Mapping Using Multiple Markers Simultaneously SCI-PUBLICATIONS Author Manuscript American Journal of Agricultural and Biological Science (3): 195-01, 007 ISSN 1557-4989 007 Science Publications QTL Mapping Using Multiple Markers Simultaneously D.

More information

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are:

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are: Problem Set 8 Genetics 371 Winter 2010 1. In a population exhibiting Hardy-Weinberg equilibrium, 23% of the individuals are homozygous for a recessive character. What will the genotypic, phenotypic and

More information

Gene Mapping in Natural Plant Populations Guilt by Association

Gene Mapping in Natural Plant Populations Guilt by Association Gene Mapping in Natural Plant Populations Guilt by Association Leif Skøt What is linkage disequilibrium? 12 Natural populations as a tool for gene mapping 13 Conclusion 15 POPULATIONS GUILT BY ASSOCIATION

More information

Feature Selection in Pharmacogenetics

Feature Selection in Pharmacogenetics Feature Selection in Pharmacogenetics Application to Calcium Channel Blockers in Hypertension Treatment IEEE CIS June 2006 Dr. Troy Bremer Prediction Sciences Pharmacogenetics Great potential SNPs (Single

More information

Lecture 6: GWAS in Samples with Structure. Summer Institute in Statistical Genetics 2015

Lecture 6: GWAS in Samples with Structure. Summer Institute in Statistical Genetics 2015 Lecture 6: GWAS in Samples with Structure Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 25 Introduction Genetic association studies are widely used for the identification

More information

Improvement of Association-based Gene Mapping Accuracy by Selecting High Rank Features

Improvement of Association-based Gene Mapping Accuracy by Selecting High Rank Features Improvement of Association-based Gene Mapping Accuracy by Selecting High Rank Features 1 Zahra Mahoor, 2 Mohammad Saraee, 3 Mohammad Davarpanah Jazi 1,2,3 Department of Electrical and Computer Engineering,

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

LS4 final exam. Problem based, similar in style and length to the midterm. Articles: just the information covered in class

LS4 final exam. Problem based, similar in style and length to the midterm. Articles: just the information covered in class LS4 final exam Problem based, similar in style and length to the midterm Articles: just the information covered in class Complementation and recombination rii and others Neurospora haploid spores, heterokaryon,

More information

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations Topics How to track evolution allele frequencies Hardy Weinberg principle applications Requirements for genetic equilibrium Types of natural selection Population genetic polymorphism in populations, pp.

More information

Quan=fying genomic varia=on of gut microbiota across the human popula=on. Stephen Nayfach iseem2 Call February 9, 2015

Quan=fying genomic varia=on of gut microbiota across the human popula=on. Stephen Nayfach iseem2 Call February 9, 2015 Quan=fying genomic varia=on of gut microbiota across the human popula=on Stephen Nayfach iseem2 Call February 9, 2015 Biological Mo=va=on Evolu=onarily similar organisms oden differ in their gene content

More information

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey

Genome-Wide Association Studies. Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Genome-Wide Association Studies Ryan Collins, Gerissa Fowler, Sean Gamberg, Josselyn Hudasek & Victoria Mackey Introduction The next big advancement in the field of genetics after the Human Genome Project

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus. NAME EXAM# 1 1. (15 points) Next to each unnumbered item in the left column place the number from the right column/bottom that best corresponds: 10 additive genetic variance 1) a hermaphroditic adult develops

More information

Exploring genomic databases: Practical session "

Exploring genomic databases: Practical session Exploring genomic databases: Practical session Work through the following practical exercises on your own. The objective of these exercises is to become familiar with the information available in each

More information

689 Special Topics in Ecological Genomics. Spring January 22, 2015

689 Special Topics in Ecological Genomics. Spring January 22, 2015 689 Special Topics in Ecological Genomics Spring 2015 January 22, 2015 Animal mtdna Excep&ons: heteroplasmy, paternal leakage, intra- and interspecific recombina&on Animal mtdna Haploid and maternally

More information

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls BMI/CS 776 www.biostat.wisc.edu/bmi776/ Mark Craven craven@biostat.wisc.edu Spring 2011 1. Understanding Human Genetic Variation!

More information

INTRODUCTION TO GENETICS USING TOBACCO (Nicotiana tabacum) SEEDLINGS

INTRODUCTION TO GENETICS USING TOBACCO (Nicotiana tabacum) SEEDLINGS INTRODUCTION TO GENETICS USING TOBACCO (Nicotiana tabacum) SEEDLINGS By Dr. Elaine Winshell Updated and Revised by Dr. Susan Petro Objectives To apply Mendel s Law of Segregation To use Punnett squares

More information

Next- genera*on Sequencing. Lecture 13

Next- genera*on Sequencing. Lecture 13 Next- genera*on Sequencing Lecture 13 ChIP- seq Applica*ons iden%fy sequence varia%ons DNA- seq Iden%fy Pathogens RNA- seq Kahvejian et al, 2008 Protein-DNA interaction DNA is the informa*on carrier of

More information

After the association: Functional and Biological Validation of Variants

After the association: Functional and Biological Validation of Variants After the association: Functional and Biological Validation of Variants Jason L. Stein Geschwind Laboratory / Imaging Genetics Center University of California, Los Angeles (but soon to be at UNC-Chapel

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 2 Genes, Genomes, and Mendel Nicholas Wheeler & David Harry

More information