MAPPING GENES TO TRAITS IN DOGS USING SNPs

Similar documents
THE TEOSINTE HYPOTHESIS

LACTASE PERSISTENCE: EVIDENCE FOR SELECTION

Stalking the Genetic Basis of a Trait

DNA. Grade Level: 5-6

The Making of the Fittest: Natural Selection in Humans

Content Objectives Write these down!

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

Course: IT and Health

Understanding Variation in Human Skin Color

Crash-course in genomics

UNDERSTANDING VARIATION IN HUMAN SKIN COLOR. Student Handout

1b. How do people differ genetically?

DNA Structure, Function and Replication Teacher Notes 1

Genetics and Heredity Power Point Questions

Algorithms for Genetics: Introduction, and sources of variation

CUMACH - A Fast GPU-based Genotype Imputation Tool. Agatha Hu

Applied Bioinformatics

High-density SNP Genotyping Analysis of Broiler Breeding Lines

Identifying Genes Underlying QTLs

BTRY 7210: Topics in Quantitative Genomics and Genetics

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

Association Mapping. Mendelian versus Complex Phenotypes. How to Perform an Association Study. Why Association Studies (Can) Work

Danika Bannasch DVM PhD. School of Veterinary Medicine University of California Davis

Lecture 2: Biology Basics Continued. Fall 2018 August 23, 2018

Mutation and sexual reproduction produce the genetic variation that makes evolution possible. [2]

A brief introduction to Marker-Assisted Breeding. a BASF Plant Science Company

DNA Structure & the Genome. Bio160 General Biology

Linking Genetic Variation to Important Phenotypes: SNPs, CNVs, GWAS, and eqtls

What is Genetics? Genetics The study of how heredity information is passed from parents to offspring. The Modern Theory of Evolution =

Answers to additional linkage problems.

The Evolution of Populations

COMPUTER SIMULATIONS AND PROBLEMS

THE STUDY OF GENETICS is extremely

SNPs - GWAS - eqtls. Sebastian Schmeier

Lecture 2: Height in Plants, Animals, and Humans. Michael Gore lecture notes Tucson Winter Institute version 18 Jan 2013

The Evolution of Populations

Lecture 2: Biology Basics Continued

Biology 142 Advanced Topics in Genetics and Molecular Biology Course Syllabus Spring 2006

Genome-wide association studies (GWAS) Part 1

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

5/18/2017. Genotypic, phenotypic or allelic frequencies each sum to 1. Changes in allele frequencies determine gene pool composition over generations

Linking Genetic Variation to Important Phenotypes

Reading: 1) Chapter 14 in Biology for a Changing World, by Shuster, Vigna, Sinha, and Tontonoz (2012)

PUBH 8445: Lecture 1. Saonli Basu, Ph.D. Division of Biostatistics School of Public Health University of Minnesota

BUILDING ECOLOGICAL PYRAMIDS WORKSHEET

LD Mapping and the Coalescent

Biology 423L Syllabus 2016 Laboratory Experiments in Genetics

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

Disease-ology: Transforming the teaching of biology using a disease model: Evolution. Jeff Higginbotham. Rockledge High School.

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Observing Patterns in Inherited Traits. Chapter 11

Gene Mapping in Natural Plant Populations Guilt by Association

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Genetics of Beef Cattle: Moving to the genomics era Matt Spangler, Assistant Professor, Animal Science, University of Nebraska-Lincoln

Computational Workflows for Genome-Wide Association Study: I

Analysis of genome-wide genotype data

BST227 Introduction to Statistical Genetics. Lecture 3: Introduction to population genetics

Course Justifications for ANSC 650 (DNA and Genetic Analysis) 1. What is the new course?

Part I: Predicting Genetic Outcomes

Anthro 101: Human Biological Evolution. Lecture 3: Genetics & Inheritance. Prof. Kenneth Feldmeier feldmekj.weebly.

CMSC423: Bioinformatic Algorithms, Databases and Tools. Some Genetics

mrna for protein translation

GENIQUEST Guide Part II

Chapter 4.!Extensions to Mendelian Genetics.! Gene Interactions

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS.

Module 20: Population Genetics, Student Learning Guide

Midterm#1 comments#2. Overview- chapter 6. Crossing-over

Single Nucleotide Variant Analysis. H3ABioNet May 14, 2014

Association studies (Linkage disequilibrium)

LAB ACTIVITY ONE POPULATION GENETICS AND EVOLUTION 2017


Controlling DNA. Ethical guidelines for the use of DNA technology. Module Type: Discussion, literature review, and debate

Chapter 23: The Evolution of Populations

AP BIOLOGY Population Genetics and Evolution Lab

Human Genetics and Gene Mapping of Complex Traits

Whole Genome-Assisted Selection. Alison Van Eenennaam, Ph.D. Cooperative Extension Specialist

I.1 The Principle: Identification and Application of Molecular Markers

Variation Chapter 9 10/6/2014. Some terms. Variation in phenotype can be due to genes AND environment: Is variation genetic, environmental, or both?

LECTURE 1 : GENETICS

MSc specialization Animal Breeding and Genetics at Wageningen University

Evolutionary Genetics: Part 1 Polymorphism in DNA

University of York Department of Biology B. Sc Stage 2 Degree Examinations

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

WARM UP. 1. Take out your laptop and Chapter 12 Notes 2. Log in to Google Classroom 3. Wait for me to post the quick quiz!

of heritable factor ). 1. The alternative versions of genes are called alleles. Chapter 9 Patterns of Inheritance

This is DUE: Tuesday, March 1, 2011 Come prepared to share your findings with your group.

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Tracing Your Matrilineal Ancestry: Mitochondrial DNA PCR and Sequencing

Interest Grabber Notebook #1

INTRODUCTION TO MOLECULAR GENETICS. Andrew McQuillin Molecular Psychiatry Laboratory UCL Division of Psychiatry 22 Sept 2017

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Chapter 4 (Pp ) Heredity and Evolution

Dna Provides Instructions For Making Which >>>CLICK HERE<<<

Result Tables The Result Table, which indicates chromosomal positions and annotated gene names, promoter regions and CpG islands, is the best way for

Genes and Gene Technology

The joining of a sperm and an egg

Lecture 21: Association Studies and Signatures of Selection. November 6, 2006

Block: Science 10 Biology Biology Review Package. 1. What is the full name for DNA? Label the following diagram on the right. Deoxyriboucleic Acid

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

Transcription:

MAPPING GENES TO TRAITS IN DOGS USING SNPs OVERVIEW This instructor guide provides support for a genetic mapping activity, based on the dog genome project data. The activity complements a 29-minute lecture by Dr. Elinor Karlsson of the Broad Institute in Cambridge, MA (http://www.hhmi.org/biointeractive/dog-genomics-and-dogs-modelorganisms), in which she discusses genome-wide association studies (GWAS). The activity was created using actual DNA data from dog saliva that was collected and submitted by students who attended HHMI s 2013 Holiday Lectures on Science. The data were obtained and analyzed by Dr. Karlsson and colleagues at the Broad. This activity consists of: A before-class assignment in which students read a press release from the National Institutes of Health (NIH) and answer questions on their student handout. An optional video lecture by Dr. Elinor Karlsson, explaining GWAS. An in-class component that explores how GWAS work by comparing single-nucleotide polymorphisms (SNPs) of dogs with different phenotypes and identifying associations. This part of the activity has a hands-on component that uses SNP Cards. An optional statistical analysis activity that uses chi-square analysis. KEY CONCEPTS AND LEARNING OBJECTIVES Comparing DNA sequences of many individuals reveals common variations across the genome. Some DNA variations may occur more frequently in individuals with one form of a trait than another. Variations associated with a trait can point to the location of the gene (or genes) responsible for that trait. After participating in this activity, students will be able to: Describe the key concepts in the article (and the video) in their own words. Connect the information in the article to the real data contained in the SNP cards. Identify patterns in genomic data. Participate in a discussion of their interpretation of the genomic data. CURRICULUM CONNECTIONS Curriculum Standards NGSS (April 2013) HS-LS-1, HS-LS3-3, Science and Engineering Practices 2 and 4, Crosscutting Concepts 1, 2, and AP (2012 13) Big Idea 3, EU 3.A.3, 3.C.1, and SP 2, 5, and IB (201) 3.2, 3.5, Aims 1, 4, and 5

KEY TERMS nucleotide, base pair, allele, single-nucleotide polymorphism (SNP), genotype, phenotype, genome sequence, genome-wide association study (GWAS) TIME REQUIREMENTS The introductory section on the student worksheet, the NIH press release, and the six readingcomprehension questions that follow can be assigned as homework and should require about 30 to 40 minutes to complete, depending on the student s reading ability. The optional Karlsson video can also be assigned as homework to supplement and enhance the introduction to the topic (key segment is 12 minutes). Analyzing the SNP data will require roughly 20 minutes of in-class time, depending on the amount of class discussion. The optional chi-square statistical analysis will require an additional 20 minutes. SUGGESTED AUDIENCE This activity was designed for all levels of high school biology and biology-related electives (i.e., genetics and biotechnology). It is a great resource for connecting students with the current state of genetic research and could lead to a study of evolution as well, since dogs are an excellent example of artificial selection. In a college classroom, this activity would be appropriate for an introductory biology or genetics course. PRIOR KNOWLEDGE In order to get the most out of this activity, students should know what a gene is and should be familiar with the relationship between genotype and phenotype. It would be helpful for students to understand the relationship between a single gene and the genome. The activity includes an optional chi-square analysis for students with some background or who need further practice with statistical methods. MATERIALS Variants in Three Genes Account for Most Dog Coat Differences NIH press release. http://www.nih.gov/news/health/aug2009/nhgri-27.htm : Dog Genomics and Dogs as Model Organisms, video lecture with Elinor Karlsson (http://www.hhmi.org/biointeractive/dog-genomics-and-dogs-modelorganisms). The key segment is from 1:5 to 12:15 minutes of the lecture. A set of SNP Cards displaying SNP alleles on chromosome 32 in dogs with short and long hair. A set of SNP Cards displaying SNP alleles on chromosome 27 in dogs with curly and noncurly hair.

PROCEDURE AND ANSWERS TO QUESTIONS Part 1: Read and Answer Questions Distribute the student worksheet to students. Ask students to read the NIH press release, Variants in Three Genes Account for Most Dog Coat Differences, and answer the associated questions before class. (Students will return to this article at the end of the activity and analyze some of the data reported in the article.) 1. How many genes account for the wide variety of coat textures in dogs? Three. 2. In two or three sentences, describe how researchers went about identifying these genes. Answers will vary. Researchers conducted genome-wide scans of specific SNPs in thousands of dogs and a variety of breeds. When they analyzed the results, they found that just three genetic variants accounted for the majority of dog coat types. 3. In this reading, why are SNPs referred to as genetic markers or signposts? A singlenucleotide polymorphism (SNP) is a DNA sequence variation that occurs commonly within a population. For example, a SNP may replace the nucleotide adenine (A) with the nucleotide cytosine (C) in a particular stretch of DNA. Researchers know the precise location of that SNP in the genome. Once a SNP is associated with a particular trait (meaning that it occurs more frequently in individuals with that trait), researchers focus on that region of the genome and sequence it. In most cases, the SNP itself does not contribute to the trait, but it is close to the trait-causing stretch of DNA, serving as a marker or signpost for the location. 4. Why do you think it is important to analyze the DNA of many dogs when doing this research? An individual dog will have many variations compared to any other dog. Researchers analyze the DNA of many dogs to find variations that are common in one group compared to another. A variation is not a useful marker if it only occurs in one or a few dogs. Students may also mention that researchers need to look at many dogs to find SNPs associated with one trait and not another. 5. Do humans have SNPs? Yes.. How might the dog genome project benefit humans? Student answers will vary. The most likely answer will indicate that the human genome is very similar to the dog genome. If we can find the genes responsible for all coat types in dogs, we can use the same approach to identify genes involved in complex diseases in humans. Part 2: Introduction to a GWAS Discuss the article and address student questions or points of confusion. Suggested discussion questions: What is the relationship between a complete genome, a chromosome, and a gene? How are SNPs used in a GWAS? How do GWAS allow researchers to find genes that cause disease? What is the difference between a SNP and a mutation?

This part of the activity begins with a simple table exercise to introduce students to a GWAS. In the student worksheet, students examine Table 1, which displays SNPs at 17 different positions in the genome sequences of eight different dogs (four with white fur and four with black fur). Each SNP locus has two nucleotides, one from each chromosome. To determine whether the SNPs in the table are associated with white fur, students compare the SNPs of the black and white dogs and search for patterns in the data. They should answer questions 1-4 in Part 2. Table 1. Nucleotides at 17 Different Loci in Two Groups of Dogs. 1 2 3 4 5 7 8 9 10 11 12 13 14 15 1 17 C C A T C C G G A A T C T T C C G G A A T T G T A G A A C C G G A T C C A T A C G G G G T T T T C C G G A G T T G G A G A G C C G G A T C C A A A C C G G G T T T T C T G G A A T T G T A G A G C C T T A T C C A A A C G G A G T T T T C T G G G G T T G G A G A G C C G T A T C C A T C C C G A A T T A A C T G G A A T T G T A A A A C C G T A T C C A T C C G G A A T T A A T C G G A A T T G G A G A A C C G G T T C C A T C C G G A A T T A A C C G G A A T T G G A A A A C C G T A T C C A A C C G G A G T T A A T C G G G G T T G G A A A A C C T T A T 1. What is the locus number (or numbers) for the 100% associated SNP(s)? 7 2. What is the locus number (or numbers) for the somewhat associated SNP(s)? The somewhat associated SNP are at locus 3, 13, or 14. 3. What is the locus number (or numbers) for the not associated SNP(s)? Any of the loci not mentioned above. 4. What are two possible explanations for why a SNP is associated with a phenotype like coat color? A SNP can be associated with a phenotype because it causes that phenotype (e.g., white fur) or because it is linked to, or located close to, the allele that causes the phenotype.

Part 3: Identify Associations Using Real Data Following that warm-up, students proceed to the card activity to identify the most associated SNPs using two different sets of SNP Cards: one set for coat length (short/long) and one for coat type (curly/straight). Students can work individually or in groups. Distribute the coat-length cards. Give a set of coat-length SNP cards to each student or group. The data displayed on the coat-length card shows SNPs at seven loci on chromosome 32 in 12 dogs with either short or long hair. Note: The chromosome number (32), followed by a seven-digit nucleotide number, indicates the SNP loci. Ask students to shuffle the cards around and then organize them to find associations between certain SNPs and coat-length trait. Give students enough time to examine the cards and write down their answers. Students then follow the directions in their student documents to identify SNPs associated with coat length. Their data should look as follows: Chr32 7420804 T 4 4 0 C 8 8 0 0 Chr32 747220 A 9 12 3 G 3 0 3 Chr32 7473337 T 3 12 9 G 9 0 9 18 Chr32 7479580 T 9 12 3 C 3 0 3 Chr32 748287 A 5 0 5 G 7 12 5 10

Chr32 7490570 T 11 5 C 1 5 10 Chr32 749234 C 3 1 2 G 9 11 2 4 1. Which SNP (or SNPs) is most strongly associated with coat length? Chr32 7473337 2. Which SNP (or SNPs) is least associated with coat length? Chr32 7420804 Distribute the coat-texture cards. Give a set of coat-texture SNP Cards to each student or group. The data displayed on the coattexture card shows SNPs at six loci on chromosome 27 in 10 dogs with either curly or straight hair. Students follow the directions in their Student Worksheet, record their tallies in the charts provided, and determine the strongest association to coat type. Their results should look as shown in the following tables. Chr27 5525002 C 4 5 1 T 5 1 2

Chr27 5541113 C 7 4 3 T 3 3 Chr27 554280 C 4 10 T 0 12 Chr27 5545082 A 4 8 4 G 2 4 8 Chr27 5557298 C 4 2 2 T 8 2 4 Chr27 558947 C 5 1 T 5 4 1 2

3. Which SNP (or SNPs) shows the strongest association to coat texture? Chr27 554280 4. If you were to look for a gene involved in coat length, on which region of the genome would you focus your analysis? What about a gene for coat texture? Write your answer below. Students should indicate that they would look for a gene involved in coat length on chromosome 32, somewhere around nucleotide 7473337. For the gene involved in coat texture, they would look on chromosome 27 near nucleotide 554280. Part 4: Chi-Square Test of Independence Analysis (Optional) It is possible that the frequency of alleles between the two groups of dogs could be different just by chance. On the other hand, if these did not occur by chance, the associations might indicate that the SNPs mark the positions of genes involved in coat length or texture. To determine whether the associations in this activity are statistically significant, students conduct a chi-square analysis. This part of the activity is optional and requires students to be familiar with the chi-square test of independence test. 1. The results are as follows: SNP locus: Chr32 7473337 G alleles = 9 Expected number of G alleles = 4.5 T alleles = 15 Expected number of T alleles = 7.5 2. Complete the table below for the SNP with the strongest association for coat length. Allele Short Coat Observed Long Coat Observed Expected G 9 0 4.5 T 3 12 7.5 3. Chi-square value calculation: (4.5) 2 /4.5 + (4.5) 2 /7.5 + (4.5) 2 /4.5 + (4.5) 2 /7.5 = 4.5 + 2.7 + 4.5 + 2.7 = 14.4 4. Chi-square value 14.4 5. Record the p-value for the SNP with the strongest association: 0.001. What does this p-value tell you? The association is statistically significant. 7. Which SNP has the strongest association for curly versus non-curly coat? Chr27 554280 8. Is this association statistically significant? Yes 9. What is the evidence for your choice? The chi-square value for this locus is 8.5, which is much larger than 3.84.

Part 5: What Are the Genes? Students examine an additional table summarizing data collected on three genes responsible for seven major coat types of purebred dogs. Using the table, students answer questions that guide interpretation of the data. This research example relates to the article that students read at the beginning of the activity. 1. What phenotype is the ancestral allele of the KRT71 gene associated with in dogs? The ancestral allele of KRT71 is associated with the straight phenotype because the dogs with the - allele all have straight fur and with the + allele curly fur. 2. Which breed of dog shown above exhibits the phenotype for a variant allele for FGF5 but an ancestral allele for KRT71? The Golden Retriever and Bearded Collie have a variant FGF5 allele, different from wolves, and the ancestral allele of the KRT71 gene. TEACHING TIPS The lecture by Dr. Karlsson (http://www.hhmi.org/biointeractive/dog-genomics-and-dogs- modelorganisms) is an excellent introduction to GWAS in dogs. You could assign a segment of it as homework or consider showing chapters 2 to 7 (time 1:5 to 12:15 min.) in class, prior to the SNP activity. Note that in the lecture Dr. Karlsson uses the term correlated to discuss the relationship found in the GWAS. This is a colloquial use of the term correlated. In strict statistical sense, correlations describe relationship between two quantitative variables. Here, because the variables are categories, and not numbers, the term that should be used is associated. The Chi-Square Test of Independence is used to test the strength of associations between variables. You may want to clarify this statistical language to your students. Students can work individually or in groups. You will need to print a complete set of SNP cards for each student or small group of students. You could laminate the cards for repeat use. Consider showing students how GWAS data is plotted; an example can be found at http://www.broadinstitute.org/igv/viewing_gwas. GWAS data is often shown as a Manhattan plot with SNPs represented as dots, color-coded by chromosome. The x-axis shows the locations of the SNPs in the genome. The larger the dot and the higher it is on the scale, the more significant the association is with a trait. You may want to discuss with students that, although the human genome was obtained by pooling sequence data from several individuals, the reference dog genome consists of data from just one dog (Tasha). Scientists also sequenced dogs from other breeds and put all the they found compared to Tasha s genome in the public database called dbsnp. [http://www.ncbi.nlm.nih.gov/snp/] Although this activity only shows a few SNPs on two chromosomes, a GWAS uses millions of SNPs across the entire genome and a computer sifts through the data to find associations. This activity uses simple examples that can be done without a computer to demonstrate the concepts involved. When describing the difference between a SNP that causes a trait (causative SNP) and an associated SNP that does not cause a trait, take the opportunity to discuss genetic linkage, which is typically taught during the Mendelian genetics unit. Even if a SNP in a GWAS shows

strong association with a trait, it takes further investigation to see if it is a causative SNP (i.e., in the noncoding or coding region of a gene) or if it s simply closely linked to the causative SNP. After introducing GWAS, you could share scientific papers with students that describe using GWAS in a human genomics study to identify genes that cause disease. You can also point out additional areas of research in which GWAS are used, such as in agriculture. This activity could serve as a lead-in to evolution since dogs are an excellent example of artificial selection. SOCIAL MEDIA SUGGESTIONS Students could write an analysis of the activity as a blog post (on a class blog or individual student blogs). Once posted, the comments feature could be used to facilitate peer review and discussion. As an alternative optional extension project, students could create a Dog Genotype and Phenotype infographic, intended for a veterinarian s office. Free, online infographic creation tools (Easelly or Piktochart) make this sort of brochure or handout easy to create. AUTHORS This activity was written by Melissa Csikari, Colonial Forge High School, Stafford, VA; Elinor Karlsson, Ph.D., Broad Institute of MIT, Cambridge, MA; Laura Bonetta, Ph.D., and Eriko Clements, Ph.D., HHMI. It was edited by Laura Bonetta, Ph.D., HHMI and Robin Heyden, consultant; copyedited by Linda Felaco. Reviewers: David McDonald, Ph.D., North Carolina Central University; Vince Buonaccorsi, Ph.D., Juniata College; Elinor Karlsson, Ph.D., Broad Institute.