Genome-wide association study and gene network analysis of fertility, retained placenta, and metritis in US Holstein cattle

Similar documents
General aspects of genome-wide association studies

Question. In the last 100 years. What is Feed Efficiency? Genetics of Feed Efficiency and Applications for the Dairy Industry

Revisiting the a posteriori granddaughter design

Introduction. Data collection and evaluation activities

Managing genetic groups in single-step genomic evaluations applied on female fertility traits in Nordic Red Dairy cattle

The powerful new genomic selection tool. Built By Angus Genetics Inc.

ChroMoS Guide (version 1.2)

Individual Genomic Prediction Report

Improving fertility through management and genetics.

Strategy for applying genome-wide selection in dairy cattle

BLUP and Genomic Selection

Fertility Factors Fertility Research: Genetic Factors that Affect Fertility By Heather Smith-Thomas

Fundamentals of Genomic Selection

Genetics of dairy production

Genomic selection +Sexed semen +Precision farming = New deal for dairy farmers?

Genomic prediction. Kevin Byskov, Ulrik Sander Nielsen and Gert Pedersen Aamand. Nordisk Avlsværdi Vurdering. Nordic Cattle Genetic Evaluation

Bull Selection Strategies Using Genomic Estimated Breeding Values

SNPs - GWAS - eqtls. Sebastian Schmeier

Big Data, Science and Cow Improvement: The Power of Information!

BTRY 7210: Topics in Quantitative Genomics and Genetics

Establishment of a Single National Selection Index for Canada

Genomics and resilience

Practical integration of genomic selection in dairy cattle breeding schemes

Genome-wide association studies (GWAS) Part 1

Single-step genomic evaluation for fertility in Nordic Red dairy cattle

A strategy for multiple linkage disequilibrium mapping methods to validate additive QTL. Abstracts

PoultryTechnical NEWS. GenomChicks Advanced layer genetics using genomic breeding values. For further information, please contact us:

Understanding Results

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Beef Improvement New Zealand Inc.

Development of genetic and genomic evaluation for wellness traits in US Holstein cows

What could the pig sector learn from the cattle sector. Nordic breeding evaluation in dairy cattle

Figure 1: Testing the CIT: T.H. Morgan s Fruit Fly Mating Experiments

Operation of a genomic selection service in Ireland

What do I need to know about Genomically Enhanced EPD Values? Lisa A. Kriese Anderson Extension Animal Scientist Auburn University.

FREQUENTLY ASKED QUESTIONS

Lees J.A., Vehkala M. et al., 2016 In Review

Genetic evaluation using single-step genomic best linear unbiased predictor in American Angus 1

4.1. Genetics as a Tool in Anthropology

Axiom mydesign Custom Array design guide for human genotyping applications

Why should registered and commercial breeders of Holsteins and Jerseys support an industry working together to insure a national dairy database?

Beyond single genes or proteins

Association mapping of Sclerotinia stalk rot resistance in domesticated sunflower plant introductions

Improving Genetics in the Suckler Herd by Noirin McHugh & Mark McGee

Runs of Homozygosity Analysis Tutorial

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

Personal Genomics Platform White Paper Last Updated November 15, Executive Summary

Supplementary Note: Detecting population structure in rare variant data

Inheritance Biology. Unit Map. Unit

Bayesian Decomposition

The 150+ Tomato Genome (re-)sequence Project; Lessons Learned and Potential

What s the most complex problem in biology?

Chapter 47. Development

Analysis of Microarray Data

Package FSTpackage. June 27, 2017

Understanding and Using Expected Progeny Differences (EPDs)

Genomic selection and its potential to change cattle breeding

Computational Workflows for Genome-Wide Association Study: I

Genetic Analysis of Cow Survival in the Israeli Dairy Cattle Population

Basic Concepts of Human Genetics

Reliabilities of genomic prediction using combined reference data of the Nordic Red dairy cattle populations

High Cross-Platform Genotyping Concordance of Axiom High-Density Microarrays and Eureka Low-Density Targeted NGS Assays

Beef Cattle Genomics: Promises from the Past, Looking to the Future

Short communication: Genetic evaluation of stillbirth in US Brown Swiss and Jersey cattle

Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances

TECHNICAL BULLETIN GENEMAX FOCUS - EVALUATION OF GROWTH & GRADE FOR COMMERCIAL USERS OF ANGUS GENETICS. November 2016

Haplotypes, recessives and genetic codes explained

Section 4 - Guidelines for DNA Technology. Version October, 2017

DEVELOPMENT OF STANDARD METHODS TO ESTIMATE MANURE PRODUCTION AND NUTRIENT CHARACTERISTICS FROM DAIRY CATTLE

Joint Study of Genetic Regulators for Expression

Introduction to Genome Wide Association Studies 2015 Sydney Brenner Institute for Molecular Bioscience Shaun Aron

Lecture 2: Biology Basics Continued

Welcome to Igenity Brangus and the Power of Confident Selection

Lecture 17. Transgenics. Definition Overview Goals Production p , ,

H3A - Genome-Wide Association testing SOP

wheat yield (tonnes ha 1 ) year Key: total yield contribution to yield made by selective breeding Fig. 4.1

Bayesian Networks as framework for data integration

Introduction to Indexes

Linkage & Genetic Mapping in Eukaryotes. Ch. 6

Sequencing Millions of Animals for Genomic Selection 2.0

Impact of Genotype Imputation on the Performance of GBLUP and Bayesian Methods for Genomic Prediction

Human linkage analysis. fundamental concepts

MSc specialization Animal Breeding and Genetics at Wageningen University

Herd Summary Definitions

Quality Control Assessment in Genotyping Console

Gene Set Enrichment Analysis! Robert Gentleman!

Genome-wide association mapping using single-step GBLUP!!!!

1999 American Agricultural Economics Association Annual Meeting

A Propagation-based Algorithm for Inferring Gene-Disease Associations

CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer

Exploring the Genetic Basis of Congenital Heart Defects

Genetics of extreme body size evolution in mice from Gough Island

Exam 2 3/19/07 P. Sengupta BISC 4A

Gregor Mendel solved the puzzle of heredity

Dr. Mallery Biology Workshop Fall Semester CELL REPRODUCTION and MENDELIAN GENETICS

Quantitative Genetics

Gene Annotation and Gene Set Analysis

Transcription:

Genome-wide association study and gene network analysis of fertility, retained placenta, and metritis in US Holstein cattle J.B. Cole, K.L. Parker Gaddis 2, D.J. Null, C. Maltecca 3, and J.S. Clay 4 Animal Genomics and Improvement Laboratory, ARS, USDA, Bldg 005, BARC-West, 0300 Baltimore Avenue, Beltsville, MD 20705 2 Council on Dairy Cattle Breeding, 420 Northview Drive, Bowie, MD 2076, USA 3 Department of Animal Science, College of Agriculture and Life Sciences, North Carolina State University, Campus Box 762, Raleigh, NC 27695 4 Dairy Records Management Systems, 33 Chapanoke Road, Suite 00, Raleigh, NC 27603 Summary The objectives of this research were to identify genes, genomic regions, and gene networks associated with three measures of fertility (daughter pregnancy rate, DPR; heifer conception rate, HCR; and cow conception rate, CCR) and two measures of reproductive health (metritis, METR; and retained placenta, RETP) in US Holstein using producer-reported data. A five-trait mixed model analysis was used to perform a genome-wide association study (GWAS) to identify significant SNP located within 25 kbp of genes in bull and cow predictor populations. Gene ontology (GO) and medical subject heading (MeSH) analyses were used to identify pathways and processes over-represented compared to a background set of all annotated Bos taurus genes. An adaptive weight matrix was used to identify significant associations among genes. GWAS results identified different sets of SNP in the two predictor populations, with SNP affecting protein processing, cell-cell signaling, sex differentiation, and embryonic development. Significant GO and MeSH terms also differed between predictor populations, but terms associated with reproductive processes were identified in both cases. The degree of nodes in the network analysis did not deviate from expectations, but fertility-related terms were identified, and several of the most-connected genes were associated with male or female fertility and embryo size and morphology in mice or humans, most notably ITPR, SETB, LMNB, NEO, and DGKA. None of the 00 SNP explaining the most variance in the GWAS were among the connected genes in the networks. While this study identified genes and interactions among them clearly related to fertility, no obvious associations with peripartum reproductive health were found. A more powerful experimental design, such as a case-control study, may be needed to identify relationships among fertility and reproductive tract health. Keywords: dairy cattle, fertility, health, reproduction Introduction Parker Gaddis et al. (206) recently used single- and multiple-trait genome-wide association studies (GWAS) in all-bull, all-cow, and mixed predictor populations to dissect three fertility traits. Their results showed that gene network analysis was able to identify several important genes that were not identified by ordinary GWAS. The US will soon introduce genetic evaluations for 6 health traits in Holstein cattle, including retained placenta and metritis as

measures of reproductive health (Parker Gaddis et al., 207). Cows beginning a lactation with retained placenta or metritis have longer days open and lower conception rates than cows that do not (e.g., Fourichon et al., 2000). However, it is not known to what degree susceptibility to reproductive tract diseases and cow fertility are influenced by common sets of genes. The objectives of this research were to identify genes, genomic regions, and gene networks associated with three measures of fertility (daughter pregnancy rate, DPR; heifer conception rate, HCR; and cow conception rate, CCR) and two measures of reproductive health (metritis, METR; and retained placenta, RETP) in US Holstein cows. Materials and methods Phenotypic and genotypic data Genomic evaluations for DPR, HCR, and CCR from the December 206 proofs calculated by the Council on Dairy Cattle Breeding (CDCB; Bowie, MD, USA) were combined with evaluations of METR and RETP calculated from on-farm health event data provided by Dairy Records Management Systems (Raleigh, NC, USA) as described in Parker Gaddis et al. (204, 207). Genotypes included 60,67 SNP used in the routine U.S. evaluations. Holstein bull and cow predictor populations were formed by selecting animals with reliabilities of predicted transmitting ability (PTA) for lifetime net merit greater than the reliability of their parent average. All 35,724 bulls in the predictor set were retained, and a random sample of 35,000 cows was drawn from the 2,895 cows in the predictor population. Only animals with PTA for all traits were included in the analysis. Genome-wide association studies The five-trait multivariate genome-wide association study (GWAS) used the model: where Y is an n 5 matrix offor n individuals, is the intercept, is an n-vector of marker genotypes, is a vector of marker effect sizes for the 5, U is an n 5 matrix of random effects, and E is an n 5 matrix of errors. The random effects matrix, U, was where K is a known relatedness matrix and is a symmetric matrix of genetic variance components. The error matrix, E, was, where is an identity matrix and is a symmetric matrix of residual variance components (Zhou, 204). SNP and enrichment analyses Each autosomal marker was assigned to the closest gene within 25,000 bp using BEDTools version 2.2.0 (Quinlan and Hall, 200). Gene information was taken from the Bovine UMD3.

genome assembly (Zimin et al., 2009). After merging with the annotated gene data 36,435 markers were available for subsequent analysis. SNP from the GWAS whose P-value from a Wald test exceeded a threshold of 5 0-8 from the five-trait multivariate analysis were selected for further analysis and gene function was determined by a review of the literature. Gene ontology (GO; Ashburner et al., 2000) and medical subject heading (MeSH; Morota et al., 205) enrichment analyses were used to compare all SNP with P-values less than 0.05 against a background of all annotated genes in the bovine genome. GO and MeSH term analyses were carried out in R v. 3.4.0 using the GOSTATS v..5.3 and meshr v..2.0 packages as distributed in Bioconductor v. 3.5. Gene network construction An association weight matrix (AWM) was constructed following the procedures previously implemented by Fortes et al. The construction of the AWM started with the selection of relevant SNP from those identified as significant in the association analyses. Each column in the AWM corresponded to a trait, and each row corresponded to a SNP. Each cell in the matrix corresponded to the z-score normalized effect size for the SNP. When more than one SNP was mapped to the same gene, the most significant SNP was retained and the others dropped. Rowwise partial correlations were computed on the AWM using the PCIT algorithm in R which produced an m symmetric adjacency matrix. Each cell in the adjacency matrix corresponded to a partial correlation between gene i and gene j. When partial correlations were not significant the value in the cell was set to 0. The significant correlations can be interpreted as significant genegene interactions. These interactions were used to construct bull (Figure ) and cow gene networks. In order to avoid spurious connections, the bull and cow networks were reduced to sub-networks including only connections with a partial correlation 0.98. Correlation networks were visualized using Cytoscape version 3.2. (Shannon et al., 2003). Results and discussion Genome-wide association studies The were 43 significant SNP in the bull predictor population, and in the cow population. The five SNP with the largest effects in each population are described in Table. There was no clear pattern among gene functions, but developmental, cell-signaling, and protein modification processes were represented in both populations. The top SNP between the bull and cow populations did not overlap. Table. The five SNP with the largest effects on in a multivariate analysis using bull and cow genotypes. Grou p SNP Chrome Location Gene Function log 0 (P) Bulls BTB-0079045 20 57,373,60 FBXL7 Ubiquitination 44.67 ARS-BFGL-NGS- 8 48,486,442 ECH Fatty acid 4.43 6445 degradation ARS-BFGL-NGS- 72630 6 8,87,663 SORCS2 Nervous system 2.88

development BTB-00259343 6 62,642,435 BEND4 Longevity 5.02 Hapmap55409-4 33,236,485 CROT Lipid 2.77 rs29022997 metabolism Cows ARS-BFGL-NGS- 6 92,53,394 CDKL2 Sex 3.26 23066 differentiation BTB-0006275 35,269,426 EPHB Cell signaling 9.08 BTB-0076697 4 40,934,520 SEMA3C Embryonic 8.02 development ARS-BFGL-NGS- 4 9,34,42 UBE3C Ubiquitination 7.96 33 ARS-BFGL-NGS- 36082 7 55,96,203 KDM2B Ubiquitination 7.45 Chrome = chromosome number. GO and MeSH term enrichment analyses Significantly enriched GO and MeSH terms for the bull and cow populations are presented in Table 2. Gene ontology terms were taken from the Biological Processes category and identify pathways that involve the activities of many gene products. Bulls were enriched for processes including spermatogenesis and DNA processing, while cows were enriched for a broad array of pathways including embryonic development and gene expression. Medical subject heading terms identify enriched processes based on literature reports. As in the case of GO terms, many different processes were identified in bulls, while cows had only two significant terms. Table 2. Gene ontology (GO) and medical subject heading (MeSH) terms with significant effects on in a multivariate analysis using bull and cow genotypes. GO MeSH 2,3 Group GO ID Term P-value MeSH ID Term P-value Bulls 0006270 DNA replication 0.005 D002970 Cleavage 0.004 initiation stage, ovum 0007288 sperm axoneme 0.04 D003599 Cytoskeleton 0.032 assembly 00566 maintenance of 0.04 D00920 Myofibrils 0.035 centrosome location 902979 mitotic DNA 0.04 D036 Spinal cord 0.035 replication termination 0007283 spermatogenesis 0.036 D04254 Intracellular 0.036 space Cows 2000738 positive regulation of 0.06 D002823 Chorion 0.034 stem cell differentiation 007026 mitochondrial translational termination 0.024 D009092 Mucous membrane 0.043

2000637 positive regulation of 0.024 gene silencing by mirna 004870 embryonic cranial 0.039 skeleton morphogenesis 006047 regulation of posttranscriptional gene silencing 0.039 Biological processes (BP) category. 2 Anatomy (A) category. 3 Only two MeSH terms were significantly enriched in the cow population. Gene networks Sex-specific gene networks included 824 genes in bulls and 856 genes in cows. Their number of connections (the degree of the vertices induced by the PCIT algorithm) ranged between and,049 in bulls, and and,240 in cows. The two networks shared 39 genes in common. The number of connections between nodes in biological networks usually follows a Power-law distribution. We used a Kolmogorov-Smirnov test to validate this assumption, and the null hypothesis of the networks being drawn from a Power-law distribution was not rejected. Several genes identified as the top connected in the networks were associated with either male or female fertility and embryo size and morphology in mice or humans, most notably ITPR, SETB, LMNB, NEO, and DGKA. None of the 00 SNP explaining the largest amount of variance in the GWAS were among the most connected genes in the networks. Figure. Gene network based on the bull predictor population constructed using edges from thewith partial correlation 0.98. Node sizes are proportional to their degree.

Conclusions As expected, these analyses identified individual SNP associated with fertility, and enriched pathways also included some fertility terms. Bull- and cow-specific gene networks similarly included genes with known effects on fertility. However, no significant loci had any obvious associations with reproductive tract health as measured by METR and RETP. This may be due to the. A case-control study using paired animals could provide greater power for identifying SNP and coexpression networks associated with both reproductive health and fertility. Acknowledgments The authors thank Dairy Records Management Systems (Raleigh, NC) for providing health data. The Council on Dairy Cattle Breeding (Bowie, MD) provided fertility evaluations, and the Cooperative Dairy DNA Repository (Madison, WI) provided genotypes. List of References Ashburner M., C.A. Ball, J.A. Blake, D. Botstein, H. Butler, et al., 2000. Gene Ontology: tool for the unification of biology. Nature Genet. 25:25 29. Fourichon, C., H. Seegers, & X. Malher, 2000. Effect of disease on reproduction in the dairy cow: a meta-analysis. Theriogenol. 53:729 759. Garrick, D.J., J.F. Taylor, and R.L. Farnando, 2009. Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet. Sel. Evol. 4:55. Morota, G., F. Peñagaricano, J.L. Petersen, D.C. Ciobanu, K. Tsuyuzaki, et al., 205. An application of MeSH enrichment analysis in livestock. Anim. Genet. 46(4):38 387. Parker Gaddis, K.L., D.J. Null, and J.B. Cole, 206. Explorations in genome-wide association studies and network analyses with dairy cattle fertility traits. J. Dairy Sci. 99(8):6420 6435. Parker Gaddis, K.L., J.B. Cole, J.S. Clay, & C. Maltecca, 204. Genomic selection for producerrecorded health event data in US dairy cattle. J. Dairy Sci. 97(5): 390 399. Parker Gaddis, K.L., M.E. Tooker, J.R. Wright, J.H. Megonigal, J.S. Clay, et al., 207. Development of genomic evaluations for direct measures of health in U.S. Holsteins and their correlations with fitness traits. J. Dairy. Sci. 00(Suppl. 2):378(abstr. 377). Quinlan, A.R., and I.M. Hall, 200. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):84 842. Servin, B., & M. Stephens, 2007. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 3:e4. Shannon, P., A. Markiel, O. Ozier, N.S. Baliga, J.T. Wang, D. Ramage, N. Amin, B. Schwikowski, & T. Ideker, 2003. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 3:2498 2504. Zhou, X., P. Carbonetto, & M. Stephens, 203. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9:e003264. Zhou, X., and M. Stephens, 204. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods (4):407 409. Zimin, A.V., A. L. Delcher, L. Florea, D.R. Kelley, M.C. Schatz, et al. 2009. A whole-genome assembly of the domestic cow, Bos taurus. Genome Biol. 0:R42.