Guilherme J. M. Rosa, Natalia de Leon and Artur J. M. Rosa

Size: px
Start display at page:

Download "Guilherme J. M. Rosa, Natalia de Leon and Artur J. M. Rosa"

Transcription

1 Guilherme J. M. Rosa, Natalia de Leon and Artur J. M. Rosa Physiol Genomics 28:15-23, First published Sep 19, 2006; doi: /physiolgenomics You might find this additional information useful... This article cites 49 articles, 22 of which you can access free at: This article has been cited by 2 other HighWire hosted articles: Functional Genomics of the Chicken A Model Organism L. A. Cogburn, T. E. Porter, M. J. Duclos, J. Simon, S. C. Burgess, J. J. Zhu, H. H. Cheng, J. B. Dodgson and J. Burnside Poult. Sci., October 1, 2007; 86 (10): [Abstract] [Full Text] [PDF] Physiological genomics special issue on animal functional genomics J. L. Burton and G. J. M. Rosa Physiol Genomics, December 13, 2006; 28 (1): 1-4. [Full Text] [PDF] Updated information and services including high-resolution figures, can be found at: Additional material and information about Physiological Genomics This information is current as of January 31, can be found at: Physiological Genomics publishes results of a wide variety of studies from human and from informative model systems with techniques linking genes and pathways to physiology, from prokaryotes to eukaryotes. It is published quarterly in January, April, July, and October by the American Physiological Society, 9650 Rockville Pike, Bethesda MD Copyright 2005 by the American Physiological Society. ISSN: , ESSN: Visit our website at

2 Physiol Genomics 28: 15 23, First published September 19, 2006; doi: /physiolgenomics Invited Review CALL FOR PAPERS 2nd International Symposium on Animal Functional Genomics Review of microarray experimental design strategies for genetical genomics studies Guilherme J. M. Rosa, 1 Natalia de Leon, 2 and Artur J. M. Rosa 3 Departments of 1 Dairy Science and 2 Agronomy, University of Wisconsin, Madison, Wisconsin; and 3 Department of Animal & Range Sciences, South Dakota University, Brookings, South Dakota Rosa GM, deleon N, Rosa AJM. Review of microarray experimental design strategies for genetical genomics studies. Physiol Genomics 28: 15 23, First published September 19, 2006; doi: /physiolgenomics Genetical genomics approaches provide a powerful tool for studying the genetic mechanisms governing variation in complex traits. By combining information on phenotypic traits, pedigree structure, molecular markers, and gene expression, such studies can be used for estimating heritability of mrna transcript abundances, for mapping expression quantitative trait loci (eqtl), and for inferring regulatory gene networks. Microarray experiments, however, can be extremely costly and time consuming, which may limit sample sizes and statistical power. Thus it is crucial to optimize experimental designs by carefully choosing the subjects to be assayed, within a selective profiling approach, and by cautiously controlling systematic factors affecting the system. Also, a rigorous strategy should be used for allocating mrna samples across assay batches, slides, and dye labeling, so that effects of interest are not confounded with nuisance factors. In this presentation, we review some selective profiling strategies for genetical genomics studies, including the selection of individuals for increased genetic dissimilarity and for a higher number of recombination events. Efficient designs for studying epistasis are also discussed, as well as experiments for inferring heritability of transcriptional levels. It is shown that solving an optimal design problem generally requires a numerical implementation and that the optimality criteria should be intimately related to the goals of the experiment, such as the estimation of additive, dominance, and interacting effects, localizing putative eqtl, or inferring genetic and environmental variance components associated with transcriptional abundances. optimal design; selective phenotyping; transcriptional profiling; gene expression; expression quantitative trait loci MODERN TECHNIQUES BEING USED to unravel the genetic mechanisms governing variation in complex traits combine information on phenotypic traits, family (or pedigree) structure, molecular markers, and gene expression, and are generally referred to as genetical genomics, or quantitative genomics approaches (11, 22, 31, 42). For example, the transcriptional activity of genes, assessed by microarray experiments in genotyped individuals, has been treated as multiple phenotypic traits such that traditional quantitative trait locus (QTL) analysis has been used to search for polymorphisms associated with gene expression variability (the so-called expression QTL, or eqtl). Applications of such methodology can be found, for example, in Brem et al. (4), Hubner et al. (20), Morley et al. (38), Schadt et al. (45), and Yvert et al. (54). Results of eqtl studies can be represented as in Fig. 1, in which a fictitious chromosome is depicted on both axes, with the molecular marker locations (10 markers) placed on the horizontal axis and the genes probed in the microarray slide (20 genes) placed on the vertical axis. For each gene, a genome scan is performed, and the significant QTL are represented by a dot (location point estimate) and a horizontal segment (location confidence interval). In genetical genomics studies hundreds of markers (in multiple chromosomes) and thousands of genes are generally considered in a single experiment. For an example of presentation of eqtl mapping results see Bing and Hoeschele (3) and Lan et al. (32), among others. 1 On the illustrative example of Fig. 1, five genes (which are denoted as g 1 g 5 ) present at least one significant QTL affect- Article published online before print. See web site for date of publication ( Address for reprint requests and other correspondence: G. J. M. Rosa, 460 Animal Science Bldg., 1675 Observatory Dr., Univ. of Wisconsin - Madison, Madison, WI ( grosa@wisc.edu). 1 The 2nd International Symposium on Animal Functional Genomics was held May 16 19, 2006 at Michigan State University in East Lansing, MI, and was organized by Jeanne Burton of Michigan State University and Guilherme J. M. Rosa of University of Wisconsin-Madison (see meeting report by Drs. Burton and Rosa, Physiol Genomics 28: 1-4, 2006) /06 $8.00 Copyright 2006 the American Physiological Society 15

3 Invited Review 16 GENETICAL GENOMICS EXPERIMENTAL DESIGN Fig. 1. Representation of results of a fictitious genetical genomics study. A single chromosome is depicted, having chromosomic locations of molecular markers and microarray genes represented on the x- and y-axes, respectively. ing their transcriptional levels. Some eqtl locations coincide with the region where the gene whose expression is being studied resides, such as the eqtl found for gene 1. Gene 1 as well as its eqtl are located around marker 3. This indicates that the transcriptional activity of gene 1 may be partially modulated by polymorphisms on gene 1. This process is referred to as cis-acting. Some other eqtl, however, are located elsewhere in the genome, denoting polymorphisms on specific loci contributing to variation in the expression of genes in different regions of the genome. This process, denoted by trans-acting, can be studied to understand how genes interact and how they cluster in gene networks. Throughout this paper (to facilitate the discussion on the advantages and disadvantages of alternative experimental strategies) the terms epistasis and trans-acting effects are used to represent two variants of gene interaction. Epistasis refers to the classical definition of the joint effect of alleles in two or more segregating loci (i.e., how the combined effect of genotypes in multiple loci differs from the sum of each genotype effect alone) and how it contributes to variation on phenotypes (which may represent also transcriptional activity of genes). Epistasis is then defined similarly to the statistical interaction among two or more factors (8), but having factors and their levels represented by loci and their genotypes, respectively. Epistasis involving two biallelic loci can be then factored into specific components such as additive additive, additive dominance, dominance additive, and dominance dominance interactions; higher order terms can also be studied if more than two loci are considered. Trans-acting effects, on the other hand, represent the effect of a specific polymorphism on the transcriptional abundance of another gene, which may be not even polymorphic. The trans-acting effect of a biallelic locus [e.g., single nucleotide polymorphism (SNP)] on the expression of a specific gene can be factored on additive and dominance trans-acting components, similarly to any quantitative phenotype. Genetical genomics studies provide valuable information regarding gene interactions (both epistatic effects and transacting factors), gene allelic variants responsible for its own accentuated or attenuated transcriptional activity (cis-acting factors), and eqtl hot-spots (chromosomic regions affecting expression of multiple genes, within a pleiotropic context, e.g., the region between markers 3 and 4 in Fig. 1, which is found to be associated with the transcriptional activity of genes 1, 2, 3, and 5) and can be combined with QTL analysis of phenotypic traits (such as economically important agricultural traits or human disease-related traits). Such information helps us further our understanding of the genetic complexity underlying variation of such traits, useful for the generation of candidate genes for target pharmaceutical drug development and for the selection of molecular markers to be used in marker assisted breeding programs in agriculture. In addition to eqtl mapping, genetical genomics approaches have also been used to study whole genome transacting effects of specific loci, such as candidate genes or transgenes (40), to estimate heritability of mrna transcript abundances (13, 18, 37) using information on related subjects and to infer regulatory gene networks (3, 6, 9, 35, 55). For a review of available methods for the statistical analysis of genetical genomics data see, for example, Alberts et al. (1), Carlborg et al. (7), Kadarmideen et al. (25), Kendziorski and Wang (28), and Rosa et al. (44). As genetical genomics studies involve expensive and laborintensive throughput laboratory techniques (50), such as SNP scoring and gene expression profiling using microarray and/or quantitative reverse transcription polymerase chain reaction (qrt-pcr), careful experimental design of such trials is critical for their success. In the following sections, we discuss some design strategies for genetical genomics studies, using simple language and avoiding excessive mathematical formalism, and provide some general design guidelines related to different experimental goals, such as the comparison of expression levels of different genotypic groups for candidate genes, eqtl mapping, and the estimation of heritabilities of mrna transcript abundances. DESIGN OF GENETICAL GENOMICS STUDIES The planning of a genetical genomics study entails a variety of aspects. Similarly to any QTL mapping experiment it requires, for example, the choice of breeds or lines to be used, as well as the experimental design, such as backcross (BC), F 2, granddaughter design, etc. In addition, gene expression studies involve careful thought regarding the cell type(s), tissue(s), and the developmental stage(s) to be assayed. Specifically with respect to gene expression microarray experiments, researchers are also faced with the questions of which microarray platform best suits their specific experiment goals and how many slides should be considered for the desired experiment efficiency. In addition, after the choice of a microarray platform and the number of slides to be used, researchers should still decide the subjects to be assayed, as well as how to pair samples within slides and how to assign dye labeling (e.g., Cy3 and Cy5 dyes), in the case of two-color technologies, and how to organize the hybridizations across assay batches.

4 The choice of microarray platform is generally guided by the availability of alternative technologies for the organism being studied. For example, cdna microarrays are available only for species from which expressed sequence tags (ESTs) were obtained from cdna libraries. Likewise, slides using oligonucleotide probes can be generated only if DNA sequence is available, and an optimal probe set can be designed only for those species with completely sequenced genomes. Under these circumstances, a broader range of commercial and homemade array platforms is generally available for experiments involving human subjects or model organisms, as opposed to most livestock or wildlife species. In addition, given the current costs associated with microarray hybridization experiments, the size of the experiments (i.e., number of slides considered) is most often dictated by a budget constraint. Therefore, in this paper we focus our discussion on two specific statistical issues of the experimental design of genetical genomics studies. First, the subset selection of individuals for gene expression assaying (also known as selective phenotyping, or selective profiling), and second, the microarray experimental set-up, especially when making use of two-color systems, such as cdna or long oligo arrays. Selective Profiling In the past, genotyping costs used to limit the sample sizes of gene mapping studies. To overcome this problem and increase the power of such studies for a fixed number of experimental units, selective genotyping approaches have been considered (2, 12, 33, 34). In genetical genomics studies, phenotyping can also be extremely expensive because of the high costs of gene expression profiling via microarrays. Thus measuring gene expression for only a subset of available individuals is a natural strategy for reducing the cost of eqtl mapping experiments. In the next subsections we discuss alternative selective profiling approaches for different experimental goals of genetical genomics trials. Genetic dissimilarity. A proposed strategy for selective phenotyping uses marker information for subset selection for increased genetic dissimilarity (23), with the goal of maximizing the power for QTL detection. The procedure compares the marker genotypes of all subjects available (the so-called full mapping panel) and uses an algorithm based on the experimental design concept of minimum moment aberration (MMA) to select a subsample of individuals to be phenotyped. The procedure may either consider information on all available markers or alternatively target specific regions of the genome thought to be important for the trait(s) of interest. Given m markers, MMA measures similarity for a subsample of n individuals as the average of all pairwise similarities, given by: n 1 K 1 p i 1 n j i 1 s ij, GENETICAL GENOMICS EXPERIMENTAL DESIGN where s ij is the similarity measurement between individuals i and j, and p n(n 1)/2 is the total number of pairs of individuals. Jin and colleagues (23) used the number of alleles two individuals share (0, 1, or 2) as a measure of similarity, so that s ij is the sum of number of alleles in common over all markers considered. To allow comparisons across experiments of different sizes, the authors considered a standardized version of the similarity measure, called score (S), given by: M K S n, R Invited Review where M is the maximum possible value of K, and R is the difference between the maximum and the minimum possible values. Jin and collaborators (23) used a simulation study to illustrate the increase in the efficiency of QTL detection when using this selective phenotyping approach, compared with a random sample from the mapping panel. An F 2 experiment was considered, with a single chromosome and evenly spaced (10 cm) markers, and a single QTL with heritability h , 0.50, and Varying mapping panels (N ), sample sizes (n 10 N), and proportion of individuals selected (10 90%) were assessed as well. Their results show that, for a fixed subsample size (n 50), the sensitivity (i.e., the percentage of simulation runs in which the QTL was detected) increased with mapping panel size, leveling off when the proportion of selected subjects reached 50% (for h ). In a situation with a fixed mapping panel (N 100), as the proportion of subjects selected increased, the sensitivity improved much faster when using the selective phenotyping approach than with a random sample from the mapping panel. But again there was not much improvement on sensitivity with 50% of subsampling, especially for higher heritability scenarios, suggesting that most of the information needed for QTL detection is retained with 50% of selective phenotyping. As discussed by these authors, their genetic dissimilarity criterion tends to select individuals that are predominantly homozygous for different alleles. For example, in an F 2 mapping panel population originated from inbred lines A and B, a 1:2:1 ratio of A:H:B genotypes is expected. Their genetic dissimilarity approach samples from the mapping panel such that a 1:1 ratio between homozygous individuals in the subsample is favored. This procedure, however, is recommended only if the focus of the experiment refers to additive effects. While additive effects are usually considered the most important and most prevailing among all (23), they are also the easiest ones to detect. For estimating more complex gene action effects, however, other genotypes are required in the subsample, so alternative selection criteria should be considered instead. For example, heterozygous individuals are required in the subsample if one wants to infer dominance effects as well. Jin and colleagues (23) suggested that similarity could be defined as 1 for the same genotype and 0 for different genotype at each marker if interest refers to general QTL effects. Moreover, the authors indicated that their MMA criterion, which corresponds to the first moment (or mean) of the similarity measure across the individuals, optimizes selection for nonepistatic effects. The second moment (or variance) would further optimize for epistatic QTL. The MMA criterion is conceptually simple and easy to implement, but its current theory relies on complete data and independent factors. For dealing with missing genotypes, Jin and collaborators (23) used a data imputation approach using the Haldane mapping function and the information on flanking markers. To minimize the correlation from genetic linkage, the 17

5 Invited Review 18 GENETICAL GENOMICS EXPERIMENTAL DESIGN authors suggest the selection of widely spaced markers for computing similarity measures. It is important to mention that classical interval mapping approaches may produce biased estimates for QTL effects when selective genotyping is considered (33). As demonstrated by Jin and colleagues (23), interval mapping is robust against selective phenotyping, meaning that inference obtained by analyzing only the selected subjects is representative of the whole population. Finally, the authors suggest that a two-stage selective phenotyping could considerably reduce the cost and increase the power of large experiments, in which a first stage of genome-wide selection could identify promising genomic regions, which would then be used for marker-based selective phenotyping on a second stage. Applications of the selective phenotyping approach proposed by Jin and collaborators (23), in the context of genetical genomics, can be found for example in Lan and colleagues (32). Genetic complementarity. As discussed above, increasing genetic dissimilarity maximizes the power only for detecting additive effects. Important nonadditive effects, which are actually generally more difficult to estimate, may be missed if not specifically targeted when designing the experiment. To illustrate this concept, consider a situation (Fig. 2) with a single locus and three possible genotypes (homozygotes A and B and the heterozygote H), for which the expected phenotypic values are represented, respectively, by A, B, and H. The additive effect is defined as the difference between A or B and the average ( ) between A and B or, similarly, equal to half the difference between A and B. The dominance effect is defined as the difference between H and. So, it is clear that with information on both homozygous groups one can calculate the additive effect, but the dominance effect can be computed only with information on the three genotypic groups. In practice, however, the phenotypic means ( A, B, and H ) are unknown so they need to be inferred from experimental data. A linear model for the analysis of such data can be expressed as: y ij i e ij G i e ij where y ij represents the phenotypic observation on replication Fig. 2. Expected phenotypic value ( 1, 2, and 3) given the genotype (A, H, and B) at a specific locus. Additive and dominance effects are indicated by the Greek letters and, respectively. Fig. 3. Variance of estimates of additive (solid line) and dominance (dashed line) effects with varying proportions of heterozygous individuals (A) and with varying proportion of each homozygous groups (B). In A, a similar proportion of each homozygous group is considered, and in B the proportion of heterozygous individuals is held constant at 0.5. j of genotype i (with j 1,...,n i ;i A, B, and H; and n i being the sample size for genotype i), is a general constant (defined here as the average between the expected phenotypic values of the homozygous genotypes), G i is the effect of genotype i, and e ij is a residual term associated with the observation y ij. The residuals are generally assumed normally distributed with mean 0 and variance 2, representing polygenic and environmental factors affecting the phenotype. Note that in genetical genomics the phenotype can be the transcriptional abundance (often log transformed) relative to either the polymorphic locus A/B or any other locus of interest. An analysis of variance could be considered to study the effect of genotype, as well as to estimate the residual variance 2. An estimate of the additive effect can be obtained by a linear contrast involving the averages of the homozygous groups, i.e., ˆ ( ˆ A ˆ B)/2 (y A y B )/2. For estimating the dominance effect the contrast must also involve the heterozygous group average, as ˆ ˆ H ( ˆ A ˆ B)/2 y H (y A y B )/2. It is shown that the variances of ˆ and ˆ are, respectively, (1/r A 1/r B ) 2 /(4N) and (4/r H 1/r A 1/r B ) 2 /(4N), where r i is the proportion of individuals with genotype i and N n A n B n H. Figure 3A depicts the variance of the estimates of the additive and dominance effects as a function of the proportion of heterozygous individuals (r H ) in the sample. It is considered the same proportion of each homozygous group (A and B), i.e., r A r B (1 r H )/2. It is

6 seen that the variance for additive effects is always smaller than that for the dominance effects. Also, the variance for additive effects is minimized when r H is small, as proposed by the selection criterion discussed by Jin and colleagues (23). On the other hand, the variance for the dominance effects increases exponentially as the values of r H decrease, and it goes to infinity as r H approaches zero (i.e., the dominance effects simply cannot be estimated if there are no H individuals in the sample). Note also that the variances of both the additive and the dominance effects increase exponentially as r H approaches 1, indicating the obvious conclusion that neither additive nor dominance effects can be estimated if only H individuals are represented in the sample. In Fig. 3B, the proportion of H individuals is fixed to its optimal value r H 0.5 (i.e., the proportion that minimizes the variance for dominance effects), and the proportion of each homozygous group is changed. It is seen, as expected, that the best scenario refers to a situation with a balance between each homozygous groups, i.e., r A r B But more importantly, note that the variance for the additive effects is always smaller than that for the dominance effects, even when there is strong unbalance between the A and B groups. With special interest on dominance effects, Keller and collaborators (26) and Piepho (41) discussed an alternative selective profiling criterion that favors a 1:2:1 ratio of A:H:B genotypes in the subsample. It is shown that even with an A:H:B ratio of 1:2:1 (i.e., the ratio that maximizes the precision of dominance effects estimates), the variance of the additive effect estimate for a specific locus is still half the variance of its dominance effects estimate (Fig. 3A). The authors, however, considered a situation with two inbred lines and their hybrids, such that only two haplotype configurations are possible for each chromosome. Under these circumstances, it is not possible to determine if a differential gene expression observed across the three genotypic groups for any specific gene is due to the allelic variation in that gene (cis-acting) or to allelic variation on other loci (trans-acting) of the genome or to a combination of such effects. For example, consider an experiment to compare the transcriptional activity of three genes (1, 2, and 3) between two lines (A and B) with genotypes A 1 A 1 A 2 A 2 A 3 A 3 and B 1 B 1 B 2 B 2 B 3 B 3 and their hybrid A 1 B 1 A 2 B 2 A 3 B 3. If a higher transcriptional abundance on gene 1 is observed for individuals with genotype A 1 B 1 compared with the average of the two homozygous genotypes (A 1 A 1 and B 1 B 1 ), it may be due to a cis-acting dominance effect on gene 1, as well as trans-acting dominance effects of genes 2 or 3, such as transcriptional factors or regulatory effects. Keller and collaborators (26) and Piepho (41) use the terms heterosis and dominance interchangeably when referring to the overexpression of genes on H individuals compared with the average of the parent lines A and B. We understand heterosis is a more appropriate terminology in this case because, as discussed above, one cannot ensure whether the overexpression of a specific gene is due to any specific locus (or small set of loci) or to polygenic effects. The only way an experiment could provide information to disentangle these effects would be by allowing loci to recombine, such as by carrying the crosses to at least an F 2 generation. More generations may be necessary to increase the probability of recombination among closely linked loci. In any event, a selective profiling criterion may be GENETICAL GENOMICS EXPERIMENTAL DESIGN Invited Review used to select individuals carrying desired allelic combinations across target loci. A general approach in this regard was proposed by Bueno and colleagues (5). Their selective phenotyping criterion is based on what is coined here genetic complementarity, in which the subset selection of subjects depends on the goal of the experiment. For example, if a central goal of an F 2 line cross experiment is to infer trans-acting dominance effects of a candidate gene, the selection criterion will tend to sample (similarly to the above) a subset of subjects for a 1:2:1 ratio of A:H:B genotypes. Conversely, if both additive and dominance effects are equally sought, then the subset selection will tend toward a 0.293:0.414:0.293 ratio of A:H:B genotypes. It is important to notice that ideally (but not necessarily) the candidate gene(s) should be in linkage equilibrium with other genes probed in the microarray slide. Complete linkage disequilibrium, on the other hand, leads the effects to be confounded, as discussed above for the case with inbred lines. Bueno and colleagues (5) also discussed situations with multiple loci, including epistatic effects, i.e., the combined effect of alleles in two or more loci on the expression of another locus. It is shown that if only additive and additive additive effects are to be estimated, an experiment involving K biallelic loci will correspond to a factorial of the series 2 K.If dominance effects and interactions (epistasis) involving dominance effects are sought as well, a 3 K factorial experiment should be considered. Simpler experiment layouts can be utilized if epistatic effects are not of interest, but more complicated experimental scenarios (such as fractional factorial structures) may be necessary depending on the number of loci and the number of microarray slides considered, as well as the genetic material available (e.g., some specific allelic combinations across multiple loci may be absent due to rare allelic frequencies or to low recombination rates between closely linked loci). Recombination rates. The genetic complementarity approaches discussed above consider situations with candidate genes (5), or when subjects belong to a few possible genotypic groups, such as inbred lines and F 1 (26, 41). In either case, there is no need to estimate the location of eqtl. Many genetical genomics experiments, however, relate to eqtl mapping studies, such as the example depicted in Fig. 1. In such situations, interest refers not only to detection of eqtl, but also to localizing those putative eqtl, as well as to estimating their cis- or trans-acting effects. The genetic dissimilarity methodology suggested by Jin and collaborators (23) for eqtl mapping maximizes the power of detection of eqtl with additive effects but may not be optimal for inferring the location of those eqtl. For example, consider a double haploid (DH) experiment with three linked, ordered loci. In such a situation, a pair of nonrecombinant individuals (i.e., individuals with genotypes A 1 A 1 A 2 A 2 A 3 A 3 and B 1 B 1 B 2 B 2 B 3 B 3 ) has the same genetic dissimilarity value as a pair of double recombinant individuals (i.e., individuals with genotypes A 1 A 1 B 2 B 2 A 3 A 3 and B 1 B 1 A 2 A 2 B 3 B 3 ). While these two pairs of individuals have the same amount of information regarding additive effects of putative eqtl within the chromosomic segment between loci 1 and 3, only the second pair has information regarding the number of eqtl (one vs. two), as well as the position of such eqtl. 19

7 Invited Review 20 GENETICAL GENOMICS EXPERIMENTAL DESIGN With the goal of maximizing the efficiency of localizing eqtl, de Leon and Rosa (15) proposed a selective phenotyping criterion to maximize the number of recombination events in the subset sample. In a simulation study involving a DH experiment with 10 markers evenly spaced and a single QTL and different mapping panel sizes and subsampling rates, the authors concluded that the selective profiling based on recombination rates substantially improved the precision of the QTL position estimates (even compared with a genetic dissimilarityselective criterion focused on markers nearby the QTL), with no sizeable detrimental effect on either the detection power or the precision of inferences regarding the QTL effect. Similar results were presented by Jannink (21) and Xu and colleagues (52), who performed even broader simulations, with varying marker spacing, map length, and number of QTL. More general methodologies of selective phenotyping based on recombination rates were proposed by Jannink (21) and Xu and collaborators (52). Their approaches favor not only an increased number of recombination events in the subsample, but also an even distribution of recombinations across the genome. With use of such procedures, a pair of individuals with genotypes A 1 A 1 B 2 B 2 B 3 B 3 and B 1 B 1 B 2 B 2 A 3 A 3 (i.e., two individuals showing recombinations on different chromosomic regions) would be preferred over a pair of individuals with genotypes A 1 A 1 B 2 B 2 B 3 B 3 and B 1 B 1 A 2 A 2 A 3 A 3 (i.e., individuals with recombinations observed only for one of the chromosomic segments). Xu and colleagues (52) used as the objective function the so-called sum of squares of bin lengths (SSBL), where bin was defined on a sample of individuals as an interval along the linkage group within which there were no crossovers in any sampled individual and bounded on either side either by a crossover in at least one individual or by the end of a linkage group. By minimizing the SSBL a sample of individuals in which crossovers are more frequent and the distance between them less variable is obtained. Alternatively, Jannink (21) proposed a selective profiling criterion (coined unirec) based on the overall sum of d ij c ij /m i (across subjects and marker intervals), where c ij 1 if progeny j is recombinant in marker interval i (and c ij 0 otherwise), and m i is the map distance (cm) between the markers flanking interval i. In addition, the author considered a selection criterion (called maxrec) based on the number of recombinations only (i.e., the overall sum of c ij values, without weighting them by map distances), as the one discussed by de Leon and Rosa (15). Both Xu and collaborators (52) and Jannink (21) concluded that the selective profiling based on the number (and distribution) of recombinations significantly increased the accuracy of QTL position estimates, especially for smaller genetic map length. The selective profiling strategies discussed above improve eqtl mapping efficiency in different ways compared with a random sample from the full mapping panel. The approaches of Bueno et al. (5), Jin et al. (23), Keller et al. (26), and Piepho (41) maximize the power of detection of specific QTL effects, while the procedures suggested by de Leon and Rosa (15), Jannink (21), and Xu et al. (52) improve the QTL mapping resolution. No research, however, has been published on alternatives to combine both strategies, i.e., to increase the number and homogeneity of recombinations on the subsample while favoring specific genotypic proportions across the selected individuals. Additional research and simulations in this area would certainly be welcome to further improve the benefits and flexibility of selective phenotyping approaches in genomics research, especially for situations in which a large genotyped population is available [such as with recombinant inbred lines (RIL)], and when the phenotypic assays are onerous and costly (such as with transcriptional and translational output assays). Covariance among subjects. A completely distinct research goal that has been considered in the genetical genomics literature relates to inferring variance components and heritability of transcriptional activity (18, 37). In these circumstances, treatments (which may refer either to family structures or related individuals on a complex pedigree) are considered of random effects. A selective phenotyping criterion approach for these cases should take into account the genetic relatedness among the available subjects to maximize the precision of the variance components or heritability estimates. Bueno and collaborators (5) discussed an algorithm for a subselection procedure with random treatments and presented some examples involving half sibs, full sibs, and complex pedigrees. The best designs are shown to be very specific for each pedigree and estimation objective. An R function was developed for finding optimal designs with any covariance structure among treatments. Microarray Experiment Layout After a set of subjects has been selected for gene expression profiling, another important step in the experimental design is required, especially for two-color microarray platforms. Specifically, a microarray experiment layout should be optimized regarding the allocation of mrna samples within assay batches, slides, dye labeling, and other local control factors (5, 51). Reference and loop designs. The most widely used experimental layouts for two-color microarray experiments refer to the so-called reference and loop structures (30, 53). In the reference design (Fig. 4A), a single sample (reference sample) is hybridized with every sample from each of the treatments or experimental groups. Dye-swap is sometimes considered, but it is not mandatory. The main advantage of the reference design is its simplicity; it is straightforward to conduct in the lab. Its disadvantage, however, relates to the fact that half of the Fig. 4. Graphic illustration of a reference (A) and a loop (B) design for 2-color microarray experiments. Rectangles represent slides, and shading indicate the dye labeling (Cy3 and Cy5). Letters represent mrna samples from each treatment (A, B, C, D, and E) or reference sample (R).

8 observations refers to the reference sample, which is not of primary interest. Conversely, the loop design (Fig. 4B) refers to a more complex structure, in which each sample must be labeled with both dyes, which are cohybridized with samples from alternating experimental groups and dye labeling. Nonetheless, the loop structure is shown to be generally more efficient than the reference design (29, 30, 48, 53). Some alternatives to the classical reference and loop layouts are discussed in Rosa et al. (43), Steibel and Rosa (46), and Tempelman (47), who compare efficiency and robustness of designs combining different levels of biological replication (i.e., subjects within the experimental groups) and technical replication (e.g., replicated arrays for each biological sample). The reference design depicted in Fig. 4A presents a single replication for each experimental group, so evidently multiple replications of such experiment would be necessary for statistical inference purposes. Moreover, in Fig. 4B, the two samples of each group (A to E) may refer to either two aliquots of the same mrna sample from each group or to two independent samples (two subjects) from each group. In the first case there would be only technical replication for each mrna sample, and the five slides in Fig. 1B would represent a single biological replication of the experiment. The second scenario would represent a minimally replicated balanced loop design with five experimental groups. More biological replications are generally needed for a reasonable experimental precision and efficiency. For a discussion on technical vs. biological replication please refer to, for example, Churchill (10), Rosa et al. (43), and Tempelman (47). Most previous papers on the design of two-color microarray experiments, however, considered situations in which there was no genetic component distinguishing the experimental groups, so their results are not directly applicable to genetical genomics studies. Consequently, possibly due to unawareness of better design alternatives, genetical genomics experiments utilizing two-color microarrays are generally conducted using reference designs with dye-swap (4, 45, 54). Recently, however, some studies proposing more efficient experimental setups for genetical genomics were presented, which are discussed below. Distant pairing. In the genetical genomics context, Fu and Jansen (16) proposed a design strategy for allocating pairs of RIL samples to two-color microarray slides. Their approach, called distant pair design, is based on two basic principles. First, for a given number of slides, it is generally more efficient to increase biological replication rather than technical replication; and second, samples should be paired such that an increased ratio of within- over between-slides genotypic dissimilarity is obtained. For example, consider an experiment with two slides to study the effects of allelic variation on the gene expression of three loci (1, 2, and 3). Consider also that four samples are available with the following genotypes: A 1 A 1 B 2 B 2 B 3 B 3, A 1 A 1 A 2 A 2 B 3 B 3,B 1 B 1 A 2 A 2 A 3 A 3, and B 1 B 1 B 2 B 2 A 3 A 3. In this case, a more efficient estimation of genetic effects is obtained by pairing samples 1 and 3 in one slide, and samples 2 and 4 in the other. The distant pairing approach presented by Fu and Jansen (16) can also be combined with a selective profiling step whenever the mapping panel is bigger than the number of samples intended to be used in the microarray experiment. The GENETICAL GENOMICS EXPERIMENTAL DESIGN Invited Review selective phenotyping can be performed using either the genotypic dissimilarity approach proposed by Jin et al. (23) or the recombination-based approaches of Jannink (21) or Xu et al. (52). Efficient designs to estimate dominance effects. The distant pair design approach is recommended only when interest relies exclusively on additive effects or when only two genotypes are possible for each locus, such as with RIL, BC, or DH populations. If other effects are of interest, an alternative pairing strategy may be necessary. For instance, Piepho (41) discussed efficient designs for two-color microarray experiments when the main interest refers to the estimation of dominance effects. It is shown that in such cases slides cohybridizing samples from heterozygous against homozygous individuals are more informative and desired than slides comparing homozygous subjects for different alleles. As an example, consider a single locus situation with three possible genotypes: A, B, and H. Efficient designs for inferring dominance effects should pair samples A with H, and B with H. Obviously, such experiments should include multiple slides of each pair comparison (using independent samples from each genotypic group, i.e., biological replication), with alternating dye labeling. General approach. As discussed previously, the approach proposed by Piepho (41), either for selecting subjects for microarray screening or for pairing samples within slides, applies only to inbred lines and their hybrids or to situations targeting a single biallelic locus. A more general approach for searching for optimal genetical genomics designs (including an optimal dye assignment and pairing of samples across slides) was proposed by Bueno and colleagues (5). Similarly to their selective phenotyping approach based on genetic complementarity, samples are allocated to slides and dyes favoring hybridizations that provide more informative contrasts relative to the genetic parameters of interest. The results presented using examples with multiple loci generalize the distant pair concept for inferring additive effects, as well as Piepho s design strategy for inferring dominance effects. It is shown that the optimal design depends on the effect(s) of interest and on how they are weighted in the optimality criterion. For example, if inferences are focused on additive effects, the optimal design will resemble a distant pairing strategy. Likewise, if dominance effects are the main effects of interest, the resulting design will favor slides comparing heterozygous vs. homozygous subjects. However, if interaction terms (epistatic effects) are also considered, the design structures get more complex. In such situations, the allocation of samples within slides should take into account their combination of genotypes across multiple loci. For example, a distant pair design considering two loci in a DH population would tend to hybridize only slides with samples A 1 A 1 A 2 A 2 vs. B 1 B 1 B 2 B 2, and A 1 A 1 B 2 B 2 vs. B 1 B 1 A 2 A 2. Nonetheless, additional genotypes should be paired if additive additive epistatic effects are of interest as well. For instance, a pair of slides comparing the genotype A 1 A 1 A 2 A 2 against A 1 A 1 B 2 B 2 and the genotype B 1 B 1 A 2 A 2 against B 1 B 1 B 2 B 2 provides more precise information regarding how the additive effect of locus 2 changes according to the genotype on locus 1, i.e., A 2 A 2 vs. B 2 B 2 when locus 1 genotype is A 1 A 1, and A 2 A 2 vs. B 2 B 2 when locus 1 genotype is B 1 B 1, respectively. Another generalization of the distant pairing concept proposed by Fu and Jansen (16) for finite loci was also presented 21

9 Invited Review 22 GENETICAL GENOMICS EXPERIMENTAL DESIGN by Bueno and colleagues (5) for experiments aiming at the estimation of variance components and heritability of gene expression. It is shown that given a sample of subjects with a certain relatedness structure (pedigree), the search algorithm tends to pair less related individuals in each competitive hybridization. CONCLUDING REMARKS This paper discusses the designing of microarray experiments for different goals of genetical genomics studies, such as the comparison of expression levels of different genotypic groups, eqtl mapping studies, or estimation of heritabilities of mrna transcript abundances. Choosing a good microarray design for a genetical genomics study consists of two steps: selective profiling (or selective phenotyping), i.e., the selection of subjects to be assayed; and the microarray experiment layout, which refers to the allocation of pairs of samples to slides and the assignment of dye labeling (red and green). These two steps are also referred to in the statistical literature as treatment choice and treatment to unit allocation, respectively. The selective phenotyping step depends on the genetic or biological material available and on the goal of the experiment, so it is similar for any microarray platform being used, such as single channel high density oligonucleotide technology (e.g., Affymetrix) or two-color spotted slides (using either cdna or long oligonucleotide probes). The microarray experiment layout, however, depends also on the microarray technology considered. In the case of single-channel platforms, as each sample is assayed in an independent slide, the experiment set-up is straightforward. Conversely, with two-color microarrays, there are always numerous ways of pairing samples and assigning dyes. The simplest alternative in this regard refers to the reference design, which resembles a single-channel experiment layout. However, it is possible to take advantage of the possibility of assaying two samples in each slide by searching for more general structures that provide increased efficiency and precision for the experiments. In general (and especially from the statistical point of view), biological replication should be preferred over technical replication. For example, if a reference design with 2n microarray slides is considered, better statistical precision is obtained if 2n subjects are assayed instead of a dye-swap structure (i.e., reverse labeling of each biological sample and the reference) with n subjects in two slides each. As discussed in this paper, even more efficient experiments may be sought in the context of row-column (slides and dyes) structures, by searching for optimal (or near-optimal) designs for specific goals of the experiment (5, 51). Dye-swap (or any other level of technical replication) may be considered if a limited number of biological samples are available; however, statistical tests should take into account such hierarchical replication structure (i.e., subjects and slides within subjects) when performing significance testing (43). Another common strategy in microarray experiments is to pool samples as an attempt to reduce biological variability (10, 27). In genetical genomics studies, however, pooling is generally not advised, except in a few cases such as with RIL (28) or other experiments involving genetically identical individuals. Another interesting issue with microarray experiments, which is not addressed in this paper but which certainly relates to genetical genomics studies as well, is that of power and sample size calculation. Because of the high-dimensional nature of microarray assays, power calculation should be based on the false discovery rate (FDR) concept, as proposed by Dobbin and Simon (14), Gadbury et al. (17), Hu et al. (19), Jung (24), and Muller et al. (39). However, genetical genomics studies usually involve multiple, hierarchical sources of variation within a mixed effects model context (43). Extensions of the FDR-based power calculations for genetical genomics studies are not yet available but would certainly be extremely useful. In this paper we discuss selective phenotyping based on information on genetic markers or relatedness among individuals. Alternatively, selective transcriptional profiling may be based on traditional phenotypes (36) or on combinations of trait and marker data (49) or trait and family structures (44). The suitability of each approach will depend on the goal of the experiment and on the assumptions of the model. Generally, however, the subsample selection mechanism should be taken into account when analyzing the observed data (49). While for small or simple experiments an analytical solution for the optimality problem may be possible, more complex scenarios (such as in situations with multiple allelic and interacting loci or with complex pedigrees) require a numerical solution using a search algorithm. An optimal design is guaranteed only with a full search (i.e., by comparing all possible designs), but it is generally unfeasible for larger design spaces. Search algorithms (such as simulated annealing or genetic algorithms) can be used instead to find optimal (or nearoptimal) designs, but they may be computationally demanding and time consuming. Common sense may then be used either to constrain the design space to be searched, by eliminating design structures that are clearly inadequate (such as designs with strong unbalance on dye labeling), or to come up with a reasonable starting point for the search algorithm. GRANTS This work was supported by United States Department of Agriculture Grant to G. J. M. Rosa. REFERENCES 1. Alberts R, Fu J, Swertz MA, Lubbers LA, Alberts CJ, Jansen RC. Combining microarrays and genetic analysis. Briefings in Bioinformatics 6: , Allison DB, Heo M, Schork NJ, Wong SL, Elston RC. Extreme selection strategies in gene mapping studies of oligogenic quantitative traits do not always increase power. Hum Hered 48: , Bing N, Hoeschele I. Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics 170: , Brem RB, Yvert G, Clinton R, Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science 296: , Bueno JSD, Gilmour SG, Rosa GJM. Design of microarray experiments for genetical genomics studies. Genetics 174: , Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, Wiltshire T, Su AI, Vellenga E, Wang J, Manly KF, Lu L, Chesler EJ, Alberts R, Jansen RC, Williams RW, Cooke MP, de Haan G. Uncovering regulatory pathways affecting hematopoietic stem cell function using genetical genomics. Nat Genet 37: , Carlborg O, De Koning DJ, Manly KF, Chesler E, Williams RW, Haley CS. Methodological aspects of the genetic dissection of gene expression. Bioinformatics 21: , 2005.

Review of microarray experimental design strategies for genetical genomics studies

Review of microarray experimental design strategies for genetical genomics studies Physiol Genomics 28: 15 23, 2006. First published September 19, 2006; doi:10.1152/physiolgenomics.00106.2006. CALL FOR PAPERS 2nd International Symposium on Animal Functional Genomics Review of microarray

More information

Statistical Methods for Quantitative Trait Loci (QTL) Mapping

Statistical Methods for Quantitative Trait Loci (QTL) Mapping Statistical Methods for Quantitative Trait Loci (QTL) Mapping Lectures 4 Oct 10, 011 CSE 57 Computational Biology, Fall 011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 1:00-1:0 Johnson

More information

Identifying Genes Underlying QTLs

Identifying Genes Underlying QTLs Identifying Genes Underlying QTLs Reading: Frary, A. et al. 2000. fw2.2: A quantitative trait locus key to the evolution of tomato fruit size. Science 289:85-87. Paran, I. and D. Zamir. 2003. Quantitative

More information

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs

By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs (3) QTL and GWAS methods By the end of this lecture you should be able to explain: Some of the principles underlying the statistical analysis of QTLs Under what conditions particular methods are suitable

More information

DESIGNS FOR QTL DETECTION IN LIVESTOCK AND THEIR IMPLICATIONS FOR MAS

DESIGNS FOR QTL DETECTION IN LIVESTOCK AND THEIR IMPLICATIONS FOR MAS DESIGNS FOR QTL DETECTION IN LIVESTOCK AND THEIR IMPLICATIONS FOR MAS D.J de Koning, J.C.M. Dekkers & C.S. Haley Roslin Institute, Roslin, EH25 9PS, United Kingdom, DJ.deKoning@BBSRC.ac.uk Iowa State University,

More information

Genetic dissection of complex traits, crop improvement through markerassisted selection, and genomic selection

Genetic dissection of complex traits, crop improvement through markerassisted selection, and genomic selection Genetic dissection of complex traits, crop improvement through markerassisted selection, and genomic selection Awais Khan Adaptation and Abiotic Stress Genetics, Potato and sweetpotato International Potato

More information

THE combined study of gene expression and molecular

THE combined study of gene expression and molecular Copyright Ó 2006 by the Genetics Society of America DOI: 10.1534/genetics.105.047001 Optimal Design and Analysis of Genetic Studies on Gene Expression Jingyuan Fu 1 and Ritsert C. Jansen Groningen Bioinformatics

More information

QTL Mapping Using Multiple Markers Simultaneously

QTL Mapping Using Multiple Markers Simultaneously SCI-PUBLICATIONS Author Manuscript American Journal of Agricultural and Biological Science (3): 195-01, 007 ISSN 1557-4989 007 Science Publications QTL Mapping Using Multiple Markers Simultaneously D.

More information

High-density SNP Genotyping Analysis of Broiler Breeding Lines

High-density SNP Genotyping Analysis of Broiler Breeding Lines Animal Industry Report AS 653 ASL R2219 2007 High-density SNP Genotyping Analysis of Broiler Breeding Lines Abebe T. Hassen Jack C.M. Dekkers Susan J. Lamont Rohan L. Fernando Santiago Avendano Aviagen

More information

Mapping and Mapping Populations

Mapping and Mapping Populations Mapping and Mapping Populations Types of mapping populations F 2 o Two F 1 individuals are intermated Backcross o Cross of a recurrent parent to a F 1 Recombinant Inbred Lines (RILs; F 2 -derived lines)

More information

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping

POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping POPULATION GENETICS Winter 2005 Lecture 18 Quantitative genetics and QTL mapping - from Darwin's time onward, it has been widely recognized that natural populations harbor a considerably degree of genetic

More information

Supplementary Text. eqtl mapping in the Bay x Sha recombinant population.

Supplementary Text. eqtl mapping in the Bay x Sha recombinant population. Supplementary Text eqtl mapping in the Bay x Sha recombinant population. Expression levels for 24,576 traits (Gene-specific Sequence Tags: GSTs, CATMA array version 2) was measured in RNA extracted from

More information

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening.

MAS refers to the use of DNA markers that are tightly-linked to target loci as a substitute for or to assist phenotypic screening. Marker assisted selection in rice Introduction The development of DNA (or molecular) markers has irreversibly changed the disciplines of plant genetics and plant breeding. While there are several applications

More information

Introduction to Quantitative Genomics / Genetics

Introduction to Quantitative Genomics / Genetics Introduction to Quantitative Genomics / Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics September 10, 2008 Jason G. Mezey Outline History and Intuition. Statistical Framework. Current

More information

1. why study multiple traits together?

1. why study multiple traits together? Multiple Traits & Microarrays 1. why study multiple traits together? 2-10 diabetes case study 2. design issues 11-13 selective phenotyping 3. why are traits correlated? 14-17 close linkage or pleiotropy?

More information

Enhancers mutations that make the original mutant phenotype more extreme. Suppressors mutations that make the original mutant phenotype less extreme

Enhancers mutations that make the original mutant phenotype more extreme. Suppressors mutations that make the original mutant phenotype less extreme Interactomics and Proteomics 1. Interactomics The field of interactomics is concerned with interactions between genes or proteins. They can be genetic interactions, in which two genes are involved in the

More information

QTL Mapping, MAS, and Genomic Selection

QTL Mapping, MAS, and Genomic Selection QTL Mapping, MAS, and Genomic Selection Dr. Ben Hayes Department of Primary Industries Victoria, Australia A short-course organized by Animal Breeding & Genetics Department of Animal Science Iowa State

More information

I.1 The Principle: Identification and Application of Molecular Markers

I.1 The Principle: Identification and Application of Molecular Markers I.1 The Principle: Identification and Application of Molecular Markers P. Langridge and K. Chalmers 1 1 Introduction Plant breeding is based around the identification and utilisation of genetic variation.

More information

Computational Genomics

Computational Genomics Computational Genomics 10-810/02 810/02-710, Spring 2009 Quantitative Trait Locus (QTL) Mapping Eric Xing Lecture 23, April 13, 2009 Reading: DTW book, Chap 13 Eric Xing @ CMU, 2005-2009 1 Phenotypical

More information

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis

Gene expression analysis. Biosciences 741: Genomics Fall, 2013 Week 5. Gene expression analysis Gene expression analysis Biosciences 741: Genomics Fall, 2013 Week 5 Gene expression analysis From EST clusters to spotted cdna microarrays Long vs. short oligonucleotide microarrays vs. RT-PCR Methods

More information

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Summer Review 7 Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk Jian Zhou 1,2,3, Chandra L. Theesfeld 1, Kevin Yao 3, Kathleen M. Chen 3, Aaron K. Wong

More information

Using Mapmaker/QTL for QTL mapping

Using Mapmaker/QTL for QTL mapping Using Mapmaker/QTL for QTL mapping M. Maheswaran Tamil Nadu Agriculture University, Coimbatore Mapmaker/QTL overview A number of methods have been developed to map genes controlling quantitatively measured

More information

Marker-Assisted Selection for Quantitative Traits

Marker-Assisted Selection for Quantitative Traits Marker-Assisted Selection for Quantitative Traits Readings: Bernardo, R. 2001. What if we knew all the genes for a quantitative trait in hybrid crops? Crop Sci. 41:1-4. Eathington, S.R., J.W. Dudley, and

More information

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are:

Let s call the recessive allele r and the dominant allele R. The allele and genotype frequencies in the next generation are: Problem Set 8 Genetics 371 Winter 2010 1. In a population exhibiting Hardy-Weinberg equilibrium, 23% of the individuals are homozygous for a recessive character. What will the genotypic, phenotypic and

More information

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score

Midterm 1 Results. Midterm 1 Akey/ Fields Median Number of Students. Exam Score Midterm 1 Results 10 Midterm 1 Akey/ Fields Median - 69 8 Number of Students 6 4 2 0 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 Exam Score Quick review of where we left off Parental type: the

More information

SNPs - GWAS - eqtls. Sebastian Schmeier

SNPs - GWAS - eqtls. Sebastian Schmeier SNPs - GWAS - eqtls s.schmeier@gmail.com http://sschmeier.github.io/bioinf-workshop/ 17.08.2015 Overview Single nucleotide polymorphism (refresh) SNPs effect on genes (refresh) Genome-wide association

More information

1 why study multiple traits together?

1 why study multiple traits together? Multiple Traits & Microarrays why map multiple traits together? central dogma via microarrays diabetes case study why are traits correlated? close linkage or pleiotropy? how to handle high throughput?

More information

Multiple Traits & Microarrays

Multiple Traits & Microarrays Multiple Traits & Microarrays 1. why study multiple traits together? 2-10 diabetes case study 2. design issues 11-13 selective phenotyping 3. why are traits correlated? 14-17 close linkage or pleiotropy?

More information

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze

SolCAP. Executive Commitee : David Douches Walter De Jong Robin Buell David Francis Alexandra Stone Lukas Mueller AllenVan Deynze SolCAP Solanaceae Coordinated Agricultural Project Supported by the National Research Initiative Plant Genome Program of USDA CSREES for the Improvement of Potato and Tomato Executive Commitee : David

More information

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze

Marker types. Potato Association of America Frederiction August 9, Allen Van Deynze Marker types Potato Association of America Frederiction August 9, 2009 Allen Van Deynze Use of DNA Markers in Breeding Germplasm Analysis Fingerprinting of germplasm Arrangement of diversity (clustering,

More information

Integrative Genomics 1a. Introduction

Integrative Genomics 1a. Introduction 2016 Course Outline Integrative Genomics 1a. Introduction ggibson.gt@gmail.com http://www.cig.gatech.edu 1a. Experimental Design and Hypothesis Testing (GG) 1b. Normalization (GG) 2a. RNASeq (MI) 2b. Clustering

More information

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001

Pharmacogenetics: A SNPshot of the Future. Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 Pharmacogenetics: A SNPshot of the Future Ani Khondkaryan Genomics, Bioinformatics, and Medicine Spring 2001 1 I. What is pharmacogenetics? It is the study of how genetic variation affects drug response

More information

AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY

AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY AN EVALUATION OF POWER TO DETECT LOW-FREQUENCY VARIANT ASSOCIATIONS USING ALLELE-MATCHING TESTS THAT ACCOUNT FOR UNCERTAINTY E. ZEGGINI and J.L. ASIMIT Wellcome Trust Sanger Institute, Hinxton, CB10 1HH,

More information

Statistical Methods for Network Analysis of Biological Data

Statistical Methods for Network Analysis of Biological Data The Protein Interaction Workshop, 8 12 June 2015, IMS Statistical Methods for Network Analysis of Biological Data Minghua Deng, dengmh@pku.edu.cn School of Mathematical Sciences Center for Quantitative

More information

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS.

GENETICS - CLUTCH CH.20 QUANTITATIVE GENETICS. !! www.clutchprep.com CONCEPT: MATHMATICAL MEASRUMENTS Common statistical measurements are used in genetics to phenotypes The mean is an average of values - A population is all individuals within the group

More information

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1

Human SNP haplotypes. Statistics 246, Spring 2002 Week 15, Lecture 1 Human SNP haplotypes Statistics 246, Spring 2002 Week 15, Lecture 1 Human single nucleotide polymorphisms The majority of human sequence variation is due to substitutions that have occurred once in the

More information

Why do we need statistics to study genetics and evolution?

Why do we need statistics to study genetics and evolution? Why do we need statistics to study genetics and evolution? 1. Mapping traits to the genome [Linkage maps (incl. QTLs), LOD] 2. Quantifying genetic basis of complex traits [Concordance, heritability] 3.

More information

Introduction to Bioinformatics and Gene Expression Technology

Introduction to Bioinformatics and Gene Expression Technology Vocabulary Introduction to Bioinformatics and Gene Expression Technology Utah State University Spring 2014 STAT 5570: Statistical Bioinformatics Notes 1.1 Gene: Genetics: Genome: Genomics: hereditary DNA

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

Using molecular marker technology in studies on plant genetic diversity Final considerations

Using molecular marker technology in studies on plant genetic diversity Final considerations Using molecular marker technology in studies on plant genetic diversity Final considerations Copyright: IPGRI and Cornell University, 2003 Final considerations 1 Contents! When choosing a technique...!

More information

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA.

Trudy F C Mackay, Department of Genetics, North Carolina State University, Raleigh NC , USA. Question & Answer Q&A: Genetic analysis of quantitative traits Trudy FC Mackay What are quantitative traits? Quantitative, or complex, traits are traits for which phenotypic variation is continuously distributed

More information

Crash-course in genomics

Crash-course in genomics Crash-course in genomics Molecular biology : How does the genome code for function? Genetics: How is the genome passed on from parent to child? Genetic variation: How does the genome change when it is

More information

Axiom mydesign Custom Array design guide for human genotyping applications

Axiom mydesign Custom Array design guide for human genotyping applications TECHNICAL NOTE Axiom mydesign Custom Genotyping Arrays Axiom mydesign Custom Array design guide for human genotyping applications Overview In the past, custom genotyping arrays were expensive, required

More information

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci

Wu et al., Determination of genetic identity in therapeutic chimeric states. We used two approaches for identifying potentially suitable deletion loci SUPPLEMENTARY METHODS AND DATA General strategy for identifying deletion loci We used two approaches for identifying potentially suitable deletion loci for PDP-FISH analysis. In the first approach, we

More information

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus.

b. (3 points) The expected frequencies of each blood type in the deme if mating is random with respect to variation at this locus. NAME EXAM# 1 1. (15 points) Next to each unnumbered item in the left column place the number from the right column/bottom that best corresponds: 10 additive genetic variance 1) a hermaphroditic adult develops

More information

Experimental Design and Sample Size Requirement for QTL Mapping

Experimental Design and Sample Size Requirement for QTL Mapping Experimental Design and Sample Size Requirement for QTL Mapping Zhao-Bang Zeng Bioinformatics Research Center Departments of Statistics and Genetics North Carolina State University zeng@stat.ncsu.edu 1

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

AD HOC CROP SUBGROUP ON MOLECULAR TECHNIQUES FOR MAIZE. Second Session Chicago, United States of America, December 3, 2007

AD HOC CROP SUBGROUP ON MOLECULAR TECHNIQUES FOR MAIZE. Second Session Chicago, United States of America, December 3, 2007 ORIGINAL: English DATE: November 15, 2007 INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS GENEVA E AD HOC CROP SUBGROUP ON MOLECULAR TECHNIQUES FOR MAIZE Second Session Chicago, United

More information

Genetics of dairy production

Genetics of dairy production Genetics of dairy production E-learning course from ESA Charlotte DEZETTER ZBO101R11550 Table of contents I - Genetics of dairy production 3 1. Learning objectives... 3 2. Review of Mendelian genetics...

More information

On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study

On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study On the Power to Detect SNP/Phenotype Association in Candidate Quantitative Trait Loci Genomic Regions: A Simulation Study J.M. Comeron, M. Kreitman, F.M. De La Vega Pacific Symposium on Biocomputing 8:478-489(23)

More information

Experimental Design for Gene Expression Microarray. Jing Yi 18 Nov, 2002

Experimental Design for Gene Expression Microarray. Jing Yi 18 Nov, 2002 Experimental Design for Gene Expression Microarray Jing Yi 18 Nov, 2002 Human Genome Project The HGP continued emphasis is on obtaining by 2003 a complete and highly accurate reference sequence(1 error

More information

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill

Introduction to Add Health GWAS Data Part I. Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Introduction to Add Health GWAS Data Part I Christy Avery Department of Epidemiology University of North Carolina at Chapel Hill Outline Introduction to genome-wide association studies (GWAS) Research

More information

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012

Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012 Lecture 23: Causes and Consequences of Linkage Disequilibrium November 16, 2012 Last Time Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation

More information

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010

Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Association Mapping in Plants PLSC 731 Plant Molecular Genetics Phil McClean April, 2010 Traditional QTL approach Uses standard bi-parental mapping populations o F2 or RI These have a limited number of

More information

Genome-Wide Association Studies (GWAS): Computational Them

Genome-Wide Association Studies (GWAS): Computational Them Genome-Wide Association Studies (GWAS): Computational Themes and Caveats October 14, 2014 Many issues in Genomewide Association Studies We show that even for the simplest analysis, there is little consensus

More information

Genome-wide association studies (GWAS) Part 1

Genome-wide association studies (GWAS) Part 1 Genome-wide association studies (GWAS) Part 1 Matti Pirinen FIMM, University of Helsinki 03.12.2013, Kumpula Campus FIMM - Institiute for Molecular Medicine Finland www.fimm.fi Published Genome-Wide Associations

More information

Application of MAS in French dairy cattle. Guillaume F., Fritz S., Boichard D., Druet T.

Application of MAS in French dairy cattle. Guillaume F., Fritz S., Boichard D., Druet T. Application of MAS in French dairy cattle Guillaume F., Fritz S., Boichard D., Druet T. Considerations about dairy cattle Most traits of interest are sex linked Generation interval are long Recent emphasis

More information

Haplotype Association Mapping by Density-Based Clustering in Case-Control Studies (Work-in-Progress)

Haplotype Association Mapping by Density-Based Clustering in Case-Control Studies (Work-in-Progress) Haplotype Association Mapping by Density-Based Clustering in Case-Control Studies (Work-in-Progress) Jing Li 1 and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University of California

More information

Module 1 Principles of plant breeding

Module 1 Principles of plant breeding Covered topics, Distance Learning course Plant Breeding M1-M5 V2.0 Dr. Jan-Kees Goud, Wageningen University & Research The five main modules consist of the following content: Module 1 Principles of plant

More information

Implementing direct and indirect markers.

Implementing direct and indirect markers. Chapter 16. Brian Kinghorn University of New England Some Definitions... 130 Directly and indirectly marked genes... 131 The potential commercial value of detected QTL... 132 Will the observed QTL effects

More information

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types

Lecture 12. Genomics. Mapping. Definition Species sequencing ESTs. Why? Types of mapping Markers p & Types Lecture 12 Reading Lecture 12: p. 335-338, 346-353 Lecture 13: p. 358-371 Genomics Definition Species sequencing ESTs Mapping Why? Types of mapping Markers p.335-338 & 346-353 Types 222 omics Interpreting

More information

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016

CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 CS273B: Deep Learning in Genomics and Biomedicine. Recitation 1 30/9/2016 Topics Genetic variation Population structure Linkage disequilibrium Natural disease variants Genome Wide Association Studies Gene

More information

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases).

B) You can conclude that A 1 is identical by descent. Notice that A2 had to come from the father (and therefore, A1 is maternal in both cases). Homework questions. Please provide your answers on a separate sheet. Examine the following pedigree. A 1,2 B 1,2 A 1,3 B 1,3 A 1,2 B 1,2 A 1,2 B 1,3 1. (1 point) The A 1 alleles in the two brothers are

More information

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE

ROAD TO STATISTICAL BIOINFORMATICS CHALLENGE 1: MULTIPLE-COMPARISONS ISSUE CHAPTER1 ROAD TO STATISTICAL BIOINFORMATICS Jae K. Lee Department of Public Health Science, University of Virginia, Charlottesville, Virginia, USA There has been a great explosion of biological data and

More information

Genetics or Genomics?

Genetics or Genomics? Genetics or Genomics? genetics: study single genes or a few genes first identify mutant organism with change of interest characterize effects of mutation but only a fraction of 30k human genes directly

More information

Computational Workflows for Genome-Wide Association Study: I

Computational Workflows for Genome-Wide Association Study: I Computational Workflows for Genome-Wide Association Study: I Department of Computer Science Brown University, Providence sorin@cs.brown.edu October 16, 2014 Outline 1 Outline 2 3 Monogenic Mendelian Diseases

More information

Authors: Vivek Sharma and Ram Kunwar

Authors: Vivek Sharma and Ram Kunwar Molecular markers types and applications A genetic marker is a gene or known DNA sequence on a chromosome that can be used to identify individuals or species. Why we need Molecular Markers There will be

More information

Population Structure and Gene Flow. COMP Fall 2010 Luay Nakhleh, Rice University

Population Structure and Gene Flow. COMP Fall 2010 Luay Nakhleh, Rice University Population Structure and Gene Flow COMP 571 - Fall 2010 Luay Nakhleh, Rice University Outline (1) Genetic populations (2) Direct measures of gene flow (3) Fixation indices (4) Population subdivision and

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu Spring 2015, Thurs.,12:20-1:10

More information

RNA-SEQUENCING ANALYSIS

RNA-SEQUENCING ANALYSIS RNA-SEQUENCING ANALYSIS Joseph Powell SISG- 2018 CONTENTS Introduction to RNA sequencing Data structure Analyses Transcript counting Alternative splicing Allele specific expression Discovery APPLICATIONS

More information

Analysis of genome-wide genotype data

Analysis of genome-wide genotype data Analysis of genome-wide genotype data Acknowledgement: Several slides based on a lecture course given by Jonathan Marchini & Chris Spencer, Cape Town 2007 Introduction & definitions - Allele: A version

More information

Course Announcements

Course Announcements Statistical Methods for Quantitative Trait Loci (QTL) Mapping II Lectures 5 Oct 2, 2 SE 527 omputational Biology, Fall 2 Instructor Su-In Lee T hristopher Miles Monday & Wednesday 2-2 Johnson Hall (JHN)

More information

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros

DNA Collection. Data Quality Control. Whole Genome Amplification. Whole Genome Amplification. Measure DNA concentrations. Pros DNA Collection Data Quality Control Suzanne M. Leal Baylor College of Medicine sleal@bcm.edu Copyrighted S.M. Leal 2016 Blood samples For unlimited supply of DNA Transformed cell lines Buccal Swabs Small

More information

Applicazioni biotecnologiche

Applicazioni biotecnologiche Applicazioni biotecnologiche Analisi forense Sintesi di proteine ricombinanti Restriction Fragment Length Polymorphism (RFLP) Polymorphism (more fully genetic polymorphism) refers to the simultaneous occurrence

More information

Strategy for Applying Genome-Wide Selection in Dairy Cattle

Strategy for Applying Genome-Wide Selection in Dairy Cattle Strategy for Applying Genome-Wide Selection in Dairy Cattle L. R. Schaeffer Centre for Genetic Improvement of Livestock Department of Animal & Poultry Science University of Guelph, Guelph, ON, Canada N1G

More information

Package qtldesign. R topics documented: February 20, Title Design of QTL experiments Version Date 03 September 2010

Package qtldesign. R topics documented: February 20, Title Design of QTL experiments Version Date 03 September 2010 Title Design of QTL experiments Version 0.941 Date 03 September 2010 Package qtldesign February 20, 2015 Author Saunak Sen, Jaya Satagopan, Karl Broman, and Gary Churchill Tools for the design of QTL experiments

More information

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

Outline. Analysis of Microarray Data. Most important design question. General experimental issues Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,

More information

Summary for BIOSTAT/STAT551 Statistical Genetics II: Quantitative Traits

Summary for BIOSTAT/STAT551 Statistical Genetics II: Quantitative Traits Summary for BIOSTAT/STAT551 Statistical Genetics II: Quantitative Traits Gained an understanding of the relationship between a TRAIT, GENETICS (single locus and multilocus) and ENVIRONMENT Theoretical

More information

Data Mining and Applications in Genomics

Data Mining and Applications in Genomics Data Mining and Applications in Genomics Lecture Notes in Electrical Engineering Volume 25 For other titles published in this series, go to www.springer.com/series/7818 Sio-Iong Ao Data Mining and Applications

More information

Quantitative Genetics

Quantitative Genetics Quantitative Genetics Polygenic traits Quantitative Genetics 1. Controlled by several to many genes 2. Continuous variation more variation not as easily characterized into classes; individuals fall into

More information

Motivation From Protein to Gene

Motivation From Protein to Gene MOLECULAR BIOLOGY 2003-4 Topic B Recombinant DNA -principles and tools Construct a library - what for, how Major techniques +principles Bioinformatics - in brief Chapter 7 (MCB) 1 Motivation From Protein

More information

Genomic resources. for non-model systems

Genomic resources. for non-model systems Genomic resources for non-model systems 1 Genomic resources Whole genome sequencing reference genome sequence comparisons across species identify signatures of natural selection population-level resequencing

More information

ChIP-seq and RNA-seq. Farhat Habib

ChIP-seq and RNA-seq. Farhat Habib ChIP-seq and RNA-seq Farhat Habib fhabib@iiserpune.ac.in Biological Goals Learn how genomes encode the diverse patterns of gene expression that define each cell type and state. Protein-DNA interactions

More information

Introduction to Plant Genetics Spring 2000

Introduction to Plant Genetics Spring 2000 Chapter 4: linkage and mapping Linkage: Cis and Trans Many genes are found on any one chromosome. Loci on the same chromosome are part of the same dsdn molecule Loci on the same chromosome are said to

More information

University of York Department of Biology B. Sc Stage 2 Degree Examinations

University of York Department of Biology B. Sc Stage 2 Degree Examinations Examination Candidate Number: Desk Number: University of York Department of Biology B. Sc Stage 2 Degree Examinations 2016-17 Evolutionary and Population Genetics Time allowed: 1 hour and 30 minutes Total

More information

Introduction to some aspects of molecular genetics

Introduction to some aspects of molecular genetics Introduction to some aspects of molecular genetics Julius van der Werf (partly based on notes from Margaret Katz) University of New England, Armidale, Australia Genetic and Physical maps of the genome...

More information

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham

Designing a Complex-Omics Experiments. Xiangqin Cui. Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham Designing a Complex-Omics Experiments Xiangqin Cui Section on Statistical Genetics Department of Biostatistics University of Alabama at Birmingham 1/7/2015 Some slides are from previous lectures of Grier

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 4 Quantitative Genetics Nicholas Wheeler & David Harry Oregon

More information

Prostate Cancer Genetics: Today and tomorrow

Prostate Cancer Genetics: Today and tomorrow Prostate Cancer Genetics: Today and tomorrow Henrik Grönberg Professor Cancer Epidemiology, Deputy Chair Department of Medical Epidemiology and Biostatistics ( MEB) Karolinska Institutet, Stockholm IMPACT-Atanta

More information

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011

EPIB 668 Genetic association studies. Aurélie LABBE - Winter 2011 EPIB 668 Genetic association studies Aurélie LABBE - Winter 2011 1 / 71 OUTLINE Linkage vs association Linkage disequilibrium Case control studies Family-based association 2 / 71 RECAP ON GENETIC VARIANTS

More information

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds:

1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: 1) (15 points) Next to each term in the left-hand column place the number from the right-hand column that best corresponds: natural selection 21 1) the component of phenotypic variance not explained by

More information

Genetics Effective Use of New and Existing Methods

Genetics Effective Use of New and Existing Methods Genetics Effective Use of New and Existing Methods Making Genetic Improvement Phenotype = Genetics + Environment = + To make genetic improvement, we want to know the Genetic value or Breeding value for

More information

Traditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding.

Traditional Genetic Improvement. Genetic variation is due to differences in DNA sequence. Adding DNA sequence data to traditional breeding. 1 Introduction What is Genomic selection and how does it work? How can we best use DNA data in the selection of cattle? Mike Goddard 5/1/9 University of Melbourne and Victorian DPI of genomic selection

More information

Expression quantitative trait loci analysis in plants

Expression quantitative trait loci analysis in plants Plant Biotechnology Journal (2010) 8, pp. 10 27 doi: 10.1111/j.1467-7652.2009.00460.x Review article Expression quantitative trait loci analysis in plants Arnis Druka 1, Elena Potokina 2,ZeweiLuo 3, Ning

More information

PCB Fa Falll l2012

PCB Fa Falll l2012 PCB 5065 Fall 2012 Molecular Markers Bassi and Monet (2008) Morphological Markers Cai et al. (2010) JoVE Cytogenetic Markers Boskovic and Tobutt, 1998 Isozyme Markers What Makes a Good DNA Marker? High

More information

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers

HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers HCS806 Summer 2010 Methods in Plant Biology: Breeding with Molecular Markers Lecture 7. Populations The foundation of any crop improvement program is built on populations. This session will explore population

More information

Analysis of Microarray Data

Analysis of Microarray Data Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization George Bell, Ph.D. Senior Bioinformatics Scientist Bioinformatics and Research Computing Whitehead Institute Outline Introduction

More information

Supplementary Note: Detecting population structure in rare variant data

Supplementary Note: Detecting population structure in rare variant data Supplementary Note: Detecting population structure in rare variant data Inferring ancestry from genetic data is a common problem in both population and medical genetic studies, and many methods exist to

More information

The principles of QTL analysis (a minimal mathematics approach)

The principles of QTL analysis (a minimal mathematics approach) Journal of Experimental Botany, Vol. 49, No. 37, pp. 69 63, October 998 The principles of QTL analysis (a minimal mathematics approach) M.J. Kearsey Plant Genetics Group, School of Biological Sciences,

More information

Conifer Translational Genomics Network Coordinated Agricultural Project

Conifer Translational Genomics Network Coordinated Agricultural Project Conifer Translational Genomics Network Coordinated Agricultural Project Genomics in Tree Breeding and Forest Ecosystem Management ----- Module 3 Population Genetics Nicholas Wheeler & David Harry Oregon

More information