Lab Rotation Report. Re-analysis of Molecular Features in Predicting Survival in Follicular Lymphoma

Size: px
Start display at page:

Download "Lab Rotation Report. Re-analysis of Molecular Features in Predicting Survival in Follicular Lymphoma"

Transcription

1 Lab Rotation Report Re-analysis of Molecular Features in Predicting Survival in Follicular Lymphoma Ray S. Lin Biomedical Informatics Training Program, Stanford June 24, 2006 ABSTRACT The findings in the article "Prediction of survival in follicular lymphoma based on molecular features of tumor infiltrating cells" by Dave and Wright et al. published in NEJM 2004 [1] have been found to be fragile. This project investigated whether Wright's finding can be reproduced in the following three experiments: 1) using the original training and test sets, 2) swapping training and test sets, and 3) randomly dividing into training and test sets for 100 times. Different experimental settings were examined, but unfortunately we were not able to reproduce Wright s findings in any of the settings. In the 1 st experiment, we found some pairs of gene clusters that were significant in both training and test sets; however, the most significant pair in the training set was not significant in the test set. In the 2 nd experiment, none of the pairs that consisted of one poor gene cluster and one good gene cluster were found to be significant in the test set. In the 3 rd experiment, the empirical distribution of p values in the test sets was generated after performing 100 trials on randomly-split training and test sets. The distribution is close to uniform, which indicates that the significant results Wright obtained were likely due to chance. In summary, based on Wright s dataset and the analytical procedure, it is not likely that there exist gene clusters (or pairs of gene clusters) that strongly predict survival in follicular lymphoma patients.

2 1. INTRODUCTION The finding in the article Prediction of survival in follicular lymphoma based on molecular features of tumor infiltrating cells by Dave and Wright et al. published in NEJM 2004 [1] has been found to be fragile. Robert Tibshirani reported that the finding was not reproducible by following the same model-building procedure if the training and test sets were swapped [2]. This project investigated whether Wright's finding can be reproduced in three experiments: 1) using the original training and test sets, 2) swapping training and test sets, and 3) randomly dividing into training and test sets for 100 times. 2. METHODS The original dataset was obtained from Wright s. The dataset contains gene expression measurements of 44,928 genes on 191 patients. Four out of the 191 patients did not have survival data. We performed the following three experiments: 1) reproducing Wright s results based on the original training and test sets; 2) new analysis after swapping training and test sets; 3) 100 new analyses that randomly divides the whole dataset into a training and a test set and then selects the most significant pair of gene clusters found in the training set and computes its p value in the test set Reproducing Wright's results The analysis consists of the following steps according to the description in Dave s article and the supplementary appendix. (a) Dividing Patients into training and test sets The whole dataset was divided into training and test sets based on the split performed by George Wright. There were 95 patients (93 with survival data) in the training set and 96 (94 with survival data) in the test set. (b) Filtering the genes The 44,928 genes were filtered based on two criteria computed in the training set: 1) Wald p < 0.1 in univariate Cox model and 2) median expression > 6. The remaining

3 genes were categorized into poor genes that predict poor survival (with Wald score > 0) and good genes that predicts good survival (with Wald score < 0). It was not clear that the filtering procedure was done based on data of all patients (i.e., n=95) or only patients with survival data (i.e., n=93). Therefore, both scenarios were examined. (c) Clustering the genes The poor genes and good genes are clustered by software XCluster [3] separately. The clustering result was further filtered by the joining correlation (> 0.5) and the cluster size (> 24 and < 51). Four datasets (G1 to G4) were derived from further analysis. G1 and G2 were derived based on the list of genes that appear in the.gtr and.ctr files (the output of the clustering software) provided by Wright. The files contain 1,569 poor genes, but the files containing good genes are not available. These poor genes were identified in the original dataset. The dataset G1 contains these 1,569 genes for all the 95 patients whereas G2 contains the same set of genes but for patients with survival data (n=93). Datasets G3 and G4 were derived from the original dataset by performing the filtering procedure listed above. G3 were derived based on all 95 patients whereas G4 were based on the 93 patients with survival data. (d) Fitting the gene clusters to Cox models The gene clusters obtained from G3 and G4 were further analyzed by Cox models. For each gene cluster, the expression of the cluster was computed as the mean expression of its constituent genes. Cox models were used to analyze the expression of the individual gene clusters and pairs of these clusters. For each pair, the p values of Wald tests in the training and test sets were reported. The clusters obtained from G1 and G2 could not be analyzed because they did not contain the clusters for good genes Swapping training and test sets In this analysis, the training and test sets were swapped. There were 96 patients (94 with survival data) in the new training set and 95 patients (93 with survival data) in the test set. The analysis was performed following steps (b) to (d) described Sec 2.1. This analysis was based on dataset G4. In other words, it includes patients without survival data.

4 2.3. Randomly dividing training and test sets The whole dataset was randomly divided into training and test sets (with roughly equal size) for 100 times. In each random split, steps (b) to (d) described in Sec. 2.1 were performed. The pair of gene clusters with the most significant Wald test in the training set was selected, and its p value in the test set was computed. The empirical distribution of these test-set p values was examined. This analysis was based on dataset G4, which includes patients without survival data. 3. RESULTS The following subsections summarize the results of the analyses in the three experiments Reproducing Wright's results Table 1 summarizes the number of poor and good genes after gene filtering. In Wright s analysis, it was not clear that the filtering procedure was done based on data of all patients (i.e., n=95) or only patients with survival data (i.e., n=93). Both scenarios were examined in this project, and the first scenario (i.e., n=95) produced the same number of genes as the supplementary appendix published on NEJM website [4]. However, the data provided by Wright was different from their appendix, and neither scenario in our analysis could reproduce Wright s result. This finding is the same as the one obtained by Rob Tibshirani [2]. Four datasets (G1 to G4) were created for analysis as described in Sec 2.1. Their characteristics were summarized in Table 2. Table 3 summarizes the number of clusters obtained in different datasets by trying different centering and scaling methods. None of the four datasets generated the same number of clusters as Wright's analysis. Figure 1 shows the p values in training and test sets based on dataset G3. Panel A and B show the p values of pairs of gene clusters in two-variable Cox models. Some pairs reached significant p values at the 0.05 level in both training and test sets. However, the most significant ones in the training set were not significant in the test set. Panel C and D

5 show the p value of poor and good gene clusters in univariate Cox models. Only one poor gene cluster and none of the good gene clusters was significant in both training and test sets Swapping training and test sets Figure 2 shows the Cox model p value in training and test sets based on dataset G4. Several pairs reached significant p values at the 0.05 level in both training and test sets (Panel A). However, when considering only the pairs consisting of one poor gene cluster and one good gene cluster, none of the pairs was significant in the test set (Panel B). Similarly in univariate Cox models, none of the poor gene clusters (Panel C) and none of the good gene clusters (Panel D) was significant in the test set regardless of their p values in the training set Randomly dividing training and test sets Table 4 summarizes the empirical distribution of Cox model p values in training and test sets, and Figure 3 shows the cumulative distribution of the p values in the test set. The distribution of p values in the test set is close to uniform. Only 6% of the p values in the test set were less than 0.05 although all the p values were very significant in the training set. 4. DISCUSSION The first goal of this project was to reproduce Wright s results. However, as presented in Sec 3, none of our analysis produced the same result as Wright s. It would be helpful if we could obtain their.cdt (not.ctr) files, which contain the gene clusters with the expression measurements for the all genes on all the patients. In this case, we can compare our.cdt files against theirs and might be able to identify the cause of the different results. The results in Sec 3.1 show that if we follow Wright s procedure and pick up the pair of gene clusters with the most significant p values in the training set, this pair would not be significant in the test set. Even though there are some pairs that are significant in

6 both training and test sets, we would not be able to identify them by just looking their p values in the training set. After swapping training and test sets, none of the pairs that consist of one poor gene cluster and one good gene cluster were significant in the test set. This finding is the same as Rob s result [2]. Notice that he plots are not exactly the same. This may be because Rob s analysis included only the patients with survival data (n=187) whereas this project included all the patients (n=191). In the 3 rd experiment, the empirical distribution of p values in the test set is close to uniform although the mean and median is around 0.42 (not 0.5). The p values in the test sets were significant (< 0.05) in only 6 out of the total 100 trials. This shows that if we follow Wright s procedure, the significant results obtained in the test set were likely only due to chance. It is not likely that there are pairs of gene clusters that strongly predict survival. In summary, it is unfortunate that we could not reproduce Wright s findings. This could be due to the nuance of parameters in the analysis. We have experimented different parameters, but none produced exactly the same results as Wright s. In the 1 st experiment, we found some pairs of gene clusters that were significant in both training and test sets; however, the most significant pair in the training set was not significant in the test set. In the 2 nd experiment, none of the pairs that consisted of one poor gene cluster and one good gene cluster were found to be significant in the test set. In the 3 rd experiment, the empirical distribution of p values in the test sets was generated after performing 100 trials on randomly-split training and test sets. The distribution is close to uniform, which indicates that the significant results Wright obtained were likely due to chance. Based on Wright s dataset and the analytical procedure, we were not able to find significant gene clusters (or pairs of gene clusters) that predict survival in follicular lymphoma patients. ACKNOWLEDGEMENTS I would like to thank the guidance of Dr. Trevor Hastie and Dr. Rob Tibshirani.

7 REFENECE 1. Dave, et al., Prediction of survival in follicular lymphoma based on molecular features of tumor infiltrating cells. NEJM, (21). 2. Tibshirani, R., Re-analysis of Dave et al, NEJM Nov 18, Sherlock, G., XCluster Dave, et al., Supplementary appendix to prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating immune cells. NEJM,

8 TABLES Table 1. The numbers of poor and good genes after gene filtering in different datasets. Patients in the tr. set #poor genes #good genes Reported by? 1,568 1,731 Supp appendix? 1,569 -* Wright all patients (n=95) 1,568 1,731 Ray patients with surv. data (n=93) 1,565 1,730 Ray *: Data in this cell were not available.?: In Wright s analysis, it was not clear whether patients without survival data were included or not. Table 2. The numbers of poor and good genes after gene filtering in different datasets. Dataset #patients #poor genes #good genes Derived from G1 95 1,569 -* Wright's clusters G2 93 1,569 -* Wright's clusters G3 95 1,568 1,731 original dataset G4 93 1,565 1,730 original dataset *: Data in this cell were not available. Table 3. The numbers of poor and good genes after gene filtering in different datasets. Dataset centering scaling #cluster (poor #cluster (good genes) genes) G1 mean none 92 -* var 96 -* median none 84 -* var 79 -* G2 mean none 78 -* var 87 -* median none 83 -* var 76 -* G3 mean none var median none var G4 mean none var median none var Wright's median? 71 -* *: Data in these cells were not available.?: Wright did not report scaling

9 Table 4. The empirical distribution of Cox model p values in training and test sets in twovariable Cox models. Min 1 st Qu. Median Mean 3 rd Qu. Max Training set Test set

10 FIGURES Figure 1. Cox model p values in training and test sets based on dataset G3. One poor + one good A B C D

11 Figure 2. Cox model p values in training and test sets based on dataset G4. One poor + one good A B C D

12 Figure 3. The empirical cumulative distribution function of p values in two-variable Cox models in test set. Empirical CDF of test set p-value Empirical CDF P value

13 APPENDIX Pseudo-code of R script 1. Read in the data from files 2. Combine the clinial data and gene expression data 3. Divide the whole dataset into training and test sets 4. Filter the genes i. Compute Wald tests (univariate Cox) for each gene based on training set ii. Filter genes based on Wald tests p < 0.1(the absolute value of Wald score > 1.645) iii. Fileter genes based on the median expression in the training set 5. Identify poor genes (wald score > 0) and good genes (wald score < 0) 6. Cluster the poor genes i. Transform the dataset into the format for XCluster ii. Call XCluster (parameters = centered by median, no scaling) iii. Filter the clusters based on joining correlation cutoff > 0.5 iv. Filter the clusters based on cluster size (>24 and <51) v. Map XCluster's gene id back to the id in the original dataset 7. Cluster the good genes i. Transform the dataset into the format for XCluster ii. Call XCluster (parameters = centered by median, no scaling) iii. Filter the clusters based on joining correlation cutoff > 0.5 iv. Filter the clusters based on cluster size (>24 and <51) v. Map XCluster's gene id back to the id in the original dataset 8. For each patient and each gene cluster, compute its expression as the mean expression of its constituent genes 9. Fit two-variable Cox model to every pair of one poor-gene cluster and one good-gene cluster 10. Plot the p values of Wald tests in training vs. test sets 11. In the training set, select the most significant pair of one poor-gene cluster and one good-gene cluster 12. Report the p value of this pair in the test set.

Intelligent Techniques Lesson 4 (Examples about Genetic Algorithm)

Intelligent Techniques Lesson 4 (Examples about Genetic Algorithm) Intelligent Techniques Lesson 4 (Examples about Genetic Algorithm) Numerical Example A simple example will help us to understand how a GA works. Let us find the maximum value of the function (15x - x 2

More information

Today. Last time. Lecture 5: Discrimination (cont) Jane Fridlyand. Oct 13, 2005

Today. Last time. Lecture 5: Discrimination (cont) Jane Fridlyand. Oct 13, 2005 Biological question Experimental design Microarray experiment Failed Lecture : Discrimination (cont) Quality Measurement Image analysis Preprocessing Jane Fridlyand Pass Normalization Sample/Condition

More information

BIOSTAT III: Survival Analysis. Examination

BIOSTAT III: Survival Analysis. Examination BIOSTAT III: Survival Analysis Examination February 14, 2014 Time: 9:00 11.30 Exam room location: Jacob Berzelius, Berzelius väg 3, Karolinska Institutet Code (please do not write your name): Time allowed

More information

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test CHAPTER 8 T Tests A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test 8.1. One-Sample T Test The One-Sample T Test procedure: Tests

More information

Analysis of Factors Affecting Resignations of University Employees

Analysis of Factors Affecting Resignations of University Employees Analysis of Factors Affecting Resignations of University Employees An exploratory study was conducted to identify factors influencing voluntary resignations at a large research university over the past

More information

Michelle Wang Department of Biology, Queen s University, Kingston, Ontario Biology 206 (2008)

Michelle Wang Department of Biology, Queen s University, Kingston, Ontario Biology 206 (2008) An investigation of the fitness and strength of selection on the white-eye mutation of Drosophila melanogaster in two population sizes under light and dark treatments over three generations Image Source:

More information

Machine Learning. Genetic Algorithms

Machine Learning. Genetic Algorithms Machine Learning Genetic Algorithms Genetic Algorithms Developed: USA in the 1970 s Early names: J. Holland, K. DeJong, D. Goldberg Typically applied to: discrete parameter optimization Attributed features:

More information

Machine Learning. Genetic Algorithms

Machine Learning. Genetic Algorithms Machine Learning Genetic Algorithms Genetic Algorithms Developed: USA in the 1970 s Early names: J. Holland, K. DeJong, D. Goldberg Typically applied to: discrete parameter optimization Attributed features:

More information

Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics:

Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: Understanding Workers, Developing Effective Tasks, and Enhancing Marketplace Dynamics: A Study of a Large Crowdsourcing Marketplace ABSTRACT Ayush Jain ajain42@illinois.edu University of Illinois Aditya

More information

Telecommunications Churn Analysis Using Cox Regression

Telecommunications Churn Analysis Using Cox Regression Telecommunications Churn Analysis Using Cox Regression Introduction As part of its efforts to increase customer loyalty and reduce churn, a telecommunications company is interested in modeling the "time

More information

Enhancement of the Adaptive Signature Design (ASD) for Learning and Confirming in a Single Pivotal Trial

Enhancement of the Adaptive Signature Design (ASD) for Learning and Confirming in a Single Pivotal Trial Enhancement of the Adaptive Signature Design (ASD) for Learning and Confirming in a Single Pivotal Trial Gu Mi, Ph.D. Global Statistical Sciences Eli Lilly and Company, Indianapolis, IN 46285 mi_gu@lilly.com

More information

CREDIT RISK MODELLING Using SAS

CREDIT RISK MODELLING Using SAS Basic Modelling Concepts Advance Credit Risk Model Development Scorecard Model Development Credit Risk Regulatory Guidelines 70 HOURS Practical Learning Live Online Classroom Weekends DexLab Certified

More information

COORDINATING DEMAND FORECASTING AND OPERATIONAL DECISION-MAKING WITH ASYMMETRIC COSTS: THE TREND CASE

COORDINATING DEMAND FORECASTING AND OPERATIONAL DECISION-MAKING WITH ASYMMETRIC COSTS: THE TREND CASE COORDINATING DEMAND FORECASTING AND OPERATIONAL DECISION-MAKING WITH ASYMMETRIC COSTS: THE TREND CASE ABSTRACT Robert M. Saltzman, San Francisco State University This article presents two methods for coordinating

More information

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET

A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET A STUDY ON STATISTICAL BASED FEATURE SELECTION METHODS FOR CLASSIFICATION OF GENE MICROARRAY DATASET 1 J.JEYACHIDRA, M.PUNITHAVALLI, 1 Research Scholar, Department of Computer Science and Applications,

More information

Stefano Monti. Workshop Format

Stefano Monti. Workshop Format Gad Getz Stefano Monti Michael Reich {gadgetz,smonti,mreich}@broad.mit.edu http://www.broad.mit.edu/~smonti/aws Broad Institute of MIT & Harvard October 18-20, 2006 Cambridge, MA Workshop Format Morning

More information

Understanding protein lists from comparative proteomics studies

Understanding protein lists from comparative proteomics studies Understanding protein lists from comparative proteomics studies Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine bing.zhang@vanderbilt.edu A typical comparative

More information

A Comprehensive Evaluation of Regression Uncertainty and the Effect of Sample Size on the AHRI-540 Method of Compressor Performance Representation

A Comprehensive Evaluation of Regression Uncertainty and the Effect of Sample Size on the AHRI-540 Method of Compressor Performance Representation Purdue University Purdue e-pubs International Compressor Engineering Conference School of Mechanical Engineering 2016 A Comprehensive Evaluation of Regression Uncertainty and the Effect of Sample Size

More information

FINAL REPORT

FINAL REPORT PROFICIENCY TESTING PROGRAM FOR TENSILE PROPERTIES OF STEEL [PTM/MECH/10/14-15] FINAL REPORT 2015-16 ORGANIZED BY GLOBAL PT PROVIDER PVT. LTD., NEW DELHI PT SCHEME NAME : TENSILE PROPERTIES OF STEEL PT

More information

Developing an Accurate and Precise Companion Diagnostic Assay for Targeted Therapies in DLBCL

Developing an Accurate and Precise Companion Diagnostic Assay for Targeted Therapies in DLBCL Developing an Accurate and Precise Companion Diagnostic Assay for Targeted Therapies in DLBCL James Storhoff, Ph.D. Senior Manager, Diagnostic Test Development World Cdx, Boston, Sep. 10th Molecules That

More information

3 Ways to Improve Your Targeted Marketing with Analytics

3 Ways to Improve Your Targeted Marketing with Analytics 3 Ways to Improve Your Targeted Marketing with Analytics Introduction Targeted marketing is a simple concept, but a key element in a marketing strategy. The goal is to identify the potential customers

More information

Quantitative Genetics

Quantitative Genetics Quantitative Genetics Polygenic traits Quantitative Genetics 1. Controlled by several to many genes 2. Continuous variation more variation not as easily characterized into classes; individuals fall into

More information

Credit Risk Models Cross-Validation Is There Any Added Value?

Credit Risk Models Cross-Validation Is There Any Added Value? Credit Risk Models Cross-Validation Is There Any Added Value? Croatian Quants Day Zagreb, June 6, 2014 Vili Krainz vili.krainz@rba.hr The views expressed during this presentation are solely those of the

More information

Facility Location. Lindsey Bleimes Charlie Garrod Adam Meyerson

Facility Location. Lindsey Bleimes Charlie Garrod Adam Meyerson Facility Location Lindsey Bleimes Charlie Garrod Adam Meyerson The K-Median Problem Input: We re given a weighted, strongly connected graph, each vertex as a client having some demand! Demand is generally

More information

metaarray package for meta-analysis of microarray data

metaarray package for meta-analysis of microarray data metaarray package for meta-analysis of microarray data Debashis Ghosh and Hyungwon Choi October 30, 2017 Introduction metaarray is a collection of functions for large-scale meta-analysis of microarray

More information

PARALLEL LINE AND MACHINE JOB SCHEDULING USING GENETIC ALGORITHM

PARALLEL LINE AND MACHINE JOB SCHEDULING USING GENETIC ALGORITHM PARALLEL LINE AND MACHINE JOB SCHEDULING USING GENETIC ALGORITHM Dr.V.Selvi Assistant Professor, Department of Computer Science Mother Teresa women s University Kodaikanal. Tamilnadu,India. Abstract -

More information

Changing Mutation Operator of Genetic Algorithms for optimizing Multiple Sequence Alignment

Changing Mutation Operator of Genetic Algorithms for optimizing Multiple Sequence Alignment International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 11 (2013), pp. 1155-1160 International Research Publications House http://www. irphouse.com /ijict.htm Changing

More information

PREDICTING PREVENTABLE ADVERSE EVENTS USING INTEGRATED SYSTEMS PHARMACOLOGY

PREDICTING PREVENTABLE ADVERSE EVENTS USING INTEGRATED SYSTEMS PHARMACOLOGY PREDICTING PREVENTABLE ADVERSE EVENTS USING INTEGRATED SYSTEMS PHARMACOLOGY GUY HASKIN FERNALD 1, DORNA KASHEF 2, NICHOLAS P. TATONETTI 1 Center for Biomedical Informatics Research 1, Department of Computer

More information

Outline. Analysis of Microarray Data. Most important design question. General experimental issues

Outline. Analysis of Microarray Data. Most important design question. General experimental issues Outline Analysis of Microarray Data Lecture 1: Experimental Design and Data Normalization Introduction to microarrays Experimental design Data normalization Other data transformation Exercises George Bell,

More information

Mate-pair library data improves genome assembly

Mate-pair library data improves genome assembly De Novo Sequencing on the Ion Torrent PGM APPLICATION NOTE Mate-pair library data improves genome assembly Highly accurate PGM data allows for de Novo Sequencing and Assembly For a draft assembly, generate

More information

Supplementary materials

Supplementary materials Supplementary materials Calculation of the growth rate for each gene In the growth rate dataset, each gene has many different growth rates under different conditions. The average growth rate for gene i

More information

Forecast accuracy measures for count data & intermittent demand Stephan Kolassa International Symposium on Forecasting, June 2015.

Forecast accuracy measures for count data & intermittent demand Stephan Kolassa International Symposium on Forecasting, June 2015. Forecast accuracy measures for count data & intermittent demand Stephan Kolassa International Symposium on Forecasting, June 2015 Public Means, medians and MADs Given any distribution The median minimizes

More information

Measuring the Correlates of Intent to Participate and Participation in the Census and Trends in These Correlates:

Measuring the Correlates of Intent to Participate and Participation in the Census and Trends in These Correlates: Measuring the Correlates of Intent to Participate and Participation in the Census and Trends in These Correlates: Comparisons of RDD Telephone and Non-probability Sample Internet Survey Data Josh Pasek

More information

TAMU: PROTEOMICS SPECTRA 1 What Are Proteomics Spectra? DNA makes RNA makes Protein

TAMU: PROTEOMICS SPECTRA 1 What Are Proteomics Spectra? DNA makes RNA makes Protein The Analysis of Proteomics Spectra from Serum Samples Jeffrey S. Morris Department of Biostatistics MD Anderson Cancer Center TAMU: PROTEOMICS SPECTRA 1 What Are Proteomics Spectra? DNA makes RNA makes

More information

Quality Control Assessment in Genotyping Console

Quality Control Assessment in Genotyping Console Quality Control Assessment in Genotyping Console Introduction Prior to the release of Genotyping Console (GTC) 2.1, quality control (QC) assessment of the SNP Array 6.0 assay was performed using the Dynamic

More information

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................3

More information

Widening the Net. A comparison of online intercept and access panel sampling

Widening the Net. A comparison of online intercept and access panel sampling Widening the Net A comparison of online intercept and access panel sampling Presented by: David G. Bakken PhD Senior Vice President and Chief Science Officer CASRO Panel Conference 4 th March 2011 Context

More information

Sawtooth Software. Sample Size Issues for Conjoint Analysis Studies RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. Sample Size Issues for Conjoint Analysis Studies RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES Sample Size Issues for Conjoint Analysis Studies Bryan Orme, Sawtooth Software, Inc. 1998 Copyright 1998-2001, Sawtooth Software, Inc. 530 W. Fir St. Sequim, WA

More information

401 N. Washington Street, Suite 700, Rockville, MD Phone: Fax:

401 N. Washington Street, Suite 700, Rockville, MD Phone: Fax: Leveraging Resources to Design, Conduct and Analyze Hematopoietic Stem Cell Transplant Clinical Trials: The Ongoing Collaboration between the Center for International Blood and Marrow Transplant Research

More information

ENERGY INITIATIVE ENERGY EFFICIENCY PROGRAM Impact Evaluation of Prescriptive and Custom Lighting Installations. National Grid. Prepared by DNV GL

ENERGY INITIATIVE ENERGY EFFICIENCY PROGRAM Impact Evaluation of Prescriptive and Custom Lighting Installations. National Grid. Prepared by DNV GL ENERGY INITIATIVE ENERGY EFFICIENCY PROGRAM Impact Evaluation of Prescriptive and Custom Lighting Installations National Grid Prepared by DNV GL Date: September 25, 2015 DNV GL www.dnvgl.com September,

More information

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS

INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS * ORIGINAL: English DATE: April 19, 2002 INTERNATIONAL UNION FOR THE PROTECTION OF NEW VARIETIES OF PLANTS GENEVA E GENERAL INTRODUCTION TO THE EXAMINATION OF DISTINCTNESS, UNIFORMITY AND STABILITY AND

More information

TOTAL CANCER CARE: CREATING PARTNERSHIPS TO ADDRESS PATIENT NEEDS

TOTAL CANCER CARE: CREATING PARTNERSHIPS TO ADDRESS PATIENT NEEDS TOTAL CANCER CARE: CREATING PARTNERSHIPS TO ADDRESS PATIENT NEEDS William S. Dalton, PhD, MD CEO, M2Gen & Director, Personalized Medicine Institute, Moffitt Cancer Center JULY 15, 2013 MOFFITT CANCER CENTER

More information

Biomedical Big Data and Precision Medicine

Biomedical Big Data and Precision Medicine Biomedical Big Data and Precision Medicine Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago October 8, 2015 1 Explosion of Biomedical Data 2 Types

More information

Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of Kansas Farm Management Association Cow-Calf Enterprise

Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of Kansas Farm Management Association Cow-Calf Enterprise Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of 2012-2016 Kansas Farm Management Association Cow-Calf Enterprise Dustin L. Pendell (dpendell@ksu.edu) and Kevin L.

More information

Trust-Networks in Recommender Systems

Trust-Networks in Recommender Systems San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 2008 Trust-Networks in Recommender Systems Kristen Mori San Jose State University Follow this and additional

More information

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design Robert A. Vierkant, Terry M. Therneau, Jon L. Kosanke, James M. Naessens Mayo Clinic, Rochester, MN ABSTRACT A matched

More information

Composite Performance Measure Evaluation Guidance. April 8, 2013

Composite Performance Measure Evaluation Guidance. April 8, 2013 Composite Performance Measure Evaluation Guidance April 8, 2013 Contents Introduction... 1 Purpose... 1 Background... 2 Prior Guidance on Evaluating Composite Measures... 2 NQF Experience with Composite

More information

Data Analysis and Sampling

Data Analysis and Sampling Data Analysis and Sampling About This Course Course Description In order to perform successful internal audits, you must know how to reduce a large data set down to critical subsets based on risk or importance,

More information

Shift Swapping Quick-Guide

Shift Swapping Quick-Guide Shift Swapping Quick-Guide (v 3.6.0) Shift Swapping Quick-Guide Login to Lawson Workforce Management Self Service: 1. Open Internet Explorer and type in http:// in the address bar to access the Lawson

More information

Justifying Simulation. Why use simulation? Accurate Depiction of Reality. Insightful system evaluations

Justifying Simulation. Why use simulation? Accurate Depiction of Reality. Insightful system evaluations Why use simulation? Accurate Depiction of Reality Anyone can perform a simple analysis manually. However, as the complexity of the analysis increases, so does the need to employ computer-based tools. While

More information

Random forest for gene selection and microarray data classification

Random forest for gene selection and microarray data classification www.bioinformation.net Hypothesis Volume 7(3) Random forest for gene selection and microarray data classification Kohbalan Moorthy & Mohd Saberi Mohamad* Artificial Intelligence & Bioinformatics Research

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 Exploratory Data Analysis Exploring data can help to determine whether the statistical

More information

Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.

Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Supplementary Data for DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Wenxiu Ma 1, Lin Yang 2, Remo Rohs 2, and William Stafford Noble 3 1 Department of Statistics,

More information

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved. AcaStat How To Guide AcaStat Software Copyright 2016, AcaStat Software. All rights Reserved. http://www.acastat.com Table of Contents Frequencies... 3 List Variables... 4 Descriptives... 5 Explore Means...

More information

Weka Evaluation: Assessing the performance

Weka Evaluation: Assessing the performance Weka Evaluation: Assessing the performance Lab3 (in- class): 21 NOV 2016, 13:00-15:00, CHOMSKY ACKNOWLEDGEMENTS: INFORMATION, EXAMPLES AND TASKS IN THIS LAB COME FROM SEVERAL WEB SOURCES. Learning objectives

More information

Chapter 3. RESEARCH METHODOLOGY

Chapter 3. RESEARCH METHODOLOGY Chapter 3. RESEARCH METHODOLOGY In order to discover the final conclusion of this study, there were several steps to be conducted. This chapter contains the detailed steps of what has been done in this

More information

WORKING WITH TEST DOCUMENTATION

WORKING WITH TEST DOCUMENTATION WORKING WITH TEST DOCUMENTATION CONTENTS II. III. Planning Your Test Effort 2. The Goal of Test Planning 3. Test Planning Topics: b) High Level Expectations c) People, Places and Things d) Definitions

More information

Research Article Information in Repeated Ultimatum Game with Unknown Pie Size

Research Article Information in Repeated Ultimatum Game with Unknown Pie Size Economics Research International Volume 2013, Article ID 470412, 8 pages http://dx.doi.org/10.1155/2013/470412 Research Article Information in Repeated Ultimatum Game with Unknown Pie Size ChingChyiLeeandWilliamK.Lau

More information

RNA-Seq analysis using R: Differential expression and transcriptome assembly

RNA-Seq analysis using R: Differential expression and transcriptome assembly RNA-Seq analysis using R: Differential expression and transcriptome assembly Beibei Chen Ph.D BICF 12/7/2016 Agenda Brief about RNA-seq and experiment design Gene oriented analysis Gene quantification

More information

Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of Kansas Farm Management Association Cow-Calf Enterprise

Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of Kansas Farm Management Association Cow-Calf Enterprise Differences Between High-, Medium-, and Low-Profit Cow-Calf Producers: An Analysis of 2010-2014 Kansas Farm Management Association Cow-Calf Enterprise Dustin L. Pendell (dpendell@ksu.edu), Youngjune Kim

More information

Genetic Algorithms in Matrix Representation and Its Application in Synthetic Data

Genetic Algorithms in Matrix Representation and Its Application in Synthetic Data Genetic Algorithms in Matrix Representation and Its Application in Synthetic Data Yingrui Chen *, Mark Elliot ** and Joe Sakshaug *** * ** University of Manchester, yingrui.chen@manchester.ac.uk University

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Aggregation Dr. David Koop Selection & Highlighting Selection: a user action (mouse, keyboard) on items, links, etc. Selection types: single vs. multiple, contiguous

More information

CONTINUOUS RESERVOIR SIMULATION INCORPORATING UNCERTAINTY QUANTIFICATION AND REAL-TIME DATA. A Thesis JAY CUTHBERT HOLMES

CONTINUOUS RESERVOIR SIMULATION INCORPORATING UNCERTAINTY QUANTIFICATION AND REAL-TIME DATA. A Thesis JAY CUTHBERT HOLMES CONTINUOUS RESERVOIR SIMULATION INCORPORATING UNCERTAINTY QUANTIFICATION AND REAL-TIME DATA A Thesis by JAY CUTHBERT HOLMES Submitted to the Office of Graduate Studies of Texas A&M University in partial

More information

Informatics of Clinical Genomics

Informatics of Clinical Genomics Informatics of Clinical Genomics Lynn Bry, MD, PhD. Director, Center for Clinical and Translational Metagenomics Associate Pathologist, Center for Advanced Molecular Diagnostics Dept. Pathology, Brigham

More information

On Optimal Tiered Structures for Network Service Bundles

On Optimal Tiered Structures for Network Service Bundles On Tiered Structures for Network Service Bundles Qian Lv, George N. Rouskas Department of Computer Science, North Carolina State University, Raleigh, NC 7695-86, USA Abstract Network operators offer a

More information

Application Note. NGS Analysis of B-Cell Receptors & Antibodies by AptaAnalyzer -BCR

Application Note. NGS Analysis of B-Cell Receptors & Antibodies by AptaAnalyzer -BCR Reduce to the Best Application Note NGS Analysis of B-Cell Receptors & Antibodies by AptaAnalyzer -BCR The software AptaAnalyzer harnesses next generation sequencing (NGS) data to monitor the immune response

More information

Application of neural network to classify profitable customers for recommending services in u-commerce

Application of neural network to classify profitable customers for recommending services in u-commerce Application of neural network to classify profitable customers for recommending services in u-commerce Young Sung Cho 1, Song Chul Moon 2, and Keun Ho Ryu 1 1. Database and Bioinformatics Laboratory, Computer

More information

INTERNAL PILOT DESIGNS FOR CLUSTER SAMPLES

INTERNAL PILOT DESIGNS FOR CLUSTER SAMPLES INTERNAL PILOT DESIGNS FOR CLUSTER SAMPLES CHRISTOPHER S. COFFEY University of Alabama at Birmingham email: ccoffey@uab.edu website: www.soph.uab.edu/coffey MATTHEW J. GURKA University of Virginia KEITH

More information

GOAL STATEMENT: Students will simulate the effects of pesticides on an insect population and observe how the population changes over time.

GOAL STATEMENT: Students will simulate the effects of pesticides on an insect population and observe how the population changes over time. STATE SCIENCE STANDARDS: 6 th, 7 th, 8 th Grade Skills and Processes: 1.0.A.1.h Use mathematics to interpret and communicate data. 1.0.B.1 Review data from a simple experiment, summarize the data, and

More information

University of Groningen. The value of haplotypes Vries, Anne René de

University of Groningen. The value of haplotypes Vries, Anne René de University of Groningen The value of haplotypes Vries, Anne René de IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Next Generation Sequencing Technologies. Rob Mitra 1/30/17

Next Generation Sequencing Technologies. Rob Mitra 1/30/17 Next Generation Sequencing Technologies Rob Mitra 1/30/17 Outline Overview of next-generation sequencing How does it work? What technologies are being used? How would one use it in practice? Math basic

More information

Gene Set Enrichment Analysis! Robert Gentleman!

Gene Set Enrichment Analysis! Robert Gentleman! Gene Set Enrichment Analysis! Robert Gentleman! Outline! Description of the experimental setting! Defining gene sets! Description of the original GSEA algorithm! proposed by Mootha et al (2003)! Our approach

More information

7 Steps to Data Blending for Predictive Analytics

7 Steps to Data Blending for Predictive Analytics 7 Steps to Data Blending for Predictive Analytics Evolution of Analytics Just as the volume and variety of data has grown significantly in recent years, so too has the expectations for analytics. No longer

More information

Quality of Cancer RNA Samples Is Essential for Molecular Classification Based on Microarray Results Application

Quality of Cancer RNA Samples Is Essential for Molecular Classification Based on Microarray Results Application Quality of Cancer RNA Samples Is Essential for Molecular Classification Based on Microarray Results Application Pathology Author Jenny Xiao Agilent Technologies, Inc. Deer Creek Road MC U-7 Palo Alto,

More information

CHAPTER 5: DISCRETE PROBABILITY DISTRIBUTIONS

CHAPTER 5: DISCRETE PROBABILITY DISTRIBUTIONS Discrete Probability Distributions 5-1 CHAPTER 5: DISCRETE PROBABILITY DISTRIBUTIONS 1. Thirty-six of the staff of 80 teachers at a local intermediate school are certified in Cardio- Pulmonary Resuscitation

More information

Enabling Systems Biology Driven Proteome Wide Quantitation of Mycobacterium Tuberculosis

Enabling Systems Biology Driven Proteome Wide Quantitation of Mycobacterium Tuberculosis Enabling Systems Biology Driven Proteome Wide Quantitation of Mycobacterium Tuberculosis SWATH Acquisition on the TripleTOF 5600+ System Samuel L. Bader, Robert L. Moritz Institute of Systems Biology,

More information

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner SAS Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner Melodie Rush Principal

More information

What is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur...

What is Evolutionary Computation? Genetic Algorithms. Components of Evolutionary Computing. The Argument. When changes occur... What is Evolutionary Computation? Genetic Algorithms Russell & Norvig, Cha. 4.3 An abstraction from the theory of biological evolution that is used to create optimization procedures or methodologies, usually

More information

HAN Phase 3 Impact and Process Evaluation Report

HAN Phase 3 Impact and Process Evaluation Report REPORT HAN Phase 3 Impact and Process Evaluation Report December 2014 Submitted by Nexant Candice Churchwell, Senior Consultant Michael Sullivan, Ph.D., Senior Vice President Dan Thompson, Analyst II Jeeheh

More information

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits

Incorporating Molecular ID Technology. Accel-NGS 2S MID Indexing Kits Incorporating Molecular ID Technology Accel-NGS 2S MID Indexing Kits Molecular Identifiers (MIDs) MIDs are indices used to label unique library molecules MIDs can assess duplicate molecules in sequencing

More information

U n iversity o f H ei delberg. From Imitation to Collusion A Comment Jörg Oechssler, Alex Roomets, and Stefan Roth

U n iversity o f H ei delberg. From Imitation to Collusion A Comment Jörg Oechssler, Alex Roomets, and Stefan Roth U n iversity o f H ei delberg Department of Economics Discussion Paper Series No. 588 482482 From Imitation to Collusion A Comment Jörg Oechssler, Alex Roomets, and Stefan Roth March 2015 From Imitation

More information

MAYO CLINIC CENTER FOR BIOMEDICAL DISCOVERY EXCEPTIONAL RESEARCH LEADS TO EXCEPTIONAL PATIENT CARE

MAYO CLINIC CENTER FOR BIOMEDICAL DISCOVERY EXCEPTIONAL RESEARCH LEADS TO EXCEPTIONAL PATIENT CARE MAYO CLINIC CENTER FOR BIOMEDICAL DISCOVERY EXCEPTIONAL RESEARCH LEADS TO EXCEPTIONAL PATIENT CARE THE RESEARCH WE DO TODAY WILL DETERMINE THE TYPE OF MEDICAL AND SURGICAL PRACTICE WE CARRY ON AT THE CLINIC

More information

Supplementary Fig. 1 related to Fig. 1 Clinical relevance of lncrna candidate

Supplementary Fig. 1 related to Fig. 1 Clinical relevance of lncrna candidate Supplementary Figure Legends Supplementary Fig. 1 related to Fig. 1 Clinical relevance of lncrna candidate BC041951 in gastric cancer. (A) The flow chart for selected candidate lncrnas in 660 up-regulated

More information

U.S. EPA s Vapor Intrusion Database: Preliminary Evaluation of Attenuation Factors

U.S. EPA s Vapor Intrusion Database: Preliminary Evaluation of Attenuation Factors March 4, 2008 U.S. EPA s Vapor Intrusion Database: Preliminary Evaluation of Attenuation Factors Office of Solid Waste U.S. Environmental Protection Agency Washington, DC 20460 [This page intentionally

More information

The Metaphor. Individuals living in that environment Individual s degree of adaptation to its surrounding environment

The Metaphor. Individuals living in that environment Individual s degree of adaptation to its surrounding environment Genetic Algorithms Sesi 14 Optimization Techniques Mathematical Programming Network Analysis Branch & Bound Simulated Annealing Tabu Search Classes of Search Techniques Calculus Base Techniqes Fibonacci

More information

Logistics. Final exam date. Project Presentation. Plan for this week. Evolutionary Algorithms. Crossover and Mutation

Logistics. Final exam date. Project Presentation. Plan for this week. Evolutionary Algorithms. Crossover and Mutation Logistics Crossover and Mutation Assignments Checkpoint -- Problem Graded -- comments on mycourses Checkpoint --Framework Mostly all graded -- comments on mycourses Checkpoint -- Genotype / Phenotype Due

More information

Estoril Education Day

Estoril Education Day Estoril Education Day -Experimental design in Proteomics October 23rd, 2010 Peter James Note Taking All the Powerpoint slides from the Talks are available for download from: http://www.immun.lth.se/education/

More information

Database Searching and BLAST Dannie Durand

Database Searching and BLAST Dannie Durand Computational Genomics and Molecular Biology, Fall 2013 1 Database Searching and BLAST Dannie Durand Tuesday, October 8th Review: Karlin-Altschul Statistics Recall that a Maximal Segment Pair (MSP) is

More information

Gasoline Consumption Analysis

Gasoline Consumption Analysis Gasoline Consumption Analysis One of the most basic topics in economics is the supply/demand curve. Simply put, the supply offered for sale of a commodity is directly related to its price, while the demand

More information

CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer

CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer CS 5984: Application of Basic Clustering Algorithms to Find Expression Modules in Cancer T. M. Murali January 31, 2006 Innovative Application of Hierarchical Clustering A module map showing conditional

More information

A GUIDE TO GETTING SURVEY RESPONSES

A GUIDE TO GETTING SURVEY RESPONSES FROM EMAIL INVITATIONS TO PAID RESPONSES A GUIDE TO GETTING SURVEY RESPONSES CHOOSE SAMPLE CHOOSE MODE OF SURVEY SELECT PANEL IF NEEDED 1 2 SURVEY MODE: AN IMPORTANT CHOICE The way that you distribute

More information

PharmaSUG 2016 Paper 36

PharmaSUG 2016 Paper 36 PharmaSUG 2016 Paper 36 What's the Case? Applying Different Methods of Conducting Retrospective Case/Control Experiments in Pharmacy Analytics Aran Canes, Cigna, Bloomfield, CT ABSTRACT Retrospective Case/Control

More information

ACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes

ACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes ACCELERATING GENOMIC ANALYSIS ON THE CLOUD Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia

More information

Real-Time Predictive Modeling of Key Quality Characteristics Using Regularized Regression: SAS Procedures GLMSELECT and LASSO

Real-Time Predictive Modeling of Key Quality Characteristics Using Regularized Regression: SAS Procedures GLMSELECT and LASSO Paper 8240-2016 Real-Time Predictive Modeling of Key Quality Characteristics Using Regularized Regression: SAS Procedures GLMSELECT and LASSO Jon M. Lindenauer, Weyerhaeuser Company ABSTRACT This paper

More information

LAB. POPULATION GENETICS. 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium.

LAB. POPULATION GENETICS. 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium. Period Date LAB. POPULATION GENETICS PRE-LAB 1. Explain what is meant by a population being in Hardy-Weinberg equilibrium. 2. List and briefly explain the 5 conditions that need to be met to maintain a

More information

Adaptive Design for Clinical Trials

Adaptive Design for Clinical Trials Adaptive Design for Clinical Trials Mark Chang Millennium Pharmaceuticals, Inc., Cambridge, MA 02139,USA (e-mail: Mark.Chang@Statisticians.org) Abstract. Adaptive design is a trial design that allows modifications

More information

ANALYSING QUANTITATIVE DATA

ANALYSING QUANTITATIVE DATA 9 ANALYSING QUANTITATIVE DATA Although, of course, there are other software packages that can be used for quantitative data analysis, including Microsoft Excel, SPSS is perhaps the one most commonly subscribed

More information

The Comet Assay How to recognise Good Data

The Comet Assay How to recognise Good Data The Comet Assay How to recognise Good Data William Barfield 4 th September 2015 ICAW Content Regulatory Genetic Toxicology JaCVAM trial overview and results Protocols Historical control data Statistics

More information

Neutral theory: The neutral theory does not say that all evolution is neutral and everything is only due to to genetic drift.

Neutral theory: The neutral theory does not say that all evolution is neutral and everything is only due to to genetic drift. Neutral theory: The vast majority of observed sequence differences between members of a population are neutral (or close to neutral). These differences can be fixed in the population through random genetic

More information

Creating a Split-Plot Factorial Protocol

Creating a Split-Plot Factorial Protocol Creating a Split-Plot Factorial Protocol In this tutorial, we will demonstrate: how to set up a factorial protocol, fill in the treatments, and then view a Split-Plot trial to see how the treatments are

More information