Stata v 12 Illustration. One Way Analysis of Variance

Size: px
Start display at page:

Download "Stata v 12 Illustration. One Way Analysis of Variance"

Transcription

1 Stata v 12 Illustration Page 1. Preliminary Download anovaplot.. 2. Descriptives Graphs. 3. Descriptives Numerical 4. Assessment of Normality.. 5. Analysis of Variance Model Estimation.. 6. Tests of Equality of Variances.. 7. Post-hoc Pairwise Comparison of Groups 8. Post-hoc Graphs Source Hulley et al (1998) Randomized trial of estrogen plus progestin for secondary prevention of heart disease in postmenopausal women. The Heart and Estrogen/progestin Replacement Study. Journal of the American Medical Association, 280(7), Source Data The actual data set contains information on n=2,763. This was a randomized controlled trial investigation of hormone therapy for the prevention of heart attack and death. Notes (1) For this illustration, I have taken a subsample of size n=612 so that students with small stata will be able to reproduce this analysis. (2) Specifically, I took a random sample of 300 whites, all 218 African-Americans, plus all 94 women of other race ethnicity. The total sample size of the data set used in this illustration is thus 612. Analysis Question In this hypothetical data set, does systolic blood pressure (sbp) vary by race-ethnicity (raceth)? Stata Data Set used in this Illustration \STATA v 12 one way anova.doc Page 1 of 16

2 Key: Green stata comments Black stata command Blue output Red annotations 1. Preliminary Download anovaplot. * Download "add-on" anova command anovaplot if you don t already have it. findit anovaplot Click here to download Input Data. * Input HERS data (small version for class). use " \STATA v 12 one way anova.doc Page 2 of 16

3 2. Descriptives - Graphs. ** Set scheme for graphs user choice. set scheme s1color. * get min and max of y=sbp for y-axis tick marks. tabstat sbp, stat(min max) variable min max sbp * retrieve correspondence between codes and code labels and get sample sizes. numlabel, add. tabulate raceth race/ethnicity Freq. Percent Cum White African American Other Total ** Side-by-side dot plot - PLAIN. sort raceth. dotplot sbp, over(raceth) sbp distributions are similar in the 3 groups. It s unlikely that anova will be significant. \STATA v 12 one way anova.doc Page 3 of 16

4 . ** Side-by-side dot plot - PRETTY. dotplot sbp, over(raceth) msymbol(d) msize(vsmall) mcolor(blue) ylabel(75(25)200, labsize(small)) xlabel(1 "Whites (n=300)" 2 "African-Americans (n=218)" 3 "Other (n=94)", labsize(small) angle(45)) title("systolic BP (mm HG), by Race-Ethnicity") subtitle("n=612") caption("dot_pretty.png", size(vsmall)) \STATA v 12 one way anova.doc Page 4 of 16

5 . ** Side-by-side box plot - PLAIN. graph box sbp, over(raceth). ** Side-by-side box plot - PRETTY. graph box sbp, over(raceth) ylabel(75(25)200, labsize(small)) title("systolic BP (mm HG), by Race-Ethnicity") subtitle("n=612") caption("box_pretty.png", size(vsmall)) \STATA v 12 one way anova.doc Page 5 of 16

6 3. Descriptives - Numerical. * Descriptives - Numerical Descriptives of Raw Data. tabstat sbp, by(raceth) stat(n mean sd sem min q max) Summary for variables: sbp by categories of: raceth (race/ethnicity) raceth N mean sd se(mean) min p25 p50 p75 max White African Ameri Other Total Numerical descriptives are telling us the same. The means range mm Hg to mm Hg. \STATA v 12 one way anova.doc Page 6 of 16

7 4. Assessment of Normality. * Shapiro-Wilk Test (NULL: Distribution is normal and is retained for pvalue=large). swilk sbp Shapiro-Wilk W test for normal data Variable Obs W V z Prob>z sbp The null hypothesis of normality is rejected (p <.00001). Need graphical look to see if we re really in trouble.. ** histogram of overlay normal - PLAIN. histogram sbp, normal (bin=24, start=95, width=4.125) \STATA v 12 one way anova.doc Page 7 of 16

8 . ** histogram of overlay normal - PRETTY. histogram sbp, normal start(70) width(5) percent xlabel(75(25)200, labsize(small)) title("distribution of Y=sbp") subtitle("assessment of Normality") caption("histogram_pretty.png", size(vsmall)) (bin=25, start=70, width=5) Not bad actually!! For purposes of this illustration, we ll proceed under the assumption of normality. \STATA v 12 one way anova.doc Page 8 of 16

9 5. Analysis of Variance Model Estimation A one way anova can be obtained with either of two commands: anova or oneway..* The command ANOVA uses deviation from means parameterization.* anova YVARIABLE FACTOR. anova sbp raceth Number of obs = 612 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model raceth Residual Total As expected, the F-test for the null hypothesis of equality of means is not statistically significant (p=.32)..* The command ONEWAY uses deviation from means and provides Bartlett test of equal variances..* oneway YVARIABLE FACTOR. oneway sbp raceth Analysis of Variance Source SS df MS F Prob > F Between groups Within groups Total Bartlett's test for equal variances: chi2(2) = Prob>chi2 = * The command ONEWAY with option TABULATE. oneway sbp raceth, tabulate race/ethnic Summary of systolic blood pressure ity Mean Std. Dev. Freq White Africa Other Total Analysis of Variance Source SS df MS F Prob > F Between groups Within groups Total \STATA v 12 one way anova.doc Page 9 of 16

10 Bartlett's test for equal variances: chi2(2) = Prob>chi2 = *** Reference Cell Coding Approach requires two steps.* Step 1 Command is anova. anova sbp raceth Number of obs = 612 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model raceth Residual Total * Step 2 Command is regress. regress Source SS df MS Number of obs = F( 2, 609) = 1.14 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = sbp Coef. Std. Err. t P>t [95% Conf. Interval] raceth _cons \STATA v 12 one way anova.doc Page 10 of 16

11 6. Tests of Equality of Variances. * BARTLETT s Test is provided in output from command oneway. * Caution: This test is sensitive to the assumption of normality. oneway sbp raceth Analysis of Variance Source SS df MS F Prob > F Between groups Within groups Total Bartlett's test for equal variances: chi2(2) = Prob>chi2 = The null hypothesis of equal variances is not rejected (Bartlett test p-value=.20). * LEVENE and BROWN-FORSYTHE tests are obtained using the command robvar. * These are good choices when assumption of normality is in question.. * W_0 = Levene test. * W_50 = Forsythe-Browne modification of Levene test (mean is replaced by median). * W_10 = Fosythe-Browne modification of Levene test (mean is replaced by 10% trim). * robar(yvar), by(factor). robvar sbp, by(raceth) race/ethnic Summary of systolic blood pressure ity Mean Std. Dev. Freq White African A Other Total W0 = df(2, 609) Pr > F = Levene W50 = df(2, 609) Pr > F = Brown-Forsythe with median W10 = df(2, 609) Pr > F = Brown-Forsythe with 10% trimmed mean The null hypothesis of equal variances is not rejected by Levene s test either (p-value=.24) \STATA v 12 one way anova.doc Page 11 of 16

12 7. Post-Hoc Pairwise Comparisons of Groups Pairwise comparisons of groups is done using the command pwcompare. Note You must have fit the model first using anova.* Be sure to have first fit model using anova. anova sbp raceth Number of obs = 612 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model raceth Residual Total * No adjustment for multiple comparisons.* pwcompare FACTOR. pwcompare raceth Pairwise comparisons of marginal linear predictions Margins : asbalanced Unadjusted Contrast Std. Err. [95% Conf. Interval] raceth 2 vs vs vs For all pairwise comparisons of groups, the 95% confidence interval includes the null hypothesis value of zero. \STATA v 12 one way anova.doc Page 12 of 16

13 . * Bonferroni adjustment (NOT RECOMMENDED) for Multiple Comparisons sorted and with p-values. pwcompare raceth, mcompare(bonferroni) sort effects Pairwise comparisons of marginal linear predictions Margins : asbalanced Number of Comparisons raceth Bonferroni Bonferroni Contrast Std. Err. t P>t [95% Conf. Interval] raceth 3 vs vs vs Even with the stringent Bonferroni adjustment, for all pairwise comparisons of groups, no statistically significant differences are found. Not surprising, given what we ve already seen.. * Tukey adjustment for Multiple Comparisons. * NOTE This requires equal sample sizes in all groups. * So, technically, I should not have done this.. pwcompare raceth, mcompare(tukey) Pairwise comparisons of marginal linear predictions Margins : asbalanced Number of Comparisons raceth Tukey Contrast Std. Err. [95% Conf. Interval] raceth 2 vs vs vs \STATA v 12 one way anova.doc Page 13 of 16

14 Note: The tukey method requires balanced data for proper level coverage. A factor was found to be unbalanced. Yes, yes, I know. I should not have done the Tukey procedure 8. Post-Hoc Graphs Note You must have fit the model first using anova. anova sbp raceth Number of obs = 612 R-squared = Root MSE = Adj R-squared = Source Partial SS df MS F Prob > F Model raceth Residual Total * anova plot - PLAIN. anovaplot \STATA v 12 one way anova.doc Page 14 of 16

15 . *anovaplot - PRETTY. anovaplot, legend(off) title("one Way ANOVA of Y=sbp over X=Race-Ethnicity") subtitle("n=612") ylabel(75(25)200, labsize(small)) xlabel(1 "Whites (n=300)" 2 "African- Americans (n=218)" 3 "Other(n=94)", labsize(small) angle(45)) caption("anovaplot_pretty.png", size(vsmall)). * I think I ll drop the things I created to do Levene s test before exiting.. drop m. drop absolutediff. drop yhat \STATA v 12 one way anova.doc Page 15 of 16

16 . log close \STATA v 12 one way anova.doc Page 16 of 16

Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users

Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users Data Set for this Assignment: Download from the course website: Stata Users: framingham_1000.dta Source: Levy (1999) National

More information

PubHlth 640 Intermediate Biostatistics Unit 2 Regression and Correlation

PubHlth 640 Intermediate Biostatistics Unit 2 Regression and Correlation PubHlth 640 Intermediate Biostatistics Unit 2 Regression and Correlation Multiple Linear Regression Software: Stata v 10.1 Human p53 and Breast Cancer Risk Source: Matthews et al. Parity Induced Protection

More information

Stata Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition Chapter 12: Analysis of Variance

Stata Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition Chapter 12: Analysis of Variance Stata Program Notes Biostatistics: A Guide to Design, Analysis, and Discovery Second Edition Chapter 12: Analysis of Variance Program Note 12.1 - One-Way ANOVA and Multiple Comparisons The Stata command

More information

Bios 312 Midterm: Appendix of Results March 1, Race of mother: Coded as 0==black, 1==Asian, 2==White. . table race white

Bios 312 Midterm: Appendix of Results March 1, Race of mother: Coded as 0==black, 1==Asian, 2==White. . table race white Appendix. Use these results to answer 2012 Midterm questions Dataset Description Data on 526 infants with very low (

More information

Introduction of STATA

Introduction of STATA Introduction of STATA News: There is an introductory course on STATA offered by CIS Description: Intro to STATA On Tue, Feb 13th from 4:00pm to 5:30pm in CIT 269 Seats left: 4 Windows, 7 Macintosh For

More information

BIOSTATS 640 Spring 2017 Stata v14 Unit 2: Regression & Correlation. Stata version 14

BIOSTATS 640 Spring 2017 Stata v14 Unit 2: Regression & Correlation. Stata version 14 Stata version 14 Illustration Simple and Multiple Linear Regression February 2017 I- Simple Linear Regression.... 1. Introduction to Example... 2. Preliminaries: Descriptives.. 3. Model Fitting (Estimation)

More information

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,

Table. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use k:mydirectory, Table XTMIXED Procedure in STATA with Output Systolic Blood Pressure, 2001. use "k:mydirectory,. xtmixed sbp nage20 nage30 nage40 nage50 nage70 nage80 nage90 winter male dept2 edu_bachelor median_household_income

More information

Analyzing CHIS Data Using Stata

Analyzing CHIS Data Using Stata Analyzing CHIS Data Using Stata Christine Wells UCLA IDRE Statistical Consulting Group February 2014 Christine Wells Analyzing CHIS Data Using Stata 1/ 34 The variables bmi p: BMI povll2: Poverty level

More information

Notes on PS2

Notes on PS2 17.871 - Notes on PS2 Mike Sances MIT April 2, 2012 Mike Sances (MIT) 17.871 - Notes on PS2 April 2, 2012 1 / 9 Interpreting Regression: Coecient regress success_rate dist Source SS df MS Number of obs

More information

Midterm Exam. Friday the 29th of October, 2010

Midterm Exam. Friday the 29th of October, 2010 Midterm Exam Friday the 29th of October, 2010 Name: General Comments: This exam is closed book. However, you may use two pages, front and back, of notes and formulas. Write your answers on the exam sheets.

More information

COMPARING MODEL ESTIMATES: THE LINEAR PROBABILITY MODEL AND LOGISTIC REGRESSION

COMPARING MODEL ESTIMATES: THE LINEAR PROBABILITY MODEL AND LOGISTIC REGRESSION PLS 802 Spring 2018 Professor Jacoby COMPARING MODEL ESTIMATES: THE LINEAR PROBABILITY MODEL AND LOGISTIC REGRESSION This handout shows the log of a STATA session that compares alternative estimates of

More information

rat cortex data: all 5 experiments Friday, June 15, :04:07 AM 1

rat cortex data: all 5 experiments Friday, June 15, :04:07 AM 1 rat cortex data: all 5 experiments Friday, June 15, 218 1:4:7 AM 1 Obs experiment stimulated notstimulated difference 1 1 689 657 32 2 1 656 623 33 3 1 668 652 16 4 1 66 654 6 5 1 679 658 21 6 1 663 646

More information

Soci Statistics for Sociologists

Soci Statistics for Sociologists University of North Carolina Chapel Hill Soci708-001 Statistics for Sociologists Fall 2009 Professor François Nielsen Stata Commands for Module 11 Multiple Regression For further information on any command

More information

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian. Preliminary Data Screening

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian. Preliminary Data Screening r's age when 1st child born 2 4 6 Density.2.4.6.8 Density.5.1 Sociology 774: Regression Models for Categorical Data Instructor: Natasha Sarkisian Preliminary Data Screening A. Examining Univariate Normality

More information

* STATA.OUTPUT -- Chapter 5

* STATA.OUTPUT -- Chapter 5 * STATA.OUTPUT -- Chapter 5.*bwt/confounder example.infile bwt smk gest using bwt.data.correlate (obs=754) bwt smk gest -------------+----- bwt 1.0000 smk -0.1381 1.0000 gest 0.3629 0.0000 1.0000.regress

More information

CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA

CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA PLS 802 Spring 2018 Professor Jacoby CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA This handout shows the log from a Stata session that examines the Duncan Occupational Prestige data

More information

Application: Effects of Job Training Program (Data are the Dehejia and Wahba (1999) version of Lalonde (1986).)

Application: Effects of Job Training Program (Data are the Dehejia and Wahba (1999) version of Lalonde (1986).) Application: Effects of Job Training Program (Data are the Dehejia and Wahba (1999) version of Lalonde (1986).) There are two data sets; each as the same treatment group of 185 men. JTRAIN2 includes 260

More information

. *increase the memory or there will problems. set memory 40m (40960k)

. *increase the memory or there will problems. set memory 40m (40960k) Exploratory Data Analysis on the Correlation Structure In longitudinal data analysis (and multi-level data analysis) we model two key components of the data: 1. Mean structure. Correlation structure (after

More information

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011 ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2011 Instructions: Answer all five (5) questions. Point totals for each question are given in parentheses. The parts within each

More information

PubHlth Introduction to Biostatistics. 1. Summarizing Data Illustration: STATA version 10 or 11. A Visit to Yellowstone National Park, USA

PubHlth Introduction to Biostatistics. 1. Summarizing Data Illustration: STATA version 10 or 11. A Visit to Yellowstone National Park, USA PubHlth 540 - Introduction to Biostatistics 1. Summarizing Data Illustration: Stata (version 10 or 11) A Visit to Yellowstone National Park, USA Source: Chatterjee, S; Handcock MS and Simonoff JS A Casebook

More information

Interpreting and Visualizing Regression models with Stata Margins and Marginsplot. Boriana Pratt May 2017

Interpreting and Visualizing Regression models with Stata Margins and Marginsplot. Boriana Pratt May 2017 Interpreting and Visualizing Regression models with Stata Margins and Marginsplot Boriana Pratt May 2017 Interpreting regression models Often regression results are presented in a table format, which makes

More information

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis

SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis SOCY7706: Longitudinal Data Analysis Instructor: Natasha Sarkisian Two Wave Panel Data Analysis In any longitudinal analysis, we can distinguish between analyzing trends vs individual change that is, model

More information

Example Analysis with STATA

Example Analysis with STATA Example Analysis with STATA Exploratory Data Analysis Means and Variance by Time and Group Correlation Individual Series Derived Variable Analysis Fitting a Line to Each Subject Summarizing Slopes by Group

More information

Example Analysis with STATA

Example Analysis with STATA Example Analysis with STATA Exploratory Data Analysis Means and Variance by Time and Group Correlation Individual Series Derived Variable Analysis Fitting a Line to Each Subject Summarizing Slopes by Group

More information

Exploring Functional Forms: NBA Shots. NBA Shots 2011: Success v. Distance. . bcuse nbashots11

Exploring Functional Forms: NBA Shots. NBA Shots 2011: Success v. Distance. . bcuse nbashots11 NBA Shots 2011: Success v. Distance. bcuse nbashots11 Contains data from http://fmwww.bc.edu/ec-p/data/wooldridge/nbashots11.dta obs: 199,119 vars: 15 25 Oct 2012 09:08 size: 24,690,756 ------------- storage

More information

Biostatistics 208 Data Exploration

Biostatistics 208 Data Exploration Biostatistics 208 Data Exploration Dave Glidden Professor of Biostatistics Univ. of California, San Francisco January 8, 2008 http://www.biostat.ucsf.edu/biostat208 Organization Office hours by appointment

More information

Group Comparisons: Using What If Scenarios to Decompose Differences Across Groups

Group Comparisons: Using What If Scenarios to Decompose Differences Across Groups Group Comparisons: Using What If Scenarios to Decompose Differences Across Groups Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 15, 2015 We saw that the

More information

Biostatistics 208. Lecture 1: Overview & Linear Regression Intro.

Biostatistics 208. Lecture 1: Overview & Linear Regression Intro. Biostatistics 208 Lecture 1: Overview & Linear Regression Intro. Steve Shiboski Division of Biostatistics, UCSF January 8, 2019 1 Organization Office hours by appointment (Mission Hall 2540) E-mail to

More information

SUGGESTED SOLUTIONS Winter Problem Set #1: The results are attached below.

SUGGESTED SOLUTIONS Winter Problem Set #1: The results are attached below. 450-2 Winter 2008 Problem Set #1: SUGGESTED SOLUTIONS The results are attached below. 1. The balanced panel contains larger firms (sales 120-130% bigger than the full sample on average), which are more

More information

The study obtains the following results: Homework #2 Basics of Logistic Regression Page 1. . version 13.1

The study obtains the following results: Homework #2 Basics of Logistic Regression Page 1. . version 13.1 Soc 73994, Homework #2: Basics of Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 All answers should be typed and mailed to

More information

Week 10: Heteroskedasticity

Week 10: Heteroskedasticity Week 10: Heteroskedasticity Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline The problem of (conditional)

More information

energy usage summary (both house designs) Friday, June 15, :51:26 PM 1

energy usage summary (both house designs) Friday, June 15, :51:26 PM 1 energy usage summary (both house designs) Friday, June 15, 18 02:51:26 PM 1 The UNIVARIATE Procedure type = Basic Statistical Measures Location Variability Mean 13.87143 Std Deviation 2.36364 Median 13.70000

More information

SECTION 11 ACUTE TOXICITY DATA ANALYSIS

SECTION 11 ACUTE TOXICITY DATA ANALYSIS SECTION 11 ACUTE TOXICITY DATA ANALYSIS 11.1 INTRODUCTION 11.1.1 The objective of acute toxicity tests with effluents and receiving waters is to identify discharges of toxic effluents in acutely toxic

More information

PSC 508. Jim Battista. Dummies. Univ. at Buffalo, SUNY. Jim Battista PSC 508

PSC 508. Jim Battista. Dummies. Univ. at Buffalo, SUNY. Jim Battista PSC 508 PSC 508 Jim Battista Univ. at Buffalo, SUNY Dummies Dummy variables Sometimes we want to include categorical variables in our models Numerical variables that don t necessarily have any inherent order and

More information

17.871: PS3 Key. Part I

17.871: PS3 Key. Part I 17.871: PS3 Key Part I. use "cces12.dta", clear. reg CC424 CC334A [aweight=v103] if CC334A!= 8 & CC424 < 6 // Need to remove values that do not fit on the linear scale. This entails discarding all respondents

More information

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION JMP software provides introductory statistics in a package designed to let students visually explore data in an interactive way with

More information

You can find the consultant s raw data here:

You can find the consultant s raw data here: Problem Set 1 Econ 475 Spring 2014 Arik Levinson, Georgetown University 1 [Travel Cost] A US city with a vibrant tourist industry has an industrial accident (a spill ) The mayor wants to sue the company

More information

X. Mixed Effects Analysis of Variance

X. Mixed Effects Analysis of Variance X. Mixed Effects Analysis of Variance Analysis of variance with multiple observations per patient These analyses are complicated by the fact that multiple observations on the same patient are correlated

More information

Eco311, Final Exam, Fall 2017 Prof. Bill Even. Your Name (Please print) Directions. Each question is worth 4 points unless indicated otherwise.

Eco311, Final Exam, Fall 2017 Prof. Bill Even. Your Name (Please print) Directions. Each question is worth 4 points unless indicated otherwise. Your Name (Please print) Directions Each question is worth 4 points unless indicated otherwise. Place all answers in the space provided below or within each question. Round all numerical answers to the

More information

The Effect of Occupational Danger on Individuals Wage Rates. A fundamental problem confronting the implementation of many healthcare

The Effect of Occupational Danger on Individuals Wage Rates. A fundamental problem confronting the implementation of many healthcare The Effect of Occupational Danger on Individuals Wage Rates Jonathan Lee Econ 170-001 Spring 2003 PID: 703969503 A fundamental problem confronting the implementation of many healthcare policies is the

More information

Foley Retreat Research Methods Workshop: Introduction to Hierarchical Modeling

Foley Retreat Research Methods Workshop: Introduction to Hierarchical Modeling Foley Retreat Research Methods Workshop: Introduction to Hierarchical Modeling Amber Barnato MD MPH MS University of Pittsburgh Scott Halpern MD PhD University of Pennsylvania Learning objectives 1. List

More information

= = Intro to Statistics for the Social Sciences. Name: Lab Session: Spring, 2015, Dr. Suzanne Delaney

= = Intro to Statistics for the Social Sciences. Name: Lab Session: Spring, 2015, Dr. Suzanne Delaney Name: Intro to Statistics for the Social Sciences Lab Session: Spring, 2015, Dr. Suzanne Delaney CID Number: _ Homework #22 You have been hired as a statistical consultant by Donald who is a used car dealer

More information

The Multivariate Dustbin

The Multivariate Dustbin UCLA Statistical Consulting Group (Ret.) Stata Conference Baltimore - July 28, 2017 Back in graduate school... My advisor told me that the future of data analysis was multivariate. By multivariate he meant...

More information

3. The lab guide uses the data set cda_scireview3.dta. These data cannot be used to complete assignments.

3. The lab guide uses the data set cda_scireview3.dta. These data cannot be used to complete assignments. Lab Guide Written by Trent Mize for ICPSRCDA14 [Last updated: 17 July 2017] 1. The Lab Guide is divided into sections corresponding to class lectures. Each section should be reviewed before starting the

More information

Applied Econometrics

Applied Econometrics Applied Econometrics Lecture 3 Nathaniel Higgins ERS and JHU 20 September 2010 Outline of today s lecture Schedule and Due Dates Making OLS make sense Uncorrelated X s Correlated X s Omitted variable bias

More information

ROBUST ESTIMATION OF STANDARD ERRORS

ROBUST ESTIMATION OF STANDARD ERRORS ROBUST ESTIMATION OF STANDARD ERRORS -- log: Z:\LDA\DataLDA\sitka_Lab8.log log type: text opened on: 18 Feb 2004, 11:29:17. ****The observed mean responses in each of the 4 chambers; for 1988 and 1989.

More information

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes

More information

Longitudinal Data Analysis, p.12

Longitudinal Data Analysis, p.12 Biostatistics 140624 2011 EXAM STATA LOG ( NEEDED TO ANSWER EXAM QUESTIONS) Multiple Linear Regression, p2 Longitudinal Data Analysis, p12 Multiple Logistic Regression, p20 Ordered Logistic Regression,

More information

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2014

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2014 ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics May 2014 Instructions: Answer all five (5) questions. Point totals for each question are given in parentheses. The parts within each

More information

Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2. B. Rosner, 5/09/17

Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2. B. Rosner, 5/09/17 Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2 B. Rosner, 5/09/17 1 Outline 1. Testing for effect modification in logistic regression analyses 2. Conditional logistic

More information

Trunkierte Regression: simulierte Daten

Trunkierte Regression: simulierte Daten Trunkierte Regression: simulierte Daten * Datengenerierung set seed 26091952 set obs 48 obs was 0, now 48 gen age=_n+17 gen yhat=2000+200*(age-18) gen wage = yhat + 2000*invnorm(uniform()) replace wage=max(0,wage)

More information

Tabulate and plot measures of association after restricted cubic spline models

Tabulate and plot measures of association after restricted cubic spline models Tabulate and plot measures of association after restricted cubic spline models Nicola Orsini Institute of Environmental Medicine Karolinska Institutet 3 rd Nordic and Baltic countries Stata Users Group

More information

This is a quick-and-dirty example for some syntax and output from pscore and psmatch2.

This is a quick-and-dirty example for some syntax and output from pscore and psmatch2. This is a quick-and-dirty example for some syntax and output from pscore and psmatch2. It is critical that when you run your own analyses, you generate your own syntax. Both of these procedures have very

More information

= = Name: Lab Session: CID Number: The database can be found on our class website: Donald s used car data

= = Name: Lab Session: CID Number: The database can be found on our class website: Donald s used car data Intro to Statistics for the Social Sciences Fall, 2017, Dr. Suzanne Delaney Extra Credit Assignment Instructions: You have been hired as a statistical consultant by Donald who is a used car dealer to help

More information

Chapter 2 Part 1B. Measures of Location. September 4, 2008

Chapter 2 Part 1B. Measures of Location. September 4, 2008 Chapter 2 Part 1B Measures of Location September 4, 2008 Class will meet in the Auditorium except for Tuesday, October 21 when we meet in 102a. Skill set you should have by the time we complete Chapter

More information

Lecture 2a: Model building I

Lecture 2a: Model building I Epidemiology/Biostats VHM 812/802 Course Winter 2015, Atlantic Veterinary College, PEI Javier Sanchez Lecture 2a: Model building I Index Page Predictors (X variables)...2 Categorical predictors...2 Indicator

More information

Unit 5 Logistic Regression Homework #7 Practice Problems. SOLUTIONS Stata version

Unit 5 Logistic Regression Homework #7 Practice Problems. SOLUTIONS Stata version Unit 5 Logistic Regression Homework #7 Practice Problems SOLUTIONS Stata version Before You Begin Download STATA data set illeetvilaine.dta from the course website page, ASSIGNMENTS (Homeworks and Exams)

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Univariate Statistics Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved Table of Contents PAGE Creating a Data File...3 1. Creating

More information

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors. Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 20, 2018 Be sure to read the Stata Manual s

More information

(LDA lecture 4/15/08: Transition model for binary data. -- TL)

(LDA lecture 4/15/08: Transition model for binary data. -- TL) (LDA lecture 4/5/08: Transition model for binary data -- TL) (updated 4/24/2008) log: G:\public_html\courses\LDA2008\Data\CTQ2log log type: text opened on: 5 Apr 2008, 2:27:54 *** read in data ******************************************************

More information

Oneway ANOVA Using SAS (commands= finan_anova.sas)

Oneway ANOVA Using SAS (commands= finan_anova.sas) Oneway ANOVA Using SAS (commands= finan_anova.sas) The commands in this handout use the data set CARS.POR, which is an SPSS portable file, described in another handout. /**************************************

More information

Multilevel/ Mixed Effects Models: A Brief Overview

Multilevel/ Mixed Effects Models: A Brief Overview Multilevel/ Mixed Effects Models: A Brief Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 27, 2018 These notes borrow very heavily, often/usually

More information

Working with Stata Inference on proportions

Working with Stata Inference on proportions Working with Stata Inference on proportions Nicola Orsini Biostatistics Team Department of Public Health Sciences Karolinska Institutet Outline Inference on one population proportion Principle of maximum

More information

Computer Handout Two

Computer Handout Two Computer Handout Two /******* senic2.sas ***********/ %include 'senicdef.sas'; /* Effectively, Copy the file senicdef.sas to here */ title2 'Elementary statistical tests'; proc freq; title3 'Use proc freq

More information

The Multivariate Regression Model

The Multivariate Regression Model The Multivariate Regression Model Example Determinants of College GPA Sample of 4 Freshman Collect data on College GPA (4.0 scale) Look at importance of ACT Consider the following model CGPA ACT i 0 i

More information

Biostatistics for Public Health Practice

Biostatistics for Public Health Practice Biostatistics for Public Health Practice Week 03 3 Concepts of Statistical Inference Associate Professor Theo Niyonsenga HLTH 5187: Biostatistics for MPHP 1 Statistical Inference Statistics Survey Sampling

More information

İnsan Tunalı November 29, 2018 Econ 511: Econometrics I. ANSWERS TO ASSIGNMENT 10: Part II STATA Supplement

İnsan Tunalı November 29, 2018 Econ 511: Econometrics I. ANSWERS TO ASSIGNMENT 10: Part II STATA Supplement İnsan Tunalı November 29, 2018 Econ 511: Econometrics I STATA Exercise 1 ANSWERS TO ASSIGNMENT 10: Part II STATA Supplement TASK 1: --- name: log: g:\econ511\heter_housinglog log type: text opened

More information

Guideline on evaluating the impact of policies -Quantitative approach-

Guideline on evaluating the impact of policies -Quantitative approach- Guideline on evaluating the impact of policies -Quantitative approach- 1 2 3 1 The term treatment derives from the medical sciences and has more meaning when is used in that context. However, this term

More information

ECON Introductory Econometrics Seminar 9

ECON Introductory Econometrics Seminar 9 ECON4150 - Introductory Econometrics Seminar 9 Stock and Watson EE13.1 May 4, 2015 Stock and Watson EE13.1 ECON4150 - Introductory Econometrics Seminar 9 May 4, 2015 1 / 18 Empirical exercise E13.1: Data

More information

Appendix C: Lab Guide for Stata

Appendix C: Lab Guide for Stata Appendix C: Lab Guide for Stata 2011 1. The Lab Guide is divided into sections corresponding to class lectures. Each section includes both a review, which everyone should complete and an exercise, which

More information

!! NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA ! NOTE: The SAS System used:!

!! NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA ! NOTE: The SAS System used:! 1 The SAS System NOTE: Copyright (c) 2002-2010 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.3 (TS1M0) Licensed to UNIVERSITY OF TORONTO/COMPUTING & COMMUNICATIONS, Site 70072784.

More information

A Little Stata Session 1

A Little Stata Session 1 A Little Stata Session 1 Following is a very basic introduction to Stata. I highly recommend the tutorial available at: http://www.ats.ucla.edu/stat/stata/default.htm When you bring up Stata, you will

More information

Unit 6: Simple Linear Regression Lecture 2: Outliers and inference

Unit 6: Simple Linear Regression Lecture 2: Outliers and inference Unit 6: Simple Linear Regression Lecture 2: Outliers and inference Statistics 101 Thomas Leininger June 18, 2013 Types of outliers in linear regression Types of outliers How do(es) the outlier(s) influence

More information

Problem Points Score USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Problem Points Score USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT STAT 512 EXAM I STAT 512 Name (7 pts) Problem Points Score 1 40 2 25 3 28 USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT WRITE LEGIBLY. ANYTHING UNREADABLE WILL NOT BE GRADED GOOD LUCK!!!!

More information

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test

CHAPTER 8 T Tests. A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test CHAPTER 8 T Tests A number of t tests are available, including: The One-Sample T Test The Paired-Samples Test The Independent-Samples T Test 8.1. One-Sample T Test The One-Sample T Test procedure: Tests

More information

Never Smokers Exposure Case Control Yes No

Never Smokers Exposure Case Control Yes No Question 0.4 Never Smokers Exosure Case Control Yes 33 7 50 No 86 4 597 29 428 647 OR^ Never Smokers (33)(4)/(7)(86) 4.29 Past or Present Smokers Exosure Case Control Yes 7 4 2 No 52 3 65 69 7 86 OR^ Smokers

More information

Question Total Points Points Received 1 16

Question Total Points Points Received 1 16 ame: Check one: Mon.-Wed. Section: Tues.-Thurs. Section: Statistics 0 Midterm # ovember, 000 6-8pm This exam is closed book. You may have two pages of notes. You may use a calculator. You must write the

More information

PREDICTIVE MODEL OF TOTAL INCOME FROM SALARIES/WAGES IN THE CONTEXT OF PASAY CITY

PREDICTIVE MODEL OF TOTAL INCOME FROM SALARIES/WAGES IN THE CONTEXT OF PASAY CITY Page22 PREDICTIVE MODEL OF TOTAL INCOME FROM SALARIES/WAGES IN THE CONTEXT OF PASAY CITY Wilson Cordova wilson.cordova@cksc.edu.ph Chiang Kai Shek College, Philippines Abstract There are varied sources

More information

Psy 420 Midterm 2 Part 2 (Version A) In lab (50 points total)

Psy 420 Midterm 2 Part 2 (Version A) In lab (50 points total) Psy 40 Midterm Part (Version A) In lab (50 points total) A researcher wants to know if memory is improved by repetition (Duh!). So he shows a group of five participants a list of 0 words at for different

More information

Two Way ANOVA. Turkheimer PSYC 771. Page 1 Two-Way ANOVA

Two Way ANOVA. Turkheimer PSYC 771. Page 1 Two-Way ANOVA Page 1 Two Way ANOVA Two way ANOVA is conceptually like multiple regression, in that we are trying to simulateously assess the effects of more than one X variable on Y. But just as in One Way ANOVA, the

More information

Milk Data Analysis. 1. Objective: analyzing protein milk data using STATA.

Milk Data Analysis. 1. Objective: analyzing protein milk data using STATA. 1. Objective: analyzing protein milk data using STATA. 2. Dataset: Protein milk data set (in the class website) Data description: Percentage protein content of milk samples at weekly intervals from each

More information

Post-Estimation Commands for MLogit Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017

Post-Estimation Commands for MLogit Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 Post-Estimation Commands for MLogit Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 These notes borrow heavily (sometimes verbatim) from Long &

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Biometry 755 Spring 2009 Regression diagnostics p. 1/48 Introduction Every statistical method is developed based on assumptions. The validity of results derived from a given method

More information

for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty

for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty for var trstprl trstlgl trstplc trstplt trstep: reg X trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty -> reg trstprl trust10 stfeco yrbrn hinctnt edulvl pltcare polint wrkprty Source SS df MS

More information

BUS105 Statistics. Tutor Marked Assignment. Total Marks: 45; Weightage: 15%

BUS105 Statistics. Tutor Marked Assignment. Total Marks: 45; Weightage: 15% BUS105 Statistics Tutor Marked Assignment Total Marks: 45; Weightage: 15% Objectives a) Reinforcing your learning, at home and in class b) Identifying the topics that you have problems with so that your

More information

Read and Describe the SENIC Data

Read and Describe the SENIC Data Read and Describe the SENIC Data If the data come in an Excel spreadsheet (very common), blanks are ideal for missing values. The spreadsheet must be.xls, not.xlsx. Beware of trying to read a.csv file

More information

Number of obs = R-squared = Root MSE = Adj R-squared =

Number of obs = R-squared = Root MSE = Adj R-squared = Appendix for the details of statistical test results Statistical Package used:stata/se 11.1 1. ANOVA result with dependent variable: current level of happiness, independent variables: sexs, ages, and survey

More information

Checking the model. Linearity. Normality. Constant variance. Influential points. Covariate overlap

Checking the model. Linearity. Normality. Constant variance. Influential points. Covariate overlap Checking the model Linearity Normality Constant variance Influential points Covariate overlap 1 Checking the model: linearity Average value of outcome initially assumed to be linear function of continuous

More information

Timing Production Runs

Timing Production Runs Class 7 Categorical Factors with Two or More Levels 189 Timing Production Runs ProdTime.jmp An analysis has shown that the time required in minutes to complete a production run increases with the number

More information

Categorical Variables, Part 2

Categorical Variables, Part 2 Spring, 000 - - Categorical Variables, Part Project Analysis for Today First multiple regression Interpreting categorical predictors and their interactions in the first multiple regression model fit in

More information

ADVANCED ECONOMETRICS I

ADVANCED ECONOMETRICS I ADVANCED ECONOMETRICS I Practice Exercises (1/2) Instructor: Joaquim J. S. Ramalho E.mail: jjsro@iscte-iul.pt Personal Website: http://home.iscte-iul.pt/~jjsro Office: D5.10 Course Website: http://home.iscte-iul.pt/~jjsro/advancedeconometricsi.htm

More information

Compartmental Pharmacokinetic Analysis. Dr Julie Simpson

Compartmental Pharmacokinetic Analysis. Dr Julie Simpson Compartmental Pharmacokinetic Analysis Dr Julie Simpson Email: julieas@unimelb.edu.au BACKGROUND Describes how the drug concentration changes over time using physiological parameters. Gut compartment Absorption,

More information

Interactions made easy

Interactions made easy Interactions made easy André Charlett Neville Q Verlander Health Protection Agency Centre for Infections Motivation Scientific staff within institute using Stata to fit many types of regression models

More information

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Modelling categorical variables using logit models Software commands

More information

Center for Demography and Ecology

Center for Demography and Ecology Center for Demography and Ecology University of Wisconsin-Madison A Comparative Evaluation of Selected Statistical Software for Computing Multinomial Models Nancy McDermott CDE Working Paper No. 95-01

More information

I. INTRODUCTION II. LITERATURE REVIEW. Ahmad Subagyo 1, Armanto Wijaksono 2. Lecture Management at GCI Business School, Indonesia

I. INTRODUCTION II. LITERATURE REVIEW. Ahmad Subagyo 1, Armanto Wijaksono 2. Lecture Management at GCI Business School, Indonesia 218 IJSRST Volume 4 Issue 8 Print ISSN: 2395-611 Online ISSN: 2395-62X Themed Section: Science and Technology Test Quality of Variance and Tabulation : Case study from Indonesia Ahmad Subagyo 1, Armanto

More information

Term Test #2, ECO220Y, January 31, 2014 Page 1 of 12

Term Test #2, ECO220Y, January 31, 2014 Page 1 of 12 Term Test #2, ECO220Y, January 31, 2014 Page 1 of 12 Last Name: First Name: There are 5 questions on 12 pages with varying point values for a total of 88 possible points. The last page is a formula sheet

More information

SPSS 14: quick guide

SPSS 14: quick guide SPSS 14: quick guide Edition 2, November 2007 If you would like this document in an alternative format please ask staff for help. On request we can provide documents with a different size and style of

More information

Survey commands in STATA

Survey commands in STATA Survey commands in STATA Carlo Azzarri DECRG Sample survey: Albania 2005 LSMS 4 strata (Central, Coastal, Mountain, Tirana) 455 Primary Sampling Units (PSU) 8 HHs by PSU * 455 = 3,640 HHs svy command:

More information

Statistics: Data Analysis and Presentation. Fr Clinic II

Statistics: Data Analysis and Presentation. Fr Clinic II Statistics: Data Analysis and Presentation Fr Clinic II Overview Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars

More information