Tutorial Regression & correlation. Presented by Jessica Raterman Shannon Hodges

Similar documents
The Dummy s Guide to Data Analysis Using SPSS

Chapter 10 Regression Analysis

Using SPSS for Linear Regression

Choosing the right business solution is a team sport. EVERYONE WINS

Opening SPSS 6/18/2013. Lesson: Quantitative Data Analysis part -I. The Four Windows: Data Editor. The Four Windows: Output Viewer

Marginal Costing Q.8

DEBUNKING THE MYTHS SURROUNDING MILLENNIALS A PRACTICAL CHECKLIST FOR MANAGERS

RESULT AND DISCUSSION

STAT 350 (Spring 2016) Homework 12 Online 1

Advanced Tutorials. SESUG '95 Proceedings GETTING STARTED WITH PROC LOGISTIC

Chapter 5 Regression

Correlation and Simple. Linear Regression. Scenario. Defining Correlation

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa

Using Predictive Margins to Make Clearer Explanations

The Impact of SEM Programs on Customer Participation Dan Rubado, JP Batmale and Kati Harper, Energy Trust of Oregon

Who Are My Best Customers?

Introduction to Business Research 3

What s New in Minitab 17

Business Quantitative Analysis [QU1] Examination Blueprint

Statistics: Data Analysis and Presentation. Fr Clinic II

Problem Points Score USE YOUR TIME WISELY SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

* Whenever you have x number of something per y number of something else, the x goes on top and the y goes on bottom.

Managerial Decision-Making Introduction To Using Excel In Forecasting

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations

Starter Watch the video clip In the Field which highlights some of the challenges of collecting data in the Arctic region.

EST Accuracy of FEL 2 Estimates in Process Plants

Categorical Predictors, Building Regression Models

The dependent variable of Secchi Depth was explored for its correlation to the independent variable of Turbidity.

GETTING STARTED WITH PROC LOGISTIC

Lecture 3: Section 1.2 Linear Functions and Applications

Near-Balanced Incomplete Block Designs with An Application to Poster Competitions

Subscribe and Unsubscribe: Archive of past newsletters

Design of Experiments

An empirical machine learning method for predicting potential fire control locations for pre-fire planning and operational fire management

GETTING STARTED WITH PROC LOGISTIC

Bridging logistic and OLS regression

YouTube Playbook for Small Business. Connect to customers with compelling videos on YouTube.

4.3 Nonparametric Tests cont...

A Business Agility e-book. Getting out of Excel Hell Power BI: a guide

Know Your Data (Chapter 2)

Model Building Process Part 2: Factor Assumptions

Chapter 3. Table of Contents. Introduction. Empirical Methods for Demand Analysis

Getting Started with OptQuest

Distinguish between different types of numerical data and different data collection processes.

Quadratic Regressions Group Acitivity 2 Business Project Week #4

JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

SPSS Guide Page 1 of 13

Statistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models

SPSS 14: quick guide

Thus, there are two points to keep in mind when analyzing risk:

Creative Commons Attribution-NonCommercial-Share Alike License

Statistics 201 Summary of Tools and Techniques

Introduction to Categorical Data Analysis Procedures (Chapter)

Learn What s New. Statistical Software

Continuous Improvement Toolkit

Tabulate and plot measures of association after restricted cubic spline models

APPLICATION OF SEASONAL ADJUSTMENT FACTORS TO SUBSEQUENT YEAR DATA. Corresponding Author

Statistical Modelling for Business and Management. J.E. Cairnes School of Business & Economics National University of Ireland Galway.

Ecology Chapter Teacher Sheet. Activity #7: Using SWMP Data

Getting Started with HLM 5. For Windows

Choosing the Right Type of Forecasting Model: Introduction Statistics, Econometrics, and Forecasting Concept of Forecast Accuracy: Compared to What?

Using R for Introductory Statistics

2017 CAPE Examiner Training Prework Instructions

Chapter 1 Introduction: The Role of Statistics in Engineering

Habitat as a predictor of Warbler usage.

Department of Sociology King s University College Sociology 302b: Section 570/571 Research Methodology in Empirical Sociology Winter 2006

Surrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence Intervals by Binary Logistic Regression

SAARC Training Workshop Program Identification, Comparison and Scenario Based Application of Power Demand/ Load Forecasting Tools

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design

Weka Evaluation: Assessing the performance

PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL

+? Mean +? No change -? Mean -? No Change. *? Mean *? Std *? Transformations & Data Cleaning. Transformations

Spreadsheets in Education (ejsie)

Unit 6: Simple Linear Regression Lecture 2: Outliers and inference

CREDIT RISK MODELLING Using SAS

[EASY ISO 9001 IMPLEMENTATION GUIDEBOOOK]

Examination of Cross Validation techniques and the biases they reduce.

Multiple Regression. Dr. Tom Pierce Department of Psychology Radford University

One-Factor RSM Tutorial

Data Visualization. Prof.Sushila Aghav-Palwe

Pivot Table Tutorial Using Ontario s Public Sector Salary Disclosure Data

LECTURE 10: CONFIDENCE INTERVALS II

ORGANON Calibration for Western Hemlock Project

Graphical Tools - SigmaXL Version 6.1

Chapter 5 Notes Page 1

Model construction of earning money by taking photos

y x where x age and y height r = 0.994

Using Shiftboard s Demand Planner

C. A. R. E. Curriculum Assessment Remediation Enrichment Algebra 1 Mathematics CARE Package #6 Modeling with Linear Functions

Import Files from Folders

I want to sustain & implement launching or growing what I do

I want to sustain & implement launching or growing what I do

STATISTICAL TECHNIQUES. Data Analysis and Modelling

TIM/UNEX 270, Spring 2012 Homework 1

Your Toastmasters Public Relations Toolbox Speakers Notes

Today in the computer lab we will go over some examples of time series data using R Tableau

COMPARISON OF LOGISTIC REGRESSION MODEL AND MARS CLASSIFICATION RESULTS ON BINARY RESPONSE FOR TEKNISI AHLI BBPLK SERANG TRAINING GRADUATES STATUS

S-ID Used Subaru Foresters I

If you make a mistake, press Stop and re- try.

Transcription:

+ Tutorial Regression & correlation Presented by Jessica Raterman Shannon Hodges

+ Access & assess your data n Install and/or load the MASS package to access the dataset birthwt n Familiarize yourself with the data Structure? i.e. Type of data? Number of observations? Parametric or nonparametric? Number & names of columns? Are you working with complete or incomplete data?

+ Access & assess your data n Explore the variables & put them in a more meaningful context What does the variable lwt measure? What type of variable? Look through the rest of the variables hypothesizing yet? n Produce simple summary statistics anything noteworthy, or is more information needed? Optional - rename the data for easier coding

+ Access & assess your data n Now that you have a better handle on what the data are, start reasoning: Generate a few scatterplots You can look at all pairs or just those of interest, if you have some ideas about what variables might be interesting Any relationships? Do a quick test of your preliminary suspicions by asking for the correlation between two variables of interest

+ Access & assess your data n Decide on two variables to use for the tutorial (practice s sake- don t spend too much time on this!) Again, use a help function to remind yourself what s being measured or to start reasoning through what might be related

+ Access & assess your data Check normality & distribution Visual assessment Do your variables follow a normal distribution? Leverage points/outliers? Consider transformations if necessary Don t forget to note/deal with missing values in your own datasets After changes: n Visualize again. Recheck the distribution n Has the correlation changed? n Has the scatterplot changed? n Think through what these changes mean

+ Parametric: Linear Regression n Regress y on x n Check the model s summary What values are of interest? Check model assumptions How much variation does your model explain? How much and in what direction does y change for each unit of x (i.e. explain the slope)? Put together the predictive equation

+ Parametric: Linear Regression n Confidence and Prediction Confidence intervals for all parameters Check B0, B1 CI for mean response What y interval values do we expect given x? Single predicted values of mean response What about single values of y for a given x?

+ Parametric: Linear Regression n Add the line of best fit to visually assess how well your data fits. Remember you need to rerun your plot if you ve closed it. n Find the regression equation y = B 0 +/- B 1 x Use the summary to get these values, can plug in numbers and predict values this way, too.

+ Nonparametric Use when there is residuals are not normally distributed (i.e. cannot assume linear relationship between x and y). n Correlation You will first need to change your coefficient of correlation to a suitable nonparametric method (e.g. Spearman). Check the help file.

+ Nonparametric n Smooth with loess, then use linear reg. Check residuals again with summary. Improved? Does it meet the requirements for linear regression now?

+ Further practice n Try one run-through of the tutorial with a new set of data that meet parametric requirements, and one that meets the requirements of nonparametric data. Find new data of interest practice with. https://stat.ethz.ch/r-manual/r-patched/ library/datasets/html/00index.html

+ Sources n Hartlaub, BA. 2011. Introduction to R. [internet]. Downloaded on January 26, 2015. Available at http://www2.kenyon.edu/depts/math/hartlaub/ Math305%20Fall2011/R.htm n Hosmer DW, Lemeshow S, and Sturdivant RX, editors. 1989. Applied Logistic Regression, 3rd edition. New York: John Wiley & Sons Inc. n Stack Exchange. [internet]. Fit a Line with LOESS in R. Downloaded on January 30, 2015. Available at http://stackoverflow.com/questions/15337777/fit-aline-with-loess-in-r