R Short Course Session 5
|
|
- Abigayle Phelps
- 6 years ago
- Views:
Transcription
1 R Short Course Session 5 Daniel Zhao, PhD Sixia Chen, PhD Department of Biostatistics and Epidemiology College of Public Health, OUHSC 11/20/2015
2 Outline Linear Regression Fit linear regression and check results Examine normality, independence and heteroscedasticity Examine outliers and influence points Examine collinearity Model selection procedure Model comparisons
3 Outline (2) Logistic Regression Fit logistic model and check results Odds ratio and confidence intervals Model selection procedure Goodness of fit test
4 Linear Regression Fitting linear regression of y vs x: lm(y~x,data=,weights=,singular.ok=true) Example: Input data is Prestige dataset in R package car Variables: education, income, women, prestige, census and type
5 Linear Regression (2) summary(prestige)
6 pairs(prestige) Linear Regression (3)
7 Fit linear regression reg1<lm(prestige~education+log2(income)+women, data=prestige)
8 summary(reg1) Check the results
9 attributes(reg1) Check the results (2)
10 Check the results (3) reg1$coefficients reg1$df.residual
11 Examine Normality QQ plot for studentized residuals: qqplot(reg1, main="qq Plot") Distribution of studentized residuals library(mass) sresid <- studres(reg1) hist(sresid, freq=false,main="distribution of Studentized Residuals") xfit<-seq(min(sresid),max(sresid),length=40) yfit<-dnorm(xfit) lines(xfit, yfit)
12 Examine Normality (2)
13 Examine Normality (3) shapiro.test(reg1$residuals) ad.test(reg1$residuals)
14 Plots: Examine Independence
15 Examine Independence (2) Durbin Watson Test: durbinwatsontest(reg1)
16 Plots: Examine heteroscedasticity
17 Examine heteroscedasticity (2) Goldfeld-Quandt test: gqtest(reg1). Note that we need to install R package lmtest Non-constant error variance test: ncvtest(reg1)
18 Examine Outliers outliertest(reg1) # Bonferonni p-value for most extreme obs lm.influence(reg1) # Calculate diagonal hat matrix and the influence of each point on regression coefficient and standard deviation estimation
19 Examine Outliers (2) lm.influence(reg1)$hat #calculate leverage lm.influence(reg1)$hat[lm.influence(reg1)$hat >3*(3+1)/102] #Identify high leverage cases
20 Examine influence points influence.measures(reg1) # calculate DFFITS, COOK S D, DFBETAS and Covariance ratios
21 Examine influence points (2) Cook s D plot (identify D values>4/(n-k-1)): cutoff<-4/((nrow(prestige)- length(reg1$coefficients)-2)) plot(reg1, which=4, cook.levels=cutoff) Influence Plot: influenceplot(reg1, id.method="identify", main="influence Plot", sub="circle size is proportial to Cook's Distance" )
22 Cook s D plot
23 Influence Plot
24 Examine Colinearity Variance inflation factors: vif(reg1) Problem? sqrt(vif(reg1)) > 2
25 Examine Colinearity (2) Added variable plots: avplots(reg1)
26 Model Selection Procedures step(object=,scope=list(lower=,upper=),directi on=c( both, backward, forward ), steps=1000,k=2, ) #object is an object representing a model #scope defines the range of models examined in the stepwise search #direction controls the mode of stepwise search
27 Model Selection Procedures (2) edu2<-education^2 loginc2<-log2(income)^2 edulogin<-education*log2(income) reg2<lm(prestige~education+edu2+log2(income)+lo ginc2+edulogin+women,data=prestige) step(reg2,direction='both')
28 Model Selection Procedures (3)
29 anova(reg1,reg2) Model Comparisons
30 Logistic Regression Input data: plasma in package HSAUR Variables: Fibrinogen: the fibrinogen level in the blood Globulin: the globulin level in the blood ESR: the erythrocyte sedimentation rate, either less or greater 20 mm /hour
31 Logistic Regression (2) plasma data: head(plasma) Objective: fit logistic regression by using ESR as dependent variable and other two as independent variables
32 Logistic Regression (3) glm(formula, data =,family=,weights=,intercept=, ) #formula is y~x type #data is the input dataset #family can be gussian, binomial or others #weights specifies weighted or unweighted analysis #intercept is logical (Do we need intercept or not?)
33 Fit logistic model fit<glm(esr~fibrinogen+globulin,data=plasma,fa mily=binomial('logit'))
34 summary(fit) Fit logistic model (2)
35 attributes(fit) Fit logistic model (3)
36 Logistic regression plot attach(plasma) ESR2<-rep(1,dim(plasma)[1]) ESR2[ESR=='ESR > 20']<-0 fit<glm(esr2~fibrinogen,data=plasma,family=binomi al('logit')) plot(fibrinogen, ESR2) lines(fibrinogen[order(fibrinogen)],fit$fitted.value s[order(fibrinogen)])
37 Logistic regression plot (2)
38 Odds ratio and confidence intervals Calculate Odds ratio: exp(coef(fit)) Calculate variance covariance for coefficients: vcov(fit)
39 Odds ratio and confidence intervals (2) Confidence interval for coefficient: confint.default(fit) Confidence interval for Odds ratio: exp(confint.default(fit))
40 Model selection fit2<glm(esr2~fibrinogen+globulin,data=plasma,fa mily=binomial('logit')) step(fit2)
41 Model selection (2)
42 Hosmer-Lemeshow goodness of fit test hoslem.test(x,y,g=10) in R package ResourceSelection #x is a numeric vector of observations, binary (0/1) #y is expected values #g is number of bins to use to calculate quantiles
43 Example hoslem.test(esr2,fit2$fitted.values)
44 Questions Contact and
AP Statistics Scope & Sequence
AP Statistics Scope & Sequence Grading Period Unit Title Learning Targets Throughout the School Year First Grading Period *Apply mathematics to problems in everyday life *Use a problem-solving model that
More informationRegression diagnostics
Regression diagnostics Biometry 755 Spring 2009 Regression diagnostics p. 1/48 Introduction Every statistical method is developed based on assumptions. The validity of results derived from a given method
More informationBusiness Quantitative Analysis [QU1] Examination Blueprint
Business Quantitative Analysis [QU1] Examination Blueprint 2014-2015 Purpose The Business Quantitative Analysis [QU1] examination has been constructed using an examination blueprint. The blueprint, also
More informationDidacticiel Études de cas
1. Subject Detecting outliers and influential points for regression analysis. The analysis of outliers and influential points is an important step of the regression diagnostics. The goal is to detect (1)
More informationA SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design
A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design Robert A. Vierkant, Terry M. Therneau, Jon L. Kosanke, James M. Naessens Mayo Clinic, Rochester, MN ABSTRACT A matched
More informationBiostatistics 208 Data Exploration
Biostatistics 208 Data Exploration Dave Glidden Professor of Biostatistics Univ. of California, San Francisco January 8, 2008 http://www.biostat.ucsf.edu/biostat208 Organization Office hours by appointment
More informationChoosing the Right Type of Forecasting Model: Introduction Statistics, Econometrics, and Forecasting Concept of Forecast Accuracy: Compared to What?
Choosing the Right Type of Forecasting Model: Statistics, Econometrics, and Forecasting Concept of Forecast Accuracy: Compared to What? Structural Shifts in Parameters Model Misspecification Missing, Smoothed,
More informationSurrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence Intervals by Binary Logistic Regression
Available online at www.annclinlabsci.org Annals of Clinical & Laboratory Science, vol. 39, no. 3, 2009 313 Surrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence
More informationMismanagement of Compostable and Recyclable Materials at Carleton College
Mismanagement of Compostable and Recyclable Materials at Carleton College Anthony Hill- Abercrombie and Zed Fashena Math 245: Applied Regression Final Project Abstract: To contribute to the campus wide
More informationThe SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa
The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pages 37-64. The description of the problem can be found
More informationCREDIT RISK MODELLING Using SAS
Basic Modelling Concepts Advance Credit Risk Model Development Scorecard Model Development Credit Risk Regulatory Guidelines 70 HOURS Practical Learning Live Online Classroom Weekends DexLab Certified
More informationGETTING STARTED WITH PROC LOGISTIC
PAPER 255-25 GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services, Inc. USA Introduction Logistic Regression is an increasingly popular analytic tool. Used to predict the probability
More informationRESULT AND DISCUSSION
4 Figure 3 shows ROC curve. It plots the probability of false positive (1-specificity) against true positive (sensitivity). The area under the ROC curve (AUR), which ranges from to 1, provides measure
More informationApplying Regression Techniques For Predictive Analytics Paviya George Chemparathy
Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS
More informationLab 1: A review of linear models
Lab 1: A review of linear models The purpose of this lab is to help you review basic statistical methods in linear models and understanding the implementation of these methods in R. In general, we need
More informationGETTING STARTED WITH PROC LOGISTIC
GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services and University of California, Berkeley Extension Division Introduction Logistic Regression is an increasingly popular analytic
More informationAn empirical machine learning method for predicting potential fire control locations for pre-fire planning and operational fire management
International Journal of Wildland Fire 2017, 26, 587 597 IAWF 2017 Supplementary material An empirical machine learning method for predicting potential fire control locations for pre-fire planning and
More informationTutorial Regression & correlation. Presented by Jessica Raterman Shannon Hodges
+ Tutorial Regression & correlation Presented by Jessica Raterman Shannon Hodges + Access & assess your data n Install and/or load the MASS package to access the dataset birthwt n Familiarize yourself
More informationGetting Started With PROC LOGISTIC
Getting Started With PROC LOGISTIC Andrew H. Karp Sierra Information Services, Inc. 19229 Sonoma Hwy. PMB 264 Sonoma, California 95476 707 996 7380 SierraInfo@aol.com www.sierrainformation.com Getting
More informationMovie Success Prediction PROJECT REPORT. Rakesh Parappa U CS660
Movie Success Prediction PROJECT REPORT Rakesh Parappa U01382090 CS660 Abstract The report entails analyzing different variables like movie budget, actor s Facebook likes, director s Facebook likes and
More informationAdvanced Tutorials. SESUG '95 Proceedings GETTING STARTED WITH PROC LOGISTIC
GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services and University of California, Berkeley Extension Division Introduction Logistic Regression is an increasingly popular analytic
More informationUsing Predictive Margins to Make Clearer Explanations
Using to Make Clearer Explanations StataCorp LP Indian Stata Users Group Meeting 1 August 2013 Goals Introduction Goals Getting our Dataset This will be an interactive demonstration Looking at estimation
More informationChapter 5 Regression
Chapter 5 Regression Topics to be covered in this chapter: Regression Fitted Line Plots Residual Plots Regression The scatterplot below shows that there is a linear relationship between the percent x of
More informationCHAPTER 5 RESULTS AND ANALYSIS
CHAPTER 5 RESULTS AND ANALYSIS This chapter exhibits an extensive data analysis and the results of the statistical testing. Data analysis is done using factor analysis, regression analysis, reliability
More informationAdvanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2014
1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2014 Instructor: Joanne M. Garrett, PhD e-mail: joanne_garrett@med.unc.edu Class Notes: Copies of the class lecture slides
More informationUnit 5 Logistic Regression Homework #7 Practice Problems. SOLUTIONS Stata version
Unit 5 Logistic Regression Homework #7 Practice Problems SOLUTIONS Stata version Before You Begin Download STATA data set illeetvilaine.dta from the course website page, ASSIGNMENTS (Homeworks and Exams)
More informationClovis Community College Class Assessment
Class: Math 110 College Algebra NMCCN: MATH 1113 Faculty: Hadea Hummeid 1. Students will graph functions: a. Sketch graphs of linear, higherhigher order polynomial, rational, absolute value, exponential,
More informationCorrelations. Regression. Page 1. Correlations SQUAREFO BEDROOMS BATHS ASKINGPR
multreg.sav squarefo bedrooms baths askingpr 3632 4 2.5 49 2 4889 6 5.0 399 3 3000 5 3.5 395 4 3669 4 3.5 379 5 2800 4 3.0 359 6 3600 5 3.5 349 7 2800 5 2.5 320 8 2257 3 3.0 299 9 2000 3 3.0 295 0 2455
More informationSoci Statistics for Sociologists
University of North Carolina Chapel Hill Soci708-001 Statistics for Sociologists Fall 2009 Professor François Nielsen Stata Commands for Module 11 Multiple Regression For further information on any command
More informationBinary Classification Modeling Final Deliverable. Using Logistic Regression to Build Credit Scores. Dagny Taggart
Binary Classification Modeling Final Deliverable Using Logistic Regression to Build Credit Scores Dagny Taggart Supervised by Jennifer Lewis Priestley, Ph.D. Kennesaw State University Submitted 4/24/2015
More information= = Intro to Statistics for the Social Sciences. Name: Lab Session: Spring, 2015, Dr. Suzanne Delaney
Name: Intro to Statistics for the Social Sciences Lab Session: Spring, 2015, Dr. Suzanne Delaney CID Number: _ Homework #22 You have been hired as a statistical consultant by Donald who is a used car dealer
More informationSociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian. Preliminary Data Screening
r's age when 1st child born 2 4 6 Density.2.4.6.8 Density.5.1 Sociology 774: Regression Models for Categorical Data Instructor: Natasha Sarkisian Preliminary Data Screening A. Examining Univariate Normality
More information4.3 Nonparametric Tests cont...
Class #14 Wednesday 2 March 2011 What did we cover last time? Hypothesis Testing Types Student s t-test - practical equations Effective degrees of freedom Parametric Tests Chi squared test Kolmogorov-Smirnov
More informationPROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL
PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL Cody Chiuzan, PhD Biostatistics, Epidemiology and Research Design (BERD) Lecture March 19, 2018 1 Outline Experimental vs Non-Experimental Study WHEN and
More informationPrice transmission along the food supply chain
Price transmission along the food supply chain Table of Contents 1. Introduction... 2 2. General formulation of models... 3 2.1 Model 1: Price transmission along the food supply chain... 4 2.2 Model 2:
More informationBiostatistics 208. Lecture 1: Overview & Linear Regression Intro.
Biostatistics 208 Lecture 1: Overview & Linear Regression Intro. Steve Shiboski Division of Biostatistics, UCSF January 8, 2019 1 Organization Office hours by appointment (Mission Hall 2540) E-mail to
More informationBUS105 Statistics. Tutor Marked Assignment. Total Marks: 45; Weightage: 15%
BUS105 Statistics Tutor Marked Assignment Total Marks: 45; Weightage: 15% Objectives a) Reinforcing your learning, at home and in class b) Identifying the topics that you have problems with so that your
More informationUnit 6: Simple Linear Regression Lecture 2: Outliers and inference
Unit 6: Simple Linear Regression Lecture 2: Outliers and inference Statistics 101 Thomas Leininger June 18, 2013 Types of outliers in linear regression Types of outliers How do(es) the outlier(s) influence
More informationPOST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)
OUTLINE FOR THE POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM) Module Subject Topics Learning outcomes Delivered by Exploratory & Visualization Framework Exploratory Data Collection and
More informationStatistics: Data Analysis and Presentation. Fr Clinic II
Statistics: Data Analysis and Presentation Fr Clinic II Overview Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars
More informationTiming Production Runs
Class 7 Categorical Factors with Two or More Levels 189 Timing Production Runs ProdTime.jmp An analysis has shown that the time required in minutes to complete a production run increases with the number
More informationLinear Regression Analysis of Gross Output Value of Farming, Forestry, Animal Husbandry and Fishery Industries
1106 Proceedings of the 8th International Conference on Innovation & Management Linear Regression Analysis of Gross Output Value of Farming, Forestry, Animal Husbandry and Fishery Industries Liu Haime,
More informationStatistical Modelling for Social Scientists. Manchester University. January 20, 21 and 24, Modelling categorical variables using logit models
Statistical Modelling for Social Scientists Manchester University January 20, 21 and 24, 2011 Graeme Hutcheson, University of Manchester Modelling categorical variables using logit models Software commands
More informationSmall Business advice seeking behaviour technical report. An analysis of the 2018 small business legal need survey July 2018
Small Business advice seeking behaviour technical report An analysis of the 2018 small business legal need survey July 2018 Which characteristics of small businesses and the legal issues they face have
More informationQuantification of Harm -advanced techniques- Mihail Busu, PhD Romanian Competition Council
Quantification of Harm -advanced techniques- Mihail Busu, PhD Romanian Competition Council mihail.busu@competition.ro Summary: I. Comparison Methods 1. Interpolation Method 2. Seasonal Interpolation Method
More informationCase study: Modelling berry yield through GLMMs
Case study: Modelling berry yield through GLMMs Jari Miina Finnish Forest Research Institute (Metla) European NWFPs network Action FP1203 www.nwfps.eu TRAINING SCHOOL Modelling NWFP El Escorial, 29 th
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Univariate Statistics Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved Table of Contents PAGE Creating a Data File...3 1. Creating
More informationChecking the model. Linearity. Normality. Constant variance. Influential points. Covariate overlap
Checking the model Linearity Normality Constant variance Influential points Covariate overlap 1 Checking the model: linearity Average value of outcome initially assumed to be linear function of continuous
More information= = Name: Lab Session: CID Number: The database can be found on our class website: Donald s used car data
Intro to Statistics for the Social Sciences Fall, 2017, Dr. Suzanne Delaney Extra Credit Assignment Instructions: You have been hired as a statistical consultant by Donald who is a used car dealer to help
More informationCorrelation between Carbon Steel Corrosion and Atmospheric Factors in Taiwan
CORROSION SCIENCE AND TECHNOLOGY, Vol.17, No.2(2018), pp.37~44 pissn: 1598-6462 / eissn: 2288-6524 [Research Paper] DOI: https://doi.org/10.14773/cst.2018.17.2.37 Correlation between Carbon Steel Corrosion
More informationStatistical Modelling for Business and Management. J.E. Cairnes School of Business & Economics National University of Ireland Galway.
Statistical Modelling for Business and Management J.E. Cairnes School of Business & Economics National University of Ireland Galway June 28 30, 2010 Graeme Hutcheson, University of Manchester Luiz Moutinho,
More informationSAARC Training Workshop Program Identification, Comparison and Scenario Based Application of Power Demand/ Load Forecasting Tools
SAARC Training Workshop Program Identification, Comparison and Scenario Based Application of Power Demand/ Load Forecasting Tools Long Term Power Demand Forecasting using Regression Model Contents Growth
More informationFOLLOW-UP NOTE ON MARKET STATE MODELS
FOLLOW-UP NOTE ON MARKET STATE MODELS In an earlier note I outlined some of the available techniques used for modeling market states. The following is an illustration of how these techniques can be applied
More informationWinsor Approach in Regression Analysis. with Outlier
Applied Mathematical Sciences, Vol. 11, 2017, no. 41, 2031-2046 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2017.76214 Winsor Approach in Regression Analysis with Outlier Murih Pusparum Qasa
More informationJMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING
JMP TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION JMP software provides introductory statistics in a package designed to let students visually explore data in an interactive way with
More informationQuantitative Analysis Using Statistics for Forecasting and Validity Testing. Course #6300/QAS6300 Course Material
Quantitative Analysis Using Statistics for Forecasting and Validity Testing Course #6300/QAS6300 Course Material Table of Contents Page Chapter 1: Decision Making With Statistics and Forecasting I. How
More informationBusiness Statistics: 41000
Business Statistics: 41000 Section 4: Multiple Regression and Logistic Regression Nick Polson The University of Chicago Booth School of Business http://faculty.chicagobooth.edu/nicholas.polson/teaching/41000/
More informationMETHOD VALIDATION TECHNIQUES PREPARED FOR ENAO ASSESSOR CALIBRATION COURSE OCTOBER/NOVEMBER 2012
METHOD VALIDATION PREPARED FOR ENAO ASSESSOR CALIBRATION COURSE TECHNIQUES OCTOBER/NOVEMBER 2012 Prepared by for ENAO Assessor Calibration B SCOPE Introduction House Rules Central Tendency Statistics Population
More informationData from a dataset of air pollution in US cities. Seven variables were recorded for 41 cities:
Master of Supply Chain, Transport and Mobility - Data Analysis on Transport and Logistics - Course 16-17 - Partial Exam Lecturer: Lidia Montero November, 10th 2016 Problem 1: All questions account for
More informationBuilding the In-Demand Skills for Analytics and Data Science Course Outline
Day 1 Module 1 - Predictive Analytics Concepts What and Why of Predictive Analytics o Predictive Analytics Defined o Business Value of Predictive Analytics The Foundation for Predictive Analytics o Statistical
More information(DMSTT 21) M.Sc. (Final) Final Year DEGREE EXAMINATION, DEC Statistics. Time : 03 Hours Maximum Marks : 100
(DMSTT 21) M.Sc. (Final) Final Year DEGREE EXAMINATION, DEC. - 2012 Statistics Paper - I : STATISTICAL QUALITY CONTROL Time : 03 Hours Maximum Marks : 100 Answer any Five questions All questions carry
More informationModule 7: Multilevel Models for Binary Responses. Practical. Introduction to the Bangladesh Demographic and Health Survey 2004 Dataset.
Module 7: Multilevel Models for Binary Responses Most of the sections within this module have online quizzes for you to test your understanding. To find the quizzes: Pre-requisites Modules 1-6 Contents
More informationTable. XTMIXED Procedure in STATA with Output Systolic Blood Pressure, use "k:mydirectory,
Table XTMIXED Procedure in STATA with Output Systolic Blood Pressure, 2001. use "k:mydirectory,. xtmixed sbp nage20 nage30 nage40 nage50 nage70 nage80 nage90 winter male dept2 edu_bachelor median_household_income
More informationSTEPHEN CARSTENS RCBM (Pty) Ltd ABSTRACT
INCREASING THE COMPETITIVENESS OF MAINTENANCE CONTRACT RATES BY USING AN ALTERNATIVE METHODOLOGY FOR THE CALCULATION OF AVERAGE VEHICLE MAINTENANCE COSTS STEPHEN CARSTENS stephcar@global.co.za RCBM (Pty)
More informationDETECTING AND MEASURING SHIFTS IN THE DEMAND FOR DIRECT MAIL
Chapter 3 DETECTING AND MEASURING SHIFTS IN THE DEMAND FOR DIRECT MAIL 3.1. Introduction This chapter evaluates the forecast accuracy of a structural econometric demand model for direct mail in Canada.
More informationApplied Logistic Regression
Applied Logistic Regression Applied Logistic Regression Third Edition DAVID W. HOSMER, JR. Professor of Biostatistics (Emeritus) Division of Biostatistics and Epidemiology Department of Public Health
More informationROBUST REGRESSION PROCEDURES TO HANDLE OUTLIERS. PRESENTATION FOR EDU7312 SPRING 2013 Elizabeth Howell Southern Methodist University
ROBUST REGRESSION PROCEDURES TO HANDLE OUTLIERS PRESENTATION FOR EDU7312 SPRING 2013 Elizabeth Howell Southern Methodist University ehowell@smu.edu Ordinary Least Squares (OLS) tertertertertertetert Simplest,
More informationEconomic Analysis of Korea Green Building Certification System in the Capital Area Using House-Values Index
Economic Analysis of Korea Green Building Certification System in the Capital Area Using House-Values Index Kiyoung Son 1, Sungho Lee 2, Chaeyeon Lim 3 and Sun-Kuk Kim* 4 1 Assistant Professor, School
More informationST7002 Optional Regression Project. Postgraduate Diploma in Statistics. Trinity College Dublin. Sarah Mechan. FAO: Prof.
ST72 Optional Regression Project Postgraduate Diploma in Statistics Trinity College Dublin Sarah Mechan FAO: Prof. John Haslett School of Computer Science & Statistics Section 1 Introduction This report
More informationAdd Sophisticated Analytics to Your Repertoire with Data Mining, Advanced Analytics and R
Add Sophisticated Analytics to Your Repertoire with Data Mining, Advanced Analytics and R Why Advanced Analytics Companies that inject big data and analytics into their operations show productivity rates
More informationC-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING. CAS Annual Meeting November 7-10
1 C-14 FINDING THE RIGHT SYNERGY FROM GLMS AND MACHINE LEARNING CAS Annual Meeting November 7-10 GLM Process 2 Data Prep Model Form Validation Reduction Simplification Interactions GLM Process 3 Opportunities
More informationShort-Term Load Forecasting Under Dynamic Pricing
Short-Term Load Forecasting Under Dynamic Pricing Yu Xian Lim, Jonah Tang, De Wei Koh Abstract Short-term load forecasting of electrical load demand has become essential for power planning and operation,
More informationInterval Matrix Eigen/Singular-Value Decomposition and an Application
Interval Matrix Eigen/Singular-Value Decomposition and an Application CHENYI HU Professor and Chairman Computer Science Department University of Central Arkansas, USA URL: www.cs.uca.edu 1 RANMEP 2008,Taiwan
More informationOverview. Presenter: Bill Cheney. Audience: Clinical Laboratory Professionals. Field Guide To Statistics for Blood Bankers
Field Guide To Statistics for Blood Bankers A Basic Lesson in Understanding Data and P.A.C.E. Program: 605-022-09 Presenter: Bill Cheney Audience: Clinical Laboratory Professionals Overview Statistics
More informationEngineering Statistics ECIV 2305 Chapter 8 Inferences on a Population Mean. Section 8.1. Confidence Intervals
Engineering Statistics ECIV 2305 Chapter 8 Inferences on a Population Mean Section 8.1 Confidence Intervals Parameter vs. Statistic A parameter is a property of a population or a probability distribution
More informationSAS Enterprise Miner 5.3 for Desktop
Fact Sheet SAS Enterprise Miner 5.3 for Desktop A fast, powerful data mining workbench delivered to your desktop What does SAS Enterprise Miner for Desktop do? SAS Enterprise Miner for Desktop is a complete
More informationMath227 Sample Final 3
Math227 Sample Final 3 You may use TI calculator for this test. However, you must show all details for hypothesis testing. For confidence interval, you must show the critical value and the margin of error.
More informationTo Hydrate or Chlorinate: A Regression Analysis of the Levels od Chlorine in the Public Water Supply
A Regression Analysis of the Levels od Chlorine in the Public Water Supply SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in
More informationTopics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2. B. Rosner, 5/09/17
Topics in Biostatistics Categorical Data Analysis and Logistic Regression, part 2 B. Rosner, 5/09/17 1 Outline 1. Testing for effect modification in logistic regression analyses 2. Conditional logistic
More informationStatistics and Data Analysis
Selecting the Appropriate Outlier Treatment for Common Industry Applications Kunal Tiwari Krishna Mehta Nitin Jain Ramandeep Tiwari Gaurav Kanda Inductis Inc. 571 Central Avenue #105 New Providence, NJ
More informationTabulate and plot measures of association after restricted cubic spline models
Tabulate and plot measures of association after restricted cubic spline models Nicola Orsini Institute of Environmental Medicine Karolinska Institutet 3 rd Nordic and Baltic countries Stata Users Group
More informationLogistic Regression for Early Warning of Economic Failure of Construction Equipment
Logistic Regression for Early Warning of Economic Failure of Construction Equipment John Hildreth, PhD and Savannah Dewitt University of North Carolina at Charlotte Charlotte, North Carolina Equipment
More informationUse Multi-Stage Model to Target the Most Valuable Customers
ABSTRACT MWSUG 2016 - Paper AA21 Use Multi-Stage Model to Target the Most Valuable Customers Chao Xu, Alliance Data Systems, Columbus, OH Jing Ren, Alliance Data Systems, Columbus, OH Hongying Yang, Alliance
More informationBackground for Case Study: Clifton Park Residential Real Estate
Techniques for Engaging Business Students in the Statistics Classroom Jane E. Oppenlander Example Assignments and Class Exercises Background for Case Study: Clifton Park Residential Real Estate Data on
More informationUnit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users
Unit 2 Regression and Correlation 2 of 2 - Practice Problems SOLUTIONS Stata Users Data Set for this Assignment: Download from the course website: Stata Users: framingham_1000.dta Source: Levy (1999) National
More informationEFFICACY OF ROBUST REGRESSION APPLIED TO FRACTIONAL FACTORIAL TREATMENT STRUCTURES MICHAEL MCCANTS
EFFICACY OF ROBUST REGRESSION APPLIED TO FRACTIONAL FACTORIAL TREATMENT STRUCTURES by MICHAEL MCCANTS B.A., WINONA STATE UNIVERSITY, 2007 B.S., WINONA STATE UNIVERSITY, 2008 A THESIS submitted in partial
More informationMULTILOG Example #1. SUDAAN Statements and Results Illustrated. Input Data Set(s): DARE.SSD. Example. Solution
MULTILOG Example #1 SUDAAN Statements and Results Illustrated Logistic regression modeling R and SEMETHOD options CONDMARG ADJRR option CATLEVEL Input Data Set(s): DARESSD Example Evaluate the effect of
More informationSTAT 350 (Spring 2016) Homework 12 Online 1
STAT 350 (Spring 2016) Homework 12 Online 1 1. In simple linear regression, both the t and F tests can be used as model utility tests. 2. The sample correlation coefficient is a measure of the strength
More informationGoing Further with SPSS 16. Jean Russell Bob Booth May 2010 AP-SPSS6
Going Further with SPSS 16. Jean Russell Bob Booth May 2010 AP-SPSS6 University of Sheffield Contents 1. INTRODUCTION... 3 1.1 MORE ON VARIABLES AND ANALYSIS... 3 2. STARTING SPSS... 5 2.1 SAVING AND LOADING
More informationCategorical Predictors, Building Regression Models
Fall Semester, 2001 Statistics 621 Lecture 9 Robert Stine 1 Categorical Predictors, Building Regression Models Preliminaries Supplemental notes on main Stat 621 web page Steps in building a regression
More informationDeveloping ISTA Cold Chain Environmental Standards
FRIDAY morning session Developing ISTA Cold Chain Environmental Standards Industry approved testing profiles have not been developed for the Cold Chain transportation environment. This presentation will
More informationDistinguish between different types of numerical data and different data collection processes.
Level: Diploma in Business Learning Outcomes 1.1 1.3 Distinguish between different types of numerical data and different data collection processes. Introduce the course by defining statistics and explaining
More informationCategorical Data Analysis
Categorical Data Analysis Hsueh-Sheng Wu Center for Family and Demographic Research October 4, 200 Outline What are categorical variables? When do we need categorical data analysis? Some methods for categorical
More informationLogistic Regression using OLS1D in Excel 2013 XL4D: V0H Schield-Logistic-OLS1D-Excel2013-Slides.pdf. Background & Goals
Logistic Regression using OLS1D in Excel 2013 XL4D: V0H 1 Logistic Regression using OLS1D in Excel 2013 by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy
More informationImproving long run model performance using Deviance statistics. Matt Goward August 2011
Improving long run model performance using Deviance statistics Matt Goward August 011 Objective of Presentation Why model stability is important Financial institutions are interested in long run model
More informationInformation Literacy Program
Information Literacy Program SPSS Advanced Significance Testing 2017 ANU Library anulib.anu.edu.au/research-learn ilp@anu.edu.au Table of Contents To start SPSS... 1 Significance testing (Inferential
More informationIntroduction to Generalized Linear Models: Nominal and Ordinal Logistic Regression, and Poisson Regression
1/39 to Generalized Linear Models: Nominal and Ordinal Logistic Regression, and Poisson Regression Dr Cameron Hurst cphurst@gmail.com DAMASAC and CEU, Khon Kaen University 24 th August, 2558 2/39 What
More informationQuantitative Methods
THE ASSOCIATION OF BUSINESS EXECUTIVES DIPLOMA PART 2 QM Quantitative Methods afternoon 4 June 2003 1 Time allowed: 3 hours. 2 Answer any FOUR questions. 3 All questions carry 25 marks. Marks for subdivisions
More informationPredictive Modeling Using SAS Visual Statistics: Beyond the Prediction
Paper SAS1774-2015 Predictive Modeling Using SAS Visual Statistics: Beyond the Prediction ABSTRACT Xiangxiang Meng, Wayne Thompson, and Jennifer Ames, SAS Institute Inc. Predictions, including regressions
More informationLeveraging Attitudinal & Behavioral Data to Better Understand Global & Local Trends in Customer Loyalty & Retention
Leveraging Attitudinal & Behavioral Data to Better Understand Global & Local Trends in Customer Loyalty & Retention Brian Griner, Ph.D. Science, Strategy & Technology for Relationship Management & Marketing
More information