Practical Aspects of Modelling Techp.iques in Logistic Regression Procedures of the SAS System

Size: px
Start display at page:

Download "Practical Aspects of Modelling Techp.iques in Logistic Regression Procedures of the SAS System"

Transcription

1 r""'=~~"''''''''''''''''''''''''''''\;'=="'~''''o''''"'"''~ ~c_,,..! Practical Aspects of Modelling Techp.iques in Logistic Regression Procedures of the SAS System Rainer Muche 1, Josef HogeP and Olaf Gefeller 2 lclinical Documentation, University of Ulm, Germany 2Department of Medical Statistics, University of Gottingen, Germany ABSTRACT Practical apllications of logistic regression analysis often involve the implementation of model-building strategies to find an appropriate final model for a comprehensible description of the relationship under study. The statistical software used for logistic regression modelling has thus not only to be evaluated with respect to its features related to analysing one given model but also with respect to its capability to support the modelbuilding process. In this paper, the procedures within the SAS software performing logistic regression modelling are examined to which extent they offer the necessary features for convenient model-building. The deficits found in this investigation are described in combination with a proposal for theit resolution. INTRODUCTION Logistic regression modelling is the methodologic approach frequently applied in epidemiology and clinical research to detect and to evaluate risk factors for dichotomous disease variables. In this context usually information on a multitude of explanatory variables (potential risk factors) and the response variable (the disease) is obtained. A variable-selecting algorithm is then routinely employed to reduce the number of explanatory variables. Variable selection techniques as part of statistical modelling in logistic regression are applied to obtain an appropriate model fitting the data which should (i) lead to stable parameter estimates, (ii) predict accurately future values, and (iii) allow comprehensible interpretation since only few important explanatory variables are selected [1]. In order to meet these requirements variable selection algorithms should cover two statistical strategies: incorporation of the hierarchy principle in selecting interaction terms and coding of categorical variables via dummy variables. 1045

2 i \.. In practice, the choice of a variable selection algorithm to be employed usually depends not only on its statistical properties but also on the availability of the corresponding software. Since SAS is the leading software package for statistical analyses, the statistical features of variable selection techniques implemented in the SAS-procedures for logistic regression will have a major impact on the current practice of variable selection in applications. In this paper, the present situation within the SASI ST ATTM procedures performing logistic regression modelling is investigated, and a proposal for resolving the deficits found in this investigation is provided. STATISTICAL BACKGROUND Logistic Regression Analysis To analyse the influence of explanatory variables on the risk for a disease described by a dichotomous variable usually the logistic regression model is employed [2]. This model can be used to analyse simultaneously explanatory variables and interaction terms of different data types (D: disease, EI,'..,Ek : explanatory variables, h+b'..,1/: interaction terms): 1 P(D = 1 lei = ej,",ek = ek, h+i = ik+b'. " II = i l ) = -----k---i--- -(/30+ L /3i e;+ L /3jij) 1 + e ;=1 j=k+1 Variable Selection Procedures In epidemiology and clinical research a variable-selecting algorithm is routinely employed in order to reduce the number of explanatory variables in the final model [3-7]. Such a procedure attempts to yield a final model which should primarily lead to unbiased stable estimators of the regression coefficients. There are several existing selection procedures in the literature and statistical software (e.g. st.epwise methods, all possible subset methods, tree structure testing strategy). The criterion for selecting variables for the final model is often based on significance testing. Hierarchy - Principle In model-building strategies, the hierarchy principle refers to the rule that no interaction terms should be included in a model without simultaneously incorporating all components (main effects and interactions of minor order) involved in these terms [8-10]. This requirement is essential for the interpretability of estimates in the final model. Applying the principle means to start analysing the data at the highest order interactions. If an interaction stays in the model after variable selection, the involved components are 1046

3 forced into the model. This procedure is then repeated from highest order interactions down to main effects. Dummy Variables Categorical variables are usually included into the model after coding them as a set of dummy variables [11]. In this situation, assessment of their common effect is performed by simultaneous criteria (e.g. simultaneous likelihood-ratio test). Most text books of modelbuilding give the rule that categorical main effects have to be represented in the final model by the whole set of their dummy variables. IMPLEMENTATION OF LOGISTIC REGRESSION IN PROCEDURES OF THE SAS SOFTWARE First, a brief overview on realizations of logistic regression modelling and variable selection techniques as part of the modelling process in procedures of the SAS software is given. Logistic regression is directly implemented in three procedures of SAS/STAT [12]: PROC LOGISTIC, PROC CATMOD, and PROC PROBIT. PROC PROBIT will not be considered in this paper because this is not the usual procedure for performing logistic regression analysis. Automatic variable selection algorithms are only available in PROC LOGISTIC. Syntax and relevant features for model-building of PROC LOGISTIC [A] and PROC CATMOD [B] are as follows: [A] PROC LOGISTIC DATA=data ; MODEL d = el... ek i k+1 il / LINK=logit SELECTION= INCLUDE= ; PROC LOGISTIC gives maximum-likelihood estimators and a likelihood-ratio test to assess the effects of the variables included in the model statement. Four automatic variable selection algorithms are implemented in the procedure: forward, backward, stepwise selection, and an all possible subset selection method based on the score statistic [13]. Interaction terms have to be coded explicitly in the DATA step as a product of the main effects entered into the model. The dummy variables corresponding to a categorical or ordinal exploratory variable also have to be constructed explicitly in the DATA step. Hence, using this procedure, dummy variables are treated as independent variables in model-building without recognizing their interrelationship. There is no possibility of testing the common effect of several dummy variables. The hierarchy-principle can only be applied manually by using the INCLUDE= option. With this option all variables except 1047

4 the highest order interactions can be fixed. In this context, the variable selection algorithm will only give the relevant variables among these highest order interactions. However, for each level of interaction you have to start model-building again. [B] PROC CATMOD DATA=data ; DIRECT ; MODEL d = el... ek ik+i il / ML; PROC CATMOD offers an alternative possibility to perform logistic regression analysis. Maximum-likelihood estimators for the model parameters are calculated using the ML option. However, there is no automatic variable selection algorithm implemented in this procedure and, consequently, variable selection has to be carried out "by hand" using differences of the model likelihood values. Proceeding like this, the hierarchy principle can be applied, but in a very time consuming way: each new model resulting by removing a "non-explanatory" variable has to be restarted again. The main advantage in using PROC CATMOD is that categorical variables not listed in the DIRECT statement are coded automatically as dummy variables and a test of the common effect of these dummy variables (simultaneous likelihood-ratio test) is performed. Table 1 summarizes the statistical features relevant for model-building purposes of both procedures. Table 1: Features of the SAS/STAT procedures PROC LOGISTIC and PROC CATMOD for logistic regression model-building Features Name of automatic automatic SAS procedure variable selection hierarchy-principle dummy-coding PROC LOGISTIC x - - PROC CATMOD - - x. DEMAND FOR IMPLEMENTATION OF HIERARCHY-PRINCIPLE AND DUMMY-CODING IN VARIABLE SELECTION ALGORITHMS As already mentioned, implementation of the hierarchy-principle and simultaneous analysis of dummy variables is necessary for using statistical software in model-building. The results of SAS software-ballots of the last years demonstrate that a substantial proportion of SAS users miss also these tools within the SAS software. Table 2 gives the corresponding items of the ballot together with the number of votes and ranking. 1048

5 Table 2: SAS software-ballot results ( ) for topics connected with modelbuilding in logistic regression procedures SAS procedure SUGI-Ballot Rank Votes Item PROC provide a mechanism for variables to enter and LOGISTIC leave the model as a group during model selection add a CLASS statement allowing all resulting dummy variables for a variable to enter and leave the model together during model selection add a CLASS statement allowing all resulting dummy variables for a variable to enter and leave the model together during model selection provide a TEST statement to test linear combinations of parameters PROC implement stepwise selection of variables CATMOD implement stepwise selection of variables allow automatic model selection It should not be a problem to implement such analytical model-building strategies in PROC LOGISTIC: Modelling of interactions could be included into the model statement allowing the same notation as in PROC GLM multiplying the main effects, using an asterisk (*). In doing so, the procedure will know which variables are interactions and of what order they are. As a consequence, implementation of the hierarchy-principle in automatic variable selection algorithms then will not present any problem. Activation of the hierarchy-principle could be implemented by a separate option (like HIERARCHY). The following syntactical approach can handle this problem: PROC LOGISTIC DATA=data ; MODEL d = el... ek el * e2 el * e3... ek-l * ek jlink=logit SELECTION=.. HIERARCHY; By implementing an extra statement (like CLASS) the identification of categorical or ordinal variables becomes possible. Like in PROC CATMOD, automatic dummy coding and simultaneous assessment of the common effects can be applied. \ \, ;'_ -_;" -, ;,_., ~~'_'r' _ r, _.;.- _.,' _.", ~ ".:;~:..._

6 REFERENCES [1] Kron, M., Gefeller, O. (1992). A critical appraisal of variable selection methods in regression models. In: A. Westlake, R. Blanks, C. Payne, T. Orchards (Eds.): Survey and statistical computing. North-Holland, Amsterdam, [2] Hosmer, D.W., Lemeshow, S. (1989). Applied Logistic Regression. John Wiley, New York [3] Aldrich, J.H.,Nelson, F.D. (1984). Linear probability, logit, and probit models. SAGE Publications, Beverly Hills [4] Cox, D.R., Snell, E.J. (1989). Analysis of binary data. Chapman and Hall, London [5] Greenland, S. (1989). Modelling and variable selection in epidemiologic analysis. Am. J. Public Health 79, [6] Harrell, F.E., Lee, K.L., Califf, R.M., Pryor, D.B., Rosati, R.A. (1984). Regression modelling strategies for improved prognostic prediction. Statistics in Medicine 3, [7] Miller, A.J. (1984). Selection of subsets of regression variables (with discussion). J.R. Statist. Soc. A 147, [8] Kleinbaum, D.G., Kupper, L.L., Morgenstern, H. (1982). Epidemiologic Research: Principles and quantitative methods. Lifetime Learning Publications, Belmont/California [9] Kleinbaum, D.G., Kupper, L.L., Chambless, L.E. (1982). Logistic regression analysis of epidemiologic data: Theory and practice. Commun. Stat. - Theor. Meth. 11, John Wiley, New York [10] Bishop, Y.M.M., Fienberg, S.E., Holland, P.W. (1975). Discrete multivariate analysis. MIT Press, Cambridge [11] Cohen, A. (1991). Dummy variables in stepwise regression. The American Statistician 45, [12] SAS Institute Inc. (1990). SAS/STAT User's Guide, 4th Edition. SAS Institute Inc., Cary, NC. [13] Gefeller, O., Muche, R. (1993). Variable Selection Techniques implemented in Procedures of the SAS software. Proceedings of SEUGI 93, Address for correspondence: Dipl.-Stat. Rainer Muche Klinische Dokumentation, Universitat Ulm D VIm, Germany SAS and SAS/STAT are registered trademarks of SAS Institute Inc., Cary, NC, USA. \, 1050

GETTING STARTED WITH PROC LOGISTIC

GETTING STARTED WITH PROC LOGISTIC PAPER 255-25 GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services, Inc. USA Introduction Logistic Regression is an increasingly popular analytic tool. Used to predict the probability

More information

SAS/STAT 14.1 User s Guide. Introduction to Categorical Data Analysis Procedures

SAS/STAT 14.1 User s Guide. Introduction to Categorical Data Analysis Procedures SAS/STAT 14.1 User s Guide Introduction to Categorical Data Analysis Procedures This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual

More information

Advanced Tutorials. SESUG '95 Proceedings GETTING STARTED WITH PROC LOGISTIC

Advanced Tutorials. SESUG '95 Proceedings GETTING STARTED WITH PROC LOGISTIC GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services and University of California, Berkeley Extension Division Introduction Logistic Regression is an increasingly popular analytic

More information

GETTING STARTED WITH PROC LOGISTIC

GETTING STARTED WITH PROC LOGISTIC GETTING STARTED WITH PROC LOGISTIC Andrew H. Karp Sierra Information Services and University of California, Berkeley Extension Division Introduction Logistic Regression is an increasingly popular analytic

More information

Introduction to Categorical Data Analysis Procedures (Chapter)

Introduction to Categorical Data Analysis Procedures (Chapter) SAS/STAT 12.1 User s Guide Introduction to Categorical Data Analysis Procedures (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 12.1 User s Guide. The correct bibliographic

More information

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2014

Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2014 1 Advanced Quantitative Methods for Health Care Professionals PUBH 742 Spring 2014 Instructor: Joanne M. Garrett, PhD e-mail: joanne_garrett@med.unc.edu Class Notes: Copies of the class lecture slides

More information

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design

A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design A SAS Macro to Analyze Data From a Matched or Finely Stratified Case-Control Design Robert A. Vierkant, Terry M. Therneau, Jon L. Kosanke, James M. Naessens Mayo Clinic, Rochester, MN ABSTRACT A matched

More information

SAS/STAT 13.1 User s Guide. Introduction to Multivariate Procedures

SAS/STAT 13.1 User s Guide. Introduction to Multivariate Procedures SAS/STAT 13.1 User s Guide Introduction to Multivariate Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is

More information

Introduction to Multivariate Procedures (Book Excerpt)

Introduction to Multivariate Procedures (Book Excerpt) SAS/STAT 9.22 User s Guide Introduction to Multivariate Procedures (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.22 User s Guide. The correct bibliographic citation

More information

Center for Demography and Ecology

Center for Demography and Ecology Center for Demography and Ecology University of Wisconsin-Madison A Comparative Evaluation of Selected Statistical Software for Computing Multinomial Models Nancy McDermott CDE Working Paper No. 95-01

More information

Analyzing non-normal data with categorical response variables

Analyzing non-normal data with categorical response variables SESUG 2016 Paper SD-184 Analyzing non-normal data with categorical response variables Niloofar Ramezani, University of Northern Colorado; Ali Ramezani, Allameh Tabataba'i University Abstract In many applications,

More information

RESULT AND DISCUSSION

RESULT AND DISCUSSION 4 Figure 3 shows ROC curve. It plots the probability of false positive (1-specificity) against true positive (sensitivity). The area under the ROC curve (AUR), which ranges from to 1, provides measure

More information

Getting Started With PROC LOGISTIC

Getting Started With PROC LOGISTIC Getting Started With PROC LOGISTIC Andrew H. Karp Sierra Information Services, Inc. 19229 Sonoma Hwy. PMB 264 Sonoma, California 95476 707 996 7380 SierraInfo@aol.com www.sierrainformation.com Getting

More information

All Possible Mixed Model Selection - A User-friendly SAS Macro Application

All Possible Mixed Model Selection - A User-friendly SAS Macro Application All Possible Mixed Model Selection - A User-friendly SAS Macro Application George C J Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT A user-friendly SAS macro application to perform all

More information

I am an experienced SAS programmer but I have not used many SAS/STAT procedures

I am an experienced SAS programmer but I have not used many SAS/STAT procedures Which Proc Should I Learn First? A STAT Instructor s Top 5 Modeling Procedures Catherine Truxillo, Ph.D. Manager, Analytical Education SAS Copyright 2010, SAS Institute Inc. All rights reserved. The Target

More information

Mallow s C p for Selecting Best Performing Logistic Regression Subsets

Mallow s C p for Selecting Best Performing Logistic Regression Subsets Mallow s C p for Selecting Best Performing Logistic Regression Subsets Mary G. Lieberman John D. Morris Florida Atlantic University Mallow s C p is used herein to select maximally accurate subsets of predictor

More information

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN

Predictive Modeling using SAS. Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN Predictive Modeling using SAS Enterprise Miner and SAS/STAT : Principles and Best Practices CAROLYN OLSEN & DANIEL FUHRMANN 1 Overview This presentation will: Provide a brief introduction of how to set

More information

BASIC PROBLEMS OF ANALYZING DISEASE DATA

BASIC PROBLEMS OF ANALYZING DISEASE DATA BASIC PROBLEMS OF ANALYZING DISEASE DATA Abstract IN EPIDEMIOLOGIC APPLICATIONS Annette Pfahlberg 1, Olaf Gefeller 2 1 Fachbereich Statistik, Universitat Dortmund 2 Abteilung Medizinische Statistik, Georg-August-Universitat

More information

Disentangling Prognostic and Predictive Biomarkers Through Mutual Information

Disentangling Prognostic and Predictive Biomarkers Through Mutual Information Informatics for Health: Connected Citizen-Led Wellness and Population Health R. Randell et al. (Eds.) 2017 European Federation for Medical Informatics (EFMI) and IOS Press. This article is published online

More information

Use Multi-Stage Model to Target the Most Valuable Customers

Use Multi-Stage Model to Target the Most Valuable Customers ABSTRACT MWSUG 2016 - Paper AA21 Use Multi-Stage Model to Target the Most Valuable Customers Chao Xu, Alliance Data Systems, Columbus, OH Jing Ren, Alliance Data Systems, Columbus, OH Hongying Yang, Alliance

More information

Logistic Regression with Expert Intervention

Logistic Regression with Expert Intervention Smart Cities Symposium Prague 2016 1 Logistic Regression with Expert Intervention Pavla Pecherková and Ivan Nagy Abstract This paper deals with problem of analysis of traffic data. A traffic network has

More information

An Empirical Investigation of Contingent Workforce in Information Systems

An Empirical Investigation of Contingent Workforce in Information Systems Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1995 Proceedings Americas Conference on Information Systems (AMCIS) 8-25-1995 An Empirical Investigation of Contingent Workforce

More information

PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL

PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL PROPENSITY SCORE MATCHING A PRACTICAL TUTORIAL Cody Chiuzan, PhD Biostatistics, Epidemiology and Research Design (BERD) Lecture March 19, 2018 1 Outline Experimental vs Non-Experimental Study WHEN and

More information

Dealing with Missing Data: Strategies for Beginners to Data Analysis

Dealing with Missing Data: Strategies for Beginners to Data Analysis Dealing with Missing Data: Strategies for Beginners to Data Analysis Rachel Margolis, PhD Assistant Professor, Department of Sociology Center for Population, Aging, and Health University of Western Ontario

More information

MODELING THE IMPACT OF RESIDENTIAL TIME OF USE RATES

MODELING THE IMPACT OF RESIDENTIAL TIME OF USE RATES MODELING THE IMPACT OF RESIDENTIAL TIME OF USE RATES K. H. Tiedemann and I. M. Sulyma BC Hydro, 4555 Kingsway, Burnaby, BC, V5H 4T8 ken.tiedemann@bchydro.com; iris.msulyma@bchydro.com ABSTRACT As a part

More information

SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc.

SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc. ABSTRACT Paper SAS5780-2016 SAS Visual Statistics 8.1: The New Self-Service Easy Analytics Experience Xiangxiang Meng, Cheryl LeSaint, Don Chapman, SAS Institute Inc. In today's Business Intelligence world,

More information

Software Data Analytics. Nevena Lazarević

Software Data Analytics. Nevena Lazarević Software Data Analytics Nevena Lazarević 1 Selected Literature Perspectives on Data Science for Software Engineering, 1st Edition, Tim Menzies, Laurie Williams, Thomas Zimmermann The Art and Science of

More information

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner

Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner SAS Ask the Expert Model Selection Techniques in SAS Enterprise Guide and SAS Enterprise Miner Melodie Rush Principal

More information

CREDIT RISK MODELLING Using SAS

CREDIT RISK MODELLING Using SAS Basic Modelling Concepts Advance Credit Risk Model Development Scorecard Model Development Credit Risk Regulatory Guidelines 70 HOURS Practical Learning Live Online Classroom Weekends DexLab Certified

More information

IBM SPSS Statistics Editions

IBM SPSS Statistics Editions Editions Get the analytical power you need for better decision-making Why use IBM SPSS Statistics? is the world s leading statistical software. It enables you to quickly dig deeper into your data, making

More information

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa

The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pa The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics 7.5, pages 37-64. The description of the problem can be found

More information

Sylvain Tremblay SAS Canada

Sylvain Tremblay SAS Canada TECHNIQUES FOR MODEL SCORING ESUG Sylvain Tremblay SAS Canada APRIL 15, 2015 You are done and have a predictive model Now what? It s time to score If you are using Enterprise Miner You can then do the

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

ALL POSSIBLE MODEL SELECTION IN PROC MIXED A SAS MACRO APPLICATION

ALL POSSIBLE MODEL SELECTION IN PROC MIXED A SAS MACRO APPLICATION Libraries Annual Conference on Applied Statistics in Agriculture 2006-18th Annual Conference Proceedings ALL POSSIBLE MODEL SELECTION IN PROC MIXED A SAS MACRO APPLICATION George C J Fernandez Follow this

More information

Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS

Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS Paper 1772-2018 Business Customer Value Segmentation for strategic targeting in the utilities industry using SAS Spyridon Potamitis, Centrica; Paul Malley, Centrica ABSTRACT Numerous papers have discussed

More information

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved. AcaStat How To Guide AcaStat Software Copyright 2016, AcaStat Software. All rights Reserved. http://www.acastat.com Table of Contents Frequencies... 3 List Variables... 4 Descriptives... 5 Explore Means...

More information

Model Validation of a Credit Scorecard Using Bootstrap Method

Model Validation of a Credit Scorecard Using Bootstrap Method IOSR Journal of Economics and Finance (IOSR-JEF) e-issn: 2321-5933, p-issn: 2321-5925.Volume 3, Issue 3. (Mar-Apr. 2014), PP 64-68 Model Validation of a Credit Scorecard Using Bootstrap Method Dilsha M

More information

INTRODUCTION BACKGROUND. Paper

INTRODUCTION BACKGROUND. Paper Paper 354-2008 Small Improvements Causing Substantial Savings - Forecasting Intermittent Demand Data Using SAS Forecast Server Michael Leonard, Bruce Elsheimer, Meredith John, Udo Sglavo SAS Institute

More information

Generalized Linear Mixed Models For Longitudinal Data With

Generalized Linear Mixed Models For Longitudinal Data With Generalized Linear Mixed Models For Longitudinal Data With We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer,

More information

USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP

USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP INTRODUCTION PAT VALENTE, MA Solution Specialist, Data Sciences at SAS. Training in Economics and Statistics. 20 years experience in business areas including

More information

Report for PAKDD 2007 Data Mining Competition

Report for PAKDD 2007 Data Mining Competition Report for PAKDD 2007 Data Mining Competition Li Guoliang School of Computing, National University of Singapore April, 2007 Abstract The task in PAKDD 2007 data mining competition is a cross-selling business

More information

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA.

QTL mapping in mice. Karl W Broman. Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA. QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University Baltimore, Maryland, USA www.biostat.jhsph.edu/ kbroman Outline Experiments, data, and goals Models ANOVA at marker

More information

Applied Logistic Regression

Applied Logistic Regression Applied Logistic Regression Applied Logistic Regression Third Edition DAVID W. HOSMER, JR. Professor of Biostatistics (Emeritus) Division of Biostatistics and Epidemiology Department of Public Health

More information

USING SAS HIGH PERFORMANCE STATISTICS FOR PREDICTIVE MODELLING

USING SAS HIGH PERFORMANCE STATISTICS FOR PREDICTIVE MODELLING USING SAS HIGH PERFORMANCE STATISTICS FOR PREDICTIVE MODELLING Regan LU, CFA, FRM SAS certified Statistical Business Analyst & SAS certified Advanced Programmer Future of Work Taskforce Department of Jobs

More information

Statistics and Econometrics for Finance

Statistics and Econometrics for Finance Statistics and Econometrics for Finance Series Editors David Ruppert Jianqing Fan Eric Renault Eric Zivot More information about this series at http://www.springer.com/series/10377 This is the second part

More information

Examination of Cross Validation techniques and the biases they reduce.

Examination of Cross Validation techniques and the biases they reduce. Examination of Cross Validation techniques and the biases they reduce. Dr. Jon Starkweather, Research and Statistical Support consultant. The current article continues from last month s brief examples

More information

Logistic Regression for Early Warning of Economic Failure of Construction Equipment

Logistic Regression for Early Warning of Economic Failure of Construction Equipment Logistic Regression for Early Warning of Economic Failure of Construction Equipment John Hildreth, PhD and Savannah Dewitt University of North Carolina at Charlotte Charlotte, North Carolina Equipment

More information

Integrating Market and Credit Risk Measures using SAS Risk Dimensions software

Integrating Market and Credit Risk Measures using SAS Risk Dimensions software Integrating Market and Credit Risk Measures using SAS Risk Dimensions software Sam Harris, SAS Institute Inc., Cary, NC Abstract Measures of market risk project the possible loss in value of a portfolio

More information

COMPARISON OF LOGISTIC REGRESSION MODEL AND MARS CLASSIFICATION RESULTS ON BINARY RESPONSE FOR TEKNISI AHLI BBPLK SERANG TRAINING GRADUATES STATUS

COMPARISON OF LOGISTIC REGRESSION MODEL AND MARS CLASSIFICATION RESULTS ON BINARY RESPONSE FOR TEKNISI AHLI BBPLK SERANG TRAINING GRADUATES STATUS International Journal of Humanities, Religion and Social Science ISSN : 2548-5725 Volume 2, Issue 1 2017 www.doarj.org COMPARISON OF LOGISTIC REGRESSION MODEL AND MARS CLASSIFICATION RESULTS ON BINARY

More information

Logistic Regression Analysis

Logistic Regression Analysis Logistic Regression Analysis What is a Logistic Regression Analysis? Logistic Regression (LR) is a type of statistical analysis that can be performed on employer data. LR is used to examine the effects

More information

Variable Selection Methods

Variable Selection Methods Variable Selection Methods PROBLEM: Find a set of predictor variables which gives a good fit, predicts the dependent value well and is as small as possible. So far have used F and t tests to compare 2

More information

IBM SPSS Statistics. Editions. Get the analytical power you need for better decision making. Why use IBM SPSS Statistics? IBM Analytics Solution Brief

IBM SPSS Statistics. Editions. Get the analytical power you need for better decision making. Why use IBM SPSS Statistics? IBM Analytics Solution Brief Editions Get the analytical power you need for better decision making Why use IBM SPSS Statistics? is the world s leading statistical software. It enables you to quickly dig deeper into your data, making

More information

Surrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence Intervals by Binary Logistic Regression

Surrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence Intervals by Binary Logistic Regression Available online at www.annclinlabsci.org Annals of Clinical & Laboratory Science, vol. 39, no. 3, 2009 313 Surrogate Gaussian First Derivative Curves for Determination of Decision Levels and Confidence

More information

Who Are My Best Customers?

Who Are My Best Customers? Technical report Who Are My Best Customers? Using SPSS to get greater value from your customer database Table of contents Introduction..............................................................2 Exploring

More information

The impact of banner advertisement frequency on click through responses

The impact of banner advertisement frequency on click through responses The impact of banner advertisement frequency on click through responses Author Hussain, Rahim, Sweeney, Arthur, Sullivan Mort, Gillian Published 2007 Conference Title 2007 ANZMAC Conference Proceedings

More information

Applying CHAID for logistic regression diagnostics and classification accuracy improvement Received (in revised form): 22 nd March 2010

Applying CHAID for logistic regression diagnostics and classification accuracy improvement Received (in revised form): 22 nd March 2010 Original Article Applying CHAID for logistic regression diagnostics and classification accuracy improvement Received (in revised form): 22 nd March 2010 Evgeny Antipov is the President of The Center for

More information

IBM SPSS Decision Trees

IBM SPSS Decision Trees IBM SPSS Decision Trees 20 IBM SPSS Decision Trees Easily identify groups and predict outcomes Highlights With SPSS Decision Trees you can: Identify groups, segments, and patterns in a highly visual manner

More information

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT ANALYTICAL MODEL DEVELOPMENT AGENDA Enterprise Miner: Analytical Model Development The session looks at: - Supervised and Unsupervised Modelling - Classification

More information

ASSESSMENT OF URBAN OUTDOOR THERMAL COMFORT BY THE UNIVERSAL THERMAL CLIMATE INDEX UTCI

ASSESSMENT OF URBAN OUTDOOR THERMAL COMFORT BY THE UNIVERSAL THERMAL CLIMATE INDEX UTCI XIV INTERNATIONAL CONFERENCE ON ENVIRONMENTAL ERGONOMICS ASSESSMENT OF URBAN OUTDOOR THERMAL COMFORT BY THE UNIVERSAL THERMAL CLIMATE INDEX UTCI Peter Bröde, Eduardo L. Krüger & Francine A. Rossi INTRODUCTION

More information

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here.

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here. Contents List of Figures xv Foreword xxiii Preface xxv Acknowledgments xxix Chapter

More information

Consumer Demand Preference Patterns:

Consumer Demand Preference Patterns: Consumer Demand Preference Patterns: Using PROC IML* to Forecast Future Demand Daric Brummett, University of Notre Dame Lawrence Marsh, University of Notre Dame Economic models of consumer demand generally

More information

CASE CONTROL MATCHING: COMPARING SIMPLE DISTANCE- AND PROPENSITY SCORE-BASED METHODS

CASE CONTROL MATCHING: COMPARING SIMPLE DISTANCE- AND PROPENSITY SCORE-BASED METHODS Paper 1861-2014 CASE CONTROL MATCHING: COMPARING SIMPLE DISTANCE- AND PROPENSITY SCORE-BASED METHODS Lovedeep Gondara, BC Cancer Agency; Colleen McGahan, BC Cancer Agency ABSTRACT A case control study

More information

Monitoring, analyzing, and optimizing Waterflood Responses Leon Fedenczuk, Gambit Consulting Ltd. Calgary, Alberta*

Monitoring, analyzing, and optimizing Waterflood Responses Leon Fedenczuk, Gambit Consulting Ltd. Calgary, Alberta* Paper 123-28 Monitoring, analyzing, and optimizing Waterflood Responses Leon Fedenczuk, Gambit Consulting Ltd. Calgary, Alberta* ABSTRACT During a waterflood, large amounts of injected water are used to

More information

BIOSTATISTICS AND MEDICAL INFORMATICS (B M I)

BIOSTATISTICS AND MEDICAL INFORMATICS (B M I) Biostatistics and Medical Informatics (B M I) 1 BIOSTATISTICS AND MEDICAL INFORMATICS (B M I) B M I/POP HLTH 451 INTRODUCTION TO SAS PROGRAMMING FOR 2 credits. Use of the SAS programming language for the

More information

ESTIMATING THE FAMILY PERFORMANCE OF SUGARCANE CROSSES USING SMALL PROGENY TEST. Canal Point, FL. 2

ESTIMATING THE FAMILY PERFORMANCE OF SUGARCANE CROSSES USING SMALL PROGENY TEST. Canal Point, FL. 2 Journal American Society of Sugarcane Technologists, Vol. 23, 2003 ESTIMATING THE FAMILY PERFORMANCE OF SUGARCANE CROSSES USING SMALL PROGENY TEST P.Y.P. Tai 1*, J. M. Shine, Jr. 2, J. D. Miller 1, and

More information

(DMSTT 21) M.Sc. (Final) Final Year DEGREE EXAMINATION, DEC Statistics. Time : 03 Hours Maximum Marks : 100

(DMSTT 21) M.Sc. (Final) Final Year DEGREE EXAMINATION, DEC Statistics. Time : 03 Hours Maximum Marks : 100 (DMSTT 21) M.Sc. (Final) Final Year DEGREE EXAMINATION, DEC. - 2012 Statistics Paper - I : STATISTICAL QUALITY CONTROL Time : 03 Hours Maximum Marks : 100 Answer any Five questions All questions carry

More information

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy AGENDA 1. Introduction 2. Use Cases 3. Popular Algorithms 4. Typical Approach 5. Case Study 2016 SAPIENT GLOBAL MARKETS

More information

The impact of banner advertisement frequency on brand awareness

The impact of banner advertisement frequency on brand awareness The impact of banner advertisement frequency on brand awareness Author Hussain, Rahim, Sweeney, Arthur, Sullivan Mort, Gillian Published 2007 Conference Title 2007 ANZMAC Conference Proceedings Copyright

More information

Socio Economic Factors of Contract Farming: A Logistic Analysis

Socio Economic Factors of Contract Farming: A Logistic Analysis IRA-International Journal of Management & Social Sciences ISSN 2455-2267; Vol.03, Issue 03 (2016) Institute of Research Advances http://research-advances.org/index.php/rajmss Socio Economic Factors of

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Why Study Statistics? Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 1-1 Dealing with Uncertainty Everyday decisions are based on incomplete

More information

Predicting productivity using combinations of LiDAR, satellite imagery and environmental data

Predicting productivity using combinations of LiDAR, satellite imagery and environmental data Date: June Reference: GCFF TN - 007 Predicting productivity using combinations of LiDAR, satellite imagery and environmental data Author/s: Michael S. Watt, Jonathan P. Dash, Pete Watt, Santosh Bhandari

More information

Code Compulsory Module Credits Continuous Assignment

Code Compulsory Module Credits Continuous Assignment CURRICULUM AND SCHEME OF EVALUATION Compulsory Modules Evaluation (%) Code Compulsory Module Credits Continuous Assignment Final Exam MA 5210 Probability and Statistics 3 40±10 60 10 MA 5202 Statistical

More information

Validating a Bankruptcy Prediction by Using Naïve Bayesian Network Model: A case from Malaysian Firms

Validating a Bankruptcy Prediction by Using Naïve Bayesian Network Model: A case from Malaysian Firms 2012 International Conference on Economics, Business Innovation IPEDR vol.38 (2012) (2012) IACSIT Press, Singapore Validating a Bankruptcy Prediction by Using Naïve Bayesian Network Model: A case from

More information

Available online at ScienceDirect. Conference on Systems Engineering Research (CSER 2014)

Available online at  ScienceDirect. Conference on Systems Engineering Research (CSER 2014) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 28 ( 2014 ) 120 129 Conference on Systems Engineering Research (CSER 2014) Eds.: Azad M. Madni, University of Southern

More information

Are Energy Audits Worth It? Teasing Apart the Role of Audits in Driving Customer Efficiency Actions

Are Energy Audits Worth It? Teasing Apart the Role of Audits in Driving Customer Efficiency Actions Are Energy Audits Worth It? Teasing Apart the Role of Audits in Driving Customer Efficiency Actions Joan Mancuso, Quantum Consulting Inc. Mary Dimit, Pacific Gas and Electric Co. Historically, audit programs

More information

2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro

2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro 2016 INFORMS International The Analytics Tool Kit: A Case Study with JMP Pro Mia Stephens mia.stephens@jmp.com http://bit.ly/1uygw57 Copyright 2010 SAS Institute Inc. All rights reserved. Background TQM

More information

Research Note No ISSN

Research Note No ISSN Research Note No.102 1988 ISSN 0226-9368 Dose-response models for stand thinning with the Ezject herbicide injection system by W.A Bergerud Ministry of Forests and Lands Dose -response models for stand

More information

Population Segmentation in a Healthcare Environment

Population Segmentation in a Healthcare Environment Paper PP16 Population Segmentation in a Healthcare Environment MaryAnne DePesquo, BlueCross BlueShield of Arizona, Phoenix, USA ABSTRACT In this new era of Healthcare Reform (HCR) in the United States,

More information

Telecommunications Churn Analysis Using Cox Regression

Telecommunications Churn Analysis Using Cox Regression Telecommunications Churn Analysis Using Cox Regression Introduction As part of its efforts to increase customer loyalty and reduce churn, a telecommunications company is interested in modeling the "time

More information

Discriminant Analysis Applications and Software Support

Discriminant Analysis Applications and Software Support Mirko Savić Dejan Brcanov Stojanka Dakić Discriminant Analysis Applications and Stware Support Article Info:, Vol. 3 (2008), No. 1, pp. 029-033 Received 12 Januar 2008 Accepted 24 April 2008 UDC 311.42:004

More information

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes

More information

Using SAS Enterprise Guide, SAS Enterprise Miner, and SAS Marketing Automation to Make a Collection Campaign Smarter

Using SAS Enterprise Guide, SAS Enterprise Miner, and SAS Marketing Automation to Make a Collection Campaign Smarter Paper 3503-2015 Using SAS Enterprise Guide, SAS Enterprise Miner, and SAS Marketing Automation to Make a Collection Campaign Smarter Darwin Amezquita, Andres Gonzalez, Paulo Fuentes DIRECTV ABSTRACT Companies

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................3

More information

Masters in Business Statistics (MBS) /2015. Department of Mathematics Faculty of Engineering University of Moratuwa Moratuwa. Web:

Masters in Business Statistics (MBS) /2015. Department of Mathematics Faculty of Engineering University of Moratuwa Moratuwa. Web: Masters in Business Statistics (MBS) - 2014/2015 Department of Mathematics Faculty of Engineering University of Moratuwa Moratuwa Web: www.mrt.ac.lk Course Coordinator: Prof. T S G Peiris Prof. in Applied

More information

Is Sufficiency in Food Alone- A Guarantee of An End of Hunger? Evidences from Rural Bangladesh

Is Sufficiency in Food Alone- A Guarantee of An End of Hunger? Evidences from Rural Bangladesh IOSR Journal Of Humanities And Social Science (IOSRJHSS) Volume 19, Issue 2, Ver. II (Feb. 2014), PP 6771 eissn: 22790837, pissn: 22790845. Is Sufficiency in Food Alone A Guarantee of An End of Hunger?

More information

opensap Getting Started with Data Science

opensap Getting Started with Data Science opensap Getting Started with Data Science Exercise Week 1 Unit 6 Initial Data Analysis & Exploratory Data Analysis opensap TABLE OF CONTENTS INTRODUCTION... 3 EXERCISE INSTRUCTIONS... 4 Acquire Data...

More information

White Paper. AML Customer Risk Rating. Modernize customer risk rating models to meet risk governance regulatory expectations

White Paper. AML Customer Risk Rating. Modernize customer risk rating models to meet risk governance regulatory expectations White Paper AML Customer Risk Rating Modernize customer risk rating models to meet risk governance regulatory expectations Contents Executive Summary... 1 Comparing Heuristic Rule-Based Models to Statistical

More information

New Procedure to Improve the Order Selection of Autoregressive Time Series Model

New Procedure to Improve the Order Selection of Autoregressive Time Series Model Journal of Mathematics and Statistics 7 (4): 270-274, 2011 ISSN 1549-3644 2011 Science Publications New Procedure to Improve the Order Selection of Autoregressive Time Series Model Ali Hussein Al-Marshadi

More information

Bayesian statistics for infection experiments

Bayesian statistics for infection experiments 11 Bayesian statistics for infection experiments Lourens Heres # and Bas Engel ## Abstract To intervene cycles of food-borne pathogens in poultry new intervention methods need to be tested for their effectiveness.

More information

Targeting, valuing, segmenting and loyalty techniques

Targeting, valuing, segmenting and loyalty techniques MIKEGRIGSBY ADVANCED CUSTOMER ANALYTICS Targeting, valuing, segmenting and loyalty techniques MARKETING SCIENCE SERIES A KoganPage CONTENTS 01 Overview 1 What is retail? 1 What is analytics? 2 Who is this

More information

Louisiana State University Health Science Center School of Public Health

Louisiana State University Health Science Center School of Public Health Proposal Louisiana State University Health Science Center School of Public Health One semester prior, complete this then have the practice experience coordinator and your preceptor review and sign the

More information

Incorporating Whole-Stand and Individual-Tree Models in a Stand-Table Projection System

Incorporating Whole-Stand and Individual-Tree Models in a Stand-Table Projection System Incorporating Whole-Stand and Individual-Tree Models in a Stand-Table Projection System Quang V. Cao Abstract: A stand table provides number of trees per unit area for each diameter class. This article

More information

Defining models using equations...

Defining models using equations... A Course in Statistical Modelling Methods@Manchester August 27, 28 and 29, 2014 session 03: Defining models and test selection Graeme Hutcheson Manchester Institute of Education University of Manchester

More information

Bayesian Approaches to Phase I Clinical Trials:

Bayesian Approaches to Phase I Clinical Trials: Bayesian Approaches to Phase I Clinical Trials: Methodological and Practical Aspects Design of Experiments in Healthcare Isaac Newton Institue of Mathematical Sciences August 2011, Cambridge UK Beat Neuenschwander

More information

Modelling of Non-Visible Leaks to Improve Targeted Detection

Modelling of Non-Visible Leaks to Improve Targeted Detection Modelling of Non-Visible Leaks to Improve Targeted Detection Diyang QI Melinda R Hodkiewicz School of Mechanical and Chemical Engineering Nazim Kham Gopalan Nair School of Mathematics and Statistics Jon

More information

1. Measures are at the I/R level, independent observations, and distributions are normal and multivariate normal.

1. Measures are at the I/R level, independent observations, and distributions are normal and multivariate normal. 1 Neuendorf Structural Equation Modeling Structural equation modeling is useful in situations when we have a complicated set of relationships among variables as specified by theory. Two main methods have

More information

Sensitivity Analysis of Nonlinear Mixed-Effects Models for. Longitudinal Data That Are Incomplete

Sensitivity Analysis of Nonlinear Mixed-Effects Models for. Longitudinal Data That Are Incomplete ABSTRACT Sensitivity Analysis of Nonlinear Mixed-Effects Models for Longitudinal Data That Are Incomplete Shelley A. Blozis, University of California, Davis, CA Appropriate applications of methods for

More information

Preface to the third edition Preface to the first edition Acknowledgments

Preface to the third edition Preface to the first edition Acknowledgments Contents Foreword Preface to the third edition Preface to the first edition Acknowledgments Part I PRELIMINARIES XXI XXIII XXVII XXIX CHAPTER 1 Introduction 3 1.1 What Is Business Analytics?................

More information

Small Business advice seeking behaviour technical report. An analysis of the 2018 small business legal need survey July 2018

Small Business advice seeking behaviour technical report. An analysis of the 2018 small business legal need survey July 2018 Small Business advice seeking behaviour technical report An analysis of the 2018 small business legal need survey July 2018 Which characteristics of small businesses and the legal issues they face have

More information

Using Cluster Analysis to Support Commercial Assessment of Equipment Suppliers in the Early Phases of Construction Projects

Using Cluster Analysis to Support Commercial Assessment of Equipment Suppliers in the Early Phases of Construction Projects Using Cluster Analysis to Support Commercial Assessment of Equipment Suppliers in the Early Phases of Construction Projects Marcelo M Azambuja, Ph.D. and Xin Chen, Ph.D. Southern Illinois University Edwardsville

More information