tables with two rows and two columns, so-called two-by-two tables. The interpretation of the

Similar documents
2. What is the problem with using the sum of squares as a measure of variability, that is, why is variance a better measure?

On of the major merits of the Flag Model is its potential for representation. There are three approaches to such a task: a qualitative, a

DIS 300. Quantitative Analysis in Operations Management. Instructions for DIS 300-Transportation

The Multi criterion Decision-Making (MCDM) are gaining importance as potential tools

Use of AHP Method in Efficiency Analysis of Existing Water Treatment Plants

ISO/TC 176/SC 2 Document N1224, July 2014

Decision Support System (DSS) Advanced Remote Sensing. Advantages of DSS. Advantages/Disadvantages

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur

Table of Contents REGIME (REGIME)

PROJECT TIME MANAGEMENT

Analyzing Numerical Data: Validating Identification Numbers I.D Student Activity Sheet 13: Credit Card Numbers

Understanding and Managing Organizational Behavior

Determining and ranking essential criteria of Construction Project Selection in Telecommunication of North Khorasan-Iran

44. Sim Reactions Example

Section 1.0: Introduction to Making Hard Decisions

Why Learn Statistics?

Problem Solving: Percents

Management. Part III: Organizing Ch. 10. Organization design

A study on Supply Chain issues of Indian Railway in Samastipur division using AHP technique

WORKERS COMPENSATION DIVISION SURVEY

Bioreactors Prof G. K. Suraishkumar Department of Biotechnology Indian Institute of Technology, Madras. Lecture - 02 Sterilization

IT portfolio management template User guide

Publishing as Prentice Hall

DEVELOPING AN ASSESSMENT TOOL FOR MEASURING TOTAL QUALITY MANAGEMENT: A CASE STUDY

Final Exam Spring Bread-and-Butter Edition

Reducing Fractions PRE-ACTIVITY PREPARATION

White Paper. AML Customer Risk Rating. Modernize customer risk rating models to meet risk governance regulatory expectations

How to Get More Value from Your Survey Data

20. Alternative Ranking Using Criterium Decision Plus (Multi-attribute Utility Analysis)

FACES IV Package. Administration Manual. David H. Olson Ph.D. Dean M. Gorall Ph.D. Judy W. Tiesel Ph.D.

Genetics - Problem Drill 05: Genetic Mapping: Linkage and Recombination

SITUATION ANALYSIS PERCEPTUAL MAP IDEAL SPOTS

APPLICATION OF SEASONAL ADJUSTMENT FACTORS TO SUBSEQUENT YEAR DATA. Corresponding Author

A Decision Support System for Performance Evaluation

Scheduler Book Mode User Guide Version 4.81

Question 2: How do we make decisions about inventory?

Which of the following are subareas of the People Integration subcomponent of SAP NetWeaver?

Correlation matrices between ISO 9001:2008 and ISO 9001:2015

Experimental Design Day 2

Introduction to Analytics Tools Data Models Problem solving with analytics

Seven Basic Quality Tools

Crowe Critical Appraisal Tool (CCAT) User Guide

. This function gives the supplier s behavior with respect to price and quantity.

Data Analysis in Empirical Research - Overview. Prof. Dr. Hariet Köstner WS 2017/2018

Requirements Management

ENERGY STAR Portfolio Manager. Technical Reference. ENERGY STAR Score for Supermarkets and Food Stores in Canada OVERVIEW

GUIDED PRACTICE: SPREADSHEET FORMATTING

Requirements Management

Credit Card Marketing Classification Trees

A is used to answer questions about the quantity of what is being measured. A quantitative variable is comprised of numeric values.

6/29/ Professor Lili Saghafi

Chapter 3. Integrating AHP, clustering and association rule mining

Application of the Fuzzy Delphi Method and the Fuzzy Analytic Hierarchy Process for the Managerial Competence of Multinational Corporation Executives

Chapter 1 Nature and Significance of Management

Implementing the North American Industry Classification System: The Canadian Experience

This project includes information on coal mining in West Virginia from 1994 to 2014.

What Is Conjoint Analysis? DSC 410/510 Multivariate Statistical Methods. How Is Conjoint Analysis Done? Empirical Example

Why Do We Need Fractions Anyway?

AcaStat How To Guide. AcaStat. Software. Copyright 2016, AcaStat Software. All rights Reserved.

Glossary of Standardized Testing Terms

PLANT OPERATOR SELECTION SYSTEM

Draft Poof - Do not copy, post, or distribute

Studying the Employee Satisfaction Using Factor Analysis

The Department of Industrial and Manufacturing Systems Engineering

Consumer Math Unit Lesson Title Lesson Objectives 1 Basic Math Review Identify the stated goals of the unit and course

Estimating Duration and Cost. CS 390 Lecture 26 Chapter 9: Planning and Estimating. Planning and the Software Process

Commissioning Long- Term Monitoring and Tracking 2016 Square Footage Update

Applying PSM to Enterprise Measurement

SPECIFICATIONS FOR THE CONSTRUCTION OF NEW PASSENGER EQUIPMENT CARS PREFACE

Quantitative Methods. Presenting Data in Tables and Charts. Basic Business Statistics, 10e 2006 Prentice-Hall, Inc. Chap 2-1

Cut and fill productivity. Operating procedures. Introduction. 1. Approach of the model

Multiple Regression. Dr. Tom Pierce Department of Psychology Radford University

TRIAGE: PRIORITIZING SPECIES AND HABITATS

ISE 204 OR II. Chapter 8 The Transportation and Assignment Problems. Asst. Prof. Dr. Deniz TÜRSEL ELİİYİ

Software Estimation. Estimating Software Size

REASONING ABOUT CUSTOMER NEEDS IN MULTI-SUPPLIER ICT SERVICE BUNDLES USING DECISION MODELS

ISO Food Safety Management System Compliance Summary

Unit 5. Producer theory: revenues and costs

I Didn t Know I Needed That!:

Multi-criteria decision making for supplier selection using AHP and TOPSIS method

PROMOTING EMPLOYMENT ACROSS KANSAS (PEAK)

Comparative Analysis of Land Acquisition Acts Using Analytical Hierarchical Process (AHP) : A Questionary Survey

PeopleSoft HR 9.1 PeopleBook: Track Faculty Events

The Dummy s Guide to Data Analysis Using SPSS

Case Study: How CPS Energy Optimized Labor Planning

Decision Analysis Applied to Small Satellite Risk Management

TEACHER : Markups, Discounts & Taxes

The Balanced Scorecard: Translating Strategy into Results

ANALYTICAL HIERARCHY PROCESS BASED ON DATA FLOW DIAGRAM

EMPIRICAL RESEARCH ON THE CORE FACTORS OF GREEN LOGISTICS DEVELOPMENT

GAG UNCERTAINTY. THE BASICS = of

N- The rank of the specified protein relative to all other proteins in the list of detected proteins.

Online Student Guide Types of Control Charts

Measurement and Scaling Concepts

Maths Level 2. Sample. Functional Skills. Mark Scheme and Marking Guidance. Assessment Code: FSML2AA/P

Pearson s r and Chi-square tests. Dr. Christine Pereira Academic Skills Team (ASK)

Chapter 9: Static Games and Cournot Competition

Weka Evaluation: Assessing the performance

Architecting. the. Customer Centric. Enterprise. WHITE PAPER JANUARY Merging Elements Corporation

Getting Started with Pricing. Release 8.7.2

Transcription:

Measures of Association for Larger Tables We have illustrated the calculation and interpretation of measures of association for contingency tables with two rows and two columns, so-called two-by-two tables. The interpretation of the measures of association for larger tables is analogous, but the calculation is much more involved. Here we illustrate the calculation of gamma and lambda for the three-by-three cross-tabulation in Table 16.11. This example will show how useful it can be to apply different measures of association to a contingency table. Even though the table is larger, calculation of gamma follows the same three-step procedure elaborated earlier. First, calculate the number of concordant pairs and the number of discordant pairs of cases in the cross-tabulation. Next, calculate the difference between the number of concordant pairs and the number of discordant pairs. Finally, divide this difference by the sum of the number of concordant pairs and the number of discordant pairs. To begin, because measures of association are calculated from the raw frequencies rather than from percentaged data, we must convert the percentages in Table 16.11 to frequencies. Table 16.12 shows the result. Table 16.11 Percentaged Cross-Tabulation of Hierarchy and Job Satisfaction Hierarchy Job Satisfaction Low Middle High Low 75% 10% 20% Medium 15% 10% 70% High 10% 80% 10% Total 100% 100% 100%

As explained before, gamma is based on the number of concordant pairs of cases versus the number of discordant pairs in the table; the concordant pairs demonstrate support for a positive relationship, whereas the discordant pairs show support for a negative relationship. To find the number of concordant pairs, work through the table, moving downward and to the right simultaneously. Begin with the cell in the top row and left column of the table. All table cells both below and to the right of this cell form concordant pairs with it. Four cells satisfy this condition: the middle-row middle-column cell of the table, the middle-row right-column cell, the bottom-row middle-column cell, and the bottom-row right-column cell. Sum the frequencies of the four cells (20 + 140 + 160 + 20 = 340); multiply the result by the frequency in the toprow left-column cell (150). This multiplication gives the number of concordant pairs that can be formed with the top-row left-column cell: 150 340 = 51,000 pairs [see part (a) of Figure 16.1]. Move to the top-row middle-column cell of the table. Cells forming concordant pairs are again down and to the right: the middle-row right-column cell and the bottom-row right-column cell. Sum the frequencies in these two cells (140 + 20 = 160) and multiply by the frequency in the top-row middle-column cell (20). This multiplication gives the number of concordant pairs that can be formed with the top-row middle-column cell: 20 160 = 3,200 pairs [see part (b) of Figure 16.1]. Table 16.12 Cross-Tabulation of Hierarchy and Job Satisfaction (Frequencies) Hierarchy Job Satisfaction Low Medium High Total Low 150 20 40 210

Medium 30 20 140 190 High 20 160 20 200 Total 200 200 200 200 Because no table cells are both to the right and below the top-row right-column cell of the table, it forms no concordant pairs. Instead, move to the middle-row left-column cell of the table. Concordant pairs are formed with the cells below and to the right: the bottom-row middlecolumn cell and the bottom-row right-column cell. Sum these two cell frequencies (160 + 20 = 180) and multiply by the frequency in the middle-row left-column cell (30). This multiplication gives the number of concordant pairs that can be formed with this cell: 30 180 = 5,400 pairs [see part (c) of Figure 16.1]. Figure 16.1 Concordant and Discordant Pairs Move to the middle-row middle-column cell of the table. With which cells does it form

concordant pairs? Just one the bottom-row right-column cell. Multiply the two cell frequencies to find the number of concordant pairs: 20 20 = 400 [see part (d) of Figure 16.1]. You may not realize it, but you have now found all concordant pairs in the table. Because no table cell is both below and to the right of the middle-row right-column cell, no concordant pairs can be formed with it. Similarly, since no table cell is both below and to the right of the cells in the bottom row of the table, no concordant pairs can be formed with any of them. The total number of concordant pairs is equal to the sum of the four sets of concordant pairs that we have calculated: 51,000 + 3,200 + 5,400 + 400 = 60,000 [see Figure 16.1, parts (a), (b), (c), and (d)]. To find the number of discordant pairs, the procedure is analogous to that for concordant pairs, except that you must start with the top-row right-column cell of the table and move downward and to the left simultaneously to form the pairs. Parts (e) through (h) of Figure 16.1 show the procedure schematically. To begin, multiply the frequency in the top-row right-column cell (40) by the sum of the frequencies in the cells both below and to the left (20 + 30 + 160 + 20 = 230), yielding 9,200 pairs. Move to the top-row middle-column cell; multiply this frequency (20) by the sum of the frequencies in the cells both below and to the left (30 + 20 = 50), giving 1,000 pairs. Move to the middle-row right-column cell of the table, and multiply this frequency (140) by the sum of the cell frequencies below and to the left (160 + 20 = 180), yielding 25,200 pairs. Finally, the discordant pairs for the middle-row middle-column cell are formed with the bottom row left-column cell only; multiplying the relevant cell frequencies yields 20 20 = 400 pairs. The total number of discordant pairs in the contingency table is the sum of these four sets of pairs: 9,200 + 1,000 + 25,200 + 400 = 35,800 pairs [see Figure 16.1, parts (e), (f), (g), and (h)]. Recall that gamma is equal to the difference between the number of concordant pairs and the

number of discordant pairs in the contingency table, divided by their sum. Thus, for the crosstabulation in Table 16.12, gamma is equal to This value of gamma suggests a modest degree of covariation or relationship between level in the hierarchy and job satisfaction. Note how important it is to set up the cross-tabulation in the standard format displayed in Table 15.15 in Chapter 15. Had the ordering of the categories for either variable in the contingency table been reversed, concordant pairs would have been misidentified as discordant pairs, and vice versa. For calculating measures of association, whether by hand or by computer, the presumption is that the table has been set up in the standard format shown in that chapter. Lambda is a measure of association for nominal data based on the ability to predict values of the dependent variable. Like all statistics for nominal data, it can always be applied to higher levels of measurement, such as the ordinal variables cross-tabulated in Table 16.12. Lambda is a proportional reduction in error statistic. The formula for lambda presented earlier indicates that we must (1) determine the number of errors in predicting the value of the dependent variable without knowledge of the independent variable, (2) subtract from this number the number of errors that we would make with knowledge of the independent variable to inform our predictions, and (3) see by what proportion the errors in predicting values of the dependent variable are reduced by introducing knowledge of the independent variable. In Table 16.12, which category of job satisfaction would you predict that most employees have, if you did not know their level in the organizational hierarchy? Your best guess is low job satisfaction, because more employees gave this response than any other (210). You would make the correct prediction for these 210 employees, but you would be incorrect in making this

prediction for employees with medium satisfaction (190) or high satisfaction (200). In all, you would make a total of 190 + 200 = 390 errors in predicting values of the dependent variable if you did not consider employees position in the organizational hierarchy (the independent variable). Now, introduce knowledge of the independent variable. For each category of hierarchy, select the category of the dependent variable that will minimize the number of errors in predicting employee job satisfaction. For employees who are low in the organizational hierarchy, what is your best guess of their level of job satisfaction? You should guess low satisfaction, because most employees low in the hierarchy gave this response (150). You would be correct in predicting the job satisfaction of these 150 employees, but you would make errors in prediction for the 30 employees low in the hierarchy who have medium job satisfaction and for the 20 who have high satisfaction a total of 50 errors in prediction. Which category of job satisfaction yields the fewest errors in prediction for the employees in the middle ranks of the organizational hierarchy? The best prediction is high job satisfaction. This prediction would be correct for 160 of the employees in the middle ranks, but it would be in error for the 20 employees with medium job satisfaction and for the 20 with low satisfaction in the middle of the hierarchy a total of 40 errors. Finally, for employees high in the organizational hierarchy, the best prediction of job satisfaction is medium. The prediction is correct for 140 employees but in error for 60 employees the 40 with low job satisfaction and the 20 with high satisfaction in this category of hierarchy. In all, then, given knowledge of employees standing in the organizational hierarchy (the independent variable), the total number of errors in predicting job satisfaction (the dependent variable) is 50 + 40 + 60 = 150.

Lambda evaluates how much prediction of the dependent variable has improved by introducing knowledge of the independent variable. In this example, we began with 390 errors in predicting employees levels of job satisfaction, absent knowledge of their position in the organizational hierarchy. Introducing this knowledge, we made only 150 errors in prediction. By what proportion has our prediction been improved? The formula for lambda provides the answer: