Measures of Association for Larger Tables We have illustrated the calculation and interpretation of measures of association for contingency tables with two rows and two columns, so-called two-by-two tables. The interpretation of the measures of association for larger tables is analogous, but the calculation is much more involved. Here we illustrate the calculation of gamma and lambda for the three-by-three cross-tabulation in Table 16.11. This example will show how useful it can be to apply different measures of association to a contingency table. Even though the table is larger, calculation of gamma follows the same three-step procedure elaborated earlier. First, calculate the number of concordant pairs and the number of discordant pairs of cases in the cross-tabulation. Next, calculate the difference between the number of concordant pairs and the number of discordant pairs. Finally, divide this difference by the sum of the number of concordant pairs and the number of discordant pairs. To begin, because measures of association are calculated from the raw frequencies rather than from percentaged data, we must convert the percentages in Table 16.11 to frequencies. Table 16.12 shows the result. Table 16.11 Percentaged Cross-Tabulation of Hierarchy and Job Satisfaction Hierarchy Job Satisfaction Low Middle High Low 75% 10% 20% Medium 15% 10% 70% High 10% 80% 10% Total 100% 100% 100%
As explained before, gamma is based on the number of concordant pairs of cases versus the number of discordant pairs in the table; the concordant pairs demonstrate support for a positive relationship, whereas the discordant pairs show support for a negative relationship. To find the number of concordant pairs, work through the table, moving downward and to the right simultaneously. Begin with the cell in the top row and left column of the table. All table cells both below and to the right of this cell form concordant pairs with it. Four cells satisfy this condition: the middle-row middle-column cell of the table, the middle-row right-column cell, the bottom-row middle-column cell, and the bottom-row right-column cell. Sum the frequencies of the four cells (20 + 140 + 160 + 20 = 340); multiply the result by the frequency in the toprow left-column cell (150). This multiplication gives the number of concordant pairs that can be formed with the top-row left-column cell: 150 340 = 51,000 pairs [see part (a) of Figure 16.1]. Move to the top-row middle-column cell of the table. Cells forming concordant pairs are again down and to the right: the middle-row right-column cell and the bottom-row right-column cell. Sum the frequencies in these two cells (140 + 20 = 160) and multiply by the frequency in the top-row middle-column cell (20). This multiplication gives the number of concordant pairs that can be formed with the top-row middle-column cell: 20 160 = 3,200 pairs [see part (b) of Figure 16.1]. Table 16.12 Cross-Tabulation of Hierarchy and Job Satisfaction (Frequencies) Hierarchy Job Satisfaction Low Medium High Total Low 150 20 40 210
Medium 30 20 140 190 High 20 160 20 200 Total 200 200 200 200 Because no table cells are both to the right and below the top-row right-column cell of the table, it forms no concordant pairs. Instead, move to the middle-row left-column cell of the table. Concordant pairs are formed with the cells below and to the right: the bottom-row middlecolumn cell and the bottom-row right-column cell. Sum these two cell frequencies (160 + 20 = 180) and multiply by the frequency in the middle-row left-column cell (30). This multiplication gives the number of concordant pairs that can be formed with this cell: 30 180 = 5,400 pairs [see part (c) of Figure 16.1]. Figure 16.1 Concordant and Discordant Pairs Move to the middle-row middle-column cell of the table. With which cells does it form
concordant pairs? Just one the bottom-row right-column cell. Multiply the two cell frequencies to find the number of concordant pairs: 20 20 = 400 [see part (d) of Figure 16.1]. You may not realize it, but you have now found all concordant pairs in the table. Because no table cell is both below and to the right of the middle-row right-column cell, no concordant pairs can be formed with it. Similarly, since no table cell is both below and to the right of the cells in the bottom row of the table, no concordant pairs can be formed with any of them. The total number of concordant pairs is equal to the sum of the four sets of concordant pairs that we have calculated: 51,000 + 3,200 + 5,400 + 400 = 60,000 [see Figure 16.1, parts (a), (b), (c), and (d)]. To find the number of discordant pairs, the procedure is analogous to that for concordant pairs, except that you must start with the top-row right-column cell of the table and move downward and to the left simultaneously to form the pairs. Parts (e) through (h) of Figure 16.1 show the procedure schematically. To begin, multiply the frequency in the top-row right-column cell (40) by the sum of the frequencies in the cells both below and to the left (20 + 30 + 160 + 20 = 230), yielding 9,200 pairs. Move to the top-row middle-column cell; multiply this frequency (20) by the sum of the frequencies in the cells both below and to the left (30 + 20 = 50), giving 1,000 pairs. Move to the middle-row right-column cell of the table, and multiply this frequency (140) by the sum of the cell frequencies below and to the left (160 + 20 = 180), yielding 25,200 pairs. Finally, the discordant pairs for the middle-row middle-column cell are formed with the bottom row left-column cell only; multiplying the relevant cell frequencies yields 20 20 = 400 pairs. The total number of discordant pairs in the contingency table is the sum of these four sets of pairs: 9,200 + 1,000 + 25,200 + 400 = 35,800 pairs [see Figure 16.1, parts (e), (f), (g), and (h)]. Recall that gamma is equal to the difference between the number of concordant pairs and the
number of discordant pairs in the contingency table, divided by their sum. Thus, for the crosstabulation in Table 16.12, gamma is equal to This value of gamma suggests a modest degree of covariation or relationship between level in the hierarchy and job satisfaction. Note how important it is to set up the cross-tabulation in the standard format displayed in Table 15.15 in Chapter 15. Had the ordering of the categories for either variable in the contingency table been reversed, concordant pairs would have been misidentified as discordant pairs, and vice versa. For calculating measures of association, whether by hand or by computer, the presumption is that the table has been set up in the standard format shown in that chapter. Lambda is a measure of association for nominal data based on the ability to predict values of the dependent variable. Like all statistics for nominal data, it can always be applied to higher levels of measurement, such as the ordinal variables cross-tabulated in Table 16.12. Lambda is a proportional reduction in error statistic. The formula for lambda presented earlier indicates that we must (1) determine the number of errors in predicting the value of the dependent variable without knowledge of the independent variable, (2) subtract from this number the number of errors that we would make with knowledge of the independent variable to inform our predictions, and (3) see by what proportion the errors in predicting values of the dependent variable are reduced by introducing knowledge of the independent variable. In Table 16.12, which category of job satisfaction would you predict that most employees have, if you did not know their level in the organizational hierarchy? Your best guess is low job satisfaction, because more employees gave this response than any other (210). You would make the correct prediction for these 210 employees, but you would be incorrect in making this
prediction for employees with medium satisfaction (190) or high satisfaction (200). In all, you would make a total of 190 + 200 = 390 errors in predicting values of the dependent variable if you did not consider employees position in the organizational hierarchy (the independent variable). Now, introduce knowledge of the independent variable. For each category of hierarchy, select the category of the dependent variable that will minimize the number of errors in predicting employee job satisfaction. For employees who are low in the organizational hierarchy, what is your best guess of their level of job satisfaction? You should guess low satisfaction, because most employees low in the hierarchy gave this response (150). You would be correct in predicting the job satisfaction of these 150 employees, but you would make errors in prediction for the 30 employees low in the hierarchy who have medium job satisfaction and for the 20 who have high satisfaction a total of 50 errors in prediction. Which category of job satisfaction yields the fewest errors in prediction for the employees in the middle ranks of the organizational hierarchy? The best prediction is high job satisfaction. This prediction would be correct for 160 of the employees in the middle ranks, but it would be in error for the 20 employees with medium job satisfaction and for the 20 with low satisfaction in the middle of the hierarchy a total of 40 errors. Finally, for employees high in the organizational hierarchy, the best prediction of job satisfaction is medium. The prediction is correct for 140 employees but in error for 60 employees the 40 with low job satisfaction and the 20 with high satisfaction in this category of hierarchy. In all, then, given knowledge of employees standing in the organizational hierarchy (the independent variable), the total number of errors in predicting job satisfaction (the dependent variable) is 50 + 40 + 60 = 150.
Lambda evaluates how much prediction of the dependent variable has improved by introducing knowledge of the independent variable. In this example, we began with 390 errors in predicting employees levels of job satisfaction, absent knowledge of their position in the organizational hierarchy. Introducing this knowledge, we made only 150 errors in prediction. By what proportion has our prediction been improved? The formula for lambda provides the answer: