CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA

Size: px
Start display at page:

Download "CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA"

Transcription

1 PLS 802 Spring 2018 Professor Jacoby CHECKING INFLUENCE DIAGNOSTICS IN THE OCCUPATIONAL PRESTIGE DATA This handout shows the log from a Stata session that examines the Duncan Occupational Prestige data for influential observations. Recall that, previously, the data seemed to support the meritocracy theory but not the materialistic theory. Does an analysis of the influence statistics lead to a different conclusion about the determinants of occupational prestige? - (FIRST FEW LINES OMITTED FOR SPACE).. set more off. #delimit ; delimiter now ; Read data from text file, "occprest.txt". infile str21 occup income educ prestige > using "occprest.txt"; (15 observations read) Estimate multiple regression model. regress prestige income educ, beta; Source SS df MS Number of obs = F( 2, 12) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = prestige Coef. Std. Err. t P> t Beta income educ _cons Use post-estimation commands to obtain added variable plots. First, get basic versions of plots, then enhance with graph options.

2 Page 2. avplot income, > name(avplot1, replace);. graph export avplot1.pdf, replace; (file avplot1.pdf written in PDF format). avplot educ, > name(avplot2, replace);. graph export avplot2.pdf, replace; (file avplot2.pdf written in PDF format). avplot income, > scheme(s1color) > msymbol(oh) > mcolor(black) > msize(*1.5) > xaxis (1 2) > yaxis (1 2) > ylabel(, axis(2) nolabel) > xlabel(, axis(2) nolabel) > ylabel(#4, axis(1) labsize(small)) > xlabel(#3, axis(1) labsize(small)) > ylabel(#4, axis(2) labsize(small)) > xlabel(#3, axis(2) labsize(small)) > xtitle("", axis(2)) > ytitle("", axis(2)) > mlabel(occup) > mlabsize(small) > aspectratio(1) > name(avplot3, replace). graph export avplot3.pdf, replace; (file avplot3.pdf written in PDF format). avplot educ, > scheme(s1color) > msymbol(oh) > mcolor(black) > msize(*1.5) > xaxis (1 2) > yaxis (1 2) > ylabel(, axis(2) nolabel) > xlabel(, axis(2) nolabel) > ylabel(#4, axis(1) labsize(small)) > xlabel(#3, axis(1) labsize(small)) > ylabel(#4, axis(2) labsize(small)) > xlabel(#3, axis(2) labsize(small)) > xtitle("", axis(2)) > ytitle("", axis(2)) > mlabel(occup) > mlabsize(small) > aspectratio(1) > name(avplot2, replace). graph export avplot4.pdf, replace; (file avplot4.pdf written in PDF format)

3 Page 3 Calculate and print influence statistics. predict hatvalue, leverage;. predict studresid, rstudent;. predict cookdist, cooksd;. predict dffits, dfits;. list occup hatvalue studresid cookdist dffits; occup hatvalue studresid cookdist dffits Accountant Author Professor Civil Engineer Physician RR Conductor Store Manager Mail Carrier Carpenter Machinist Gas Station Attendant Taxi Driver Barber Cook Janitor Re-run regression, omitting the influential observation.. regress prestige income educ if > occup!= "RR Conductor", beta; Source SS df MS Number of obs = F( 2, 11) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = prestige Coef. Std. Err. t P> t Beta income educ _cons

4 Page 4 Create observation-specific dummy variable and include it in regression. generate rrcond = 0;. replace rrcond = 1 if occup == "RR Conductor"; (1 real change made). regress prestige income educ rrcond; Source SS df MS Number of obs = F( 3, 11) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = prestige Coef. Std. Err. t P> t [95% Conf. Interval] income educ rrcond _cons Create standardized variables and re-run regression.. egen sprestige = std(prestige);. egen sincome = std(income);. egen seduc = std(educ);. regress sprestige sincome seduc rrcond; Source SS df MS Number of obs = F( 3, 11) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = sprestige Coef. Std. Err. t P> t [95% Conf. Interval] sincome seduc rrcond _cons log close; log: l:\pls 802, spring 2018\influence\influence in stata\influence1.smcl -

5 Page 5 Figure 1: Added variable plot for income. Graphical display created with Stata defaults e( income X ) coef = , se = , t = 1.59 Figure 2: Added variable plot for education. Graphical display created with Stata defaults e( educ X ) coef = , se = , t = 4.55

6 Page 6 Figure 3: Added variable plot for income. Display is enhanced with graph options Taxi CookDriver Janitor Author Carpenter Professor Civil Engineer Physician Store Manager Barber Accountant Machinist Gas Station Attendant Mail Carrier e( income X ) coef = , se = , t = 1.59 RR Conductor Figure 4: Added variable plot for education. Display is enhanced with graph options Professor Machinist Physician Accountant Author CarpenterCivil Engineer Barber Cook Store Taxi Janitor Manager Driver Gas Station Attendant Mail Carrier RR Conductor e( educ X ) coef = , se = , t = 4.55