A Spatial Analysis of Exit Poll Interviewers During the 2008 Presidential Election

Size: px
Start display at page:

Download "A Spatial Analysis of Exit Poll Interviewers During the 2008 Presidential Election"

Transcription

1 A Spatial Analysis of Exit Poll Interviewers During the 2008 Presidential Election Clint W. Stevenson, Edison Research Joe Lenski, Edison Research Allan McCutcheon, University of Nebraska - Lincoln Rene Bautistia, University of Nebraska - Lincoln Questions can be sent to Clint Stevenson: cstevenson@edisonresearch.com 1

2 Abstract The National Election Pool Exit Poll (or NEP Exit Poll) conducted by Edison/Mitofsky provides a unique opportunity to measure not only election voting behavior but it also provides an environment to measure characteristics of the interviewer during an exit poll survey. Although there is a large body of literature on interviewer effects, the data collected during the 2008 National Exit Poll that is presented in this paper is unique in several respects. First, this is a large survey that includes 47 states. Second, this paper uses a unique geospatial dataset collected during the 2008 election year. Third, this paper will examine the distance between the interviewers home and the election polling location where they administered the questionnaires. Fourth, this paper will describe issues relating to spatial distances and how that influences completion rates while conducting the survey. Considering that many surveys and data gathering techniques incorporate in-person interviewing it is important to more fully understand how the interviewer interacts with the polling location. The analysis conducted in this paper uses multiple regression and spatial analysis to describe the interviewers and to examine how well the interviewers performed during the 2008 Presidential Election. 2

3 Contents 1 Introduction 5 2 Data Collection GeoCoding Topographical Data Interviewers Methods Dependent and Independent Variables Analyses and Findings Interviewer Profile State Summaries National Geospatial Data Models Interviewers Proximity When Conducting the Survey Discussion 18 6 Conclusion 19 7 Appendix: Question Wording 21 3

4 List of Figures 1 Completion Rates by State for Exit Poll Interviewers Spherical Distance Exit Poll Interviewers Traveled Male Interviewers Completion Rate Female Interviewers Completion Rate Age Interviewers Completion Rate Age Interviewers Completion Rate Age 60+ Interviewers Completion Rate Elevation Change by State for Exit Poll Interviewers List of Tables 1 Completion Rate Regression Model Spherical Distance Regression Model Distance and Percent Vote for Obama Interviewer Elevation Change Regression Model Completion Rates by Survey Location Contrasts of Survey Locations of the Exit Poll Interviewer Contrasts of Inside v. Outside Locations of the Exit Poll Interviewer Summary of How Interviewers Were Moved on Election Day Contrasts of Interviewing Movement of the Exit Poll Interviewer

5 1 Introduction On November 4th, 2008 Edison Research and Mitofsky International conducted the exit poll for the National Election Pool. During this election 1337 precincts across the country were randomly selected to be included in the exit poll sample. Of the 1337 precincts 300 were selected to be part of a national sample. Colorado, Oregon, Washington are not included in this sample due to the way those states conduct their elections; Oregon and Washington are mainly absentee voting states whereas Colorado has a very high absentee rate and incorporates centralized polling locations. Each national exit poll precinct selected is staffed by an interviewer that is generally hired from that geographic area. However, the distance the interviewer traveled varies greatly from precinct to precinct. This paper will look at three distance components in conjunction with the completion rate of the interviewer. These components are the Great Circle distance the interviewer traveled (or spherical distance), the change in elevation between the interviewer s home and their assigned polling location, and their distance from the building when actually conducting the survey. We will specifically describe three levels of geo-proximity in this paper 1) a geospatial profile of interviewers in each state across the nation 2) national regression modeling for the interviewer s home and assigned polling location using geospatial data and completion rates from the national sample data 3) the proximity between the polling location and where the interviewers were required to stand while administering the questionnaires. 5

6 2 Data Collection 2.1 GeoCoding The United States Census Bureau provides geospatial data through the Tiger/Line Files. The analysis for this paper uses the 2006 Second Edition Tiger/Line Files 1. These data provide, among other things, the ability to convert street addresses into latitude and longitude coordinates. In order to calculate the Great Circle distance two latitude and longitude pairs are calculated. One for the home address of the interviewer and one for the precinct address where the interviewer administered exit poll questionnaires. This makes it possible to use spherical geometry to calculate the distance between the two locations. Custom software was developed in order to organize and geospatially investigate interviewers. The software manages and maintains the interviewers home address and the interviewers polling precinct where they conducted the survey. Additionally, the software geocodes each address into latitude and longitude coordinates. In a few cases no exact address is available for the precinct location (e.g. only directions to the precinct location) and the software is not able to locate the corresponding Tiger/Lines address. Consequently, those observations are not included in the analysis. Once the latitude and longitude coordinates are available for interviewers and their precinct locations the distance between the two can be determined. The distance between the two points is calculated based on the Great Circle distance. The constants and are used for the equatorial and polar radii of the earth, respectively. 1 Data is available at 6

7 R = ( cos(lat 1 )) 2 +( sin(lat 1 )) 2 ( cos(lat 1 )) 2 +( sin(lat 1 )) 2 + ( cos(lat 2 )) 2 +( sin(lat 2 )) 2 ( cos(lat 2 )) 2 +( sin(lat 2 )) 2 2 Long = Long 2 Long 1 distance = R acos [sin (Lat 1 ) sin (Lat 2 )+cos (Lat 1 ) cos (Lat 2 ) Long ] This analysis calculates only the spherical distance. Road layout and other traffic conditions are not considered when determining the distance the interviewer lives from the polling location they are assigned. 2.2 Topographical Data Once the latitude and longitude coordinates are determined, the United States Geological Survey Seamless Elevation dataset is used to compute the elevation of each coordinate pair. For this paper this service allows the ability to determine the elevation of any location in the United States. Thus the elevation change from the interviewers home address and their assigned polling location where the interviewers administer the questionnaires are easily obtained. 2.3 Interviewers Due to logistics and feasibility, interviewers are generally hired and assigned to work at a previously selected polling location based on their geospatial proximity to the specified polling location. Once an interviewer is hired they are given training manuals and complete a rehearsal phone call. The training format and instructional materials are the same for all exit poll interviewers. 7

8 Interviewer-specific data is obtained through a questionnaire that the interviewer returns to Edison Research immediately after Election Day. This interviewer data is then merged with the geospatial, topographical, and exit poll datasets. 3 Methods 3.1 Dependent and Independent Variables For the analysis presented here there are three specific dependent variables that are investigated: completion rate, interviewer s distance between their home and the polling location, and interviewer s elevation change. Because of the moderate number of potential predictor variables a backward elimination procedure with AIC criterion is used for model selection. This way each variable is investigated in the regression function adjusted for all the other potential predictor variables. The interviewer s distance and change in elevation between their home and their polling location are determined through a fixed dataset using the custom software prior to Election Day. However, completion rate is determined on Election Day by the interviewer based on their observation of voters refusing to respond to the questionnaire. The three components to completion rate are completes, refusals, and misses. Completes are the number of voters who agree to complete a questionnaire, refusals are voters who directly refuse to complete a questionnaire, and misses are voters the interviewers are, for any reason, unable to reach. The completion rate is calculated for each interviewer as: CR = Completes Completes+Refusals+Misses 8

9 Many of the interviewer characteristic variables are categorical. Though many of the questions on the interviewer survey have multiple response levels for this analysis the top box will be investigated. The variables that are included in the models are either a continuous variable or variables that are recoded dichotomously. 4 Analyses and Findings 4.1 Interviewer Profile State Summaries The following graphs provide information on each of the three dependent variables to better understand the geospatial profile of the voters for each state. First, the mean completion rates, as shown in Figure 1, range anywhere from 38% to 59%. There are a handful of states where the interviewers tend to achieve a higher completion rate. These states with the highest completion rates are Utah, Wyoming, and Vermont. Greater granularity on the completion rate is given for interviewer age and gender. Figure 3 provides a completion rate profile of the male interviewers, Figure 4 is for female interviewers, Figure 5 for the interviewers age 18-45, Figure 6 for the interviewers age 45 to 59, and Figure 7 is for interviewers age 60 and over. Second, the mean distance traveled for each state ranges from a low of 7 miles to a high of 60 miles (using the Great Circle distance). States with the greatest distance traveled are California, Florida 2, and Mississippi. Figure 2 shows the mean distance the interviewers in each state traveled to arrive at their assigned polling location. Third, the topographical data provides an additional 2 There was one interviewer in Florida that traveled cross country. Without that interviewer Florida would be in the category. 9

10 Figure 1: Completion Rates by State for Exit Poll Interviewers profile on how the interviewers traveled to their polling location. The states with the greatest decrease in elevation for an interviewer were Utah, Maine, and Arkansas. The state with the greatest increase in elevation for an interviewer was California. Figure 8 shows the summary of these elevation changes. 4.2 National Geospatial Data Models To faciliate interpretation this evaluation provides three sets of univariate (multivariate analysis indicates comparable results) multiple regression analyses are conducted to estimate the nationwide completion rates, distances, and elevation change. First, the completion rate is investigated to determine the best predictive model. All geospatial, topographical and interviewer characteristics were established as potential predictor 10

11 Figure 2: Spherical Distance Exit Poll Interviewers Traveled variables. In the completion rate analysis no geospatial or topographical characteristics were deemed to significantly contribute to the model. However, several interviewer and precinct characteristics significantly contributed to predicting the completion rate. The four predictor variables that significantly contribute to the model come as no surprise (see Table 1). If an interviewer feels that voters are not cooperative then this translates into voters not completing the questionnaire. The location of where the interviewer must begin their interviewing naturally makes sense. If the interviewer is located inside the polling location then it not only gives a greater impression of legitimacy but it also makes it easier for the interviewer to approach voters. When an interviewer states that the weather affected their ability to conduct the survey one would expect the completion to decrease. This seems to be an indication that people simply do not want to 11

12 Figure 3: Male Interviewers Completion Rate stand around in adverse weather to complete a questionnaire. Interviewer age provides an interesting component to the model in that an older interviewer tends to achieve a higher completion rate than a younger interviewer. These findings are similar to the findings reported in the Edison/Mitofsky report on the 2004 election (Edison/Mitofsky (2005)). Table 1: Completion Rate Regression Model Variable Estimate (Std Err) Standardized Estimate Sig. Voters Cooperative (Yes).080 (.017).254 *** Survey Location (Inside).058 (.018).176 *** Weather (Yes) (.028) Interviewer Age.003 (.001).309 *** Intercept.251 (.025) *** *** p<.001, ** p<.01, * p<.05, - p<.10 Using the interviewer s distance as the dependent variable provides a unique perspective on the exit poll and interviewer data. Table 2 shows the best predictive model using distance as the dependent variable. Using this geospatial data we can first determine 12

13 Figure 4: Female Interviewers Completion Rate the best national model to predict the distance an interviewer travels to arrive at their assigned polling locations. There are three predictor variables that significantly (using α = 0.10) contribute to the regression model: interviewer shyness, Democratic vote, and voter registration. Interviewers who describe themselves as very outgoing (5 on a scale of 1 to 5) seem to be willing to drive additional miles to work at their polling location. The precinct s vote for Barack Obama and the precinct s voter registration are negatively associated with the interviewer s distance from their assigned location. These later two variables show that during the 2008 election exit poll, nationally, the interviewers traveled a shorter distance when the precinct had a higher voter registration and a higher Obama (Democratic) percent vote. This indicates that polling locations with higher Obama percent vote tends to have interviewers who did not travel as far. Table 3 shows that the lower the percent voting for Obama the farther the interviewers traveled 3. 3 Note that for two (2) precincts the actual Obama percent vote for two precincts was not available when this analysis was conducted. 13

14 Figure 5: Age Interviewers Completion Rate Table 2: Spherical Distance Regression Model Variable Estimate (Std Err) Standardized Estimate Sig. Shyness Assessment (5) 3.2 (1.91) Actual % Voting Obama (4.94) *** Voter Registration (.001) *** Intercept 28.5 (3.37) *** *** p<.001, ** p<.01, * p<.05, - p<.10 The interviewer s change in elevation provides topographical insight into the interviewer s characteristic. There is no apriorireason to investigate the change in elevation. However, the elevation data was easily obtainable and readily available. Furthermore, it provides a unique perspective into the way that exit poll interviewers must travel to get to their assigned polling location. Table 4 indicates that the only two variables that significantly contribute to the regression model are voter cooperation and actual percent voting for Barack Obama. 14

15 Figure 6: Age Interviewers Completion Rate Table 3: Distance and Percent Vote for Obama Actual % Voting Obama Distance Precincts < Interviewers Proximity When Conducting the Survey An interesting component when determining how well an interviewer performs in the national sample is the distance the interviewer is required to stand from the polling location. Each interviewer was asked how far they were required to stand from the polling location. There appears to be a very distinct change in completion rate when comparing where the interviewer stands. 15

16 Figure 7: Age 60+ Interviewers Completion Rate Table 4: Interviewer Elevation Change Regression Model Variable Estimate (Std Err) Standardized Estimate Sig. Voters Cooperative (Yes).019 (.007).170 *** Actual % Voting Obama.034 (.017).118 * Intercept (.010) ** *** p<.001, ** p<.01, * p<.05, - p<.10 When comparing the interviewers who worked inside to the other groups we can obtain a list of completion rate contrast estimates. The contrasts shown in Table 6 indicate that interviewers who stand inside the building of the polling location achieve a significantly higher completion rate. A Bonferroni correction for joint confidence intervals was applied to the lower and upper bounds: To further shed light on how the completion rate decreases when the interviewer s proximity to the building gets farther away we investigated the interviewers who were inside the building compared to all other interviewers. The contrast estimate (using 16

17 Figure 8: Elevation Change by State for Exit Poll Interviewers α = 0.10) of interviewers who were inside the building versus those of any distance outside the building are shown in Table 7. Similarly, interviewers who were first positioned away from the polling location and then later allowed to move closer to the building also achieved a higher completion rate (see Table 8). By allowing interviewers to stand closer they are able to achieve a much higher completion rate as opposed to interviewers remaining at the same positions or interviewers who are moved farther away. Likewise, the contrast estimates (seen in Table 9) for interviewers who were not required to move from their original survey location versus those that were moved suggests (using α =0.10) that when interviewers are able to be closer they are able to improve their completion rate. 17

18 Table 5: Completion Rates by Survey Location Distance % of Interviewers Completion Rate Inside the Building 33% 47.8% Right Outside the Building 30% 42.5% 10 to 24 Feet 15% 40.1% 25 to 49 Feet 13% 39.2% 50 to 99 Feet 7% 42.2% 100+ Feet 2% 35.3% Overall 100% 43.1% Table 6: Contrasts of Survey Locations of the Exit Poll Interviewer Contrast CR Difference Std Err Lower Upper p Inside - Right Outside Inside <.01 Inside <.01 Inside Inside Bonferroni joint confidence interval adjustment is applied to lower and upper bounds 5 Discussion In this discussion we specifically address the geospatial and completion rate characteristics of the 2008 exit poll. This analysis on exit poll interviewers does not provide any causal relationship. However, it appears that certain groups of characteristics seem to be associated with completion rate and other geospatial variables. The geospatial component to the interviewer provides a distinctive perspective on the interviewers. Combining that with the interviewers completion rates allows for a unique analysis on exit poll interviewers. Though completion rate is not the silver bullet of an exit poll survey it does provide additional insight into the performance of the interviewer and the survey in general. Our findings suggest that during the 2008 election exit poll the best fitting regression model for completion rate contains four variables: voter cooperation, inter- 18

19 Table 7: Contrasts of Inside v. Outside Locations of the Exit Poll Interviewer Contrast CR Difference Std Err Lower Upper p Inside - Anywhere Outside <.01 Table 8: Summary of How Interviewers Were Moved on Election Day How Positioned Completion Rate Remained at the Same Spot 42.7% Moved Farther 41.1% Allowed to Move Closer 49.2% viewer s starting location, weather, and interviewer age. We did not find any direct link between the interviewer s geospatial proximity between their home and their assigned polling location. However, our findings suggest that the distance the interviewer was required to stand when conducting the survey is positively associated with completion rate. Additionally, the weather, interviewer age, and the interviewer s assessment of voter cooperation are all associated with completion rate. 6 Conclusion Our analyses suggest several key factors when assessing interviewers geospatial proximity to the polling location and using completion rate as a measure of performance. Our data suggests that there are four key elements that will improve an exit poll interviewer s completion rate. First, proximity to the building when administering questionnaires and conducting the survey is a critical factor. It is clear that once an interviewer arrives at the polling location it is important for them to get as close to the building as possible. Interviewers who are inside the building perform significantly better than interviewers 19

20 Table 9: Contrasts of Interviewing Movement of the Exit Poll Interviewer Contrast CR Difference Std Err Lower Upper p Remain - Farther Remain - Closer Bonferroni joint confidence interval adjustment is applied to lower and upper bounds who are required to stand outside. Second, weather has a direct impact on completion rate. If an interviewer is allowed to stand inside the building during adverse weather then the data shows that the interviewer will have a much higher completion rate. Third, the interviewer s assessment of voter cooperation seems to be a de facto result of completion rate; when voters do not cooperate then they will not respond to an invitation to complete a survey. Finally, the data shows that the age of the interviewer is significant; interviewer age is positively associated with completion rate. 20

21 7 Appendix: Question Wording Voters Cooperative : How cooperative did you find the VOTERS at your location? Very Cooperative Somewhat Cooperative Not Very Cooperative Interviewer s Starting Location : Where were you permitted to stand to conduct your interview when you BEGAN interviewer? INSIDE the building Right outside of the building exit 10 to 24 feet away from the building exit 25 to 49 feet away from the building exit 50 to 99 feet away from the building exit 100 or more feet away from the building exit Weather : Did problems with the weather affect your ability to conduct your survey on Election Day? YES NO Interviewer Age : Date of Birth / / Interviewer s Self-Assessment of Shyness : On a scale of 1 to 5, with 1 being very shy and 5 being very outgoing, where would you place yourself?

22 References Edison/Mitofsky (2005). Evaluation of Edison/Mitofsky election system