MODELING THE WEEKLY DATA COLLECTION EFFICIENCY OF FACE-TO-FACE SURVEYS: SIX ROUNDS OF THE EUROPEAN SOCIAL SURVEY

Size: px
Start display at page:

Download "MODELING THE WEEKLY DATA COLLECTION EFFICIENCY OF FACE-TO-FACE SURVEYS: SIX ROUNDS OF THE EUROPEAN SOCIAL SURVEY"

Transcription

1 Journal of Survey Statistics and Methodology (217) 5, MODELING THE WEEKLY DATA COLLECTION EFFICIENCY OF FACE-TO-FACE SURVEYS: SIX ROUNDS OF THE EUROPEAN SOCIAL SURVEY CAROLINE VANDENPLAS* GEERT LOOSVELDT Understanding the dynamics of the data collection efficiency is key to developing strategies to increase response rates. In this article, we model data collection efficiency, specifically the number of completed interviews, number of contacts, ratio of completed interviews and contact attempts, and ratio of completed interviews and refusals per time unit in the field. We apply this concept to the first six rounds of the European Social Survey. The time course of efficiency for the surveys 148 country-round combinations is analyzed using a multilevel repeated measurement model in which the weekly data collection efficiency measures are repeated measurements nested in the surveys. The results show that the data collection has four main characteristics over time: the initial efficiency, the initial increase in efficiency (speed), the initial decrease in speed, and the start of a tail when the efficiency levels off. The values of these characteristics seem to be linked with the length of the data collection period. Moreover, across all surveys, the analysis suggests that although most interviews are completed and most contacts established in the first weeks, the fieldwork becomes more productive at the end, with fewer but more successful contact attempts. The covariance parameters, however, show large differences between the surveys in terms of fieldwork dynamics. The results from a model with explanatory variables show that more interviewers, a good survey climate, and a nonindividual frame type lead to a higher efficiency, while higher percentages of refusal conversion slow down its decrease. KEYWORDS: Data collection efficiency; Fieldwork process; Response-enhancement techniques; Response rates. The authors thank the associate editor and the reviewers for their helpful comments. *Address correspondence to Caroline Vandenplas, Centre for Sociological Research Parkstraat 45 - box Leuven, Belgium; caroline.vandenplas@kuleuven.be doi: 1.193/jssam/smw34 Advance access publication 13 January 217 VC The Author 217. Published by Oxford University Press on behalf of the American Association for Public Opinion Research. All rights reserved. For Permissions, please journals.permissions@oup.com

2 Modeling Data Collection Efficiency INTRODUCTION Over the last twenty years, two phenomena have been observed by survey researchers. First, response rates have been decreasing, as shown in the much cited paper by de Leeuw and de Heer (22) and supported by more recent publications (Dixon and Tucker 21; Bethlehem, Cobben, and Schouten 211; Brick and Williams 213; Kreuter 213). Lower response rates are undesirable because they increase the potential for nonresponse bias and reduce the precision of survey estimates, both of which lead to less accurate survey outcomes. Second, in an attempt to control the decline in response rates, fieldwork efforts have been expanded and fieldwork strategies have been improved (Koch, Fitzgerald, Stoop, Widdop, and Halbherr 212). These efforts are aimed at increasing the number of completed interviews and hence the response rate as well as balancing the responding sample so that a lower response rate does not necessarily produce higher nonresponse bias (for example, Groves and Couper 1998; Peytchev, Baxter, and Carley-Baxter 29; Stoop, Billiet, Koch, and Fitzgerald 21). To develop an efficient fieldwork strategy, it is important to understand how the fieldwork yield progresses during the data collection period. In the current article, we model data collection efficiency per time unit instead of after the data collection period is over. Such an analysis should offer a deeper understanding of the mechanisms involved: when is the data collection efficiency at its highest, when is it at its lowest, when is it increasing or decreasing, and what can we expect in the next weeks, days, hours (depending on the time unit chosen)? Analyzing the efficiency per time unit should allow for a more efficient management of resources during the data collection period, with a view to improving the response rate. A sign of decrease in efficiency could generate a reaction: a switch of mode, a retraining of the interviewers, an increase in the offered incentive. Modeling the data collection efficiency per time unit might also help to explain some nonintuitive findings, such as the negative association between response rates on the one hand, and the data collection duration or the percentage of refusal conversion attempts on the other hand (Vandenplas, Loosveldt, and Beullens 215). Modeling of the data collection efficiency also allows us to assess the impact of different survey characteristics such as the sampling frame, number of interviewers, mode of contact, mode of data collection, intensity of refusal conversion effort, increase in incentive, or survey climate on the time course of the fieldwork. In the next section, we introduce and define the key concepts that we use to model the progress of the data collection: data collection efficiency. 2. DATA COLLECTION EFFICIENCY To study the data collection yield per time unit, we need to consider the fieldwork as a machine, designed to produce completed interviews (or returned

3 214 Vandenplas and Loosveldt questionnaires), contacts, refusals, etc. Although data collection efficiency could be defined in numerous ways, we concentrate on four specifications. These relate to the potential of the fieldwork per time unit: The weekly number of completed interviews, the weekly number of contacts (independent of the outcome), the weekly ratio of the number of completed interviews to the number of contact attempts (productivity), and the weekly ratio of the number of completed interviews to the number of refusals (performance). The time unit can also take different forms: hours, days, or weeks. The most appropriate form depends on the conditions of the fieldwork (such as the contact and data collection mode): the measurement points should be frequent enough to analyze the dynamics but spaced enough to be able to gather the necessary information and to avoid reporting unimportant fluctuations. The definition of the time unit is also linked to the mode of contact and data collection: web surveys have almost continuous information on who is logged in or has completed the surveys, while face-to-face surveys may have a longer delay to have summary information on the progress of the fieldwork, relying on the interviewer timely reports (although technology is progressing here). The European Social Survey, on which we based our analysis, is a face-to-face survey. We will measure time in number of weeks. From now on, we will concentrate on face-to-face surveys and more specifically on the European Social Survey, although all the concepts and analyses presented in this paper could be adapted to surveys with other data collection designs and modes. As we want to compare surveys with different sample sizes, we standardize the number of completed interviews, contacts, and refusals to 1 sampling units. Thus, we replace the number of completed interviews, contacts, and refusals with the corresponding standardized numbers, dividing the original values by the sample size and multiplying the outcome by 1. The time course of the data collection efficiency can be presented as a graph, with the weeks of data collection on the x-axis and the data collection efficiency on the y-axis. In the next section, we describe the data used in our analysis. Subsequently, we specify the model for the weekly data collection efficiency and discuss the factors that can have an effect on this. 3. DATA To explore the data collection efficiency introduced in the previous section and to model its time course, we use data from the European Social Survey (ESS). 1 For each round of this academically driven survey, run biennially since 22, a random sample is selected in each participating country. Data is then collected 1. More information about the ESS and the specifications for its implementation is available on the website

4 Modeling Data Collection Efficiency 215 through standardized computer-assisted or paper personal interviews lasting about 6 minutes. From the first six rounds, a total of 148 surveys round-country combinations can be analyzed. Although each survey follows the ESS guidelines, they each have their own particularities, which could influence the time course of data collection efficiency. These properties include, among others, the sampling procedure, the contact procedure, whether or not a refusal conversion procedure was applied, and the duration of the fieldwork. Appendix 1 contains tables showing the different fieldwork characteristics of the various surveys, as well as their fieldwork outcomes in terms of response rates, contact rates, and percentages of ineligibles. 4. MODELING THE WEEKLY DATA COLLECTION EFFICIENCY OVER THE DATA COLLECTION PERIOD We analyze the weekly efficiency of the data collection and its temporal dynamics for the 148 ESS surveys in two steps. Our aim in the first step is to describe a general time course of efficiency for the ESS over all surveys. We therefore use a repeated-measurement multilevel model with the 148 surveys as macro units and the weekly efficiency as repeated measurements at the lowest level (Model 1). Each survey contributes to the estimation of the model parameters to a different extent, depending on the duration of their data collection. Indeed, some surveys lasted only two weeks and others almost a year. From this analysis, we obtain a description of the shape of the weekly data collection efficiency in ESS (determined by the fixed effects) from which each survey is going to deviate (random effects). Some surveys have a higher efficiency in the first week, which drops sooner than the general ESS shape, while others have lower efficiency at the start, which is maintained higher for a longer period. Given the large range of the data collection period, we repeat the overall analysis, grouping the surveys by their duration into four groups: very short (fewer than 9 weeks), short (between 9 and 18 weeks), long (between 18 and 27 weeks), and very long (more than 27 weeks) surveys. We comment on the summary shapes of the data collection efficiency of the four different types of surveys. In the next step, we expand this model by adding some survey design characteristics that can influence the weekly efficiency (Model 2). In doing so, we partly explain why there are such large differences in the way the data collection evolves in term of efficiency for the different considered ESS surveys. 4.1 The Basic Model First, the shape of the weekly efficiency over the data collection period is examined. For each survey, we calculate the data collection duration, the sample

5 216 Vandenplas and Loosveldt size, the weekly number of interviews completed, the weekly number of contacts, the weekly ratio of the number of completed interviews to the number of contact attempts (productivity), and the weekly ratio of the number of completed interviews to the number of refusals (performance). These statistics are based on data in the contact forms (the short questionnaire that interviewers are asked to complete for each contact attempt). Among other things, the form records the date, time, and outcome of the contact attempt (contact, noncontact, refusal, ineligible, appointment, interview, etc.). The curves, based on the weekly data collection efficiency over the data collection period of all the surveys in round 6, are presented in the following figures. Figure 1 displays the number of weekly completed interviews, figure 2 the weekly number of contacts, figure 3 the weekly productivity, and figure 4 the weekly performance, each week for all countries in round 6. The information in the figures hints at two general shapes of the data collection efficiency. These two shapes are illustrated in figure 5. The first shape (bold line) is a monotonic decline in efficiency that levels off toward the end of the data collection period (the specific week of that leveling off is survey dependent). The second shape (dashed line) is an increase in efficiency followed by a monotonic decline that levels off. These shapes have four distinguishing features: the initial efficiency, the initial speed (amplitude of the decrease or increase of the efficiency), the initial acceleration (how fast the decrease becomes larger or the increase becomes smaller), and the start of the tail, which is Number of completed interviews BE DK HU NO SK BG EE IE PL UA CH ES IL PT XK CY FI IT RU CZ FR LT SE DE GB NL SI Fieldwork weeks Figure 1. Evolution of the Weekly Number of Completed Interviews.

6 Modeling Data Collection Efficiency 217 BE BG CH CY CZ DE 2, 1, DK EE ES FI FR GB Number of contacts 2, 1, 2, 1, 2, 1, 2, 1, HU NO SK IE PL UA IL PT XK IT RU Fieldwork weeks Figure 2. Evolution of the Weekly Number of Contacts. Number of completed interviews/number of contact attempts BE DK HU NO SK BG EE IE PL UA CH ES IL PT XK CY FI IT RU Fieldwork weeks LT SE CZ FR LT SE NL SI DE GB NL SI Figure 3. Evolution of the Productivity.

7 218 Vandenplas and Loosveldt Number of completed interviews/number of refusals BE DK HU NO UA BG EE IE PL XK CH ES IL RU Fieldwork weeks Figure 4. Evolution of the Performance. Efficiency Weeks Figure 5. The Theoretical Shapes of the Weekly Data Collection Efficiency. defined as the point when the final decrease levels off. These shapes also coincide with the general experience of surveys in ESS. A large portion of the interviews are completed in the first five to seven weeks (or sooner) in most surveys, followed by a less productive period that can last almost a year to reach the minimal required responding sample size. Because we distinguish four important features for the weekly data collection efficiency, a cubic formulation is needed in the multilevel model. The model studied is as follows: CY FI IT SE CZ FR LT SI DE GB NL SK

8 Modeling Data Collection Efficiency 219 Ps; ð wþ ¼ b þ b 1 w þ b 2 w 2 þ b 3 w 3 þ s;w ; (Model 1) b ¼ c þ u s ; b 1 ¼ c 1 þ u 1s ; b 2 ¼ c 2 þ u 2s ; b 3 ¼ c 3 ; where w runs from to 32 (with representing week 1), 2 s stands for the surveys, and the residual term is normally distributed, s;w Nð; r 2 Þ: The weeks are counted from the first contact attempt of the specific surveys. We do not consider the data collection period for each sampling unit, which would be defined as the number of weeks since the address has been released. This decision is supported by our aim to describe the data collection as a whole and not the data collection for the particular sampling units. Spreading the release of the addresses over time is, however, a fieldwork strategy that may influence the general shape of the weekly efficiency. The random effects u s ; u 1s; and u 2s ; are the survey-specific intercept, linear, and quadratic parameters that express the survey-specific deviations from the overall evolution. The random effects are from multivariate normal distributions with zero expectations. As we are mainly interested in the general shape of the weekly efficiency and deviations during the first weeks, we consider the cubic term as fixed effect (no random part). Adding a random cubic effect was almost never significant and, for some models, 2 did not lead to convergence. 3 r 2 r ;1 r ;2 The level 2 covariance matrix is given by 6 4 r ;1 r 2 1 r 1;2 7 5.Intotal, r ;2 r 1;2 r 2 2 2,58 measurements were entered in the model. The variances of the survey-specific parameters (intercept, linear regression parameter, and quadratic regression parameter) are displayed on the diagonal of the matrix. These parameters are useful to test the differences between surveys in the characteristics of the data collection efficiency evolution. The covariances (off-diagonal elements of the matrix) are informative to assess the relationship between different characteristics of the data collection efficiency evolution (e.g., the initial efficiency and speed) Distribution shape of the weekly efficiency over the data collection period the general ESS curve. In table 1, we show the model parameters estimates for the time course of data collection efficiency. This is the weekly number of completed interviews, 2. For the model to converge, we limit the analyses to 32 weeks. Only Ireland in round 6 (46 weeks) and the Netherlands in round 4 (39 weeks) exceeded 32 weeks of fieldwork.

9 22 Vandenplas and Loosveldt the weekly number of contacts, the productivity (the ratio of the number of completed interviews to the number of contact attempts), and the performance (the ratio of the number of completed interviews to the number of refusals). The main result is that the fixed effects of the multilevel model support the hypothesized overall distribution of the data collection efficiency in figure 5. The overall ESS shape of the weekly efficiency can be described as follows. Looking at the intercept values: on average over all the surveys, in the first week of the fieldwork 4.83 interviews out of 1 are completed, contacts with sampling units are established (possibly the same sample unit more than once), 22.4 percent of the contact attempts are converted into completed interviews, and 2.59 more completed interviews are achieved than refusals. Moreover, the positive sign for the linear terms points to an overall increase in efficiency (positive speed) at the beginning of the fieldwork (dashed line shape). The negative quadratic coefficients indicate a leveling off of the increase in efficiency after a few weeks. The significant positive regression cubic coefficients for all interpretations of the data collection efficiency aspects support the necessity for a tail in the shape. If we consider the efficiency as being the number of completed interviews in a week, we can also conclude from the coefficient in the first column of table 1 that, averaged over all the surveys, 34 percent of the total sample become completed interviews after six weeks, about 5 percent after 11 weeks, and 57 percent after 16 weeks, as shown in figure 6. To obtain this result, we calculate the weekly number of completed interviews for each of the first six weeks and sum them. The overall weekly number of completed interviews in a specific week is given by 4.83 þ.23 w -.5 w 2 þ 1 E -3 w 3. Interestingly, the standardized number of completed interviews is closely related to the response rate: without ineligible cases in the sample, both values would be the same. Table 1. Fixed Effects for the Multilevel Model Describing the Weekly Efficiency of Data Collection (Model 1) Effect Completed interviews (SE) Contacts (SE) Productivity (SE) Performance (SE) c intercept 4.83 (.35)*** (.8)***.224 (.12) *** 2.59 (.17)*** c 1 linear term.23 (.7)**.42 (.18)*.14 (.3)***.18 (.8)* c 2 quadratic.5 (4 E -3)***.13 (.1)***.1 (2 E -4)***.3 (.1)*** term c 3 cubic term 1 E -3 (1 E -4)*** 3 E -3 (3 E -4)*** 1 E -3 (6 E -6)*** 1 E -3 (2 E -4)*** *p <.5, **p <.1, ***p <.1.

10 Modeling Data Collection Efficiency 221 Standardized number of completed interviews % Chart Title 16% Series Series Weeks Figure 6. The Time Course of the Weekly Number of Completed Interviews. Standardized number of contacts Weeks Figure 7. The Time Course of the Weekly Number of Established Contacts. Figure 7 illustrates the general evolution of the weekly number of contacts for the ESS. The area under the curve in this figure cannot be interpreted as representing the numbers of the full sample that have been contacted because an individual (or household) can be contacted more than once. An alternative specification for the efficiency of the data collection, which would be closely related to the contact rate, would be the weekly number of newly contacted individuals, but it is not considered here. Overall, around 14 contacts for a sample of 1 units are established each week in the first five weeks of the fieldwork. Subsequently, the number of established contacts rapidly declines, showing once more that the core of the data collection production occurs during the first weeks of the field period. However, one should bear in mind that 8%

11 222 Vandenplas and Loosveldt Productivity (%) Weeks Figure 8. Evolution of the Weekly Productivity. the number of contacts established each week will always decline because during the fieldwork some cases are classified as final and are not visited anymore (interview completed, individual recorded as ineligible or a hard refusal who should not be re-approached, etc.). Finally, in general at around 14 weeks, the decline starts to level off to form the tail. Figure 8 shows the time course of the weekly productivity (number of completed interviews/number of contact attempts). Surprisingly, although the model supports the hypothesized shapes, the productivity appears to increase over the data collection period. The increase in productivity initially levels off at around five weeks and then starts to rise again to reach a relatively steep level of growth at around eight weeks. This means that approaching the end of the fieldwork contact attempts are more successful. This probably reflects the evolving knowledge of interviewers, together with the organization of the survey during the data collection period. This allows for a more targeted approach, such as prioritizing appointments made at a previous contact or by telephone, avoiding recontacting hard refusals, or redistributing unfinished cases to the better, more experienced interviewers toward the end of the fieldwork. It also may be that at this point in time the survey managers choose to rework the cases that are most likely to become completed interviews. However, these are only hypotheses and further research should explore the possible explanations. Figure 9 illustrates the evolution of the weekly performance (number of completed interviews/number of refusals) over the field period. Across all ESS surveys, the performance grows until week 6 where it reaches almost three times more completed interviews than refusals. It then reaches its minimum of 2.6 more completed interviews than refusals in the 15th week. However, it should be noticed that the scale, and hence the difference, is quite narrow,

12 Modeling Data Collection Efficiency Perfomance Weeks Figure 9. Evolution of the Weekly Performance. ranging from 2.5 to a maximum of 3. The increase after week 15 is most probably an artifact of the model and its cubic term, as well as a consequence of the very few active cases left at this point, when many countries have nearly completed the fieldwork so that most cases must have already been finalized. Having learned a lot from the overall ESS curves through studying the fixed parameters, we can now turn to the covariance parameters (see appendix 2). The first important observation is that all variances of the intercepts, the linear, and the quadratic terms for all efficiency specifications are significantly (at a.1 level) different from zero. These results hint at different approaches to the allocation of the fieldwork efforts and their success in different countries and different rounds of the ESS. Some surveys start off with many completed interviews and contacts, and hence high productivity and performance, whereas others have a lower yield in the first week. Moreover, the residual variance of the model with the random slopes decreases by 36 percent compared with the residual variance of the only random-intercept model (8.94 percent). The covariance terms (see appendix 2) have the same sign across the different interpretations of the data collection efficiency and are almost all significantly different from zero (exceptions being the covariance between the intercept and linear as quadratic coefficients for the productivity and performance). Examining the scenario represented by the dashed line in figure 5, the negative covariance between the intercept and the linear term implies that the higher the intercept (which represents the efficiency in the first week), the lower the increase in efficiency is at the start of the fieldwork. This can be interpreted as follows. The better the start of the data collection period in terms of completed interviews or contacts established, the harder it is to improve the yield in the subsequent weeks. This is also true for the productivity and the performance, but the covariance is much smaller and not significant. Moreover, the positive

13 224 Vandenplas and Loosveldt covariance between the intercept and the quadratic coefficient indicates that the higher the intercept, the faster the efficiency increase levels off. This means that the higher the number of completed interviews, the number of contacts, the efficiency, or the performance at the outset, the more difficult it is to maintain a high efficiency. Last, the positive covariance between the linear and quadratic coefficient entails that the steeper the slope is at the beginning of the data collection period, the faster the efficiency increase levels off: the larger the increase in the number of completed interviews or contacts, the productivity, and the performance, the faster these increases level off Distribution shape of the weekly efficiency over the data collection period ESS curves for different survey types. Although the model for the ESS general curve already brought us some insight about the data collection process of these face-to-face surveys, modeling so many diverse surveys together may not be optimal. The surveys differ in many ways but one of the most relevant aspects here is the duration of the data collection period. Therefore, we report in table 2 the parameter estimates of Model 1 when the model is estimated for surveys with very short (fewer than 9 weeks), short (between 9 and 18 weeks), long (between 18 and 27 weeks), and very long (more than 27 weeks) data collection periods. For the weekly number of completed interviews and contacts established for all types of surveys, a shape similar to the general ESS curve appears (see the dashed line in figure 5). The only exception is that for very long surveys the number of contacts seems to stay constant over the data collection period. The performance among very short surveys and the productivity among very short, short, and long surveys seem to not vary much during the data collection period as the linear, quadratic, and cubic coefficients are not significantly different from at the.5 level. Even though the largest intercepts are to be found among short (and not very short surveys), there is a general trend indicating that longer surveys are also less efficient in the first week. Similarly, very short surveys have a large increase in efficiency in the first week, while short, long, and very longer surveys have a decline in this increase in efficiency in the first week. In summary, surveys with shorter data collection periods have a narrow high peak in efficiency at the very beginning of the data collection period and surveys with longer data collection periods have an extended lower peak or bulge in efficiency The Elaborated Model Having examined the general shape of the time course of data collection efficiency, we examine the fieldwork particularities that can influence this time course, focusing in particular on the weekly number of completed interviews. In general we assume three possible effects: (1) effects on initial efficiency, (2)

14 Modeling Data Collection Efficiency 225 Table 2. Fixed Effects for the Multilevel Model Describing of the Weekly Data Collection Efficiency (Model 1), by Groups of Surveys with Different Durations Effect Completed interviews (SE) Contacts (SE) Productivity (SE) Performance (SE) Very short (15 surveys) c intercept 3.91 (1.86) 7.74 (4.24).37 (.4)*** 2.81 (1.23)* c 1 linear term 1.29 (2.1)*** (4.13)***.1 (.5).3 (3.11) c 2 quadratic term 2.78 (75)** 6.83 (1.37)***.1 (.2).3 (1.4) c 3 cubic term.17 (.8)*.5 (.13)*** 1 E -3 (2 E -3).5 (.9) Short (57 surveys) c intercept 3.97 (.56)*** 11.4 (1.39)***.27 (.2)*** 3.42 (.46)*** c 1 linear term 1.96 (.22) 4.25 (.5)***.1 (.1).9 (.17) c 2 quadratic term.36 (.3)***.84 (.6)*** 2 E -3 (1 E -3)*.3 (.2) c 3 cubic term.1 (.)***.4 (2 E -3)*** 2 E -4 (4 E -5)*** 2 E -3 (1 E -3) Long (51 surveys) c intercept 2.34 (.34)*** 9.35 (1.6)***.16 (.2)*** 2.4 (.47)*** c 1 linear term.71 (.8)*** 1.72 (.23)***.2 (4 E -3)*** 2 E -3 (.14) c 2 quadratic term.9 (.1)***.23 (.2)*** 2 E -3 (4 E -4)*** 5 E -3 (.1) c 3 cubic term.2 (2 E -4)***.6 (5 E -4)*** 7 E -5 (1 E -5)*** 3 E -4 (4 E -4) Very long (11 surveys) c intercept 1.74 (.42)** (2.84)***.12 (.2)*** 1.66 (.76) c 1 linear term.24 (.8)*.5 (.35).2 (.1)**.36 (.16)* c 2 quadratic term.2 (.)***.2 (.2) 2 E -3 (4 E -4)***.4 (.1)* c 3 cubic term 4 E -4 (8 E -6)*** 2 E -4 (4 E -4) 4 E -5 (8 E -6)*** 9 E -4 (3 E -4)*** *p <.5, **p <.1, ***p <.1. effects on the efficiency each week (the height of the shape), and (3) effects on maintaining high efficiency during the data collection (the width of the shape). The ideal would be for the shape of data collection efficiency curve over time to be as high and/or as wide as possible because the result that needs to be maximized (the final number of completed interviews, contacted units, etc.) corresponds to the area under this shape. The shape of the data collection efficiency curve can be affected by many things; we concentrate on three: the type of sampling frame, fieldwork strategies, and survey climate. First, the type of frame is expected to have an effect on the efficiency in the first week of the data collection period. Research concerning the impact of the sampling frame on nonresponse bias indicates that interviewers can play an important role in the sample selection procedure when a sample of households/ addresses is used. There seems to be a tendency to select less reluctant sample units when interviewers are responsible for the selection (Koch, Halbherr,

15 226 Vandenplas and Loosveldt Stoop, and Kappelhof 214). Therefore, we expect a negative effect of a sampling frame of individuals on the initial data collection efficiency. Second, fieldwork strategies can be very diverse. Since its first round in 22, the ESS has made great efforts to collect high-quality data to ensure cross-national and cross-cultural comparability. Much of this effort aims at obtaining a high response rate (directly linked to the standardized total number of completed interviews) and includes contact procedure protocols as well as advice on refusal conversion procedures. To increase the chance of contact, the ESS requires at least four contact attempts, at least one of which should take place in the evening and one at the weekend; the contact attempts should be spread over at least two weeks. The guidelines on refusal conversion procedures are less strict, recommending re-issuing all soft refusals and as many "hard" refusals as possible (Billiet, Koch, and Philippens 27), as well as assigning a different, more experienced interviewer for refusal conversion (European Social Survey211; Koch et al. 212). Interviewers classify refusals as "hard" (will definitely/probably not cooperate in the future) or "soft" (may possibly/will cooperate in the future). We analyze the effect of compliance with the ESS guidelines in terms of contact attempts and refusal conversion, as well as the effect of the percentage of refusal conversions on the weekly data collection efficiency. The contact procedure and refusal conversion strategy are expected to influence the width of the curve and the leveling off of the increase or decrease in the weekly number of completed interviews. Therefore, for each survey a contact score and a refusal score is calculated, as described in appendix 3. Furthermore, the percentage of conversion attempts is calculated as the percentage of sampled people who initially refused but were subsequently re-approached. The number of interviewers active each week is considered as a fieldwork strategy factor that may influence the shape of the weekly data collection efficiency curve in terms of its height. A standardized version of the number of interviewers is used, with the weekly number of active interviewers divided by the sample size and multiplied by 1. Third, the survey climate is generally defined as the ease of contact and the ease of obtaining cooperation, or the general propensity of the target population to participate in surveys. It is influenced by many factors intrinsic to each specific survey, but also by external factors such as the overall level of respondent burden resulting from market research. Although no indicator exists for the survey climate, it is measured here as the response rate for the first contact attempt. It is clear that this specific measurement of the survey climate will have a direct influence on the initial efficiency (intercept). Clearly, this list of possible factors influencing the shape of the weekly data collection efficiency is not exhaustive. Many other factors probably play a role in the weekly efficiency of the data collection: mode of contact used (although in ESS other mode for the first contact than face-to-face is exceptional), experience of the interviewers, incentives, geographical constraints, or specific

16 Modeling Data Collection Efficiency 227 country legislation, to name a few. Although we strongly believe that one of the strengths of our analyses is the large number of surveys, which permits us to evaluate the model quantitatively, it is also a weakness. ESS surveys are conducted in different years and different countries; it is therefore not straightforward to gather information about these additional survey characteristics (e.g., the use of incentives) in a comparable way. We will therefore concentrate on these factors, acknowledging that other factors could influence the weekly efficiency. To test our hypotheses, we elaborate on Model 1 and analyze the following model (Model 2): Ps; ð wþ ¼ b þ b 1 w þ b 2 w 2 þ b 3 w 3 þ b 4 standardized no: of interviewers þ s;w ; b ¼ c þ c 1 % interviews completed after 1st contact attempt þ c 2 frame type þ u ; b 1 ¼ c 1 þ u 1 ; b 2 ¼ c 2 þ c 21 contact score þ c 22 refusal score þ c 23 conversion attempts þ u 2 ; b 3 ¼ c 3 ; where w runs from to 32 (with representing week 1), s stands for the surveys, and s;w Nð; r 2 Þ. The same covariance matrix is specified as for Model 1. The model parameters estimates are shown in table 3 (fixed effects) and table 4 (covariance). Due to missing information in the contact forms, only 118 surveys and 2,173 measurements remain in the analysis. The standardized number of interviewers has a highly significant effect on the standardized weekly number of completed interviews. In general, per 1 sample units, each extra interviewer generates 2.6 additional completed interviews each week over all the surveys. Moreover, the survey climate and the type of frame influence the shape of the data collection efficiency curve significantly. Surveys with a better survey climate and a household or address frame have a larger intercept and, hence, more completed interviews per 1 sampling units in the first week. For 1 percent more completed interviews at the first attempt, the standardized number of completed interviews in the first week goes up by.6, while surveys with a nonindividual sampling frame produce 1.2 more interviews per 1 sampling units in the first week. The last result is in line with our expectation that surveys with an individual sampling frame will have lower efficiency at the beginning because the interviewers have fewer selection opportunities. Third, although the refusal score and the contact score have no significant effect on the quadratic term, the percentage of refusal

17 228 Vandenplas and Loosveldt Table 3. Fixed Effects for the Multilevel Model Testing the Effects of Some Fieldwork Characteristics on the Weekly Data Collection Efficiency (Model 2) Effect Estimate (SE) c intercept 2.28 (.46)*** c 4 standardized no. of interviewer active in w 2.64 (.4)*** c 1 survey climate.6 (.1)*** c 2 sampling frame (not individual) 1.16 (.28)*** c 1 linear term.2 (.5)*** c 2 quadratic term.1 (.) c 21 contact score 4 E-6 2 E -5) c 22 refusal score 2 E-5 (4 E -5) c 23 percentage of refusal conversion attempts 2 E-4 (5 E -5)*** c 3 cubic term 1 E-5 (1 E -4) *p <.5, **p <.1, ***p <.1. Table 4. Coefficient Parameters Covariance parameters Estimate (SE) r 2 variance of intercept 5.22 (.8)*** r 2 1 variance linear term.11 (.3)*** r 2 2 variance quadratic term 2E-4 (8 E -5)** r ;1 covariance intercept/linear.28 (.12)* r ;2 covariance intercept/quadratic 9 E-4 (6 E-3 ) r 1;2 covariance linear/quadratic 2 E-4 (8 E-5 )* r c;w residual variance 2.11 (.7)*** *p <.5, **p <.1, ***p <.1. conversion attempts helps to maintain the efficiency at a higher level for a longer period. The variances are still all significantly different than zero (alpha ¼.1), implying that the differences between surveys are only partially explained by the fieldwork characteristics being examined. Comparing the base model (with the same 118 surveys) with the elaborated model holds the following R 2 at the different levels (see Bryk and Raudenbush 1992, p. 65), R 2 1 ¼ :64; R2 2 ¼ :65; R2 b 1 ¼ :7, and R 2 b 2 ¼.16. This yields the results

18 Modeling Data Collection Efficiency 229 that 65 percent of the residual variance, 64 percent of the intercept variance, 7 percent of the variance of the linear regression parameter, and 16 percent of the variance of the quadratic parameter are explained in the elaborated model compared with the base model (Model 1). The covariance parameters between the intercept/quadratic term and linear term remain significant, suggesting that the number of interviews completed in the first week influences its increase/decrease at the start of the fieldwork. In turn, the amplitude of this increase/decrease influences how fast the efficiency levels off or how fast the decrease becomes steeper. 5. DISCUSSION The present study introduces the data collection efficiency per time unit as a tool to analyze the dynamics of the fieldwork process. We considered four specifications for the efficiency: the weekly number of completed interviews, the weekly number of contacts, the weekly ratio of the number of completed interviews to contact attempts (productivity), and the weekly ratio of the number of completed interviews to refusals (performance). We use a multilevel model with repeated measurements to apply this concept to the first six rounds of the European Social Survey. In this model, the macro-level units are defined as the 148 survey-round combinations, and within each survey the secondlevel units are the weekly measures of efficiency. The main finding of our analysis is that the weekly data collection efficiency follows the hypothesized cubic shape. The shape has four important characteristics: The initial efficiency, the initial rate of the increase in efficiency (positive linear term in the model), a leveling off of the initial increase (negative quadratic term in the model), and a tail (significant positive cubic term). The results show that for all the specifications of the data collection efficiency, the overall curves for the ESS increase at the start of the fieldwork. In the case of completed interviews and contacts, the curve then reaches a maximum, subsequently decreases, and finally levels off to form a tail. The weekly productivity and performance both show an increase toward the end of the fieldwork, which can be hypothesized to be due to a reduction of the number of cases to be worked and a more tailored approach to the remaining cases based on greater knowledge about these cases. From the study of these curves, it also seems that fieldwork reaches a turning point around five to six weeks after the start, with a third of the sampling units being converted to completed interviews, the productivity reaching its maximum and the performance its minimum. The results clearly indicate that the model can be considered a good general description of the data collection efficiency in a large number of face-to-face surveys. A more detailed analysis of the ESS surveys, which were classified by the duration of their data collection period, reveals that the surveys with shorter field periods have higher efficiency at the start and a steeper increase of

19 23 Vandenplas and Loosveldt efficiency than the surveys with longer field periods, but also that the efficiency drops sooner when the data collection period is shorter. We further explain differences between surveys in the time course of the weekly number of completed interviews. The main findings are that the type of sampling frame, the survey climate, and the number of active interviewers have an impact on the number of interviews each week (the height of the curve), whereas fieldwork strategies mostly have an impact on the period during which the number of interviews completed each week remains high (the width of the curve). Indeed, to increase the number of completed interviews before the start of the tail, the area under the curve needs to be amplified by increasing the weekly efficiency or maintaining this efficiency at a high level for a longer time. Two strategies can be exploited: (1) involving as many active interviewers as possible every week, thereby implicitly increasing the number of contact attempts, and (2) re-approaching as many refusals as possible. Both these strategies may have an influence on the survey costs but highlight the importance of having enough interviewer power to conduct a face-to-face survey. Furthermore, the number of interviews completed in the first week is influenced by the survey climate as measured by the response rate after the first contact attempt and the type of sampling frame. The number of completed interviews in the first weeks increases with a better survey climate and a sampling frame that does not consist of individuals. The first finding is, partly due to the operationalization of the survey climate, as expected. The second is in line with previous research about the impact of sampling design on nonresponse bias and the role played by the interviewer therein. The basic and elaborated models show how useful the concept of data collection efficiency per time unit can be in the analysis of the fieldwork dynamic. Such analyses can assist decision-making in the design and the budget preparation for future surveys. There are, however, several limitations to this research. The large number of surveys involved in the model is a strength but also a drawback because specific deviations from the general hypothesized cubic shape cannot easily be considered. One characteristic of the time course of fieldwork that is not modeled in this research is a possible "double-peak" in data collection efficiency (two increases in the efficiency, such as those observed in Belgium, Germany, and Ireland during round 6 of the ESS); this might occur when a new set of interviewers is activated or if the sample units are released in two steps. The very general approach also involves a lack of precision or knowledge regarding the fieldwork strategies applied. One example is the use of incentives, which are difficult to include because their nature and value vary substantially across countries and rounds. Moreover, the analysis relies on the quality of the contact forms. Especially in the first rounds, we cannot be certain that all the contact attempts were properly recorded in a timely manner. A number of surveys had to be removed from our analysis due to missing information in the contact forms, such as date, time, or interviewers, reducing the strength of the analysis. However,

20 Modeling Data Collection Efficiency 231 great efforts have been made to improve the quality of this kind of paradata, which are important in survey methodology research. The new concepts raise new research questions, one of which is the impact of differences in the weekly data collection efficiency on data quality. Future research should also investigate whether the increase in efficiency and performance at the end of the fieldwork reflects the increasing knowledge of interviewers and better survey strategies regarding the remaining cases leading to more tailored fieldwork or if it could be instead due to less strict application of some rules intended to guarantee data quality. Researchers and survey managers might consider weekly efficiency as a tool to monitor and manage the fieldwork process and progress. A weekly comparison of the time course of data collection efficiency with the evolution in previous years could permit the early detection of problems that could lead to a low final response rate, and early interventions to increase the weekly data collection efficiency could be undertaken. Finally, the concept of data collection efficiency per time unit is not limited to computer-assisted personal interviews. The temporal dynamics of Internet surveys could also be analyzed in a similar way to the approach taken here. An interesting question is whether the shape of the efficiency curve would be similar to the one found for face-to-face surveys here. Supplementary Materials Supplementary materials are available online at org/our_journals/jssam/. References Bethlehem, J., F. Cobben, and B. Schouten (211), Handbook of Nonresponse in Household Surveys, Hoboken, NJ: John Wiley & Sons. Billiet, J., A. Koch, and M. Philippens (27), Chapter 9: Understanding and Improving Response Rates, in Measuring Attitudes Cross-nationally. Lessons from the European Social Survey, eds. R. Jowell, C. Roberts, R. Fitzgerald, and G. Eva, London: Sage Publications. Brick, J., and D. Williams (213), Explaining Rising Nonresponse Rates in Cross-Sectional Surveys, The ANNALS of the American Academy of Political and Social Science, 645, Bryk, A. S., and S. W. Raudenbush (1992), Hierarchical Linear Models: Applications and Data Analysis Methods, Newbury Park, CA: Sage. de Leeuw, E., and W. de Heer (22), Trends in Household Survey Nonresponse: A Longitudinal and International Comparison, in Survey Nonresponse, eds. R. Groves, D. Dillman, J. Eltinge, and R. J. Little, New York: Wiley. Dixon, J., and C. Tucker (21), Survey Nonresponse, in Handbook of Survey Research (2nd ed.), eds. P. Marsden and J. Wright, Bingley, UK: Emerald. European Social Survey (211), Round 6 Specification for Participating Countries, London: Centre for Comparative Social Surveys, City University London. Groves, R., and M. Couper (1998), Nonresponse in Household Interview Surveys, New York: Wiley.

21 232 Vandenplas and Loosveldt Kreuter, F. (213), Facing the Nonresponse Challenge, The ANNALS of the American Academy of Political and Social Science, 645, Koch, A., R. Fitzgerald, I. Stoop, S. Widdop, and V. Halbherr (212), Field Procedures in the European Social Survey Round 6: Enhancing Response Rates, Mannheim, Germany: European Social Survey, GESIS. Available at ESS6_response_enhancement_guidelines.pdf. Koch, A., V. Halbherr, I. A. L. Stoop, and J. W. S. Kappelhof (214), Assessing ESS Sample Quality by using External and Internal Criteria, Mannheim, Germany: European Social Survey, GESIS. Available at ple_composition_assessment.pdf. Peytchev, A., K. B. Baxter, and L. R. Carley-Baxter (29), Not all Survey Effort Is Equal: Reduction of Nonresponse Bias and Nonresponse Error, Public Opinion Quarterly, 73, Stoop, I., J. Billiet, A. Koch, and R. Fitzgerald (21), Improving Survey Response. Lessons Learned from the European Social Survey, Chichester, UK: John Breedveld Wiley & Sons Ltd. Vandenplas, C., G. Loosveldt, and K. Beullens (215), "Better or Longer? The Evolution of Weekly Number of Completed Interviews over the Fieldwork Period in the European Social Survey," paper presented at the Nonresponse Workshop.