Statistical Design and Estimation taking account of Quality

Size: px
Start display at page:

Download "Statistical Design and Estimation taking account of Quality"

Transcription

1 Statistical Design and Estimation taking account of Quality Chris Skinner NTTS, 23 February 2011

2 Quality Quality as accuracy Aim: quality improvement 2

3 Quality Context despite increasing threats to quality from nonresponse... increasing information about quality New survey data quality evaluation techniques have provided more information regarding the validity and reliability of survey results than was previously thought possible Biemer and Lyberg (2003) Unprecedented amounts of information about the data collection processes Groves and Heeringa (2006) 3

4 Quality improvement Quality control (during data collection & operations) Statistical design and estimation (before and after data collection) design data collection estimation 4

5 Statistical Design & Estimation Survey sampling approaches established by 1950s successful in dealing with sampling error Total survey error project to extend these approaches to take more realistic account of quality Measure quality as mean squared error with respect to all sources of error Scepticism about success of this project Platek & Särndal (2001), Groves & Lyberg (2010)

6 Quality Control Increasing use of quality indicators (key process variables, Biemer and Lyberg, 2003) during data collection May be based on paradata, data collected as part of survey process Monitoring of indicators may suggest design options Responsive design (Groves & Heeringa, 2006) - design decisions during data collection (making use of paradata) 6

7 Themes of Paper Thesis: opportunity to extend statistical design and estimation methodology to take better account of quality; but not straightforward. Extensions may be prompted by recent developments in quality control methodology & increased information on quality. Paper will survey some examples where quality concerns are of relevance to design and estimation and quality information is available 7

8 Outline of Remainder of Paper nonresponse measurement error summary and extensions 8

9 Nonresponse

10 Nonresponse in Surveys Quality indicators Design options Statistical design and estimation example 10

11 Representativity Indicators for Survey Quality Coordinated by Barrie Schouten, Statistics Netherlands 11

12 R-indicators Response propensities estimated probabilities of response given auxiliary variables measured on respondents and nonrespondents Auxiliary variables may be derived from frame data, paradata R-indicator measure of variation in response propensities R=1 if no variation, R=0 if maximum variation 12

13 Example: Short Term Statistics Monthly business survey at Statistics Netherlands in 2007 Two main categories of economic activity of interest: Retail (n=93,799) Industry (n=64,413) Schouten, Shlomo and Skinner (2010) 13

14 Timing Estimates required 30 days after end of reference period 3-5 days needed to process, edit, impute, aggregate data Interested in quality of estimates using data available between 25 and 30 days Auxiliary variables - business type and business size 14

15 15

16 16

17 Design Implications? Alternative stopping rules (number of days), possibly dependent on sector Targetting of underrepresented subgroups in order to maximise overall R-indicator within given number of days. Use of partial R-indicators to identify subgroups in relation to auxiliary variables. 17

18 Design (Intervention) options Stopping rules: varying number of contact attempts, visits or reminders Differential stopping rules according to subgroup (including subsampling for follow-up) Other targetting: - assigning better performing interviewers - assigning different modes of data collection - selective incentives 18

19 Tailored interventions assume interactions between intervention and respondent characteristics in effect on response outcome tailor treatments to respondents to enhance response (Wagner and Raghunathan, 2007) c.f. clinical trials and personalized medicine 19

20 Design specification 1. Specification of protocol before (wave of) data collection, where protocol depends on outcomes during data collection 2. Design decisions take place during data collection (responsive survey design: decisions taken between design phases, dependent on outcomes of previous phases Groves and Heeringa, 2006) 20

21 Design specified before data collection natural in repeated surveys, using information from previous waves dependence of protocol on outcomes during data collection almost inevitable in face of nonresponse e.g. call scheduling using interviewer call record data (Kulka and Weeks, 1988; D Arrigo, Durrant & Steele, 2009) e.g. decisions about nonresponse follow up using score function using imputed values derived from previous quarter s data (Mackenzie, 2000) 21

22 Statistical design & estimation Hansen and Hurwitz (1946) Subsample nonrespondents using more intensive intervention Treat responding & nonresponding populations as fixed strata for estimation 22

23 Example: Canadian National Household Survey 2011 voluntary survey in June 2011 replacing census long form sample of 4.5 million dwellings (30% ) paper questionnaire mailed out or dropped at door + internet for census internet respondents subsampling of nonrespondents for interview 23

24 Planning Assumptions 32-34% of households respond without follow-up 28-35% subsample of nonrepondents selected for follow-up 75% of subsample converted to respondents 24

25 Stratification of subsampling Stratification by collection unit (CU) - about 300 dwellings Uniform allocation subsample at fixed fraction Low response rate allocation only pick subsamples from CUs with response rate before follow-up of < 50% Simulation by Mike Bankier (Statistics Canada) 25

26 Uniform Allocation 100% 90% 80% 70% Final Response Rate 60% 50% 40% 30% 20% 10% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Initial Response Rate 26

27 Estimation Hansen Hurwitz approach fails under low response rate allocation because some strata with no sample. Could also have very high variance with targetted allocation designed to improve R-indicator. Hansen Hurwitz approach assumes no borrowing of information from initial respondents to estimation for nonresponse strata. Instead use double sampling estimation. Alternative would be pooled estimation or e.g. Bayesian approach (Ericson, 1967) 27

28 Statistical design & estimation in responsive designs Beaumont (2005) estimation using variable numbers of auxiliary variables observed during data collection Wagner and Raghunathan (2007) stratified allocation at different phases based on estimated response propensities 28

29 Measurement Error

30 Measurement Error Quality indicators Failed edits (including during data collection) Respondent behaviour, e.g. heaping, time to respond to questions, satisficing, other paradata Interviewer effects, estimated using multilevel modelling feedback to interviewer operations Interviewer observations 30

31 Interviewer observations How accurate do you think the answers given by the respondent were? English Longidtudinal Study of Aging (ELSA) Weekly earnings 31

32 32

33 Estimation using Quality indicators Latent class modelling (Biemer, 2011), especially with multiple indicators Treatment of outliers representing measurement error Downweighting error-prone observations Bias correction: - using more accurate measures on subsample - using reinterview data for bias correction 33

34 Downweighting error-prone observations Aim to improve efficiency estimate error model using accuracy indicators (Battistin et al., 2003) weight observations inversely proportional to estimated measurement error variances. efficiency gain potentially useful if measurement error variance non-negligible relative to observation variance but must take care not to introduce bias if accuracy indicator correlated with true variables of interest 34

35 Bias correction Zero-mean measurement error can lead to bias in estimation of distribution functions, gross flows, regression coefficients Can correct for bias: - using accurate measures on subsample - using estimates of measurement-error variances (continuous variables) or misclassification matrices (categorical variables) from reinterviews on subsample 35

36 Use of more accurate measures on subsample measurement of hourly pay, /hour, in United Kingdom Labour Force Survey X derived from questions on earnings and hours (derived variable) Y derived from direct question about hourly pay (direct variable) only on subsample Skinner et al. (2002), Durrant and Skinner (2006) 36

37 Direct Variable Direct Variable Derived Variable 37

38 Distribution of hourly earnings from 2 to 4 for age group 22+, JA Derived Variable Hot Deck Imputation (wor) NN10 Imputation Propensity Score Weighting Percent Hourly earnings in /hour 38

39 Using reinterviews to estimate gross flows Estimate misclassification rates using reinterviews Use to adjust estimator of e.g. proportion employed at two time points Unadjusted estimator biased, adjusted estimator unbiased in large samples Example of bias-variance trade-off in Fuller (1990) 39

40 MSE of unadjusted estimator/mse of adjusted estimator Sample size

41 Optimal design Assuming all interviews are of equal cost it can be demonstrated that about one fourth of the resources should be used for the reinterview study (Fuller, 1990)

42 Summary & Extensions

43 Summary Challenge: to extend methods of statistical design and estimation to take more realistic account of quality in particular, to make better use of informtion about quality Lessons from total survey error literature 43

44 Extensions Focus here on survey nonresponse & measurement error But should look beyond single survey Combining data sources from different frames, different surveys, administrative data, register data generating many estimation challenges, especially with quality varying between sources And need also to consider more complex sample size options for design when estimates required across time and place 44

45 Statistical Models models crucial for taking account of quality in estimation less reliance on probability sampling and design-based methods. 45

46 References Battistin, E (2003) What do we learn from recall consumption data, J. Human Resources Beaumont, J-F (2005) On the use of data collection process information Survey Methodology Biemer, P (2011) Latent Class Analysis of Survey Error. Wiley Biemer, P and Lyberg,L (2003) Introduction to Survey Quality. Wiley D Arrigo, J, Durrant, G & Steele, F (2009) Using field process data to predict best times S3RI Working P. Durrant, G and Skinner, C (2006) Using missing data methods to correct for meas. error Survey Meth. Ericson, W (1967) Optimal sample design with nonresponse, JASA Fuller, W (1990) Analysis of repeated surveys, Survey Methodology Groves, R and Heeringa, S (2006) Responsive design for household surveys, J. Roy. Statist. Soc. Series A Groves, R and Lyberg, L (2010) Total survey error: past, present & future, Public Opinion Quarterly Hansen, M and Hurwitz, W (1946) The problem of non-response in sample surveys, JASA Kulka, R and Weeks, M (1988) Toward the development of optimal calling protocols JOS Mackenzie, R (2000) A framework for priority contact of non respondents ICES 2 paper, Buffalo Schouten, B, Shlomo, N and Skinner C (2010) Indicators for monitoring and improving S3RI Working P. Skinner, C et al. (2002) The measurement of low pay in the UK LFS, Oxford Bull. Econ. and Statistics Wagner, J and Raghunathan, T (2007) Bayesian approaches to sequential selection ASA Surv. Res. 46

47 Cumulative distributions for cases where both variables observed Percent Derived Variable Direct Variable Hourly earnings in 47