SOME ASPECTS OF QUALITY ASSESSMENT OF USAGE OF ADMINISTRATIVE DATA IN OFFICIAL STATISTICS

Size: px
Start display at page:

Download "SOME ASPECTS OF QUALITY ASSESSMENT OF USAGE OF ADMINISTRATIVE DATA IN OFFICIAL STATISTICS"

Transcription

1 SOME ASPECTS OF QUALITY ASSESSMENT OF USAGE OF ADMINISTRATIVE DATA IN OFFICIAL STATISTICS Rudi Seljak Statistical Office of the Republic of Slovenia Sector for General Methodology and Standards Vožarski pot Ljubljana Key words: administrative data, quality assessment, quality dimensions 1 Introduction Use of administrative data in the production of the official statistics has exceedingly increased in the recent years. Although it has to be pointed out that such usage has been going on for quite many years, it is also the fact that the intention of administrative data usage has been essentially changed in the last decades. If in the past administrative data were mostly used for the purposes of sampling frame construction or as the auxiliary variable purposes in the estimation process, it is now more and more popular to use administrative data also as a direct data source. The motivation for such a wide range of activities in this area is most of all the clear possibility of essential budget cuts if the costly data collection in the case of classical statistical survey is replaced by much cheaper gathering of the data from administrative records. The tendencies for decreased survey costs are of course nothing new and the statisticians have always been faced with the demands for the cost cuts, but only the rapid development of the IT environment in the last decade, which now enables quick and efficient processing of the large data sets, has provided the technical platform which makes such a wide usage of different data sources feasible. On the other hand, also the level of understanding of usefulness of closer cooperation between the different government authorities has significantly increased in many countries. From the point of view of the statistical producer, the most useful result of such more intensive cooperation is the willingness for data-sharing. Although there are still some hesitations, mostly justified through the big brother danger arguments, more and more administrative records are available to the producers of official statistics. Although the advantages of the usage of administrative data are quite evident, especially in the field of costs and response burden reduction, it is also necessary to always consider the possible disadvantages and shortcomings of such practice. First of all we have to point out that it is too often believed that the administrative data that are coming to the statistical agency are free of errors, forgetting that also the administrative authorities use some kind of collection process, which also inevitably produces different kinds of errors in the collected data. Besides these measurement errors, which are due to the hidden collection process, some other quality aspects exist which are very much specific for the cases of surveys based on the administrative data and all these aspects should be studied as detailed as possible. In the paper we provide a brief overview of (in our opinion) the most important aspects of the quality assessment in the case of surveys where the administrative data are used as the direct data source. In the first part of the paper we will discuss some general issues connected with

2 the transition from traditional data collection to the usage of administrative records, trying to point out the main changes that such transition causes. In the second part we then move to the concrete quality dimensions as defined in the European Statistical System. We discuss the different role of these dimensions in the case of new approaches to data collection and how these changes in the quality assessment affect the quality reporting. The theoretical discussion will be then supplemented with two cases from the real practice in our statistical office. 2 Administrative data some general reflections In this section we discuss three aspects deriving from three questions, all of them referring to the general level of consideration about the main features of the administrative data which are planned to be used in the statistical process. 2.1 Do the administrative data reflect the reality? When using the administrative data one must never forget that the data were originally not collected for statistical but for administrative purposes. In practice these data are usually collected in the process of implementation of the law or some other administrative act. The direct consequence of this is that the administrative data are many times determined by the legal world. In other words, the administrative data tend to show the world as it is de iure rather than as it is de facto. Since the statisticians are more or less committed to describe the world as it is de facto, the discrepancy between the de iure and the de facto representation of the observed phenomena is one of the crucial points when quality of the administrative data based statistics is considered. Of course, there is no doubt that many times the two above mentioned worlds are very close and the administrative data can serve the statistical purposes perfectly, but sometimes these discrepancies shouldn t be neglected. 2.2 Which data are more accurate administrative or statistical? This is a very general question which is commonly stated when the quality of some administrative data source is questioned. When dealing with the concrete data source, the question could also be formulated as follows: Would the data, if gathered in the classical way, using the statistical questionnaire, be more accurate than the data obtained from the administrative source? As usually in the case of too general questions, no exact answer can be given. The preference of one or the other data source simply depends on too many too specific factors and it should be studied for each case separately. Here we give just some general considerations which should be taken into account when dealing with this kind of a question. As it was already pointed out, we must first of all be aware that both types of data could be contaminated by errors, but the source of error is usually quite different. If in the case of the classical statistical survey the quality of the incoming data largely depends on how good the measurement instruments (questionnaires, appointment letters, etc.) used in the data collection process are and how skilled and experienced the interviewers are, the accuracy of the administrative micro data usually depends on quite different factors. Two of them could be presented through the following questions: What are the consequences of the incorrect data reporting? Here we refer to the data that should be provided (by natural or legal persons) to some administrative authority and where this authority has the right to check the accuracy of the provided data and it also has the competence to punish the respondent if inaccurate data are provided. A typical example of such an authority is the tax office, which systematically controls

3 the accuracy of the incoming data and has the right to penalize the persons for which the incorrect data reporting was proved. The tax data are therefore usually considered as being of high accuracy. Here we must emphasize that only the accuracy in the sense of differences between what should and what actually is reported is meant here. The differences in the observed concepts are a completely different story and will be treated later. What benefits could be gained from the incorrect data reporting? The reporting of the data to the administrative authority can many times result in some kind of (material or non-material) benefit. Since it is in the human nature to seek the highest possible level of utility for himself and his community, there is a chance in such cases that the data could be slightly adapted if such adaptation could gain the higher benefit. The probability of such cases is usually highly correlated with the question of the penalties, treated above. 2.3 Are the administrative data from different sources coherent? The question which is here discussed usually arises when the administrative data are used as the source for the exhaustive surveys where a large number of data items should be collected at the micro level. Namely, in such cases we are usually faced with the situation that the data should be gathered from several different data sources and this can cause all kinds of different integrity and consistency problems. If all these data were gathered with the classical survey, all the needed questions would be included in the questionnaire and it would be much easier to obtain coherence of the data. On the other hand, data from different administrative sources can refer to different reference periods, use different observation units, different target populations and could be based on different conceptual approaches. If such a survey is carried out, it is then of crucial importance that all these differences are studied carefully, that their impact on the quality of final results is minimized as much as possible through the statistical process and that the possible deficiencies derived from such an approach are transparently reported to the users. 3 Administrative data overview of quality components By using the quality assessment framework, which has been widely accepted and used in the European Statistical System, the quality of statistical products and services is assessed through six quality dimensions: relevance, accuracy, timeliness and punctuality, accessibility and clarity, comparability, coherence. As the seventh, additional component we usually include costs and burden, which is by its content not a direct quality component but rather a significant factor which could influence all other components. Looking at the definition of the above stated quality dimensions, we can see that they should be very differently influenced by the fact that the classical data collection process is replaced by the administrative data usage. If the accessibility and clarity component should on the one hand be insensitive to the data collection mode, the relevance and accuracy on the other hand should be treated significantly differently in such cases. In the next part of the paper we consider each of the components (except accessibility and clarity) from the point of view of their adjustment for the cases when the administrative or combined data sources are used. 3.1 Relevance

4 The role of the relevance component in the quality assessment process is significantly changed if the administrative sources are used. In the case of classical surveys it is more or less a product-oriented component, mostly assessing the relevance of the final statistical result in the sense how much it meets the user needs. In the case of the usage of administrative data on the other hand, relevance becomes a strongly process-oriented component, since a large number of the factors that determine the relevance component derive directly from the first part of the process, when different administrative sources are gathered together in order to be used in the statistical process. In other words, if in the first case the relevance is mostly studied from the perspective of the user, in the second case the relevance component should become a tool for the assessment of appropriateness of the incoming sources for the planned purposes. There are two aspects of the relevance which should be especially thoroughly studied in the phase when fitness for our purposes is considered: Are the methodological concepts that define the variables in the administrative sources sufficiently close to the statistical concepts that are stated in the design of our survey? The fact is that the quality in the case of administrative data usage can predominantly be determined by such conceptual discrepancies. Is the reference period of the variables in the administrative sources compliant with the period targeted by the survey? If there are cases that this is not true, it must be clearly stated in the report. 3.2 Accuracy In the case of classical surveys the accuracy component, usually presented through different types of errors, is by all means theoretically the most precisely defined and described component. Terms such as sampling error, non-response error, measurement error, etc., are well known to all the persons dealing with the execution of the statistical surveys. However, there is a clear lack of the strong and consistent framework for the cases where the administrative or combined data sources are used. Here we give just a few reflections about the adaptations that should be taken into account when the well known error-categories are considered in such cases. In most of the cases when the decision in made to use the administrative data, these data are on disposal for the large part of the survey and therefore the sampling approach is not sensible in such cases. Hence, the sampling error is rarely present in the case of administrative data usage. However, many times the price of getting rid of the sampling error is the increase in the bias in our results. The main source of the bias usually comes from the coverage problem of the administrative source or sometimes from the problems with the complaints of the reference dates. Although it is not an easy task, an effort should be made to at least approximately estimate the bias derived directly from the fact that the administrative source is used. The measurement error is an especially difficult to assess component in the case of administrative data. Since in such cases the collection process is separated from the statistical process, we are usually limited with the editing procedures aimed at finding the erroneous or suspicious data, but the verification of these data at the data source is usually not possible. In these cases the close cooperation with the data provider is of crucial importance. Namely, the data provider (administrative authority) can (besides the data themselves) sometimes provide useful information which can then be used for the quality assessment purposes. In any case we should avoid the temptation of considering the data

5 coming from the administrative source as being of such high quality that no additional data editing is needed. The concept of non-response can be quite ambiguous in the cases when just the administrative data are used in the survey. In such cases it is usually difficult to distinguish it from the concept of coverage error. Let us assume the classical situation when we have a list of units determined in advanc which represents our target population and then one or more administrative sources are merged (using direct or indirect linkage approach) to this list. It is mostly inevitable that after the integration process there are units for which some or even all target variables are missing. We present such a situation in the following figure: Figure 1: Missing values after the integration phase Administrative sources Target popualtion Missing data The problem is that in the situation like this, without having some detailed information from the data provider, it is difficult to separate absence due to non-response from absence due to the coverage problems. 3.3 Timeliness There is no need for adapting the concept of the timeliness component itself when we are moving to the administrative data usage. In both cases it is simply defined as the time lag between the end of the reference period and the date of the release of the results. However, changing of the data source from statistical to administrative can have quite a distinct impact on timeliness. The question is, of course, does the usage of administrative data improve or deteriorate the timeliness component. All that we can say here is that there is no general answer to this question, but according to our experiences the latter case of deterioration is much more frequent than the former case of improvement. In most cases we face a classical trade-off between the essential cost cuts and reasonable prolongation of the time lag till the release of the results. 3.4 Comparability The comparability is another component that can be significantly influenced when we move from classical data collection to the usage of already collected administrative data. If we are considering the geographical comparability, it is clear that the usage of administrative sources in different countries doesn t really help to raise the level of comparability. Namely, the

6 administrative data in different countries are determined by different legislative universes and it is much more difficult to reach a sufficient degree of harmonization than in the case when all the countries agree to use (at least approximately) the same survey instrument in the data collection phase. The second view of comparability which should be taken into account in the quality assessment process is the comparability through time. Also here the data collection mode can be an important factor influencing the comparability component. The problematic data here are the administrative data which strongly depend on the legislation which has a strong tendency to change frequently and significantly. First of all it has to be recognized that the administrative authority doesn t care much about the consistency of the data over time. These organizations are indeed very much oriented towards cross-sectional verification of the data. Therefore, it is the obligation of the statistical organization which uses these data in the statistical process to carefully follow the underlying legislation and take all the necessary actions to diminish the possible impact of legislation changes on statistical results. 3.5 Coherence If we are considering coherence only in the sense of coherence with statistical results from other areas (e.g. national accounts), the influence of administrative data usage can be both positive as well as negative. In the case that different statistical surveys operating the same (or at least similar) area are using the same administrative source, this should increase the level of coherence of the results. On the other hand, if in one survey the administrative and in the other statistical source is used, the impact on the coherence could be exactly the opposite. But even in the latter case, this shortcoming could be turned into an advantage if the data from two surveys are properly combined in order to increase the quality. For example, if in a structural survey an exhaustive field survey is employed, these data could be used to overcome some eventual imperfections in the short-term survey which is based on the administrative data. 4 Assessment of quality when statistical and administrative data are combined two case studies 4.1 Monthly Turnover Indices in Retail Trade and Other Services The monthly turnover indices are one of the most important short-term indicators provided by our office. Since 2006 all these indicators have been calculated on the basis of the data collected by the classical postal survey, where the random sample of enterprises was selected at the beginning of each year and then for the following 12 months the data were (more or less successfully) collected from these units. To decrease the response burden, especially for small enterprises, in 2006 we started to introduce the new methodology, which is strongly based on the usage of VAT data obtained from the tax office every month. The new methodology for the estimation of the monthly turnover indices uses two types of data. For the small number of the largest (according to the turnover) units the data are still collected by the classical way, meaning that the units are surveyed by using the postal questionnaire. The units that are still surveyed classically represent 3% of the whole population in the sense of the number of units, but they cover more than 50% of the total turnover. For the remaining, majority part of the population, we use the tax authority s data to estimate the monthly turnover. These units are hence not contacted by the Statistical Office

7 for the purposes of these surveys. As it was shown in the feasibility study conducted before the actual introduction of new methodology, the estimates which are derived out of the items from the tax form are not completely in line with the methodological definition of the turnover but they can certainly serve for the purposes of the change (indices) estimation. It is obvious that the largest benefit of the new methodology is the essential decrease in costs on the side of the office and the response burden reduction on the side of the reporting units. What about other quality components? How are they influenced by the new methodology? Here we give just a few reflections on this subject Relevance The relevance of the VAT data for the purposes of estimating monthly turnover indices was thoroughly studied through the feasibility study mentioned above. The fact is that the turnover estimated from the VAT doesn t fully apply to the formal statistical definition of the turnover, but these departures would really be problematic if the level of turnover was the target estimate. The results of the feasibility study showed that the influence of the departures from the turnover definition is relatively smaller in the case when just indices are estimated. All in all, it could be summarized that the introduction of the new data source has slightly lowered the relevance of the data source, but the benefits far outweighed these shortcomings Accuracy Besides the introduction of the new data source, another very important change that came with the new methodology was the transition from random sampling to the cut-off sampling selection process. The main reason for that change was the fact that with the introduction of the exhaustive administrative data source the data for many units are now available with no additional costs and it s quite obvious that the use of the cut-off procedure should result in much more precise results than random sampling. On the other hand, also the tax data do not cover the whole population of interest. This is due to the fact that the units whose annual turnover is under a certain threshold are not obliged to report their data. In addition, some enterprises that are obliged to report are not obliged to report monthly but quarterly. The obvious consequence of the above described change in the selection of the set of the observation units was that the sampling error was replaced with the bias due to the undercoverage. The advantage of the sampling approach is that the sampling error is much easier (at least theoretically) to estimate than the bias, but at least at the annual level, when more auxiliary information for the whole population is available, the bias could be estimated quite precisely. So far, such estimations of the bias of the annual results showed that the relative bias is for most domains lower or at least at the same level as the relative sampling errors under the old methodology Timeliness Timeliness is an important issue in the case of short-term statistics. There are quite strict deadlines stated in the regulations and in recent years these deadlines have become shorter and shorter. At the moment the most demanding deadline is for the retail trade area, where the first results should be disseminated 30 days after the end of the reference month (T+30). The VAT data are delivered to our office in T+45 time, which means that these data can not serve for the first retail trade estimates. Hence, the procedure is as follows. The first retail

8 trade estimates are derived solely from the statistical data of large enterprises, which are gathered by the postal questionnaires. Since most of these large enterprises have already been in the survey for many years, they could without many doubts be considered as good respondents, whose data are sent quickly and are mostly of very high quality. Later, when also the VAT data and data from late respondents are obtained, all these data are merged and the new estimates for retail trade and the first estimates for other services are calculated and disseminated. The fact is that when quality of timeliness of short-term statistics is considered, it is very much about a trade-off between timelines and accuracy. It is again difficult to assess which methodology (the old one or the new one) is more successful in dealing with this trade-off, but in our opinion (probably highly subjective) it would be difficult to assure so accurate data in such a short time with just the survey data Comparability To assure the comparability over time is a very challenging task as far as the VAT data are concerned. The problem of these data is that they are very much determined by the legal regulations and the tax legislation tends to change very frequently. From the beginning of usage of the VAT data we have already used four different formulas for the turnover estimation out of the items from the VAT questionnaire. Regular following of all the changes in the legislation and precise and deep analyses of the influence of these changes on the data that are provided is of crucial importance if we want to preserve a sufficient degree of comparability over time. 4.2 Survey on Income and Living Conditions The European Survey on Income and Living Conditions (EU-SILC) is the project aiming at setting up the European harmonized survey for gathering comparative statistics on income distribution and social exclusion from EU member states, Norway and Iceland. The project was launched in 2003 (at that time still on the basis of a gentlemen s agreement) in 6 European member states, was widened in 2004 to 12 old member states, Estonia and Iceland, and then in 2005 included all (at that time) member states, Norway and Iceland. In Slovenia the EU-SILC was first carried out in In the planning and setting-up phase we tried to follow the Eurostat s recommendation that as many already existing data sources as possible should be used in order to reduce the response burden and to consequently increase the response rate. Therefore, we carefully studied all the existing administrative sources and their quality to allocate all the sources which could serve as a data source for the survey. Hence, in Slovenia the micro-data for the EU-SILC are gathered from three types of sources. The first part of the data is collected by the»classical«survey using CAPI and CATI technique, the second part is gathered from other statistical sources and the third part from registers and administrative sources. Among others, all the income-related variables (which are usually considered as highly sensitive ones) are gathered from the different administrative sources. Although the exhaustive use of the administrative sources has many advantages, especially in the field of response burden and survey costs reduction, such an approach can also cause certain disadvantages. In this section we briefly discuss some quality aspects of this survey, connected with the administrative data usage.

9 4.2.1 Relevance When the survey was set up, all the sources which should be used in the survey were thoroughly studied and only the relevant ones were chosen to be used in the regular production. Hence, from the basic methodological point of view all the sources satisfy the statistical requirements. There are, of course, still some departures, coming out from the administrative nature of the data, but as the analyses show, these influences shouldn t be too significant. Perhaps the largest problem here is that a part of data (income-related data) has a different reference period than all the survey data. This is a direct consequence of the usage of the tax data and can not be overcome in the statistical process. However, these differences are clearly stated in the explanation notes when the results are disseminated Accuracy The accuracy itself is a multi-dimensional concept; therefore, there are many aspects of accuracy that could be studied, especially in the complex surveys such as EU-SILC. The most positive consequence of the administrative data usage to accuracy of the results is the fact that the questionnaire is much shorter and all the sensitive questions are omitted. This certainly results in higher (unit and item) response rates and hopefully in fewer measurement errors. If we are considering the particular data items, it could be for each of them discussed which source (administrative or statistical) could provide more accurate data. For the income-related data, we are convinced that the administrative source is better than the statistical one. But in some other cases, e.g. where several administrative sources are used for one variable (e.g. employment status), the situation could be the contrary Timeliness The timelines of the results is the quality dimension where we lose the most because of the administrative data usage. The main problem here is the income data, provided by the tax office. Of course, also the tax office needs some time to collect the data for the whole population, process and verify them, and then deliver them to our office, where they are included in the statistical process. We estimate that the delay of the release, due to the delay of the tax data, is approximately 10 months. But still, all the advantages of the usage of administrative data far exceed this shortcoming. 5 Conclusion In the paper we tried to throw light on some aspects of the quality assessments in the surveys where administrative data are used as a direct data source. As it was already stated in many other papers, the quality assessment framework developed and used in the European Statistical System is mostly tailored for the classical surveys, where the data are collected by using statistical questionnaires. Since in recent years the extent of usage of administrative data has extremely grown, also the need for the adjusted quality assessment concepts is more and more evident. In the paper we tried to use the general theory as well as two concrete examples from the practice of the Statistical Office of the Republic of Slovenia to provide a moderate contribution to this complex subject. References 1. Lyberg L. et al.: Survey Measurement and Survey Quality, Wiley, 1997.

10 2. Seljak R., Flander Oblak A.: Quality Assessment of the Register-based Slovenian Census 2011; Paper presented at the Joint UNECE/Eurostat Meeting on Population and Housing Censuses, Geneva, May, Seljak R., Ostrež T.: Quality Reporting at SORS Experiences and Future Perspectives. Paper presented at the European Conference on Quality and Methodology in Official Statistics, Helsinki, Finland, 4-6 June, Seljak R., Zaletel M., Tax Data as a Means for the Essential Reduction of the Shortterm Surveys Response Burden, Paper presented at the International Conference on Establishment Surveys, Montreal Wallgren A., Wallgren B.: Register-based Statistics; Administrative Data for Statistical Purposes: John Wiley & sons, Working Group Assessment of Quality in Statistics : Definition of Quality in Statistics. Methodological documents, Sixth meeting, Luxembourg, 2-3 October 2003.