Implementing Electronic Data Collection and Scanning at Statistics Denmark

Size: px
Start display at page:

Download "Implementing Electronic Data Collection and Scanning at Statistics Denmark"

Transcription

1 Nordic Conference of Statisticians Åbo, Finland August 2004 Implementing Electronic Data Collection and Scanning at Statistics Denmark Bente Dyrberg Statistics Denmark Table of Contents 1. Introduction 2. Paperless handling of data virk.dk and scanning 3. Virk.dk 4. Scanning 5. Organisation of the work 6. Conclusion Annex Overview of Digital Reporting at Statistics Denmark 1

2 Implementing Electronic Data Collection and Scanning 1. Introduction Statistics Denmark has three general aims for the collection of data: To enhance the efficiency of data collection data, and thereby production of statistics To reduce the response burden To ensure a high data quality of the statistics For several years Statistics Denmark has benefited from replacing traditional data collection by using administrative register-based data. There has, especially, been an intensive use of administrative data sources in the field of social statistics, but the use of administrative registers, e.g. data on VAT, employment and housing conditions has also prevailed in the field of enterprise-related statistics. There is a wide number of remaining fields, especially within the field of business statistics, where data must be collected directly from the enterprises. This is necessary if information is not available from registers, or the updating of registers is not conducted at a later time, which implies that the data cannot be used. When new registers are established, great attention is focused by Statistics Denmark on whether the information can be used in a statistical context. However, the current situation is that the present possibilities of utilizing registers have been exhausted, and against this background Statistics Denmark - like other statistical institutes is, at the moment, faced with major challenges in connection with enhancing the efficiency of compiling statistics, while at the same time reducing the response burden. To tackle these challenges Statistics Denmark has put forward an overall goal making the handling of collected data paperless by This is described in paragraph 2. In paragraphs 3 and 4, focus is especially directed at the reporting of data via the Internet portal virk.dk and the handling of data by scanning the questionnaires. In paragraph 5 some questions regarding organisational aspects are raised, and finally in paragraph 6 there is a conclusion. An overview of various other methods of digital reporting applied by Statistics Denmark is given in the Annex. 2

3 2. Paperless handling of data virk.dk and scanning Statistics Denmark is currently in a process of digitalisation, and our aim is to perform the data processing of questionnaires without the use of paper, i.e. digitally. During 2005, it is intended that the handling of questionnaires does not involve the use of any papers. To achieve this goal, parallel work is performed with respect to two general possibilities involving all statistics, i.e. electronic data reporting via the Internet portal virk.dk and scanning of questionnaires. The goal is that it must generally be possible to report data digitally for all questionnaires as from June However, there is uncertainty as to when all enterprises will make use of this possibility to report data electronically. It must be assumed that data will be reported on paper questionnaires for a very long time (unless enterprises are obliged to report data electronically). The fact that a presumably high although declining number of enterprises will still prefer to report data on paper implies that the process of digitalisation will also include an electronic transformation of the data reported on paper questionnaires. This is undertaken by scanning the questionnaires, and in this way both a file with frames of the questionnaires (tif-files) and a file with data are created, which can be incorporated into the further processing of data. Electronic data reports combined with scanned paper questionnaires open up the possibility of conducting the further processing of data without the use of paper. The paperless processing of data has a determinant effect on the efficiency gains that can be achieved by Statistics Denmark from the process of digitalisation. Besides the general digital Internet portal, Statistics Denmark has already established several digital solutions for specific surveys, cf. Annex A. It should be mentioned that the technical database (XIS) which has to be established for receiving data for each survey is the same for both electronic reported data and scanned data. 3. Virk.dk One of the most significant initiatives taken by the Danish government to reduce the administrative burdens on the business community was to enable Danish enterprises to communicate digitally with public authorities in all areas. In 2002, it was therefore decided to establish a business portal virk.dk on the Internet. Virk.dk is developed in a privately-public partnership. On this portal, the business enterprises must be able to find all business-related digital services offered by public authorities. That is, it must be possible for the business enterprises to obtain all relevant information from public authorities on virk.dk, and the business enterprises must also be able to submit all information that they are obliged to supply via virk.dk. The business portal was officially put into operation in the autumn of 2003, but there are still some unsolved technical problems, which are assumed to be solved in These problems have implied that the enterprises use of the portal has been 3

4 restricted, and this was also the case for the reporting of data by enterprises to Statistics Denmark. When it was decided to establish virk.dk, Statistics Denmark took the strategic decision to aim at this portal as the electronic medium of reporting data, instead of continuing the development of own electronic solutions. A natural prerequisite for each statistical system as well as in general, is the establishment or alteration of a number of IT systems at Statistics Denmark in order to make use of virk.dk The general structure is established and each system is in the process of being adjusted to virk.dk. This strategic aim was taken with the clear expectations that virk.dk during 2004 would be able to: handle all questionnaires so that questionnaires completed with auxiliary information in advance are thus more easily completed by the enterprises be possible that the enterprises can be assisted and shown exactly those questionnaires that they have to complete with data with respect to individualized enquiries When these prerequisites are fulfilled, the reporting of data via virk.dk will result in a reduction of the response burden of enterprises and municipalities, although the size of the reduction might be small. The Intrastat form is the first questionnaire form available from virk.dk. Work has been carried out in 2003 to develop a specially-designed form for reporting data via virk.dk to Intrastat. At the beginning of May 2004, the form was made available on virk.dk. Each respondent is provided with an overview of the reporting units, and the business-specific information has been completed in the questionnaire in advance. It is also possible for the respondents to transfer the task of reporting data for each individual unit to a third party. Information submitted via the Intrastat form will, in principle, be automatically validated (invalid commodity codes, etc.). It is also possible for the respondents to subsequently make their own corrections of errors, which relate to the nature of the commodity (DKK per unit, etc). Normally, the outcome of the automatic validation will be that the respondents need not spend any subsequent time on correcting this type of error. However, the extent of the automatic validation should not be so comprehensive that the respondent considers the electronic reporting of data to be too complicated. There must be an appropriate balance between automatic validation and userfriendliness. Any errors in the data reports via virk.dk to Intrastat can be corrected at Statistics Denmark. This possibility will at first include information reported via virk.dk, but at a later time will also include information reported via other media (paper, disc, etc.). The respondents are also able to put forward comments to the data, provided that there are no errors, but that there are commodities different from previous data reports due to the new developments, etc. These new initiatives will reduce the more time-consuming and slow process of correcting errors via ordinary mail. 4

5 It is intended that this solution is to be used in other statistical fields during 2004 and This will imply that digital communication with the enterprises will be as efficient as possible. 4. Scanning As already mentioned, the main reason for establishing a scanning facility is to enable a complete paperless reporting of data to Statistics Denmark. Not until this has been achieved is the handling of data completely paperless, and it is thereby important in creating greater flexibility and enhancing the efficiency of the way in which the questionnaires are processed. The present situation is that the in-house scanning facility at Statistics Denmark and two statistics are in full operation, and more statistics are in the process of entering into regular operation. Before the end of 2005, it is intended to scan data for all statistics for which more than 500 data reports are annually received. However, many considerations have been made in order to achieve this including: test of external scanning (2002) test of external scanning in operation for 3 quarters of 2003 in-house scanning from January 2004 After having tested the quality of scanning on the basis of some existing piles of questionnaires in 2002, where two surveys were scanned, the test was extended in 2003 to include external scanning of data for the same two surveys over 3 consecutive quarters, which were in full operation After these tests it was concluded in general that there is a wide range of advantages inherent in the scanning of questionnaires. However, on the basis of the experience gained from the external scanning in operation, the advantages and disadvantages related to in-house scanning were weighed up against each other in the above-mentioned report. The advantages were described as: The statistical divisions responsible for producing statistics will be in greater contact with the survey Enhancement of the efficiency in the data processing, especially with respect to complicated surveys, as the scanning of data is conducted by a professionally competent employee More all-round work to be performed by the statistical divisions when the paper questionnaires are received and must be scanned No delays in the reporting of data, which is caused by conducting external scanning of the data No time has to be spent on communicating with external enterprise Competencies are built up and are maintained, in a broad sense, concerning scanning of data. 5

6 On the basis of the overall conclusions from the report, Statistics Denmark s Management decided in December 2003 that all questionnaires at Statistics Denmark must be made suitable for in-house scanning before the end of Experiences for in-house scanning have up to now shown: It is not difficult to learn the professional competencies to operate the scanning equipment in the statistical divisions, and it is also a simple process to verify the questionnaires scanned. The scanning itself of the questionnaires is a very fast process, and the subsequent mass verification of the questionnaires is also a relatively fast and rather uncomplicated task to learn and to perform. The fact that each statistical section responsible for producing statistics receives the questionnaires directly and scans the questionnaires has implied that the responsible section is in greater contact with the survey, as the number of questionnaires received is known, and the processing of complex questionnaires can be decided by the employee(s) responsible for the survey in question. There is, naturally, a better knowledge of the scanning hardware and software, which is instrumental in understanding how the questionnaires should be designed, etc. Experience has been gained with respect to optimising the utilization of the licences. Subsequently, more variables than those needed should not be massverified, as this will take up licence time and network resources as well as network time. Still, there are some problems that must be solved regarding questionnaire design, monitoring the status of the scanning of a survey (i.e. measurement of the number of questionnaires scanned, number of verified questionnaires, etc.) and some technical problems. To sum up there are many advantages of a decentralized in-house scanning model linked to the validation of data as it can be seen from the above-mentioned. The advantages are not immediately of an economic nature, but mainly apply to an improvement of the quality of the surveys, both with respect to the data quality, but also improvement of the services towards our respondents. The decentralised scanning is also considered to result in greater variation of the tasks to be performed by the statistical divisions, and thereby increasing job satisfaction of Statistics Denmark s employees. 5. Organisation of the work Statistics Denmark is in the process of digitalizing the data collection. This process should be finalized by the end of We gain experiences up to then. 6

7 However, it is a central question whether in connection with the changes of the work processes, an adaptation of the organisation of the work should be implemented. It must thus be ensured that the advantages achieved by digitalized data reporting are fully reaped. This is taken to mean that any obvious quality improvements must be incorporated, and it must also be ensured that the statistical surveys are conducted by using fewer resources in the future. Statistics Denmark does not yet have a solution as to which organisational changes would be the most optimal ones. First, we must ensure that the systems to be used for digitalizing the production of statistics are fully implemented. Concurrently with this, we must discuss any possible changes in our organisation in the light of the proposals for which the Director of Business Statistics is responsible to prepare in the autumn of Experience gained by other countries must also be incorporated. Without anticipating events, a picture is formed, which implies that the following elements are to be discussed: Is there to be a central respondent register? Is there to be a central unit to send out questionnaires? Is there to be a special unit to be responsible for questionnaire-design and the coordinating of the population in the various statistics? How is micro-data editing, macro-data editing and publishing to be organised? How are variations in the employees tasks ensured? Is there to be a distinction between short-term statistics and structural statistics? Must a cross-sectional analysis unit be established? How is an efficient cooperation with the IT centre ensured? The discussions to be held in the autumn of 2004 will show how we make further progress in this field. 6. Conclusion The aim of Statistics Denmark is that the questionnaire-based surveys must be fully digitalized during the year This implies that there must be possibilities of electronic data submission as well as an in-house scanning of questionnaires on paper for all statistics for which more than 500 data reports are submitted annually. This has been an ongoing process over several years, but large resources are now allocated for implementing this project. The process of establishment is in itself very difficult. Recognizing this fact, Statistics Denmark does not, at the same time, want to undertake any radical organisational changes. It is important that the production of statistics can be maintained according to schedule. However, during the autumn of 2004, the first discussions will be initiated with respect to possible organisational changes. 7

8 8

9 Annex Overview of Digital reporting at Statistics Denmark At present, a variety of possibilities have been opened up for digital reporting data in general and two specific statistics: General digital system for data reports: It is in some statistical fields possible for the enterprises to report data digitally through the Internet portal virk.dk. This possibility will generally be considerably extended during 2004 and Digital systems for specific statistics: 1. The external trade statistics 2. Statistics on earnings 3. Pig surveys 4. Statistics on transport of goods by lorry 1. Intrastat In 2003, it was made possible for respondents to use the programme IDEP for direct submission of data, in an encrypted form via to the Central Customs and Tax Administration. This implies that respondents no longer have to generate a file, which is first saved and subsequently submitted on a disc to the Central Customs and Tax Administration. 2. Statistics on earnings The material used for reporting data for the statistics on earnings is developed in cooperation with the Danish Employers' Confederation and the Danish Employers Association. These two employers organisations collect information from their member enterprises, which is used for compiling own statistics on earnings, and the information collected is passed on to Statistics Denmark. Duplicate reporting of data is thus avoided. To ease the response burden information is as widely as possible collected directly from the computerized pay transfer systems. In collaboration with the Danish Employers Confederation, Statistics Denmark has developed a standard scheme for reporting data electronically to the statistics on earnings. This standard scheme is incorporated into all computerized pay transfer systems and also in the systems operated by all agencies providing services with respect to pay transfers. In cases where an enterprise does not make use of a computerized pay transfer system or the services of an agency, data can be reported either via an electronic questionnaire or via an Internet questionnaire. Since the survey year 2000, it has been possible to report data for the statistics on other labour costs via the Internet by means of an interactive Internet questionnaire. This possibility was used by about 40 pct. of the enterprises in the survey year

10 If the enterprise is registered with the joint electronic reporting system entitled LetLøn either directly or via an agency providing services with respect to pay transfers, the data reported to Statistics Denmark is coordinated in a joint reporting of data to the reception centre. The employees report data and only make payments to LetLøn, which subsequently pass on the information and payments to the respective recipients (the Central Customs and Tax Administration, the Holiday Accounts of the Labour Market Supplementary Pension Scheme, Statistics Denmark and the Danish Employers' Confederation). In addition to being a reporting system for data on earnings, a number of earnings-related services are linked to LetLøn. LetLøn is administered by the LetLøn Centre at the Central Customs and Tax Administration. At the moment, only few enterprises make use of this possibility. 3. Pig surveys In collaboration with the software firm AgroSoft, Statistics Denmark has developed the programme WinSvin for use in the pig surveys conducted by Statistics Denmark. This programme enables farmers to report data on pig stocks via an automatic extract from the farm s computerized system. The information can be submitted automatically via , but this type of reporting system is only used by few farms. 4. Transport of goods by lorry An EDI standard has been developed by Statistics Denmark for the reporting of data on transport of goods by lorry, which can be used instead of completing the questionnaire forms that have been submitted. This system was developed by Statistics Denmark in collaboration with a number of interested parties. The data reports can be submitted to Statistics Denmark via , but at the moment, no enterprises have made use of this type of data reporting. 10