28 MARCH Doc. CoRD 047. Use of the internet in Statistics Finland s data collection

Size: px
Start display at page:

Download "28 MARCH Doc. CoRD 047. Use of the internet in Statistics Finland s data collection"

Transcription

1 COLLECTION OF RAW DATA TASK FORCE 28 MARCH 2001 Doc. CoRD 047 Use of the internet in Statistics Finland s data collection For information

2 CoRD: Use of Internet in Statistics Finland s data collection CoRD 047 Abstract Registers and other administrative sources supply 95% of Statistics Finland s data needs; the remaining 5% is collected via paper, diskettes, the WWW, and CATI/CAPI (for which Blaise is used). Data collection via the internet uses intermediaries who produce and operate data collection services and compile their own databases. Statistics Finland uses FTP to retrieve the data they need from these databases. This is known as "the TYVI model" (the letters stand for data flow from enterprises to authorities in Finnish). Existing projects for web-base solutions are described for schools, educational institutes and municipalities. An EDISENT-based solution for business activities is also mentioned. Future projects include solutions for the building costs index, financial statements data (using the TYVI model and XML in cooperation with other authorities), tourism statistics and other municipal statistics. Particularly interesting is an evaluation of the costs and benefits of web-based solutions versus traditional methods. In the future it is thought technically feasible that Statistics Finland's statistical systems might be integrated with the data suppliers' information systems, and that the data required might then be extracted automatically. Keywords: cost / benefit; intermediaries; internet; TYVI model; web-based; data collection; WWW; XML Page 2

3 3(9) 3(9) Use of the Internet in Statistics Finland s data collections Data collections at Statistics Finland Data collections are usually divided into two types: direct and indirect. Indirect data collections comprise, among others, the use of administrative and register data. Direct collections refer to collections of data from enterprises, authorities (e.g. municipalities) and individual persons. The use of data from administrative and register sources fulfils 95 per cent of the data collection needs. The 5 per cent of data that are collected direct are obtained with paper questionnaires, diskette questionnaires, www questionnaires and with computer-assisted telephone interviews. The strategy Statistics Finland has adopted in its data collections embraces replacement of direct collections by exploitation of register and administrative data, elimination of overlapping data collections, combining of inquiries in co-operation with other authorities and other measures targeted toward methodological rationalisation of data collections. The primary objectives in electronic data collection are speeded up data production and reduced data supply burden and, in consequence, lower data collection costs to society. A further important aim is improvement of the quality of the collected data. Co-operation between authorities on data collections seeks to eradicate overlapping collections and promote joint use of the collected data. This collaboration between authorities is constantly being increased. Statistics Finland uses Blaise software in its computer-assisted interview surveys. The interviews are conducted by either visiting the interviewee personally and using a laptop PC, or as telephone interviews. Data collected via the Internet used in Statistics Finland s production The collections via the Internet of data that are used in production have been implemented using the TYVI model (of data flow from enterprises to authorities), in which an outside operator produces the service and the related user administration, as well as the automated functions and validation of the questionnaires. The data supplier downloads the questionnaires for filling in from the operator s server using his/her web browser. The supplied data are stored in a database built by the operator, and Statistics Finland then uses FTP transfer to retrieve them. Page 3

4 4(9) 4(9) The TYVI model Respondents Accounting offices Operators Data collectors Nat. Board of taxes Statistics Finland LEL TEL-eläkevakuutusyhtiöt Customs Others Collection of data from comprehensive schools. The electronic data collection that has been in production for three years has covered all the data suppliers, i.e. 4,300 comprehensive schools, since The coverage also extends to the Åland Islands, and no paper questionnaires are used. The collection was launched in 1997 as a pilot financed jointly by JUHTA (the advisory committee on central government data control), the Association of Finnish Local and Regional Authorities and Statistics Finland. The aims were to minimise overlapping and reduce the workloads of e.g. educational administration and regional governments, as well as to speed up the collection of the data. The actual collection is implemented in two phases: first, the schools fill in the web questionnaire and the municipal authorities then approve the information the schools have supplied. Only after this approval are the data ready for transmission to Statistics Finland. Collection of data from educational institutes. The collection was conducted for the first time this autumn (year 2000). The institutes covered were upper secondary general schools, adult education centres and folk high schools. The educational institutes that still remain outside the electronic collection are universities, polytechnics and military educational institutes. Collection of data for statistics on the financial statements and budgets of municipalities and municipal associations. These data were collected electronically with www questionnaires in January The co-operative partners in the project were JUHTA, the Association of Finnish Local and Regional Authorities and Statistics Finland. A paper questionnaire had been used previously but the web questionnaire was now presented as the preferred alternative mode of responding. A paper questionnaire was only sent to the data suppliers upon request. The goals were the speeding up and Page 4

5 5(9) 5(9) rationalisation of the production of the statistics on municipal finances. The used web questionnaire shows the corresponding data from the previous year as comparison data. More than four out of five municipalities and municipal associations supplied their data with the web questionnaire. Collection of data for part I of statistics on the finances of municipalities and municipal associations. These data were previously collected with diskette questionnaires. In 2000, a web questionnaire was introduced in the collection, but the data suppliers were also given the possibility to use either the diskette questionnaire or a paper one. The paper questionnaire was only sent out upon request. Over 75 per cent of the data suppliers filled in the web questionnaire. The used web questionnaire showed the corresponding data from the previous year as comparison data. Monthly indicators of business activities. This was the first ever operational data collection implemented using the EDISENT software specially developed for electronic data collections within an EU project. Approximately three per cent of the data suppliers used this option. Pending projects for data collections via the Internet 1. Project for the development of electronic data collection, 1 January December The project aims to analyse the situation regarding direct data collections, assess the needs for electronic collections and prioritise them. The targets are to find the best practices for changing over to electronic data collections and to create model solutions (questionnaire templates). The project group act as experts in Statistics Finland s ongoing electronic data collection projects. As a pilot, the project will implement the collection of price data for the building cost index electronically with web questionnaires. The project is being implemented with Statistics Finland s own resources without an outside operator. The testing phase is scheduled for January-February Results from the pilot will be used to determine further actions and assess the types of data collections that Statistics Finland can implement with this in-house system. The project will continue through to the year Collection of financial statements data. This is a co-operative project between authorities, executed jointly by Statistics Finland, the Board of Taxation and the Board of Patents and Registration. Work to harmonise data collections has gone on since the 1980s and questionnaires for joint use have been completed. The intention is to set up three different projects (on preliminary study, form and tools) for the implementation of electronic data collection, scheduled to be carried out by the beginning of the year XML will be implemented in this project. The collection will exploit the TYVI model. In spring 2001, each of the co-operative authorities concerned will receive the first electronic data on the financial statements and balance sheets of enterprises for the year VIRATI data collection. A co-operative project between authorities in which the Financial Supervision Authority, the Bank of Finland and Statistics Finland have merged into the so-called VIRATI collection nearly all the data they collect from credit institutions and investment service enterprises so that each authority concerned can extract the data it requires from the combined bulk of data. Statistics Finland has co-ordinated the data content, and designed and produced the workbooks to be filled in. The authorities involved have also used the VIRATI concept to carry out data collections that only pertain to one or two instead of all three of them. Technically, 99 per cent Page 5

6 6(9) 6(9) of the data are produced in the Excel spreadsheet format, using electronic mail as the means of dissemination. Future plans include making the workbooks downloadable from the Internet. The aim is to also improve the quality and user friendliness of the data collection tools. Further actions and their scheduling within the project are at the planning stage. 4. Data concerning social service activities for the statistics on the finances and activities of municipalities and municipal associations. Web questionnaires for this collection are at the design stage. The work will be commissioned from an operator, as in previous collections of data on municipal finances. The first collection will take place in February Part II of the statistics on the finances of municipalities and municipal associations. This is the most extensive direct collection of data for the statistics on municipalities. Implementation as electronic collection is scheduled for the year Trials will be carried out in Quarterly statistics on municipal finances. A data collection project of the EU. According to current plans, this will be implemented in 2001 by using web questionnaires to collect data on the first two quarters of the year after June. The sample consists of the ten largest plus 15 other municipalities. There are plans to increase the sample size. 7. Tourist accommodation statistics. The data suppliers are accommodation establishments (accommodation chains and bookkeeping firms) and the statistics describe tourists and overnight stays by month. Implementation of electronic data collection may become possible as early as Electronic collection of data for the statistics on causes of death. A planned co-operative project with health care organisations, the Population Register Centre and its subordinate units, Police Administration and regional governments. The plans include rationalisation of the transmission and processing of data between the different authorities and institutions. The targeted preliminary scheduling period is The project has not been officially set up as yet. Gained experiences, the costs involved and the benefits It is fairly difficult to assess the impact changing over to electronic collections might have in the long run on the cost of compiling and producing statistics. The considerably high costs involved in the setting up of systems are counterbalanced by reduced amount of manual work and easy system maintenance if no major changes take place from one year to the next. Improved quality of data, upto-dateness of data and faster completion of processing represent the positive side effects from electronic data collections. Judging from the feedback received from data suppliers (for the statistics on municipal finances) using a web application would seem to reduce the resource requirements of municipalities and municipal associations. If a data collection can be repeated identically from one point of time to another, the designing and collecting costs can decrease considerably. Conversely, if the data content of the collection changes, it can cause high maintenance costs. Page 6

7 7(9) 7(9) Comparing data collection (from comprehensive schools) via the web with data collection using paper questionnaires, Statistics Finland (the data collector) does incur costs from, e.g. the designing and updating of the web questionnaire, creation and administering of user identifiers, having to collect data on new or abolished schools by fax, creation of a prefilled file for each school, updating of the text pages of the web application, guidance in the use of the web application, liaison with the operator and from the running and maintenance charges of the application. Statistics Finland will save on the designing, printing and receiving of paper questionnaires and sending reminders for them, as well as on data input costs and on the cost of envelopes and postal charges. In other respects, Statistics Finland s costs will be the same for both web and paper questionnaire collections. It is difficult to compare the cost of the web collection with that of the paper questionnaire collection for the statistics on educational institutes because no corresponding, exhaustive paper questionnaire collection covering every individual institute has ever been done before. Basing on estimates of the volume of work involved and of the running and maintenance charges and the cost of envelopes and postal charges, it does, however, seem that Statistics Finland s costs from the web collection and from the paper questionnaire collection are about equal. With regard to the required human resources, the paper questionnaire collection does increase these considerably when the quantity of handled forms is increasing. The situation is different in electronic data collections where the collection costs are not dependent on the volume of data or number of data suppliers to the same extent and, consequently, the cost per unit decreases. Data collected via the web can be processed faster than data collected with questionnaires, and this helps to eliminate overlapping collections. For example, as of 1998, regional governments have no longer needed to collect data on comprehensive schools from municipalities. Statistics Finland can offer to its customers first-rate statistical and research data that are more or less up-to-the-minute. The indirect impact arising from the data suppliers better motivation to respond should also be taken into account in system evaluations as regards the ease of questionnaire completion, guidance in the use of the Internet, advancing municipal networking, updating of web browsers, enjoyment in producing statistics, etc. In the transition stage, costs are pushed up by the need to use several data collection modes simultaneously. This is especially true in respect of collections of data from enterprises because not all of them have the facility or the possibility to respond via the Internet. Future outlook Administrative data are constantly being improved to serve administrative purposes better. Introduction of relational databases and open applications often means that the exploitation of data for statistics becomes more difficult. The cost of using administrative data has also gone up because it can be many times more expensive to gather statistical data from distributed systems than from the previously used sequential files. Page 7

8 8(9) 8(9) When administrative data are lost, e.g. as a result of the introduction of the pre-completed tax proposal procedure, the need for direct collections increases. Finland s membership in the EU has also added and will continue to add to the pressure to collect more data direct. Looking way ahead to the future it would be feasible to imagine our statistical systems being integrated direct to the data suppliers own systems. In theory, this is already possible and the first practical applications are certain to be introduced within a couple of years. The same also applies to the administrative data systems discussed above. There is a great need for Statistics Finland to make a concerted effort to develop electronic data collection further. The data collections that can be implemented electronically, as well as the relevant types of electronic collection, also need to be identified precisely. It seems that at the moment Statistics Finland has adopted two main modes for implementing data collections via the Internet: The TYVI model, using an external operator Direct collection without the TYVI operator Collections where data are needed by several different authorities would seem to suit implementation with the TYVI model. Data collections that are carried out less often than once a year are also suitable for commissioning from an operator as a bought-in service. In these cases one could talk about one-off collections, because both the content of the collected data and the used technology are likely to change so much that their re-usability is negligible. Trials are currently being carried out without an outside operator to collect data for the building cost index. The usability of direct data collections via the Internet without an outside operator in the different statistics of Statistics Finland can be better assessed after this pilot study. With regard to the work process, the need of planning resources will be greater than before in the defining and designing stage, whereas increasingly less human resources are needed towards the end of the implementation stage. It looks probable that all the data collected from municipalities can be collected electronically within a fairly short time, as the number of municipalities is limited and all of them by now have good Internet facilities. Internet facilities are also improving continuously in the enterprise sector. With enterprises, the transition to data collections via the Internet will, nevertheless, take longer than with municipalities, and will mean that several data collection tools must be used simultaneously for quite some time. Collection of individual data can be divided into direct collection from individual persons and into collection of data on individual persons from e.g. educational institutes or enterprises. Once problems connected with data protection in collections of individual data are solved, indirect collection of individual data can be implemented within the same timetable as the transition to collections via the Page 8

9 9(9) 9(9) Internet was accomplished with municipalities and enterprises. At the moment, Statistics Finland has no plans concerning direct collection of data from individual persons. In collecting and disseminating of data, Statistics Finland is in direct contact with the outside world. It is extremely important to keep the technical know-how and methods concerning these two areas up-to-date. Statistics Finland has, and will continue to, make major investments in their development. Co-operation between different branches of administration is also essential in these areas. Summary Statistics Finland s costs will not essentially go down during the period of transition to collections via the Internet. However, the quality of the collected data will improve and their collection will be faster. The volume of collected data can be increased without increasing staff, but the skills requirements of the staff will go up. Municipalities, educational institutes, financial institutions and the like are the targets from which all the required data can soon be collected via the Internet. Multiple forms of data collection will continue for quite some time in the enterprise sector. Collections of data from individual persons via the Internet have not yet been initiated. Co-operation between authorities is underway in collections of data from enterprises and financial institutions. With the Tax Administration the co-operation has started particularly well. Co-operation with the Bank of Finland and the Financial Supervision Authority has been going on for some considerable time and data collections are gradually being implemented via the Internet. Page 9