ESSnet on Free and Open Source Software for Statistical Production

Similar documents
A cross-cutting project on Information Models and Standards

EXECUTIVE SUMMARY 1. RECOMMENDATION FOR ACTION

ESSnet. Common Reference Architecture. Common Reference Architecture (CORA) ESSnet History and Work-packages

CSPA. Common Statistical Production Architecture State of the art of current sharing activities between NSIs and statistical institutions

SDMX Roadmap In this Roadmap 2020, the SDMX sponsors outline a series of strategic objectives:

Statistical Architecture Models

Innovation for Dissemination in the ESS - The DIGICOM project

Interoperability of business registers in the European Statistical System: the Eurostat VIP.ESBRs project

Practical experience in the implementation of data and metadata standards in the ESS

Implementation of Eurostat Quality Declarations at Statistics Denmark with cost-effective use of standards

ISTAT experience on European modernisation initiatives

outside the Asia Pacific region The Challenges

SDMX implementation in Istat: progress report

Business Case ESS.VIP.BUS.ADMIN. ESSnet "Quality of multisource statistics"

Modernization of Statistical Information systems Global initiatives

Experiences in the Use of Big Data for Official Statistics

Modernization of statistical production and services

BLUE-ETS Blue Enterprise and Trade Statistics

THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION

SDMX architecture for data sharing and interoperability

- International Roundtable on Business Survey Frames November 2008 OECD, Paris France

ITDG 2012/3.1/EN IT DIRECTORS GROUP 27 TH AND 28 TH NOVEMBER Item 3.1 of the agenda. Presentation of Enterprise Architecture

The Data Documentation Initiative (DDI): An Introduction for National Statistical Institutes

Rolling Review. Tourism Statistics. Executive Summary

Exploring the Relationship between DDI, SDMX and the Generic Statistical Business Process Model. Arofan Gregory Open Data Foundation

SCFE final meeting - Newport. Agenda. Day 1: Wednesday 29 November. Day 2: Thursday 30 November. Day 3: Friday 1 December

The role of the standards in the modernisation of official statistics. Aija Žīgure, President Central Statistical Bureau of Latvia

Adopting Big Data Technologies in the Support of Official Statistical Production: Opportunities, Experiences and Lessons Learned

DIGICOM Activities 2016 Luxembourg, 27 January 2017

Communication & dissemination

Assessing the quality of integrated data (ESSnet on quality of multisource statistics)

Management challenges in Modernisation Processes (Bucharest, 17 March 2016)

Quality Management of Cross-border Relationships in the European System of Interoperable Statistical Business Registers (ESBRs)

Quality Sheet: Documentation of Quality Indicators at Statistics Austria

ESS Standards for reference metadata and quality reporting (ESMS, ESQRS, SIMS)

ESSnet SCFE DELIVERABLE D4-1

Standardisation for improvements in Statistics Norway Rune Gløersen 1 and Hans Viggo Sæbø 2

Fundamental characteristics of the ESTONIAN statistical system and its transformation

DIME/ITDG Steering Group November 2015 Agenda item 09. Project Initiation Requests for 2017 ESSnets and Centres of Excellence

Maintaining the quality of EU statistics while enabling re-use

Generic Statistical Business Process Model (GSBPM)

Economic and Social Council

Development of Statistics. Statistics for Development.

High-Level Group for the Modernisation of Statistical Production and Services

ESSnet on SDMX. Technical issues. Workshop on ESSnets 31/5 1/6 2010, Stockholm

Implementing GSBPM in the Danish NSI

ModernStats World Workshop April 2018, Geneva, Switzerland

EUROPEAN STATISTICS CODE OF PRACTICE

The ESS Vision The ESS Vision P a g e

The ESSnet Big Data: Highlights of Results and Outlook for Future Work

The system of official statistics in Poland. Janusz Dygaszewicz Director Programming and Coordination Department CSO, POLAND, March 2017

Information Management Transformation Program: Towards a Statistical Industry

Case Study. National Skills Academy for Nuclear

SDMX. The basis for renovating the ESS Metadata Systems. August Götzfried Head of Unit EUROSTAT

ARCHITECTURAL PROGRAMMING: PROVIDING ESSENTIAL KNOWLEDGE OF PROJECT PARTICIPANTS NEEDS IN THE PRE-DESIGN PHASE

Evolution not Revolution

Return To Scene Digitalising Industry

SAI Performance Measurement Framework Implementation strategy

The Project Management Office (PMO) Introduction

Istat quality policy: Preconditions of quality assessment

Business Statistics Transformation (BST) in the Central Statistics Office (CSO) Ireland Modernisation of Data Collection and Dissemination

United Nations Economic Commission for Europe Statistical Division

BCS Certificate in Integrating off-the shelf Solutions

In-depth review of data integration

How to map excellence in research and technological development in Europe

Programme for the Modernisation of European Enterprise and Trade Statistics (MEETS)

AVL CONCERTO 5 Experience the harmony

Economic and Social Council

European Census Hub: A Cooperation Model for Dissemination of EU Statistics

ESS EA Reference Framework. 31 August 2015

Standardisation in the European Statistical System

Guidelines on Risk Management practices among statistical organisations

UNECE Statistical capacity development strategy

Electronic Data Collection in Accommodation Statistics

Opinion on Work Programme 2015

Quality Assessments of Statistical Production Processes in Eurostat Pierre Ecochard and Małgorzata Szczęsna, Eurostat

Supreme Audit Institutions Performance Measurement Framework

Overcoming skill gaps in the ICT and Green Economy sectors

Governmental and Statistical Enterprise Architectures as Tools for Modernizing the National Statistical System

COMMISSION OF THE EUROPEAN COMMUNITIES REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL. IDA II Mid-Term Evaluation

INTERNATIONAL FINANCE CORPORATION TERMS OF REFERENCE

DIFFERENT STROKES FOR DIFFERENT FOLKS DIGICOM approach to statistics users

National Statistical Institute of Portugal (INE-PT)

The Qualifications Triangle and Competency Development A vision for the collaboration between practical training companies, educational institutions

GENERAL GUIDELINES FOR THE COOPERATION BETWEEN CEN, CENELEC AND ETSI AND THE EUROPEAN COMMISSION AND THE EUROPEAN FREE TRADE ASSOCIATION

Statistics Netherlands

French initiative: Industry of the Future Alliance. Céline Chauveau, Projet Lead

Who Should be on Your Project Team: The Importance of Project Roles and Responsibilities

Quality evaluation of experimental statistics produced by making use of Big Data

Integrated Statistical Systems: Data collection, Processing and Dissemination of Integrated Statistics

Modernising the Production of Official Statistics

Project proposal Standardizing Risk Management for modernization within statistical organizations

Speech of Commissioner Günther H. Oettinger

Review of priorities for economic statistics and national accounts

INSTITUTE OF BIOMEDICAL SCIENCE

A report into the professional development needs of Native Title Representative Body lawyers

Enterprise Architecture and COBIT

Implementation of a new data collection on final energy consumption in households by type of end-use

BUILDING CAPACITIES: THE PATHWAY TO SUSTAINABLE DEVELOPMENT

EFSA Discussion Group on E-SUBMISSION (MATRIX Project)

Transcription:

ESSnet on Free and Open Source Software for Statistical Production Project proposal 08.02.2013 Prepared by: Giulio Barcaroli (ISTAT), Duncan Elliot (ONS), Mark van der Loo (Statistics Netherlands), and Matthias Templ (Statistics Austria, Vienna University of Technology) Summary. This document proposes an ESSnet project on methods and developments in free and open source software. The project aims to investigate the opportunities of Free and Open Source Software (FOSS) projects usefull to NSI s and the ESS. The main objectives include stocktaking, standard setting, and benchmarking. The project will also investigate opportunities for a Center of Expertise on FOSS projects for official statistics, and launch it when a clear case can be made. 1. Aim The purpose of this proposed ESSnet is to document, explore, and educate the ESS on Free and Open Source Software (FOSS) projects for statistical production within the ESS. 2. Background A communication from the Commission to the European Parliament and the Council on "the production method of EU statistics: a vision for the next decade" notes that the traditional stovepipe model of numerous parallel processes is no longer fully adapted to the changing environment. One significant area of change in recent years is the availability of statistical methods implemented as free, or open software. It has become customary for scientists and (official) statisticians to publish new methods not only as a scientific document but also in the form of a software package. Some NSIs have adopted open source tools as strategic software and are actively contributing to the family of open source tools in the field of official statistics. Examples include tools for time series, imputation, automated data editing, survey sampling, and estimation for complex surveys. There are also strong connections between open source communities and developments for dealing with large data. The use FOSS projects for statistical analyses has become wide spread amongst global players at the market. Companies such as Google, New York Times, Merck and Bank of America are known to use, and sometimes contribute, to FOSS projects related statistical computing, high performance computing or big data analyses. The importance of open source software is further demonstrated by the fact that many commercial software developers advertise that they integrate their producs with certain open source software. There is a real opportunity for NSIs and the ESS to benefit from the growth of free and open

source software. However, there are potential stumbling blocks and resource constraints where NSIs attempt this in relative isolation, in particular the continuation of the traditional stovepipe approach to production. The ESS would benefit from the opportunity to explore the issues raised by FOSS through sharing resources and knowledge as offered by an ESSnet on FOSS. 3. Motivation The changing environment of statistical production is well recognised with the ESS looking to move away from a stovepipe production model and the High Level Group for the Modernisation of Statistical Production Services established to provide guidance on this issue from a strategic perspective. Part of this change has been driven by the growing popularity of free and open source software. An open source or communal approach to software use and development can have significant benefits for the ESS. Besides the fact that FOSS tools come without licence costs, such benefits include - Implementation of standards. For example, the investment in reusing a method described in a European methodological handbook is lower when a FOSS implementation is available. Commercial implementations are only available to NSI s who can and wish to afford a license. - Cost reduction. As the open source paradigm allows full software sharing. - Sharing of knowledge. - Improved comparability of figures and statistical processes across the ESS. - Contributing to the longer term aim of the ESS to move away from the traditional stovepipe approach to statistical production and a move towards a plug and play approach. As NSIs begin to explore the possibilities of efficiency and cost savings offered by some FOSS solutions to methodological and statistical production problems, it is imperative that NSIs and the ESS are in a position to make informed choices on the adoption of such solutions. In the community of Official Statistics there are several examples of FOSS tools, some of which are widespread, and others that are used by a select few. There is a lack of familiarity in parts of the ESS with the range of FOSS tools available and in development by other NSIs. There can also be mistrust of FOSS solutions due to reliability concerns, maintenance and support, and issues surrounding recourse to licence. Many NSIs also face resource constraints to fully understand and research FOSS solutions as it requires some investment to overcome the inertia of established systems that consume people s time and energy. Finally, because of the wide-spread use of FOSS tools in academia, young graduates with a statistical background are nowadays leaving universities with experience in FOSS products rather than commercial software. The purpose of this proposed ESSnet is to document, explore, and educate the ESS on the use of FOSS projects for statistical production and to evaluate the case for an ongoing Centre of Expertise to support the Official Statistics FOSS community. In doing so it will provide some investment to properly address the potential role of FOSS within a Generic Statistical Information Model and its applicability in the ESS. It will also spread knowledge of FOSS solutions and how they may benefit the business architecture, and provide NSIs and the ESS

with informed research on which to base decisions on the future of statistical production. The importance of plug and play solutions to statistical production has been noted by the High Level Group for the Modernisation of Statistical Production and Services. To this end while a one off project in the form of an ESSnet to assess some of these plug and play solutions will be valuable, evaluating the case for a Centre of Expertise to support the Official Statistics FOSS community that has a strong methodological viewpoint is also critical to the success of moving away from the stovepipe model of statistical production and embracing new and improved software as they are developed. The form of an ESSnet is well suited to pursuing these goals, providing the appropriate structure and sharing of knowledge across NSIs within the ESS. It will also provide an excellent basis upon which to evaluate the case for a Centre of Expertise. 4. Project definition 4.1 Objectives The overall objective is to build on the active community in Official Statistics related FOSS projects and to educate the ESS by providing a critical assessment of their use in statistical production. In the table below, several subtasks have been identified that will meet this objective. Number Objective 1 Documentation of current and planned FOSS projects for statistical production in NSIs 2 Criteria for assessing/benchmarking FOSS projects for statistical production 3 Assessing FOSS projects for statistical production. 4 Development and delivery of training on FOSS projects 5 Evaluating the case for a European Centre of Expertise to support the Official Statistics FOSS Community and establishing it when indeed found beneficial. 6 Dissemination and evaluation of results Ad. 3: Applications of FOSS in the production process might involve all aspects of survey methodology, starting from drawing complex sampling designs, editing and imputation, calibration, point and variance estimation, indicator methodology, issues with large data sets, high performance computing and visualisation. Ad. 6: Establishing centers of expertise is an important featire of the ESS VIP programme that is currently being developed. The CoE on FOSS software for official statistics would have the strategic function of facilitating a FOSS community in Official Statistics. It s task could include offering expert advise on use and implementation of standards, such as those delivered by the high level group on modernisation of statistical production services; selecting candidate software that can become the recommended standard for the ESS;

organise user meetings and conferences on FOSS tools for Official Statistics; initiate and guide research projects related to statistical computing; be at the forefront of technological developments, spotting and communicating important trends for official statistics; maintaining the deliverables of the proposed ESSnet. 4.2 Scope Statistical production is defined here as the production, analysis and publication of data. Free and Open Source Software is defined here as software that is licensed to grant users the right to use, copy, study and modify its design 1 through the availability of its source code. Such software projects are usually released under an open license, such as GPL or EUPL. 4.2.1 In Scope Current FOSS projects applicable to statistical production in NSIs. Planned FOSS projects applicable to statistical production in NSIs. FOSS projects that implement statistical methodology. 4.2.1 Out of Scope FOSS projects with no application in statistical production. FOSS project bearing no relation with statistical methodology, e.g. FOSS word processors. 4.3 Dependencies and Relationships The following (international) projects have deliverables or input, related to the proposed project. MEMOBUST: the inventory phase can build on the results of the methodological handbook for business statistics. The handbook holds an overview of statistical methods that can be used as a framework to identify available software and white spots. ESS-VIP: the results of this ESSnet can feed directly into the ESS-VIP programme. ESSnet on Data Warehousing: Lessons from the pilots on European Centres of Expertise can be used to develop a European Centre of Expertise on FOSS products High Level Group for the Modernisation of Statistical Production Services: awareness and communication of relevant work concerning the Generic Statistical Information Model (GSIM). The work of the High Level Group provides a framework within which FOSS projects can be assessed as suitable for modern process development. AMELI: example of other projects delivering FOSS Software Sustainability Institute: resources and expertise on developing sustainable software Sponsorship on Standardisation: Sponsorship and ESSnet on standardisation: The Sponsorship provides strategic guidelines for standardisation, while the ESSnet delivers a structured inventory of 1 Free software foundation: http://www.gnu.org/philosophy/free-sw.html

current standards; the intended FOSS inventory and inventory of normative documents should be harmonized where possible. The ESSnet will also map standards against a common reference architecture, allowing for the identification of white spots which may provide useful input for the ESSnet on FOSS. Finally, pase II of the ESSnet on standardisation may include the development of open source tools related to standardisation. Evidently, such initiatives should be coördinated. UNECE Inventory of non commercial software: provides valuable input for the stocktaking process such as documentation templates and a previous inventory. The SDMX technical working group: delivers examples on open source IT tools used for statistical production. The R project: has a web page ( task view ) dedicated to software for official statistics. 5 Glossary Concept FOSS COTS NSI ESS CoE Description Free and Open Source Software Commercial of the Shelf National Statistical Institute European Statistical System Centre of Expertise