Complementing Conventional with Innovative Data Sources

Size: px
Start display at page:

Download "Complementing Conventional with Innovative Data Sources"

Transcription

1 Complementing Conventional with Innovative Data Sources International Workshop on Sustainable Development Goals Indicators 26 to 29 June 2018 Arturo Martinez Jr. Statistician Asian Development Bank

2 Outline of Discussion Conventional Data and SDGs Complementing with Innovative Data Data for Development TA Overview of SAE methodologies

3 SDG Monitoring Many countries in Asia and the Pacific have examined the link between their respective development targets and SDGs strategic priorities Countries have also done data availability assessment (i) immediately available, (ii) can be made available using conventional data sources, (iii) needs more effort

4 Conventional Data Sources for SDGs Administrative Reporting Systems - Education - Health - Trade and Industry (e.g., business permits, letters of credits) - Civil Registration Censuses - Population and Housing - Agriculture - Enterprise Surveys - Households - Enterprises - Farm Holdings

5 Uses of Conventional Data Sources About 20-50% of SDG indicators can be derived from household surveys and censuses Proportion of SDG Indicators derived from HH Surveys 0% 20% 40% 60% 80% SDG 01 SDG 02 SDG 03 SDG 04 SDG 05 SDG 06 SDG 07 SDG 08 SDG 09 SDG 10 SDG 11 SDG 12 SDG 13 SDG 14 SDG 15 SDG 16 SDG 17

6 Limitations of Conventional Data Sources

7 Limitations of Conventional Data Sources

8 Sample size of latest household surveys vs. total number of households of select ADB DMCs Total Number of Households ('000) 40,000 30,000 20,000 10,000 - Sample Size - 20,000 40,000 60,000 80,000 Population - HH Source: Compiled data from ADB Portal of Statistics Resources and International Household Survey Network BGD ( ) PHL (2015) VNM (2010) THA (2015) MYS (2012) NPL ( ) LKA ( ) AFG ( ) KHM (2016) MMR ( ) GEO (2016) PNG ( ) LAO ( ) TKM (2003) ARM (2016) MNG (2011) TLS (2011) FJI ( ) BTN (2012) SLB ( ) VUT (2010) MDV ( ) WSM ( ) TON (2009) FSM (2005) KIR (2006) PLW (2006) COK ( ) TUV (2010) Sample Size - HH

9 Availability of Disaggregated Data Source: Serrao (2017) Presentation on SDG Data Compilation

10 Situation in Asia and the Pacific Major surveys and censuses in some countries conducted only if donor funds are available in many countries donor dependence 70-80% budget in some countries Poor coverage and quality of administrative reporting systems both economic and social increasing the dependence on surveys For disaggregated data, surveys alone may not sufficient administrative data such as from civil registration and administrative registries need strengthening for long term sustainability

11 Small Area Estimation Let s focus on indicators that are compiled through surveysg but surveys are unable to provide reliable estimates at very granular level! A small area is small geographical area or a population segment for which reliable statistics of interest cannot be produced due to survey s limitations Small area estimation techniques borrow strength from other auxiliary information

12 Limitations of SAE methods Good auxiliary data usually required Gap between census and survey years can increase model error Only time invariant variables can be added in the model if the gap between census and survey periods is too wide

13 Big data can help address the limitations of SAE Variety Velocity Volume Big data can be much faster in providing granular auxiliary data than conventional data sources.

14 How can big data be useful for data disaggregation? To create proxy indicators To generate new covariates for small area models To validate small area estimates Source: Marchetti et al. (2015) Small Area Model-Based Estimators Using Big Data Sources

15 Create Proxy Indicators Video source:

16 Create Proxy Indicators

17 Create Proxy Indicators Relationship between Poverty and Nighttime Lights - PHL 2012 Relationship between Poverty and Nighttime Lights - PHL ln (Poverty Rate (%)) ln (Poverty Rate (%)) R² = R² = ln (sum of nighttime lights) ln (sum of nighttime lights) Relationship between Poverty and Nighttime Lights - PHL 2012 Relationship between Poverty and Nighttime Lights - PHL ln (Poverty Rate (%)) 5 ln (Poverty Rate (%)) R² = R² = ln (average of nighttime lights) ln (average of nighttime lights) 4 5

18 Create Proxy Indicators Difference and 2009 Poverty and Nighttime Lights, PHL 60 Difference and 2009 Poverty and Nighttime Lights, PHL 60 Diff, Poverty Rate (%) R² = Diff, Poverty Rate (%) R² = Diff, sum of nighttime lights Diff, sum of nighttime lights

19 Generate new covariates SDG Proportion of the rural population who live within 2 km of an all-season road Indicator can be generated using satellite images Method to calculate rural access index (RAI) requires the following data: Population distribution (e.g. Landscan and WorldPop gridded population maps) Road network (e.g. OpenStreetMap) Road condition (e.g. satellite image, user reports from apps) New RAI Estimates at the Subnational Level

20 Validate Small Area Estimates Socioeconomic indicators derived from big data can be compared to similar measures obtained from survey data E.g. poverty estimates based on satellite image vs small area estimates based on household expenditure surveys Source: Marchetti et al. (2015) Small Area Model-Based Estimators Using Big Data Sources

21 Other uses of Big Data: Now-casting Food Prices in Indonesia Using Social Media Signals Source: UN Global Pulse

22 Other Uses of Big Data: Algal Bloom Early Warning Alert System Source: Group on Earth Observations, 2017

23 Key Considerations Some types of big data may not be representative of the whole population of interest (self-selection bias) Other types of big data are held by private sector Needs a different technological infrastructure

24 Data for Development Technical Assistance Aims to build the capacity of DMCs in compiling disaggregated data for select indicators of the SDGs using combination of traditional and innovative forms of data in accordance with the SDGs leave no one behind principle s granular data requirements. ADB is collaborating with UNESCAP, PARIS21 and other development partners

25 Data for Development Technical Assistance Country-Specific Case Studies on Data Disaggregation and Big Data Analytics Issue: What is the benefit of complementing conventional with innovative data sources?

26 Data for Development Technical Assistance Technical Manual on Disaggregation of Official Statistics and SDGs Strategically-designed training workshops targeted to NSO staff Online Course Modules on SAE and Big Data Analytics

27 Thank you very much!

28 ADB's Statistics Capacity Building Efforts and Some Lessons First statistics capacity building project in 1970s (for Singapore on national accounts) Approximately 100 technical assistance projects on various topics since then Statistics management and strengthening of national statistical systems Development of statistics master plan Strengthening of selected areas in statistics (national accounts, financial statistics, social statistics, etc. ) Improving data collection strategies (household surveys, administrative reporting system, dissemination practices) Established partnerships with other development agencies in the region.

29 ADB's Statistics Capacity Building Efforts and Some Lessons International Comparison Programme for Asia and the Pacific Updating and Constructing the Supply and Use Tables for Selected Developing Member Economies Statistical Business Registers (SBR) for Improved Information on Small, Medium-Sized, and Large Enterprises Evidence and Data for Gender Equality (EDGE) Innovative data collection methods for agricultural and rural statistics Implementing Information and Communication Technology Tools to Improve Data Collection and Management of National Surveys in Support of the Sustainable Development Goals