Country Paper: Use of Information Technology for Annual Survey of Industries in India For Expert Group Meeting Opportunities and advantages of enhanced collaboration on statistical information management in Asia and the Pacific, 20-22 June 2011, Bangkok, Thailand Introduction: Dr. Ravendra Singh Deputy Director General, Central Statistics Office, Ministry of Statistics & Programme Implementation, New Delhi (India) Central Statistics Office, under the Ministry of Statistics & Programme Implementation is responsive for coordination of statistical activities in the country and for evolving and maintaining statistical standards. Its activities include compilation of National Accounts; conduct of Annual Survey of Industries and Economic Censuses, compilation of Index of Industrial Production, as well as Consumer Price Indices. It also deals with various other areas like social statistics, training, international cooperation, Industrial Classification etc. The industrial sector is one of the most important sectors of the Indian economy and hence compilation of industrial statistics assumes a crucial importance, both for research and policy-making. With the emergence of a highly competitive environment, both at national and global levels along with the shift in policies from a regime of control, permit and license towards a system of liberalization and free flow of market forces through the process of economic reforms, the need for ensuring the availability of timely, adequate and reliable statistics on this sector cannot be over emphasized. In India, for the purpose of collection of data, the entire industrial activity is divided into factory and non-factory sectors based on the size of employment in different producing units under the activity. The factory sector covers units registered under the Factories Act, 1948. The non-factory sector consists of the remaining manufacturing units. The factory sector is designated as registered or organised sector and non-factory sector is called as unregistered or unorganized sector. Moreover, electricity and major minerals are also parts of organised industrial sector, while minor minerals belong to the unorganized sector. The main source of data pertaining to organised sector is the Annual Survey of Industries (ASI), while the data on unregistered sector are collected mainly through periodic sample surveys conducted by National Sample Survey Office (NSSO) including the follow up surveys of the Economic Census. Annual Survey of Industries (ASI) The Annual Survey of Industries (ASI) is the principal source of industrial statistics (Organized sector) in India, which started in the year 1946. It provides statistical information 1
to assess and evaluate, objectively and realistically, the changes in the growth, composition and structure of organized manufacturing sector comprising activities related to manufacturing processes, repair services, gas and water supply and cold storage. Some of the major uses of ASI data are given below: (i) Estimation of the contribution of the manufacturing industries as a whole and of each type of industries to the National Income and also to the State Income for each State. (ii) Selection of item basket and weighting diagram for revision of base year of Index of Industrial Production (IIP). (iii) Selection of items and allocation of weight for the manufacturing products in the Whole Sale Price Index Number. (iv) Preparation of input for the Input Output Transaction Table (IOTT) that depicts interindustry relations of an economy and shows how the output of one industry is an input to each other industry. (v) Systematic study of the structure of the industry as a whole and of each type of industry. (vi) Analysis of the various factors influencing industries in the country. (vii) Construction of comprehensive, factual and systematic basis for formulation of policy based on a comprehensive database on industry. (viii) Productivity analysis of Indian Industry. ASI Schedule: ASI schedule is the basic tool to collect required data from the factories. The schedule for ASI, at present, has two parts. Part-I of ASI schedule, aims to collect data on assets and liabilities, employment and labour cost, receipts, expenses, input items: indigenous and imported, products and by-products, distributive expenses, etc. Part-II of ASI schedule aims to collect data on different aspects of labour statistics, namely, working days, man-days worked, absenteeism, labour turnover, man-hours worked etc. The reference period (year) for ASI is the financial year (April-March) for all items. The actual survey period for ASI is generally September to April every year. Unit of Enumeration: The primary unit of enumeration in the survey is a factory in the case of manufacturing industries, a workshop in the case of repair services, an undertaking or a licensee in the case of electricity, gas and water supply undertakings and an establishment in the case of other industries. All the factories in the updated frame are divided into two sectors, viz. Census sector, consisting of the relatively bigger units (in term of size of worker) and the Sample sector, consisting of the remaining units of the frame. While the census sector units are completely enumerated every year, a part of the sample sector is selected for survey for a particular year. 2
The Existing IT strategy: The IT strategy for Annual Survey of Industries in India comprises a large array of activities, ranging from e-schedules, Verification and Scrutiny of information in e-schedules and physical form, merging of electronic data with those received in physical form, Tabulation and data analysis and dissemination through publications and through National Data Warehouse Centre. E-Schedule System During ASI 2008-09, the data captured for ASI was done on Pilot Study through e-schedule developed in MS-excel. There were a number of problems faced in transferring the data of e- schedule (MS Excel) to the existing system being followed in ORACLE data base. Later on, the e-schedule in MS Access was developed to overcome some of these problems of e- schedule in MS-Excel. Currently for ASI 2009-10, the e-schedules are coming in MS Access format only. Some of the difficulties being faced in the existing system of e-schedules in MS Access through e-mail are listed below: E-schedules received from field offices do not contain the name of location in the file nomenclature, hence to know that for which location, the data belongs, one has to open the file and see the data. E-schedules are received as an attachment file in e-mail and down loading of which is delaying the further processing. Some of the files are received containing one schedule only which creates mismanagement of the files due to large number of files. PSL/DSL are interchanged in some of the schedules. The units are coming where DSL Nos. are not in the selection list which requires again checking from the field. Some fields which are mandatory are missing and not checked in the e-schedule. The files which are being sent for processing have two names in each block, for example Block_C and Block_C_Final tables are there and it is not clear, which file have to take for data transfer. The data structure of the existing tables are not consistent with the table structure followed at processing, hence further labour is to put to make the data consistent with the Oracle structure being followed at processing centre. Some time the duplicate table are also coming i.e. e-schedules as well as paper schedule both. Block-F and Block-G are coming as Row-wise where as it should come Column-wise as required for processing. 3
For ASI 2008-09, about 1486 schedules in MS-Excel format were received, whereas in ASI 2009-10 the target is to receive 10000 schedules in MS-Access format out of total number of 61080 schedules. Existing IT Setup of procesing centre i.e. CSO(IS Wing) Compaq Alpha Server ES40, 64 bit RISC Processor 02 Nos. Two Alpha Server in OS Cluster Operating System Tru64 Unix Ver. 4.0f with Tru Cluster V1.2 External Storage Box RA3000 (3X18 GB HDD) Data Entry cum Validation package in VB In-house developed Database Oracle 8i Enterprise Edition Data Processing Visual Fox Pro 6.0 Data Processing at CSO(IS Wing) 1. On receipt of Schedules from the field offices, a proper receipt in register is made. 2. Before sending to processing, the following aspects are verified: ID Checking and reporting in Receipt Register. Checking whether all blocks are there. Checking whether scrutiny sheet is attached. Checking of item-wise Ex-factory value in the relevant blocks. Checking if there is change in reported industry code in 2-digit level. 3. Pre-detail data entry scrutiny is carried out as per scrutiny guidelines. In case of discrepancies in schedules references are made to field office for clarification. 4. In respect of schedules received in physical forms, data entry is done at the CSO(IS Wing) after verification of required data in all blocks. Online data transfer (Eschedules) is done for processing, verifying the data through CSP (Computer Scrutiny Program), and after complete coverage checking the error lists are generated. 5. Error list correction is done in consultation with schedules at different blocks. If required, the reference also made to field offices. 6. Data consistency checked with previous year data at 2 digit Classification level and State wise for the census units. 7. Generation of trial tables after data entry and validation. 8. Sending provisional results to different States for their comments. 4
9. Examination of their comments and incorporation of corrections, if any, after further verification. 10. Releasing the results both in hard and soft copy. 11. Dissemination of results without cost or with cost as per approved price. Future Plan of IT Setup The present available IT set up installed about 10 years back, to manage data related to annual survey of Industries, would not be able to support the expected requirements for e- filing, security of data, data processing and data dissemination. Accordingly, a proposal has already been approved by Government of India to upgrade the IT set up at CSO(IS Wing) so as to keep a pace with the requirement of e-schedule system as well as the future requirement of e-filing of schedule by the industries. Components of CSO (IS Wing) Web Portal a) Existing servers to be replaced by high end servers. b) Upgradation of Oracle database to latest version. c) Web based front end development. d) Historical Data Migration from old Database Oracle 8i to the new one. e) Integration of existing applications with Online Portal. f) Online data dissemination. g) Online payment gateway. h) Network security implementation. i) Integrated backup solution. j) Company wise security and Anti-virus solution. Benefits of CSO (IS Wing) Web Portal Distributed data entry at the source with built-in validation as part of Digitisation leads to data accuracy and time saving. No physical movement of schedules from field offices. On-line scrutiny at Regional/field offices. Aggregation of data through on-line transmission. 5
Centralised database management, administration, maintenance of portal will lead to substantial savings. Download/ Upload facility at CSO-IS Wing. More emphasis on on-line unit-wise scrutiny with past data and referencing to Field offices. More emphasis on on-line scrutiny input-output data. Easy flow of Information among various stake holders. On-line dissemination of information. 24X7 availability in a secured environment. Increase usage of the portal, leading to faster delivery of data analysis reports to different Govt and Non Govt organisations. Provision to capture desired monthly data of output for construction of state as well as all-india IIP. The e-schedule/e-filing package is one part of the Web Portal. Beside this, the processing, dissemination, backup and availability of data on 24X7 are the other aspects for which the idea of web portal would be useful. Above all, the Web Portal will be user friendly environment to access the metadata of ASI. Linkage of ASI data base with National Data Warehousing Centre: CSO(IS Wing) is responsible for collation of filled in schedules in the case of physical collection of data and online collection of e-schedules from the field offices. After proper verification and scrutiny of schedules, data processing, tabulation and dissemination/ publication of results are carried out. The results of the ASI are published as under: Annual Survey of Industries: Vol-I Publication contains state and national level data on important economic parameters such as Capital, Output, Input, Gross Value Added, Net Value Added, Profit/loss, Employment and wages etc. Annual Survey of Industries: Vol-II Publication contains state-wise and industry-wise input and output data Software-supported CD-ROM publication Summary Results for Factory Sector at NIC 4 digit classification 6
The soft copies of publications along with the Unit level data collected under ASI are forwarded to Computer Centre of the Ministry of Statistics & Programme Implementation for further dissemination and making a part of National Data Warehouse of India, which has been established to develop an integrated repository of current and historical data. This encompasses data generated by various Government agencies at Centre and State levels in the country. The National Data Warehouse has been created a data warehouse with online analytical processing (OLAP) capabilities and enables web-based access to the Data Warehouse. Constraints: There are few areas, which needs to be taken care in future: (i) (ii) (iii) (iv) Identification of appropriate high end hardware and software, which can take the future requirements of ASI. Replacement of hardware without hampering the present work plan. Fill up the vacant positions by technically qualified personnel. Provision of training to existing staff so as to have smooth transition. (v) Organise training of personnel of manufacturing units so as to succeed in e- filing of returns by them. Conclusions: A large quantity of data is generated under the Annual Survey of Industries, which is also growing due to increase in the industrial units every year. Further in order to obtain quick information, a system of e-schedule was introduced in 2008-09 on a pilot basis, which is extended to cover almost all the bigger industrial units during 2009-10. It is further proposed to enhance to permit all industrial units to file e-return for the purpose of annual survey of industries. It is, therefore, pertinent to enhance IT use in the ASI so as to cater to the future requirements of the data processing, tabulation & data analysis along with the requirement of data dissemination policy. There is further need to have proper linkage with the National Data Warehouse, so as to facilitate better dissemination of information. 7