Experimental monthly filled-job series

Size: px
Start display at page:

Download "Experimental monthly filled-job series"

Transcription

1

2 Crown copyright This work is licensed under the Creative Commons Attribution 4.0 International licence. You are free to copy, distribute, and adapt the work, as long as you attribute the work to Statistics NZ and abide by the other licence terms. Please note you may not use any departmental or governmental emblem, logo, or coat of arms in any way that infringes any provision of the Flags, Emblems, and Names Protection Act Use the wording Statistics New Zealand in your attribution, not the Statistics NZ logo. Liability While all care and diligence has been used in processing, analysing, and extracting data and information in this publication, Statistics New Zealand gives no warranty it is error free and will not be liable for any loss or damage suffered by the use directly, or indirectly, of the information in this publication. Citation Statistics New Zealand (2015). Experimental monthly filled-job series. Retrieved from ISBN (online) Published in March 2016 by Statistics New Zealand Tatauranga Aotearoa Wellington, New Zealand Contact Statistics New Zealand Information Centre: Phone toll-free Phone international

3 Contents Purpose... 4 Introduction to the new series... 4 Background to the filled-job series... 4 Approach... 5 Coverage of jobs... 5 Breakdowns for monthly filled jobs... 6 Comparison with official Statistics NZ sub-annual labour market outputs... 6 Methodology Quality indicators Assumptions Limitations Tables 1 Purpose of labour market outputs Industry and person coverage Design... 9 Figure 1 Experimental filled jobs, monthly

4 Purpose This paper introduces a new experimental series we have developed to produce a measure for monthly filled jobs. The measure covers all jobs belonging to workers who were paid wages or salaries at any point in the month and uses data from the Employers Monthly Schedule (EMS) tax form. We are publishing the series for our customers to provide feedback over a six-month period. The release will allow our customers to determine whether the experimental series is a viable labour market indicator that can meet their needs. Your feedback We d like to receive comments about relevance, consistency, and timeliness. Feedback collected over the six months will be used to decide on the future of the series. It could lead to improvements or additional information being released, or to promoting the series to be an official statistic. Go to Experimental monthly filled jobs Innovation site to give your feedback. Introduction to the new series Coverage of jobs by the experimental series differs from that of current Statistics NZ s official labour market statistics (ie measures produced from the Quarterly Employment Survey (QES), Linked Employer Employee Data (LEED), and the Household Labour Force Survey (HLFS)). The monthly filled-job series is an additional timely statistic that provides an early indication of changes in the labour market. There is about three months between the end of the reference month and its publication date. We ll aim to publish the series at the end of each month for up to a year. The series is not an official statistic. Background to the filled-job series Before we developed the new series, Statistics NZ had investigated whether data from the EMS could be used to replace or modify measures produced from the QES. One part of the investigation involved estimating a monthly filled-job series based on EMS data. The EMS is a tax form that must be filed by all employers to report on earnings taxed at source. It comprises a record for each job in the month. EMS records processed by Inland Revenue since the previous supply are delivered to Statistics NZ around the 12th of each month. We decided the QES would continue operating under its current design for the short term because consultation with key customers in August 2015 indicated significant challenges in replacing it. However, during the consultation, customers expressed an interest in evaluating a monthly filled-job series based on EMS data. 4

5 Approach We use a simple linear regression model to estimate the total number of monthly jobs at about 10 weeks after the end of the reference month (ie an estimation lag of 10 weeks). The estimates are released about two weeks later (ie a publication lag of about three months). The model rates-up the incomplete monthly total number of jobs. We expect at least 90 percent of all job records in the EMS are delivered to Statistics NZ within 10 weeks after the end of the reference month. The rate-up factors account for the contribution of unavailable job records, due to the following reasons: EMS had not been filed by employer (ie late filing) EMS had been filed but not processed by Inland Revenue filed EMS was only partially processed (ie only some records were processed at 10 weeks). The regression model uses data from an input dataset at the level of geographic locations (or geos). The input dataset doesn t contain any imputed data or adjustments to account for the above. Note: A geo is a separate operating unit engaged in one, or predominately one, kind of economic activity from a single physical location or base. It is the statistical unit for filledjob outputs from QES and LEED. Coverage of jobs The experimental series covers jobs belonging to all workers who are paid wages or salaries at any point in the month. It includes jobs for: full-time and part-time employees (these are not distinguished in the EMS) employees aged 15 years and over (15+) employees on paid leave employees with a non-zero IRD number (or identifier) foreign residents, diplomats, and members of the permanent armed forces with wages or salaries self-employed people who pay themselves a wage or salary. Excluded are jobs associated with: employees on unpaid leave, and unpaid employees employees aged under 15 years employers with a zero IRD number workers with scheduler payments reported in the EMS self-employment income not taxed at source. Note: Scheduler payments are made to people who are not employees, but are employed under a contract for service (eg labour-only contractors to the building industry). 5

6 Breakdowns for monthly filled jobs For the first release, we ll make available estimates of monthly filled jobs by ANZSIC06 divisions and all industries combined. For industry breakdowns, the most robust model was the one that produced estimates for ANZSIC06 divisions. Only one-way tables of monthly jobs totals are available, because the results from a regression model based on cross-tabulated data involving region and industry were not of sufficient quality. Comparison with official Statistics NZ sub-annual labour market outputs The experimental job series complements other Statistics NZ s labour market outputs. It is a timely measure and has greater coverage of filled jobs than current official labour market statistics. The main difference between the experimental series and other labour markets statistics is the length of the reference period more jobs are covered by the experimental series. As an example, consider a café that employs 20 people over the month. Let s say 10 employees work in the café in each week of the month and it has 5 employees on each day. In terms of quarterly employment levels: The experimental series counts 20 jobs because its reference period is the calendar month. LEED counts 5 jobs because its reference period is the 15th of the middle month of the quarter. The QES counts 10 jobs because its reference period is the pay week ending on or immediately before the 20th of the middle month of the quarter. The HLFS counts 10 wage or salary workers because it measures the average weekly number of employed people in the quarter. The experimental series should be used for measuring labour demand. It is conceptually closer to LEED and the QES than to the HLFS. Note: QES and quarterly LEED measure labour demand or employment from the business perspective (ie labour demanded by businesses). Filled jobs is a labour demand measure. The HLFS is Statistics NZ's main data source for measuring labour supply, because it measures employment from the household perspective (ie households as the labour supplier). Employed people is the measure of labour supply. The new series and LEED use the same data source. However, the following need to be noted when filled jobs are compared with those from LEED: Job levels in the experimental series are generally higher than those in LEED because of the longer reference period. This can result in quarterly movements, based on the middle month of the quarter, that are significantly different from those in LEED. Methodology used by LEED uses job records directly from the EMS. In comparison, the input data of the experimental series is at the geographic unit level. LEED outputs are released about a year after the end of the reference quarter. They are produced from data that cover almost 100 percent of all jobs in the reference period. The delay in the release of LEED ensures published estimates are very close to actual values. Differences between the experimental series and official labour market outputs are summarised by the following three tables. These result in differences in estimates of levels and change. 6

7 Table 1 1 Purpose of labour market outputs Purpose of labour market outputs Attribute Experimental monthly filledjob series LEED QES HLFS Frequency Monthly Quarterly Quarterly Quarterly Reference period Month 15th of the quarter s middle month Pay week ending on or immediately before the 20th of the quarter s middle month Each week in the quarter Employment measure Count of jobs that belong to workers with wages/salaries taxed at source Count of jobs that: 1) belong to workers with wages/salaries taxed at source, and 2) existed on the 15th of the quarter s middle month 1) Total filled jobs (count of full-time & part-time employees; & working proprietors who employ staff) 2) Full-time equivalent employees (full-time plus 0.5 x part-time employees) Count of employed people A quarterly average calculated from data collected for each week in the quarter Estimates of part-time and full-time employees are published Purpose Early indicator of changes and levels in monthly jobs Estimates of change and levels associated with quarterly labour dynamic measures (ie job and worker flows) Estimates of quarterly changes and levels of average hourly and weekly (pre-tax) earnings, average weekly paid hours, and filled jobs NZ s official employment measures includes comprehensive statistics relating to employment, unemployment, and people not in the labour force Timeliness of publication About 12 weeks after end of reference month First provisional estimate released about 12 months after end of quarterly reference date Within 13 weeks of reference month (or within 6 weeks after end of reference quarter) Within 6 weeks after end of reference quarter 7

8 Table 2 2 Industry and person coverage Industry and person coverage Attribute Experimental monthly filled-job series LEED QES HLFS Industry coverage All industries All industries Excludes: A01 Agriculture A02 Aquaculture A04 Fishing, hunting, & trapping A052 Agriculture & fishing support services L6711 Residential property operators O7552 Foreign government representation O76 Non-civilian defence staff S96 Households employing staff T99 Not included elsewhere. All industries Person coverage Workers with wages/salaries taxed at source. Includes jobs belonging to: 1) employees on paid leave 2) employees 15 years and over (15+) 3) employees with a nonzero IRD number 4) non-nz residents 5) diplomats 6) armed forces 7) self-employed who pay themselves a wage/salary. (Of jobs belonging to employees with nonvalid IRD numbers, only those with a zero number are excluded). Almost the same as the experimental series. LEED excludes: 1) Jobs belonging to employees with nonvalid IRD numbers. The experimental series only excludes records with zero IRD numbers. 2) About 5,000 jobs of Inland Revenue employees. (In the experimental EMS job series, we obtain the monthly total for Inland Revenue employees from a separate supply of aggregate EMS data). All employees on the employer s payroll. Includes: employees under 15 years, non-nz residents, and people temporarily absent from work (eg sick, on leave, strike, or temporary lay-off) whether paid or not. Employed people aged 15+ years who worked for pay or profit for one or more hours, including selfemployed and unpaid relatives working in a family business. 8

9 Table 3 3 Design Design Attribute Experimental monthly filled-job series LEED QES HLFS Collection unit Generally at the enterprise/firm level (ie business that filed the EMS) Generally at the firm level (ie business that filed the EMS) Economically significant enterprises with a non-zero employee count indicator on Statistics NZ s Business Register and in surveyed industries Households Statistical unit Geographic units Jobs and geographic units. Geographic units Individuals and households Data used for producing estimates Incomplete data from EMS (or data delivered within 10 weeks). Expect at least 90% of total job records are available Almost complete data from EMS (or data delivered after months). Expect close to 100% of total job records are available Sample survey of approximately 18,000 geographic units Sample survey of approximately 16,000 households in private dwellings All working-age people living in selected households are surveyed Sampling errors None None Yes Yes Nonsampling errors Errors in processing and estimation methods Errors in processing and estimation methods Inaccurate respondent replies Errors in processing Inaccurate respondent replies Errors in processing 9

10 Methodology Preparing an input dataset We apply the following processes to raw data from the EMS. 1. Account for EMS filing groups. In an EMS group, an enterprise or a firm reports on behalf of a group of employers. Job proportions (calculated from the employee count indicator) on the Business Register (BR) are used to model the job total of each enterprise in the group. (This method is known as pro-rating or apportioning.) Note: The BR is a database of individual private and public sector businesses and organisations engaged in producing goods and services in New Zealand. Attributes such as industry, region, and size indicator, or the employee count indicator, are associated with each unit on the BR. 2. Transform enterprise data to data at the level of geographic units (or geos). Note: A geo is the statistical unit for producing industry and region outputs. Data must be modelled for geos belonging to multi-geo enterprises (or enterprises with more than one geo). About half of all jobs in New Zealand belong to multigeo enterprises. We use job proportions from the BR to model the job total of each geo in a multi-geo enterprise. Data from the EMS filed by the Ministry of Education (MoE) is dealt with by a separate process. Its EMS form includes all teachers in state sector schools but the data is not at the school level. We model the job total of each school using job proportions calculated from quarterly MoE school payroll data used by the QES. (MoE school payroll data is at the school level.) 3. Obtain industry codes for geos from the BR. We use a snapshot of the BR, at 10 weeks after the end of the reference month, to obtain industry codes. Some employers are missing from the snapshot because of data errors and delays associated with updates to the BR. For these cases, we use industry descriptions reported by employers for ACC purposes to impute the industry code. Fitting a linear regression model to the input dataset The rate-up factor (or the coefficient) in the linear regression model is based on historical relationships between incomplete and complete jobs data from the EMS; and the dimension of interest (ie industry). The model, in essence, works by calculating the average rate-up factor by dimension of interest between complete data at one year; and incomplete data at 10 weeks. Each estimate of monthly filled jobs uses at least five years of historical data. Note: Complete EMS data is that available a year after the end of the reference month. A lag of this length is sufficient for capturing almost 99 percent of all jobs. The linear regression model is: Y (t) = cx(t) where Y (t) is the estimated filled jobs in month t X(t) is the count of filled jobs in month t based on data with a 10 week lag. c is the coefficient (or rate-up factor) based on 5 or more years of historical data. 10

11 Time series and revisions Only estimates based on EMS data associated with a lag of 10 weeks (or about 2 months) will be published. Each monthly release adds an estimate for the latest reference month to the previously published series. We ll only make a revision to an estimate for a past reference month if it is deemed to be significant. Each estimate is actually revised 10 times by the regression model, because of three reasons. The production cycle is monthly. We prepare a new input dataset after EMS data is delivered to Statistics NZ (around the 10th of each month). The first estimate is based on data with a 10-week lag. The actual total (or final value) is assumed to be available from data with a 12- month lag. Quality indicators Two quality indicators are available for estimates produced by the regression model: The 95 percent confidence interval for the estimated value Y (t) (ie there is a 95 percent probability that the actual value lies within the confidence interval). R-squared values or the proportion of the total variation explained by the model. A back-series of monthly filled jobs, by ANZSIC06 industry divisions and all industries combined, is in a spreadsheet included with the release. It includes values for the above indicators. Figure 1 shows data from the spreadsheet (covering reference months December 2006 to 2015). Figure 1 1 Experimental filled jobs, monthly The spreadsheet shows all estimates by ANZSIC06 industry divisions are within the 95 percent confidence interval and r-squared values are close to 1. Therefore, the linear regression model has worked very well and is the best fit for the input dataset. The spreadsheet also shows a comparison of estimated monthly filled jobs (ie based on incomplete EMS data and associated with an estimation lag of 10 weeks) and actual totals (based on complete data). 11

12 In figure 1, there are two periods with a relatively large gap between the experimental series and the actual total. For reference months belonging to these two periods, we received a relatively lower number of job records within 10 weeks. It resulted in the regression model producing underestimated totals. Two periods and events may have affected the filing and processing of EMS tax forms: July 2007 January 2009: KiwiSaver s introduction in July 2007 April 2012 November 2012: Changes to student loan repayment tax codes introduced in April Assumptions We made five assumptions in the estimation methodology for the experimental series. Economic events affecting the labour market are reflected by the input dataset. This is expected given that at least 90 percent of the total number of job records in the EMS are delivered to Statistics NZ within 10 weeks after the end of the reference month. The impact of late filing and partial processing of EMS forms is modelled by relationships in the historical data. Volatility caused by late filers and partially processed EMS returns don t significantly bias the input data. We don t use any methods to reduce the bias in the input dataset (eg imputing job totals for geos belonging to late filers or rating-up the raw job total of geos with partial jobs data). Job proportions (from the BR) we use to model job counts for geos belonging to multi-geo enterprises are approximately the same over time. Job proportions derived from the BR, and belonging to enterprises in EMS groups or geos in multi-geo enterprises, are assumed to be about the same in each month between consecutive February months. The proportions are calculated from the employee count size indicator, reported at the geo level by a sample of businesses surveyed in February of each year. Job proportions derived from the latest quarterly MoE school payroll data (used by the QES) are assumed to be about the same for the month of interest. EMS groups and multi-geo enterprise structures, used for modelling job counts, are still valid at the time we prepare the input dataset. Up-to-date information about the structures of EMS groups and multi-geo enterprises is not available in a timely manner. The input data for the regression model uses information: about (non-moe) EMS group structures that existed 18 months before the reference month at the school level, from MOE payroll data, in the quarter before the reference month from the BR, which has a two-month lag associated with updates. (Although the BR is updated monthly, it can take up to 2 months before some changes are reflected). 12

13 Limitations We have four limitations about the experimental series. The experimental series is a non-standard employment statistic. A monthly reference period is a relatively long one for a timely sub-annual labour market measure. Current international standards refer to using a short reference period to measure currently employed persons. Statistics NZ s current official labour market measures (LEED, QES, and HLFS) are based on data collected for a short reference period. Job levels from the experimental series are higher than those from LEED and the QES because of different reference periods. The reference period used by the experimental series includes the following jobs excluded by LEED and the QES: short-term jobs that exist outside the reference periods used by LEED and the QES jobs that ended before the start of the reference periods used by LEED and the QES jobs that started after the end of the reference periods used by LEED and the QES. The experimental series cover a larger number of short-term jobs than LEED and the QES. The regression may not be able to deal with shocks, or large unexpected or oneoff events that significantly affect EMS filing or Inland Revenue s data processing. A modelled monthly earnings series, by industry and region, isn t available because of estimation issues. The apportioning or pro-rating method based on job proportions from the BR is suitable for modelling job counts at the geo level in the input dataset. However, the BR does not have a variable that is highly correlated with earnings to enable the apportioning method to model monthly earnings at the geo level. Using job proportions isn t a good approach because it incorrectly assumes there are no differences in average earnings by industry and region. For example, Auckland and Wellington have significant wage premiums over other regions. Pro-rating earnings based solely on job counts results in an under-allocation of earnings to Auckland and Wellington, and over-allocation to other regions. The ideal approach for modelling earnings at the geo level is to use job-level data. The current system is set up to handle data associated with lags of 10 weeks and 12 months. We can t evaluate revisions due to late filing by employers or delayed processing by Inland Revenue, because data associated with lags of 3 to 11 months are not included in the input dataset. For example, the system is unable to assess the impact of job records that are delivered between 10 weeks and three months after the end of the reference month. What we can evaluate: Revisions to estimates based on data with a 10-week lag. These are due to corrections Inland Revenue makes to data that had been delivered, and retrospective updates on the BR. We calculate these revisions after the regression model is fitted to the data. They are expected to be small. The difference between the estimate based on data with a 10-week lag and the final value. This can be calculated about a year after the end of the reference month. 13