Does Employer Learning Vary by Occupation? [Job Market Paper]

Size: px
Start display at page:

Download "Does Employer Learning Vary by Occupation? [Job Market Paper]"

Transcription

1 Does Employer Learning Vary by Occupation? [Job Market Paper] Hani Mansour University of California at Santa Barbara November, 2008 Abstract In this paper I provide evidence that initial occupational assignments a ect the ability revelation of young workers in the labor market. Di erences in employer learning about worker s ability carry two implications: First, the growth in wage dispersion with experience will di er across initial occupations; Second, the growth in the coe cient of a variable that employers initially do not observe, such as the Armed Forces Quali cation Test (AFQT) score, will also vary across occupations. Empirically, I nd that occupations with high growth in the variance of residual wages over the rst ten years of the worker s career are also the occupations with high growth in the AFQT coe cient. In contrast, the results for occupations with low growth in the variance of residual wages suggest that employers have learned about the worker s AFQT score at the time of hire. I also investigate the informational role of race across occupations. The coe cient on the Black variable in occupations with low growth in the residual variance is positive and decreases over time, but the racial gap widens over time in occupations with high growth in the residual variance. These results have implications for di erent topics in labor economics, such as occupational mobility, and the signaling value of education. JEL Codes: J24, J31, J71, J62 Keywords: Wage Dynamics, Occupational Choice, Statistical Discrimination. I am grateful to Peter Kuhn, Kelly Bedard, Olivier Deschênes, Peter Rupert, Javier Birchenall, and Catherine Weinberger for helpful comments and continuous support.

2 1 Introduction Employers often hire new inexperienced workers without observing their full productivity. Instead, they assess a worker s value to the rm based on information they receive from job interviews, resumes, and recommendations (Spence, 1973). Altonji and Pierret (2001) [hereinafter AP] show that if employers learn about the worker s productivity, the coe cient of an ability correlate which is initially unobserved by employers, such as the Armed Forces Quali cation Test (AFQT) score, increases with experience. AP s analysis, and most other models that use a learning framework, assume that learning is independent of job assignment. It is likely, however, that the amount of initial information workers bring to the market varies across occupations and that employers learn about the productivity of their workers at di erent rates. 1 The main hypothesis in this paper is that initial occupational assignments a ect the ability revelation of young workers in the labor market. I test this hypothesis by combining AP s insight with another implication of employer learning, namely that the distribution of wages becomes more dispersed as a cohort of workers gains experience (Neal and Rosen, 1998). Combining these two observations, we would expect the AFQT coe cient to grow the most in occupations with high growth in wage dispersion. 2 The main contribution of the paper is to provide evidence that initial occupational as- 1 An exception is Altonji (2005) who presents a framework in which the rate of employer learning depends on the skill level of the job to show how, in this environment, statistical discrimination at the time of hire a ects employment rates and wage growth, but he does not test the model empirically. In addition, Antonovics and Golan (2007), motivate the idea of "job shopping" within rms using a model in which learning di ers across jobs. 2 Matching models, such as Jovanovic s (1979a) model, also generate increased wage dispersion with experience. Although I adopt a learning framework, the main point in this paper is to show that the ow of information (whether about worker s ability or quality of match) di ers across initial occupations. Miller (1984) discusses an environment where workers learn about the quality of their match at di erent rates across occupations. 1

3 signments are associated with di erent learning parameters. The results have important implications for various models in labor economics that use a learning framework, but assume that learning is independent of job assignment. These include models of statistical discrimination, occupational mobility, wage dynamics within rms, occupational wage differences, earnings inequality, and labor market signaling. 3 The model in the paper follows that of AP. 4 Identical employers form expectations about the worker s productivity, and in each period update their initial belief based on a noisy signal of output produced by the worker. Di erently from AP, the technology I specify assumes that the precision of the signal of output di ers across occupations. If these di erences are empirically signi cant, the growth in the AFQT coe cient and the growth in wage dispersion will vary across initial occupations. 5;6 Empirically, I use data from the Current Population Survey (CPS) Outgoing Rotation Group (ORG) for the period to calculate the two-digit occupation growth in wage dispersion for workers with 0-5 and 6-10 years of experience. 7 I then, merge the estimated growth rates with the rst occupation that workers report in the 1979 National 3 Some of the relevant references are Topel and Ward (1992), Gibbons et. al (2005), Gibbons and Waldman (1999; 2006), Lemieux (2006), Lange (2007), among others. 4 AP s learning model closely follows that of Farber and Gibbons (1996) [hereinafter FB]. One important di erence between the two is that FB estimate a wage level regression while AP s dependent variable is the logarithm of wages. 5 The ow of information about the worker is public and observed by all labor market participants. The debate in the literature wether information ows symmetrically across employers has not been resolved yet. Recent papers such as Kahn (2007) and Schönberg (2007) use a learning model to test whether information between employers is symmetric. Although they use the same data set they reach opposite conclusions. 6 Throughout the analysis, I do not model initial occupational choice and assume that it is associated with a xed learning parameter for the worker s entire career. In section 5, I show that the model s predictions and empirical results are consistent with an assignment rule in which higher-expected-ability workers choose to work in occupations where output is sensitive to individual ability. 7 Ideally, I would want to follow the growth in the variance of residual wages for a cohort of workers who started their career in the same occupation over time, rather than comparing the variance of two cohorts within an occupation. This is not feasible in the NLSY79 because of sample size limitations. In section 6, I perform robustness checks to address this concern. 2

4 Longitudinal Survey of Youth (NLSY79) and compare the growth in the AFQT coe cient across occupations with di erent growth rates in wage dispersion. 8 My main nding is that occupations with high growth in the variance of residual wages over the rst 10 years are also the occupations with high growth in the AFQT coe cient over the same period. This provides further support for the notion that the market learns some of the worker s characteristics, which are correlated with AFQT score, over the rst part of the career. It also suggests that initial occupational sorting may have important implications for learning. I call occupations with high levels of growth in wage dispersion and in the AFQT coe cient "high learning" occupations. As for "low learning" occupations, I nd that the AFQT coe cient is large and does not grow over time, and that they are associated with an initial high level of residual variance. This indicates that, in these occupations, the market has learned the worker s AFQT score at the start of the career. I apply my results to test for statistical discrimination, and show that the evolution of the Black coe cient di ers across high and low learning occupations. The rest of the paper proceeds as follows. Section 2 presents a learning model which incorporates the idea of di erential learning rates across jobs. Section 3 describes the data, and section 4 includes the empirical analysis and results. Section 5 speci es an initial occupational assignment rule consistent with the model, and section 6 includes robustness checks. Finally, section 7 concludes and discusses future work. 8 Focusing on occupation-speci c growth rates in the variance of residual wages compared to looking at levels of residual wages is important since it accounts for the fact that employers in some occupations might have richer information about worker s ability, that is not related to learning. It also accounts for the fact that levels of residual wages might be a mechanical consequence of how broadly or narrowly the occupation is de ned and not of employer learning. 3

5 2 Model and Predictions 2.1 A Model of Employer Learning The model builds on that of AP. Worker i enters the labor market with skill level, h, which is decomposed into four components: h i = as i + bq i + z i + i (1) In (1), s represents variables observed by the employer and the econometrician (such as education); q is a productivity component observed by the employer but is not in the data (such as information obtained during a recruitment interview); z is a variable observed and used only by the econometrician (such as AFQT score); and is a productivity component that neither the employer nor the econometrician observe. The focus in this paper is only on time-invariant productivity components, such as innate ability, and abstracts from timevariant components acquired, for example, from on-the-job training, although the two are likely to be correlated. Because employers do not observe z or, they form expectations conditional on s and q. In what follows, I suppress the index i. Assuming that these expectations are linear in both s and q, de ne: z = E(zjs; q) + = 1 q + 2 s + (2) = E(js; q) + " = 1 s + 2 q + " 4

6 By de nition of an expectation, both and " are uncorrelated with s and q and have mean zero. Calculating the expected value of the skill level in (1) conditional on the information available to employers upon entry to the labor market, and substituting (2) in (1) yields: h e = E(hjs; q) = (a )s + (b )q + ( + ") (3) The term ( + ") in (3) represents the initial error in the employer s expectation about the worker s ability and is uncorrelated with s and q. There are J jobs (or occupations) in the economy. 9 The output in each job is a function of the worker s skill level, h, and the experience pro le of productivity, G j (t). The worker s output if assigned to job j at time t is: y jt = exp(h + G j (t) + jt ), j = 1; :::; J (4) G j (t) is assumed to be independent from s; q; z;or. The error term jt is job-speci c and represents a productivity shock drawn from a normal distribution with mean 0 and variance 2 j. 10 Denote Y jt = log(y jt ) and de ne x jt = [Y jt (as+bq) G j (t)] = z++ jt to be a noisy signal of productivity that employers can compute at the end of each period. This signal is public and is observed by all employers. Di erently from previous studies the precision of the signal di ers across occupations. 11 Since employers form expectations about z and based 9 Throughout the paper I use the terms job type and occupation interchangeably. The empirical analysis will use occupations to test for di erential employer learning. 10 The paper s objective is to provide empirical evidence that learning rates di er across occupations. Assuming that the variance of the productivity shock di ers across occupations is one possible mechanism that generates such patterns. Other mechanisms, such as the one I present in section 5, also generate such di erences. 11 In the empirical analysis I use the two-digit occupation codes to look at learning across jobs. This implicitly assume that the i.i.d shock,, is the same across all the individual occupations under each code 5

7 on s and q, observing x jt is equivalent to observing b jt = x jt E(hjs; q) = +"+ jt, which is the sum of the initial error upon entry to the labor market and the productivity shock. The worker s output history in occupation j is summarized by the vector B jt = fb j1 ; b j2 ; :::; b jt g. Wages are set by spot-market contracting, and at the end of each period workers are paid their expected output. That is, W jt = E[y jt js; q; B jt ] exp( jt ) where exp( jt ) is a classical measurement error unrelated to other variables in the model. Using equations (1), (2), and (4) we can derive the wage process: W jt = e G j(t) e s(a ) e q(b ) e E[ +"jb jt] E(e jt ) E(e v jt ) e jt (5) where jt is the di erence between [ + "] and E[ + "jb jt ], and is assumed to be independent of s,q, or B jt. It follows that the log wage process in occupation j is: w jt = G j(t) + (a )s + (b )q + E[ + "jb jt ] + jt (6) where w jt = log(w jt ) and G j(t) = G j (t)+log(e(e jt ))+log(e(e v jt )). The term E[ +"jb jt ] re ects the idea that the wage in occupation j changes because new information about worker s ability is extracted. For the purpose of this paper, I de ne an occupation as the entire expected career track associated with choosing a given initial occupation, so that a worker s entire career is associated with a xed learning parameter. Initially, I do not model initial occupational sorting, but in section 5 I propose an assignment rule to initial occupations which is consistent with the ndings of the paper. and does not take into account the fact that within-occupation job heterogeneity might be substantial. 6

8 2.2 Estimation Framework Recall that only variables in s and z are observed in the data and can be used for the empirical analysis. Assuming that s and z are scalars, the log wage process from equation (6) can be summarized in the following regression: w jt = sjt s + zjt z + G j(t), t = 0; :::; T (7) The main insight of AP is that if employers learn and statistically discriminate, the coe cient on a correlate of productivity which is not observed by the employer upon hiring, such as AFQT score, should increase over time, while, using the same logic, the coe cient on an easily observed variable, such as education, should decrease over time. The hypothesis in regard to whether employers learn can be tested within each occupation. Proposition 1 summarizes this result and the proof can be found in AP. 12 Proposition 1 Under the assumptions of the model, the regression coe cient zjt is nondecreasing in t, and the regression coe cient sjt is nonincreasing in t. If learning varies by occupation, the evolution of zjt will di er across occupations. Specifically, zjt will grow more with experience in occupations where employers extract more information about the worker s ability. We can estimate equation (7) by occupation and compare the evolution of the hardto-observe variable over time in order to test for di erences in employer learning. However, because of sample size limitations, I use another insight from the model to group occupations 12 The wage regressions in AP control for initial occupations xed e ects, but do not analyze the wage dynamics within these occupations. The proof of proposition 1 is not a ected by introducing di erent informational structure across occupations. 7

9 based on the growth in the variance of residual wages, and then test for di erences in the evolution of hard-to-observe variables over time across these groups. 2.3 Variance of Residual Wages A straightforward implication of employer learning is that the distribution of wages of a cohort of workers becomes more dispersed as they gain experience (Neal and Rosen, 1998). If employer learning di ers across occupations then we should expect di erent growth rates in wage dispersion with experience. 13 The use of the growth in the variance of residual wages as a measure of employer learning is important since level di erences in the variance of residual wages across occupations might re ect, among other things, di erences in the amount of information employers have on their employees at the time of hire. It also accounts for the fact that levels of residual wages might be a mechanical consequence of how broadly or narrowly the occupation is de ned and not of employer learning. To clarify the idea consider the following wage equation: w ijt = x ijt r jt + ijt (8) where w ijt is the natural logarithm of the hourly wage rate for person i in occupation j with t years of labor market experience; x ijt is a vector of observable skills such as education; r jt 13 Theoretically employer learning generates growth in wage dispersion. However, there are other reasons which can explain growth in wage dispersion within and across occupations. For example, the growth in the wage dispersion in traditionally unionized occupations might be small even if employer learning rates are high (Lemieux, 2006). Ranking occupations based on growth in wage dispersion o ers a simple way to analyze proposition 1 where the learning hypothesis is tested directly. 8

10 is the price of observable skills in occupation j; and ijt is an occupation-speci c regression residual. The regression residual can be decomposed into two parts (Lemieux, 2006): ijt = p jt a ijt + ijt (9) where p jt is the price of unobserved skill in occupation j, a ijt is unobserved skill, and ijt is measurement error. Calculating the variance of the residual yields: V ar( ijt ) = p 2 jtv ar(a ijt ) + V ar( ijt ) (10) An increase in V ar( ijt ) with experience can be due to an increase in p jt, an increase in V ar(a ijt ), and an increase in V ar( ijt ). Focusing on low levels of experience, we can safely assume that the price of unobserved ability in occupation j does not change with experience and that the variance of the measurement error does not grow systematically with experience, so that p jt = p jr and V ar( ijt ) = V ar( ijr ) for t 6= r. Under these assumptions, the growth in residual variance between t and r years of experience in occupation j where r > t is: V ar( ijr ) V ar( ijt ) = p 2 j[v ar(a ijr ) V ar(a ijt )] (11) Equation (11) implies that even if the price of unobserved skill varies across occupations (for example, p j > p k for j 6= k), observing di erent growth rates in the variance of residual wage across occupations can be attributed, under reasonable set of assumptions, to di erences in the growth of the variance of unobserved ability across occupations. Thus, a faster growth in the variance of residual wages is evidence for faster updating of the employer s beliefs about 9

11 the worker s productive ability. This insight o ers a simple way to group occupations based on a consistent measure of learning, which I will use to test the results of proposition 1. 3 Data Most of the analysis in this paper uses data drawn from the NLSY79 covering the period. The NLSY79 is a nationally representative sample of 12,686 men and women who were between the ages of 14 and 22 when they were rst interviewed in The survey was conducted annually until 1994 and since then the participants are interviewed on a biennial basis. The NLSY79 has been used in many of the studies of employer learning because of two main features: detailed information on work experience allows the calculation of actual rather than potential labor market experience, and most importantly, the sample includes some variables that are correlated with ability but are not observed (or used) by the employer, such as AFQT scores and father s education. The work history le in the NLSY79 provides information on the hours worked in each week of the year. In order to calculate actual labor market experience, I follow AP s methodology and accumulate the number of weeks in which the worker reports to have worked more than 30 hours and divide it by 50. I focus only on jobs after an individual has left school for the rst time and drop all observations before that. The month and year when a person left school for the rst time is directly reported by the respondents and is used as the entry date to the labor market. An individual can go back to school and remain in the sample. To construct my sample, I include only white or black men which leaves me with 5,403 individuals. I drop 2,256 respondents who left school before 1979 because information on 10

12 their rst occupation and actual labor market experience is hard to determine. I drop 362 respondents who never report to have left school and 8 individuals who do not report the month they left school for the rst time. 5 respondents who never completed more than 8 years of schooling are excluded from the sample, and 101 individuals who do not have a valid AFQT score are dropped. I nally drop 709 additional individuals who do not report their rst occupation. Aside from initial occupation, individuals in the NLSY79 report information on all the jobs they held between two interviews. I only use information from the job they hold at the time of the interview (CPS item). I exclude jobs without pay, jobs at home, and military jobs. The wage measure in the sample is the hourly wage rate of pay at the most recent job from the CPS section of the NLSY79. I use CPS de ators to calculate real wages in 1984 prices and I drop wages below $1 or above $100. My nal sample includes 22,409 observations on 1,962 distinct individuals. Since NLSY79 participants took the AFQT at di erent ages, I standardize the AFQT scores to have mean zero and a standard deviation of 1. I do this by subtracting the mean score for a person of that age group and divide it by the standard deviation for that age group. Table 1 provides summary statistics for the main variables I use in the empirical analysis. In order to calculate growth in residual variance by occupation, I use information from the ORG CPS. The optimal way to calculate growth in the residual variance and attribute it to employer learning would be to track an individual at di erent points in the experience pro le. Since a large panel that allows analysis within occupations is not available, I use a cohort analysis in which the residual variance of workers with 0-5 years of experience in 11

13 an occupation is compared to workers with 6-10 years of experience in the same occupation and year. The ORG data set is adequate for the purposes of this paper because it contains information about the hourly wage rate, the same measure I use in the NLSY79 sample. Moreover, as Lemieux (2006) shows, the wage information in the ORG is less noisy compared to the March CPS because it measures directly the hourly wage of workers paid by the hour. I pool data from and keep in the sample only white or black men who are years old and have 0-10 years of potential experience. 14 I exclude from the sample self employed and full or part time students and individuals with below 8 years of schooling. I use the hourly wage rate for workers paid by the hour (about 60 percent of the sample) and calculate an hourly wage rate for the other workers by dividing their weekly earnings by their usual weekly hours. Real wages are calculated in 1984 prices using the CPS de ator. I drop wages that are below $1 and above $100, and adjust top-coded earnings by a factor of 1.4 (Lemieux, 2006). Since actual experience is not reported in the ORG, I calculate a measure of potential experience (age-education-6). As it is well known, in 1992 the U. S. Bureau of Labor Statistics changed the educational attainment question in the CPS from one that was based on years spent in school to one that focuses on the highest degree received. I use the method proposed by Jaeger (1997; 2003) to create a consistent measure of years of schooling (to calculate potential experience) and to group individuals in consistent educational categories. Table 2 presents some summary statistics from the ORG sample. Panel A reports the sample share of four educational groups in the period. As can be seen, the 14 I analyzed other shorter periods of time such as and The results regarding the growth in the variance of wage residuals within occupations are not sensitive to the time period. In order to increase my sample size I choose to pool the entire period of

14 distribution across the sample period remains relatively constant with a steady increase throughout the period in the share of college graduates and those with some college. Panel B reports the sample share of four experience groups: workers with less than ve years of experience and workers with 6-10 years of experience. The distribution across experience groups is similar across the sample years with a small increase in the share of workers with 6-10 years of experience. This suggests that compositional e ects over the sample period are small. In a separate analysis which I do not report here, I compute the distribution of occupations over the sample period and nd no evidence of substantial changes. Moreover, I analyzed the distribution of education by occupation-experience cells and found no evidence for signi cant changes in the education level within an occupation over the experience pro le. 4 Empirical Speci cations and Results 4.1 Ranking Occupations As mentioned earlier, the growth in the wage dispersion of workers over time can be attributed to employer learning. Ideally, in order to rank occupations based on employer learning we would want to follow the growth in wage dispersion of a cohort of workers who start their career in the same occupation, over time. Instead, I calculate the growth in the variance of residual wages for two experience-cohorts within occupation in a given period of time. Although this growth measure does not account for the possibility that occupational mobility might di er across initial occupations, it has two main advantages. First, this 13

15 growth measure does not confound the increase in wage dispersion due to learning with the increase in wage dispersion due to an increase in the price of unobserved skill because of skill-biased technological change (Katz and Murphy, 1992). Second, the growth measure will not be contaminated by potential composition e ects due to large changes in the education and experience of the U.S. population (Lemieux, 2006; Card and DiNardo, 2002). In section 6, I address the concern regarding occupational mobility, and show that it is not what derives my results. I start the empirical analysis by computing the variance of the residual wage by occupationexperience cells using ORG data. The residual wage is computed from a exible regression of log hourly wage on an unrestricted set of dummies for age and years of schooling. I also include interaction terms between six schooling dummies and a quartic in age. 15 To capture any year-speci c di erences I include a full set of year dummies. I also include a set of occupation dummies to capture di erences in the level of wage residuals between occupations. I weight the regression using weights provided by the CPS. To calculate the variance in the residual wages, I use the digit occupation codes to create 45 occupation dummies and interact them with 2 experience cells, one for workers with 0-5 years of experience, and the other for workers with 6-10 years of experience. The coe cients of a regression of the squared residuals from the wage regression on the occupation-experience cells (90 coe cients) are the coe cients of interest. I calculate the growth in the variance between these two experience categories and rank occupations based on this measure. I focus on the rst 10 years of the worker s career because employer learning 15 The six schooling dummies I use to interact with a quartic in age are 9-10, 11, 12, 13-15, 16, and 17+. I do not include an unrestricted set of age-education dummies because many of the cells are empty. I do not include other typical demographic variables like marital status and race in order to focus on direct measures of skill. 14

16 is likely to happen at the beginning of the career, so that the growth in the variance of wage residuals at later stages in the experience pro le might re ect other factors rather than employer learning (Lange, 2007). 16 I test whether the growth in the variances are statistically di erent from zero (reported in Table 3). I also apply an F-test and reject the hypothesis that the occupation speci c residual variances are all equal to each other. I want to emphasize, before presenting the results, that ranking occupations using growth in the variance of residual wages provides an indirect way to group occupations based on employer learning, but one can suggest alternative explanations for these di erences (such as di erent types of contracts across occupations, and di erential on-the-job training). Eventually, however, these occupation-speci c values are going to be matched with NLSY data where I test directly the learning hypothesis. Table 3 provides a list of the occupations ranked by the growth in the variance of residual wages, and lists the average years of schooling completed by occupation. A large growth rate indicates that the amount of learning that takes place in the rst 10 years in these occupations is large, while employers in occupations with low growth in the residual variance do not learn much about the worker in the same time period. Occupations with high growth in the residual variance include health diagnosing occupations, sales workers, and administrators while occupations with low growth in the residual variance include engineers, teachers, and farm operators. Notice that the ranking of occupations based on the growth in residual variance is di erent from ranking occupations based on mean education. For example, physicians 16 Calculating the growth in the variance of wage residuals between workers with 0-5 years of experience and workers with 6-10 years of experience allows more precision in the estimation because of larger sample size. I also experimented comparing the variances of workers with 0-2 years of experience and 3-5 years of experience. The results are very similar to the ones I report here (but are estimated less precisely) and the ranking of occupations is robust to this choice. 15

17 have the highest growth in the variance while college professors experience no growth, but both have high mean education. Nonetheless, some occupations with low or no growth in the residual variance appear to have high mean education, perhaps re ecting the fact that employers in these occupations have more information about their workers abilities upon their hire. 4.2 Evidence for Employer Learning In this section, I reproduce AP s results using the updated sample. As discussed in section 2, the idea of testing for employer learning hinges on the availability of a variable which is (positively) correlated with ability, is unobserved by the employer at the time of hiring but is available to the econometrician. The empirical speci cation that stems from equation (7) when learning is assumed to independent of occupation is: log W it = s0 s i + z0 z i + T i + s10 (s i T i 10 ) + z10(z i T i 10 ) + it (12) Throughout the empirical analysis education (s) and AFQT (z) are interacted with experience divided by 10, so that the coe cients on the interaction terms represent the change in the wage slope between T = 0 and T = 10. Table 4 reproduces AP s results using the original sample period of , and using the extended sample for the period. All the reported standard errors are based on White-Huber standard errors. In the regression, experience is modeled with a cubic polynomial, and I control for the 1980 two-digit occupation codes at the rst job, urban residence, race, year xed e ects, and a cubic trend interacted with education and race. The base year for the time trend is These interactions with 16

18 the time trend control for economywide changes in the wage structure during the sample period (Katz and Murphy, 1992). It is important to mention that the NLSY79 sample used in the analysis is very similar to the one constructed by AP with two exceptions. First, the sample covers a longer period, from , while in AP s sample 1992 is the last year used. Second, individuals who left school before 1979 are excluded from the sample while AP includes them. These workers are excluded because their work history before 1979 is not complete and it is hard to determine their rst occupation. Although this restriction weakens the results when compared to the ones reported by AP, the main picture regarding employer learning is not a ected. 17 Column 1 in Table 4 reports the results of estimating equation (12) when the sample covers the period and the interaction of AFQT with experience is excluded. The coe cient on education is 0.06 and statistically signi cant and does not change over time. As in other studies that have used AFQT scores in wage regressions, it enters the equation with a positive (0.03) and statistically signi cant coe cient (Neal and Johnson, 1996). Column 2 uses the same restricted sample but includes an interaction term between AFQT and experience. As predicted by the model, the coe cient on AFQT when T = 0 becomes zero and statistically insigni cant while the coe cient on the interaction term enters the regression with a large coe cient. At T = 10 the coe cient on AFQT is 0.07 and statistically signi cant. This indicates that employers do not observe the worker s AFQT score upon entry to the labor market but do learn about it over time. Extending the sample to cover the period does not change the main results. As reported in column 3 and 4 in 17 In a separate analysis using only the sample I include workers who left school before The results con rm quite closely the ones reported by AP. These results can be provided upon request. 17

19 Table 4, the magnitude of the AFQT-experience coe cient becomes smaller (0.05) but the notion of employer learning remains unchanged. 4.3 Occupation-Speci c Employer Learning To test whether employer learning di ers across occupations, I match the rst occupation of each worker in the NLSY sample with the appropriate growth in the variance of residual wages of that occupation. 18 To start the analysis, I group occupations in terciles based on the growth measure: The rst group includes occupations where the growth in the variance is greater or equal to 0.036, the second includes occupations where the growth in the variance is between and 0.036, and the third includes occupations where the growth in the variance is below (and is mostly not statistically signi cant). If the growth measure re ects (at least partially) employer learning, the growth in the AFQT coe cient should be increasing with the variance of residual wages. After 10 years of experience, we expect the highest growth in the AFQT coe cient in occupations with the highest growth rates. In contrast, we expect no growth in the AFQT coe cient in occupations with zero growth rates in the variance of residual wages. I will report results from running separate regressions for each of these groups, and then proceed to report results using di erent speci cations. Notice that all models include occupation xed e ects, which among other things, also capture level di erences in the residual variance upon the worker s hire. Table 5 reports the results from estimating equation (12) for the three separate groups 18 About 80 percent of the workers in my sample change their initial 2-digit occupation after 10 years in the workforce. However, it may be inappropriate to consider only workers who never switch occupations. For example, an entry level sales position in a rm might be the starting point for many possible career paths in the company, and thus we would not want to restrict the sample to workers who stayed in sales. The real interest of the paper is in the entire menu of career paths that are associated with di erent initial occupational assignments. 18

20 I describe above. The results on the AFQT variable (at T = 0) suggest that while it is very small and statistically insigni cant in occupations where the growth in variance is high, it increases monotonically as the growth in the variance decreases. The coe cient in occupations with intermediate growth rates is (0.009) and grows to (0.024) in occupations with no growth in variance (Columns 2 and 3, respectively). On the other hand, the coe cient on the AFQT-experience interaction is monotonically increasing with the growth in the residual variance. It is as high as (0.012) in occupations with the highest growth rates and is not signi cantly di erent from zero in occupations with low growth in the residual variance. The coe cient on the interaction term for occupations with intermediate growth rates is (0.01). I call occupations with high growth in the variance of residual wages and high growth in the AFQT coe cient "high learning" occupations. Occupations with low growth rates in both the variance and the AFQT coe cient are "low learning" occupations, and those in between are "intermediate learning" occupations. Of course, occupations can be low learning for two di erent reasons: First, it might be the case that the market has already learned the worker s AFQT at the start of the career; Second, it might mean that the market has not learned the worker s AFQT even after 10 years into the career. The large coe cient on AFQT at T = 0 in column 3 and the large level of residual variance for workers with 0-5 years of experience in these occupations (Column 1, Table 3) support the rst scenario. It indicates that the worker s AFQT in low learning occupations is revealed immediately, or soon after their hire. This is plausible if employers in these occupations have access to richer information about workers skills upon their hire (e.g. more information in q) which enables them to correctly estimate the worker s AFQT score. 19

21 It is also interesting to look at the coe cient on the black dummy in the regressions across the di erent groups. Table 5 shows that the coe cient on the black dummy in high learning occupations is negative and large, but is positive for workers in low learning occupations (Column 3). 19 A nal point that deserves attention is the behavior of the education variable over time. According to proposition 1, if education is used by the employer to assess the average ability of the worker upon hiring and employers learn, then the coe cient on education should be nonincreasing over time. In fact, the coe cient on the educationexperience interaction in high and intermediate learning occupations is negative (but small and statistically insigni cant) while the coe cient on the education-experience interaction in low learning occupations is positive (but small and statistically insigni cant). Table 6 reports the results from a regression where the variables are directly interacted with dummy variables based on the growth in variance (with similar classi cations as before). The results in Column 1 are based on estimating equation (12) with the addition of two interaction terms of the AFQT-experience variable with high and intermediate growth in variance dummies (the omitted group is occupations where the growth is below 0.010). Column 2 adds interactions of the AFQT variable (at T = 0) with the variance dummy variables and Column 3 is based on a fully interacted model, where both education and AFQT are interacted with the growth in the variance dummy variables. The results are consistent with di erential learning. Notice that in all models, the coe cient on the interaction of AFQT-experience with high variance growth is large and statistically signi cant, while the coe cient on the interaction with intermediate growth in variance is small (0.038), in column 19 Arcidiacono, Bayer, and Hizmo (2008) report similar results on the black dummy for high school graduates and college graduates and conclude that black college graduates do not face statistical discrimination in the labor market. 20

22 1, and decreases and becomes statistically insigni cant in column 3. The coe cients on the interaction terms with high and intermediate growth in variance are statistically di erent from each other in all models. Finally, although the patterns on the AFQT at T = 0 are not estimated precisely, the coe cients on AFQT when interacted with the high-variance growth dummy are smaller that the coe cients on the AFQT when interacted with the intermediate variance growth dummy. In Table 7, instead of classifying occupations into groups based on the growth in the residual variance, I interact the variables in the regression with the raw growth measure. To simplify the comparison, column 1 repeats the results of estimating equation (12) when learning is assumed to be similar across occupations. Column 2 adds an interaction term between the AFQT-experience interaction and the growth in variance, and column 3 adds an interaction of AFQT (at T = 0) with the growth in variance. Finally, column 4 adds interactions of education (both at T = 0 and at T = 10) with the growth in variance. In all models (column 2-4) the coe cient on the AFQT-experience-variance growth variable is large, positive, and statistically signi cant, suggesting that there is a monotonic relationship between the growth in variance and the growth of the AFQT coe cient over time. 4.4 Do Employers Discriminate on the Basis of Race? An important insight of AP is that we can use learning models to test for statistical discrimination based on race. Although the focus of this paper is di erent, it is interesting to show how the coe cient on the race variable evolves over time in occupations with di erent patterns of employer learning. The race variable can be treated as an s variable or a z variable. 21

23 AP show that if there is a negative correlation between race and skill, and employers use it as a source of information about workers skills, the coe cient on the black variable should be negative at T = 0. However, if employers initially statistically discriminate against black workers but learn over time, we would expect the racial gap to shrink, that is the coe cient on race to rise over time (decrease in absolute value). If employers, however, do not discriminate on the basis of race, the race variable can be treated as a z variable. In this case, if race is negatively correlated with productivity and employers learn, the race gap will widen as workers accumulate experience. In the rst and second column of Table 8 I use the full sample to reproduce AP s results. The rst column includes interaction terms of education and the Black dummy with experience. The coe cient on the Black dummy at T = 0 is (0.011) and the coe cient on the black dummy when interacted with experience is (0.012). Adding the AFQTexperience interaction to the regression changes the coe cients to (0.012) at T = 0 and the coe cient on the interaction term to (0.014) (columns 2). The results are similar to AP s, and suggest that employers use race as a source of information upon the worker s hire but that they do not statistically discriminate based on race because the race gap widens over time, rather than shrinking. Does this conclusion hold across occupations with di erent learning rates? Columns (3)- (5) report the results of a similar regression that is used in column (2) for high, intermediate and low learning occupations. The results in column (3) and (4) seem to be in line with the results of the full sample. For high learning occupations, including the Black-experience interaction does not reduce the coe cient on the Black dummy at T = 0. It is (0.018), and the racial gap widens slightly after 10 years. For intermediate learning occupations, the 22

24 coe cient on the Black dummy is small (0.016) at T = 0, and the race gap widens over time to about According to AP s interpretation, although the Black coe cient at T = 0 in both groups of occupations is di erent, the fact that the racial gap widens over time indicate that employers treat race as a z variable and that they do not statistically discriminate Black workers. The results in column (5) portray a di erent scenario. The coe cient on the Black dummy at T = 0 in low learning occupations is (0.043) and declines over time to ( ). 20 This result is puzzling because, on the one hand, it suggests that employers in these occupations use race as an "s" variable (similar to education) and update their beliefs over time. On the other hand, the low growth in the residual variance and in the AFQT coe cient indicate that employers in these occupations learn about the worker s AFQT at the beginning of the career. This suggests that the change in the race coe cient over time is driven by something other than employer learning about the worker s AFQT, and casts doubts about the interpretation of the change in the race coe cient in high learning occupations. 5 Initial Occupational Sorting In the previous analysis I did not model initial occupational sorting. In reality, of course, there are a number of factors that might in uence a worker s initial occupational assignment, including risk aversion, credit constraints, private information about ability, and others. Generally speaking, any occupational assignment rule that would disproportionately place 20 These results are consistent with the results of Arcidiacono et. al (2008) who nd that black workers are discriminated against in the high-school labor market but not in the college graduates labor market. 23

25 workers with high growth in the AFQT coe cient into occupations with high growth in the variance of residual wages for reasons that are unrelated to general learning would constitute a selection bias that could be an alternative explanation of the paper s main results. It is hard to think of a selection-into-occupations mechanism that could be a plausible alternative explanation of the results. Consider, however, an example of such an assignment rule: If occupations with high growth in the variance of residual wages o er disproportionately more on-the-job training, and if training is correlated with AFQT score, then an assignment rule that places the least trained workers into occupations with the most on-thejob training would be consistent with the results of the paper. Although this is a possible mechanism there is no evidence that indicate large di erences in training across occupations. Certainly a model in which workers with di erent expected mean ability select into di erent starting occupations, such as the one I present in this section, does not pose a problem, since it just a ects the level of wages in each occupation, not the growth in the residual, which is the variable of interest. In the remainder of this section, I examine the e ects of one factor that might in uence a worker s initial occupational assignment, namely the e ects of the market s initial, publiclyheld priors about the worker s ability. Consider the following production function: y jt = exp(d j + c j h + G j (t) + jt ), j = 1; :::; J (13) where d j and c j are constants, common knowledge among all labor market participants, and where c J > c J 1 > ::: > c 1 > 0 and d 1 > d 2 > ::: > d J. G j (t), as before, is the experience pro le of productivity, which is assumed to be independent from s; q; z;or. The error 24

26 term jt represents a productivity shock drawn from a normal distribution with mean 0 and variance 2. The signal that employers extract now is: b jt = x jt E(hjs; q) = + " + v jt c j. In this case the precision of the signal of productivity depends on the sensitivity of the output to the worker s productive ability. In particular, the higher c j is the higher is the signal-tonoise ratio. Thus, jobs characterized with higher c j are more informative about the worker s ability compared to jobs with low c j. As before, de ne an occupation as the entire expected career track associated with choosing a given initial occupation, and assume that each of the chosen career tracks are associated with a xed learning parameter for the worker s entire career. 21 Workers objective is to choose an initial occupation that maximizes expected present value of lifetime earnings, given the information they have when they enter the labor force. Denote with h j the skill level that solves d j + c j h j = d j+1 + c j+1 h j. That is, h j is the skill level at which a worker is equally productive at job j and j + 1, and assume that h j+1 > h j, 8 j. Employers (and workers) form initial priors about the worker s skill based on the information they have from s and q. Given these priors, a worker chooses (or is assigned to) an initial job that maximizes his expected lifetime output. Proposition 2 formalizes the assignment rule, with its proof provided in the appendix. Proposition 2 Suppose that at the beginning of a worker s career, employers in J occupations form expectations about the worker s skill level based on available information. Under the assumptions of the model, initial occupational assignment and the logarithm of wages are 21 Since workers objective is to maximize lifetime income, di erential learning might well involve reassignment of workers to occupations other that the initial occupation. 25

27 given by (i)-(iii): (i) If h e < h 1 ;then worker i is assigned to occupation 1 and is paid w 1t. (ii) If h j h e < h j+1, then worker i is assigned to occupation j + 1 and is paid w j+1t. (iii) If h e > h J 1, then worker i is assigned to occupation J and is paid w J t. This assignment rule implies that workers sort into initial occupations monotonically, based on the market s prior on their productivity, with higher-expected-ability workers choosing jobs where output depends more on individual ability. Thus, in relation to the learning model, workers with higher priors enter occupations in which output depends more on ability and are more informative about the worker s ability at the start of the career. One other implication of proposition 2 is that the starting wage level is higher for workers who enter the market with higher initial priors. In fact, the mean starting wage level for workers who enter low learning occupations is $9.74/hour, while the mean starting wages for the other groups of occupations are $7.25/hour and $6.6/hour for occupations with intermediate and high learning, respectively. This is consistent with the idea that workers with low expected ability sort into occupations where employers have a lot to learn about the workers, while high expected ability workers sort into occupations where ability is revealed (and compensated) soon after their hire. 26