Human Capital, Productivity, and Labor Allocation in Rural Pakistan

Size: px
Start display at page:

Download "Human Capital, Productivity, and Labor Allocation in Rural Pakistan"

Transcription

1 Human Capital, Productivity, and Labor Allocation in Rural Pakistan Marcel Fafchamps, and Agnes R. Quisumbing Stanford University August 1997 Abstract This paper investigates whether human capital affects the productivity and labor allocation of rural households in four districts of Pakistan. We find that households with better educated males earn higher off-farm income and divert labor resources away from farm activities toward non-farm work. Better fed males are also more productive in offfarm work. Education has no significant effect on productivity in crop and livestock production. Half of the effect of human capital on household incomes is realized through the reallocation of labor from low productivity activities to non-farm work. Female education and nutrition do not affect productivity and labor allocation in any systematic fashion, consistent with the marginal role women play in market oriented activities in Pakistan. As a by-product, our estimation approach also tests the existence of perfect labor and factor markets; this hypothesis is strongly rejected. Department of Economics, Stanford University, Stanford, CA Tel.: (650) fafchamp@leland.stanford.edu. International Food Policy Research Institute, th Street NW, Washington DC Tel.: (202) a.quisumbing@cgnet.com.

2 The role of human capital in the development process has attracted an lot of attention since the seminal contributions of Schultz (1961), Becker (1964), and Welch (1970). Recently, growth theorists such as Romer (1986, 1990), Lucas (1988, 1993), Stokey (1988, 1991) and others (e.g., Azariadis and Drazen (1990), Ciccone (1994)) have shown that the accumulation of human capital can sustain long-term growth. These theories have received support from the empirical work of economic historians such as Fogel (1990) and from macroeconomic regression analysis emphasizing the positive role of education on growth (e.g., Mankiw, Romer and Weil (1992), Barro and Sala-i-Martin (1992, 1995)). Microeconomic evidence on this issue is both abundant and varied (see Jamison and Lau (1982), and Psacharopoulos (1984, 1989) for surveys). Although there is little doubt that better educated workers earn higher wages in the modern sector, whether education raises farm productivity remains a contentious issue. A widely-cited survey by Lockheed, Jamison and Lau (1980) summarizes 39 equations from 18 different studies in 13 countries, concluding that education has a positive effect on farm productivity. Phillips (1987) argues that these results vary substantially by geographic region. Studies from Asia support the positive and significant relationship between education and farm efficiency, but the evidence from Latin America and Africa is mixed. The purpose of this paper is to revisit this issue using a panel of rural households from Pakistan. The relationship between human capital and productivity and the choice of farm and off-farm work have usually been addressed in two separate strands of the literature. One examines the effects of human capital on output and outcomes (e.g. Jamison and Lau (1982), and the sources cited therein) while the other analyzes the allocation of labor to farm and off-farm activities (e.g. Huffman (1980), Huffman and Lange (1989), Kimhi (1996a, 1996b)). Following Newman and Gertler (1994), Jolliffe (1996), and Yang

3 2 (1997), this paper combines both approaches and considers not only how human capital raises productivity but also how households with different human capital endowments allocate labor to different activities. 1 Indeed, if returns to, say, education are highest in a particular activity, better educated households should reallocate their manpower to that activity, thereby providing evidence about the effect of education on output. This paper also moves beyond studies that focus on either on crop production (e.g., Jamison and Lau (1982) and the studies reviewed therein) or wages (e.g., Alderman et al. (1996b), Haddad and Bouis (1991), Sahn and Alderman (1988)) and examines all the market-oriented activities of the household. This enables us to decompose the effect of human capital of total household income into a labor reallocation effect and activity-specific productivity effects. Our analysis also encompasses several complementary measures of human capital, enabling us to better disentangle the effects of education from other dimensions of human capital such as nutrition and innate ability. 2 Finally, our study contains several methodological innovations that ensure that the results are robust and as free as possible from endogeneity and omitted variable bias. The paper is organized as follows. We begin in section 1 by introducing the conceptual framework underlying our work and discussing various econometric issues. The data 1 Newman and Gertler (1994) estimate a structural model of wages, marginal returns to farm work, and marginal rates of substitution for different demographic groups within the household, taking into account the jointness of production and consumption among rural landholding households in Peru. Jolliffe (1996) estimates the returns to education in farm and off-farm work, and finds that they are much higher in the latter, thus affecting the allocation of labor in Ghanaian farm households. Yang (1997) considers selectivity of better educated household members into off-farm activities in China, and finds that schooling does not contribute to physical efficiency in farming but raises off-farm wages. The best educated person in the household, however, may make farm management decisions while participating in off-farm work. 2 The inclusion of several dimensions of human capital is a growing trend in the literature. For example, Haddad and Bouis (1991), Thomas and Strauss (1997), and Foster and Rosenzweig (1994) include individual-level calorie intake, height, and BMI in addition to education in their studies of wage determinants in the rural Philippines and urban Brazil. Alderman et al. (1996a) examine the effects of cognitive skills, BMI, and height in addition to experience and education in their work on men s wage labor in rural Pakistan.

4 3 is presented in section 2. Regression results are examined in sections 3 for income and 4 for labor. We find that households with better educated males earn higher off-farm income and divert labor resources away from farm activities toward non-farm work. Better fed males are also more productive in off-farm work. Education has no significant effect on productivity in crop and livestock production. The effect of human capital on household incomes is realized mainly through the reallocation of labor from low to high productivity activities, i.e., non-farm work. Female education and nutrition do not affect productivity and labor allocation in any systematic fashion. This is in line with the marginal role women play in market oriented activities in Pakistan. As a by-product, our estimation approach also tests the existence of perfect labor and factor markets: this hypothesis is strongly rejected. Finally, we find evidence of fixed cost in undertaking income generating activities. Conclusions are presented at the end. Section 1. Conceptual Framework We begin by presenting a simple conceptual framework for evaluating the effect of human capital on productivity and labor allocation. This framework is tailored to the situation of rural households in Pakistan (e.g., Alderman and Garcia (1993)). Rural households derive their livelihood from several competing income generating activities: crops grown in two separate seasons denoted k and r; livestock b; and off-farm work n. Each of these activities yields an income Y a, a = {k, r, b, n} that is function of labor allocated to that activity L a, of variable inputs such as fertilizer and pesticides, and of semi-fixed factors such as planted acreage A a, livestock B, farm tools T, and non-farm capital K. Y n includes income from non-farm self-employment and off-farm wage work. 3 3 This simplifying assumption is justified by the difficulty to disentangle pure wage work from selfemployment in rural areas of developing countries. See Reardon (1997) for a discussion. Agricultural

5 4 Total expenditures on variable crop inputs are denoted X a, a = {k, r}. Because of moral hazard, households must supervise hired workers. Labor productivity is assumed increasing in the proportion of labor supplied directly by the household, that is, increasing in F a /L a where F a stands for family labor in activity a. We further assume that households are endowed with a vector of human capital characteristics Z that potentially affect production in each of these activities, either by raising the efficiency of labor, improving the input mix, or increasing the household s management and supervision capacity. Assuming that crop, livestock, and non-farm income take a Cobb-Douglas form, these effects are formally equivalent and can be represented by a multiplicative factor of the form e λ Z. We thus have: Y a = α a L a α al F γ a α al a α A aa L a T α at α X ax a e λ a Z a for a = {k, r} (1) Y b = α b L b α bl F γ b α bl b B α bb e λ b Z L b (2) Y n = α n L n α nl L n F γ n α nl n K α nk e λ n Z (3) The α s and λ s are parameters representing the household s production technology. The γ s measure the efficiency of supervision; γ a = 0 means that hired labor can be perfectly supervised in the absence of any household member. The effect of human capital on productivity can be assessed by estimating the above equations and examining whether the λ a coefficients are significantly positive. This is the approach taken in numerous empirical studies (e.g., see Jamison and Lau (1982) for a review of the literature). This approach does not, however, recognize that, if human capital raises the wage work, one category of work for which the distinction from self-employment is clear, is negligible in our data (see infra).

6 5 household s productivity in certain activities, this should be reflected in labor choices. To see why, let household choices be represented as an optimization problem whereby available manpower F is allocated between leisure and the four activities F a _ with a = {k, r, b, m} to maximize joint utility: 4 Max U(S + L a, F a, X a _ Σ [Y a X a w (L a F a )], F a subject to equations (1) to (3) and to non-negativity constraints: Σ F a) (4) a L a F a 0 and F a 0 for all a (5) U (.) is the household s utility function defined over income and leisure. S stands for unearned income and w for the market wage rate for hired workers. Given the need to supervise workers, the opportunity cost of family labor is different from the market wage. Following de Janvry, Fafchamps and Sadoulet (1991), a household model with an imperfect labor market can be represented as a system of labor demand and supply with endogenous shadow cost of labor w *. Households are assumed free to allocate their labor across activities. 5 Consequently, all activities compete for household labor at the same shadow wage rate w * and returns to labor must be equalized across the activities undertaken by the household. Labor demand in each activity is formally obtained by momentarily treating w * as given and solving the following profit maximization problem: α Max ω a L al (1 γ a ) α al γ a a Fa w L a (w * w) F a (6) L a, F a subject to F a L a. Parameter ω a is shorthand for the non-labor part of production functions (1) to (3). First order conditions for an interior optimum are: 4 A collective model of the households does not seem required here given the extremely limited involvement of women in market oriented activities in rural Pakistan (e.g., Alderman and Chishti (1991), Brown and Haddad (1995)). 5 This need not be the case if there are additional restrictions on labor use within the household, such as norms regarding gender or age specific tasks; or if non-farm employment is random and farm production is characterized by irreversible decisions such as planting.

7 6 Dividing the first by the second yields: α al (1 γ a ) Y a = w (7) L a α al γ a Y a = w * w (8) F a F a F a = 1 γa L a γ a w w * w Since can at most be one, we see that the household will provide all the labor to La (9) activity a whenever: w * w (10) w 1 γa If supervision is required -- γ a > 0 -- and if the shadow cost of labor within the household equals the market wage rage -- w * = w -- then equation (10) is violated and F a = L a. The presence of supervision costs thus creates a strong tendency for the household to supply its own labor. In that case, the supervision term drops out of the production function and labor demand takes the following form: γ a 1 α ax F a = L a = α a 1 (w * ) α al+α ax 1 α aa A 1 α al α ax 1 F a = L b = (w * ) α bl 1 1 F n = L n = (w * ) α nl 1 α at T 1 α al α ax 1 α b 1 α bl 1 α n 1 α nl α bb B 1 α bl α nb K 1 α nl λ a Z e 1 α al α ax e 1 α bl for a={k, r}(11) λ b Z (12) λ n Z (13) e 1 α nl where α a 1 is a combination of model parameters. These equations show that labor use in activity a is a decreasing function of the shadow cost of labor and an increasing function of human capital and semi-fixed factors in that activity. If w * could be observed, equations (11) to (13) could easily be estimated through regression analysis. Unfortunately, this is not the case. But we can get a sense of what

8 7 factors influence w * by noting that utility maximization yields a household labor supply of the form: _ Σ a F a = F(w *, S + w * F + Σ Πa (w *, K a )) (14) a where Π a (.) is the profit function associated with activity a and K a is shorthand for semi-fixed assets in that activity (i.e., land A and tools T in crop production, livestock B on herding, non-farm capital K in non-farm activities). If leisure is a normal good, the derivative of F (.) with respect to w * is positive and that with respect to income is negative. In case the household does not hire outside labor, the aggregate labor supply (14) and the four labor demand equations (11) to (13) form a system of five equations that can be solved for the five unknowns, F k, F r, F b, F n, and w *. An immediate corollary is that the allocation of family labor to activity a depends, through w *, on household manpower _ F, unearned income S, and productive assets in other activities (e.g., Evenson (1978), de Janvry, Fafchamps and Sadoulet (1991)). Equations (11) to (13) can thus be estimated _ indirectly by replacing w * with a function of the household s manpower stock F, unearned income S, and all its productive assets. If the household has an abundance of human capital and productive assets, the w * that solves the above system of equations may rise sufficiently so that γ a w * w >. w 1 γa It this case, it becomes optimal for the household to hire outside help so that: F a = 1 γa L a Labor use in activity a then becomes: γ a w w * w 1 L a = (ω a α al ) 1 α al γ a α al ( γ a 1 α ) al 1 γa w 1 γ a α al α al 1 γ a α al (w * w) α al 1 (15)

9 8 where ω a is again shorthand for non-labor terms in the production function. If a = k or s, α al must be replaced with α al + α ax throughout. Equation (15) shows that, when hired labor is used, labor is a decreasing function both of the market wage w and the household specific shadow wage w *. If γ a = 0, the supervision term drops out of the production function for a. In that case, as equation (15) illustrates, L a is entirely determined by the market wage rate; w * and the factors that influence it play no role in determining labor allocation to activity a. Production choices L a are unaffected by any variable that does not enter the production function directly and the household model is then said to be separable (e.g., Singh, Squire and Strauss (1986), de Janvry, Fafchamps and Sadoulet (1991)). Consequently, if we _ empirically find that F, S, or other productive assets influence L a, this can be taken as evidence of supervision cost or other labor market imperfections (e.g., Benjamin (1992)). We are now in a better position to understand how human capital affects labor allocation among competing activities. Suppose, for instance, that returns to human capital are present in one activity, say n, but absent in others. From equations (11) to (13) and from equation (15) we see that a high λ n raises the productivity of labor in that activity and shifts the demand for L n to the right. If γ n = 0, the extra demand for labor is met by hiring more outside workers and the relationship between human capital Z and labor usage in that activity depends solely on λ n : L n rises if λ a > 0, it falls if λ a < 0. The productivity of human capital in n has no effects on activities other than n. If supervision is costly, however, the same equations indicate that it is optimal for the household to reallocate some of its own manpower resources to activity n. Formally, this is accomplished by raising the internal shadow wage w * to causes other activities to release labor until the

10 9 household s internal labor market equilibrates. In this case, other things being constant, households with more human capital will increase household labor in activity n and reduce it in other household activities. The direct substitution effect may be further compounded by an income effect. Human capital, by raising w * and increasing profits in n, results in higher income. If leisure is a normal good, households will want to enjoy part of this higher income in the form of increased leisure consumption. The lower labor supply will further raise w *, inducing the household to withdraw even more labor from less productive activities. How households reallocate labor across activities in response to differences in human capital thus provides useful information on the relative productivities of human capital. Regressing equations (11) to (13) or (15) on market wage rate, semi-fixed assets, and human capital thus provides another basis on which to test whether human capital affects productivity: if human capital raises labor productivity in activity a, i.e., if λ a > 0, then the household should respond by allocating more labor to a. In addition, if supervision costs are present -- or if other labor market imperfections cause the household to be partially self-sufficient in labor -- households are expected to reallocate labor away from activities with low returns. Labor use regressions thus complement direct evidence from production/income regressions (e.g., Jolliffe (1996)). Combining the two approaches is the objective of this paper. We proceed as follows. We first estimate λ a from the production/income equations (1) to (3) directly. We then estimate the labor use equations (11) to (13) or (15) for all four activities, as well as total labor supply (14). The market wage w is taken to depend on location and time dummies. The shadow cost of labor w * also depends on household manpower F, _ unearned

11 10 income S, and other productive assets. As in Benjamin (1992), the absence of supervision costs and the completeness of the labor market are tested by examining whether these variables have a significant effect on labor choices. A human capital variable, say, Z e can be concluded to have an unmistakable effect on productivity if it is significant and has the same sign in both the production/income and labor choice regressions. Moreover, if labor markets are incomplete and Z e only raises productivity in certain activities, it should lower labor use in other activities. Section 2. The Data The data on which our analysis is based come from 12 rounds of household survey conducted by the International Food Policy Research Institute (IFPRI) in four districts of Pakistan between July 1986 and September 1989 (see Nag-Chowdhury (1991) for details). A panel of close to 1000 randomly selected households in 44 randomly selected villages were interviewed at 3 to 4 months intervals on a variety of issues ranging from incomes, agricultural activities, and labor choices to anthropometrics, education, land, and livestock (see Adams and He (1995), Alderman and Garcia (1993)). Responses to these questions were combined by the authors to generate a consistent data set containing annual information about household composition, income, assets, inherited land, human capital, and labor. All asset variables refer to the beginning of the year. The basic characteristics of the surveyed households are presented in Table 1. The median household size is 8 people, half of which are adults. Sources of income are quite varied. Crops account for about one fourth of average income; livestock accounts for another 15%. Non-farm earned income -- a mix of wages and self-employment income from crafts, trade, and services -- represents 30% of average income; rental income and

12 11 remittances amount to another 30%. Agricultural wage income is negligible among sample households. As already noted by Alderman and Garcia (1993) and by Adams and He (1995), livestock and non-farm income are more equally distributed than crop income, rental income, or remittances. On average, households own 8 acres of land, half of which is either canal or well irrigated. The median is much smaller, however, indicating that land is unequally distributed. The data also shows large differences among households in inherited land and in the amount of land owned by the father of the head. These two variables are used throughout as instruments for family background. 6 Households spend roughly as much time herding as they do in crop production. Hired labor -- mostly male -- accounts on average for as little as 2.6% and 8.5% of total labor devoted to cultivation in the kharif and rabi seasons, respectively. 91% of kharif farmers and 89% of rabi farmers do not use any hired labor. The use of outside help is somewhat higher at harvest time: it accounts for 21.5% and 23.6% of total labor for kharif and rabi, respectively. Surveyed households do not report employing any wage worker for either herding or non-farm activities. Although surveyed households use some hired labor for crop production, they spend very little time hiring themselves out as laborers. The sample may thus underrepresent farm laborers who are the poorest segments of rural society. 7 Wage work in non-farm activities is common, though. Male members of the household do 84% of the crop work, 99% of herding, and 95% of non-farm work. This is largely a consequence of purdah, 8 a system of secluding women, restricting them from moving 6 No data are available about parents levels of education, but they are likely to be zero in most cases. 7 Panel surveys have a tendency to underrepresent wage laborers who are typically more mobile than farming households and have a higher probability of dropping out of subsequent survey rounds. The resulting attrition bias is not explicitly addressed in this paper due to the absence of suitable instruments, but it should be kept in mind when interpreting the results. 8 See Ibraz (1993) and Jefferey (1979). Although purdah is now seen by many Pakistanis as a religious obligation prescribed by Islam, it was practiced by Muslims and Hindus alike before the partition of India.

13 12 into public places and enforcing high standards of female modesty upon them. This system limits women s mobility outside the home and restricts their participation in market work. 9 Women work mostly in or around the home. Human capital variables are presented in Table 2. They include: experience proxied by age and age squared; education measured in years of schooling, innate ability measured by Raven s test scores; childhood nutrition measured by height; and current nutritional status measured by the body mass index or BMI. 10 As measure of experience we use age and age squared rather than years of post-schooling wage work because, unlike in Alderman et al. (1996b), rates of school attendance are extremely low among older adult males and among adult females. Age and age squared are also more appropriate to capture life-cycle effects. Years of schooling is a measure of formal investment in human capital. Raven s (1956) Colored Progressive Matrices Test involves the recognition of changes in patterns across a series of four pictures. It was initially developed to measure innate ability among illiterate children and has been widely used as a proxy for intelligence among illiterate adults in developing countries (e.g., Knight and Sabot (1990)). Since parents may choose to educate only those children with academic potential, years of schooling is likely to be correlated with innate ability. We therefore use Raven s test scores in an effort to distinguish the productivity of innate intelligence from that of skills acquired in school. 11 In his study of Punjab in the 1920 s, for instance, Darling (1925) notes that Hindu Rajputs were the most dedicated to the practice, a status symbol for which they pay dearly [in terms of wasted manpower and reduced profits]. 9 Because of purdah, respondents are likely to have underreported female participation in market oriented work. 10 See Strauss and Thomas (1995) for a comprehensive review of attempts to account for various dimensions of human capital in measuring labor markets, health, and nutrition outcomes. 11 Years of schooling also influences achievement as measured in test scores, e.g., Glewwe and Jacoby (1994). The impact of test scores on rural labor market outcomes in Pakistan has been investigated by Alderman et al. (1996b).

14 13 Height and body mass index (BMI) proxy health and nutrition aspects of human capital. The BMI is defined as weight (in kilograms) divided by height (in meters) squared, a commonly used measure of fitness and nutritional status. Combined with other simple anthropometric measurements such as height, it has been shown to be a good predictor of muscular mass and physical strength among populations of developing countries (e.g., Conlisk et al. (1992)). Height, when evaluated for adults, captures the cumulative effects of childhood and adolescent nutrition as well as genetic endowments. Unlike BMI, it is not subject to short-term fluctuations. In this paper, we use only adult height and BMI to minimize endogeneity, i.e., the possibility that taller and better fed parents may have taller and better fed offspring. We also experiment with self-reported days of illness as a measure of health status. While it is true that illness episodes may affect both the amount and efficiency of labor supplied, self-reported illness has been argued to be contaminated by self-reporting biases, with higher-income or more educated individuals more likely to report being ill (e.g., Sindelar and Thomas (1993)). Illness episodes may also be correlated with factors which affect individuals long-term productivity; a large literature on illness shows that the probability of illness is higher among less wealthy and less educated families (e.g, Akin, Guilkey and Popkin (1992)). For these reasons, we treat the available information on sick days with caution. Two separate sets of human capital variables are constructed for each household. In the first set, individual characteristics are averaged by gender over all household members 20 years old and above, irrespective of their relationship to the head of household. The second set contains only information about the head of household and his wife. 12 By construction, the second set exists only for households with both a head and at 12 In case of polygamous households, we take the average over all wives.

15 14 least one wife. 13 The reason for constructing this second set is twofold. First, using average human capital of adult males and females may mask variations within these categories. Indeed, the head of the household and his wife are likely to have more decision making power than other household members. Second, household averages may be subject to endogeneity bias: the prosperity and genes of the parents may be reflected in their offspring, thereby opening the door to a reverse causation between productivity and household-based human capital averages. Although less vulnerable to such problems, husband and wife human capital are only partial measures and therefore subject to measurement error. Moreover, if marriage market selection exists, characteristics of husbands and wives are likely to be correlated (e.g., Foster (1995)). Since neither measure is perfect, our analysis is conducted using both and regard results about human capital as robust when they are present in both formulations. The two sets of variables are summarized in Table 2. The average head has spent 2.8 years in school; the median is zero. Female members of the household have a much lower level of education than males. 40% of males have no education, vs. 86% for females. They also get a significantly lower score on Raven s test of progressive matrices, a test supposed to measure innate ability irrespective of literacy level. This may be attributed to socially acquired attitudes by which women try less hard to perform than men, compounded by less familiarity with formal tests due to their lack of schooling (e.g., Alderman et al. (1996a)). The correlation coefficient between years of education and Raven s test score is fairly low, however:.43 for men,.28 for women. The sample population is short and, with average BMI s as low as 20.4 for males and 21 for females, 13 There are less than 1% female-headed households in the sample.

16 15 only marginally well fed. Although women are less educated than men and rank lower in Raven s test, they have a higher BMI: the t-test statistic for equality of means between male and female BMI s is highly significant (6.99 with 1776 degrees of freedom for male and female averages; 6.81 with 1441 degrees of freedom for head and wife). This is a common result due to the fact that women are shorter and have more body fat as a proportion of body weight (e.g., Gibson (1990)); it does not indicate that women in the sample are better fed than men. The nutritional status of males and females within the same household appear unrelated: the coefficient of correlation between average male and female BMI s is.17. Women report less days lost to sickness, but we suspect that this may be due to self-reporting bias: women spend most of their time within the home where being sick is less disruptive and less noticeable. In contrast, men do all the work outside the home where their ability to work would suffer from reduced mobility and where sickness is harder to accommodate within one s routine. Section 3. Testing the Productivity of Human Capital We now test whether human capital raises productivity in any of the four activities in which the surveyed farmers are involved in: kharif and rabi crop production; livestock raising; and non-farm work. We proceed in two steps. In this section, we estimate production functions (1) to (3) for these four activities and examine whether any of the estimated λ a s is significantly different from 0. We then turn in the next section to labor allocation and estimate labor demand and supply equations (11) to (14). We complement this analysis by taking a brief look at crop choices and input use and examine whether households with higher levels of capital grow different crops and use more modern inputs.

17 16 Before we can estimate production functions (1) to (3), a number of econometric issues need to be addressed. First, suitable assumptions must be made about the distribution of errors. For rabi and kharif crop production, the dependent variable is the total value of output which, by definition, cannot be negative. For these two activities, it is therefore natural to adopt a multiplicative error formulation and to estimate kharif and rabi production functions in log-log form using ordinary least squares. In contrast, livestock and non-farm income data are net of production costs and capital losses. As a result, 21% of livestock income observations and 22% of non-farm income observations are null or negative. A logarithmic formulation would artificially truncate the sample, hence generating a truncation bias as well as a loss of information. To avoid this bias we opt for an additive error formulation and estimate the livestock and non-farm production functions via non-linear least squares. Second, harvesting labor is a priori subject to endogeneity bias: a good harvest requires more labor to gather crops in the field. To minimize endogeneity bias, we drop harvesting labor from L a in the two crop production functions and use only the reported labor for land preparation, irrigation, and cultivation. Since livestock and non-farm work are not subject to a similar bias, we use total reported labor instead. Third, since no data were collected on non-farm capital, K is replaced by factors thought to influence it such as family background, proxied by father s land and inherited land, and current wealth, proxied by owned irrigated and rainfed land. Fourth, an aggregation problem arises for land and livestock. Different qualities of these semi-fixed factors, while likely to contribute differently to output, are quite substitutable in production. Buffaloes and sheep, for instance, do not yield the same income but neither of them is essential for livestock raising. A Cobb-Douglas formulation in which

18 17 each category of semi-fixed factor enters at par with labor is therefore inappropriate. To resolve this issue, we assume that different qualities of land and livestock are perfectly substitutable but have different inherent productivities. After normalization, we have: B = Cattle + (1+β b ) Buffaloes + (1+β k ) Bullocks + (16) (1+β d ) Donkeys + (1+β s ) Sheep&Goats where B is now measured in cattle-equivalents. Parameter β b represents how much more productive buffaloes are relative to cattle, and similarly for the other β parameters. Dividing through by the total number of animals N, we get: B = N (1 +β b Buffaloes + βk N Bullocks + βd N Donkeys + βs N Provided that the β s are not too large, equation (17) can be approximated as: Buffaloes + βk N Bullocks + βd N Donkeys + βs N Sheep&Goats ) (17) N Sheep&Goats N B N e β b (18) The above approximation is adopted throughout whenever highly substitutable variables appear on the right hand side of the regression. Finally, yearly and village-level fixed effects are added to capture systematic variations in market and production conditions across time and space. For kharif and rabi crop output, the estimated equations are: A a I log Y a = α a + α aa log A a + α ai + αat log T + α ax log X a + (19) Aa F a α al log L a + γ a α al log ( ) + La Σ 12 λ ah Z h + h=1 Σ 2 α ag G g + g=1 α a 86 D 86 + α a 87 D 87 + Σ 43 α av D v + ε v=1 with a = {k, r}. The error term ε is assumed randomly distributed with mean zero and constant variance σ ε 2. Variable A a I stands for irrigated cultivated acreage in season a; rainfed land is the omitted category. There are 12 human capital variables, 6 for males

19 18 and 6 for females: age, age squared, years of education, Raven s test score, height, and BMI. Sickness days are not included because much of their effect is already captured by the labor variable. Variables G g stand for household background variables, namely, the land owned by the head s father and the land inherited by the household. They are included to control for possible omitted variable bias in the human capital variable. 14 D 86 and D 87 are dummies for crop years 1986/1987 and 1987/1988, respectively. Equation (19) is estimated with village fixed effects, hence the village dummies D v. Observations with no crop land or labor are excluded. The estimated equation for livestock is: α Y b = α b L bl Σ 4 b B α β bc B c /B bb e c=1 Σ 12 λ bh Z h + Σ 4 κ bh G g + α b 86 D 86 + α b 87 D 87 + Σ 3 α bd D d e h=1 g=1 d=1 + ε (20) F b Variable drops out because all livestock labor is provided by the household. Vari- Lb ables B c stand for four categories of animals: bullocks, buffaloes, donkeys, and sheep and goats; cattle is the omitted category. The same 12 categories of human capital variables are used as in the crop regressions. Background variables are included to minimize omitted variable bias; owned irrigated and rainfed land are added to control for possible wealth effects. Equation (20) is estimated via non-linear least squares. For reasons of manageability, district fixed effects are used instead of village fixed effects Individuals whose father was farming or who inherited more land probably received more exposure to farming (e.g, Rosenzweig and Wolpin (1985)), and may capture returns to specific experience in farming reflected in higher farm productivity. If children from landed households are also better fed and educated than those from landless families but family background is not controlled for, human capital variables may capture the effect of exposure to farming, not that of human capital itself. 15 To investigate whether the difference matters, we also ran the crop regressions with district effects. Results are virtually identical, which indicates that the main difference between villages are those that distinguish districts.

20 19 Observations with no livestock are excluded from the regression. The error term ε is assumed randomly distributed with mean zero and constant variance σ ε 2. The estimated equation for non-farm income is: F n α Y n = α n L nl Σ 12 λ hb Z h + Σ 4 κ gb G g + α b 86 D 86 + α b 87 D 87 + Σ 3 α bd D d n e h=1 g=1 d=1 + ε (21) Variable drops out because all the work in non-farm activities is provided by the Ln household. Background variables are included not only to control for possible omitted variable bias but also as instruments for unobserved non-farm capital K. The negligible amounts of off-farm agricultural wages and labor recorded in the data are included in Y n and L n, respectively. Equation (21) is estimated by non-linear least squares. We also estimate a total net income regression of the form: α Y t = α t L tl Σ 2 β tl L l /L t t e l=1 α B tb Σ 4 β tc B c /B e c=1 O α tb e β to O I /O α T tt (22) Σ 12 λ th Z h + Σ 2 κ tg G g + α t 86 D 86 + α t 87 D 87 + Σ 3 α td D d e h=1 g=1 d=1 + ε Variable L t stands for total labor; L l /L t for l = {1, 2} is the share of total labor devoted to crops and livestock, respectively. Variable O is total owned land; O I is owned land that is irrigated by a canal or well. Other variables are as before. Equation (22) is estimated by non-linear least squares. Estimation results for crops in the kharif and rabi seasons are presented in Table 3. Two sets of regressions are run in each case, one using the average human capital of the household, the other using only the human capital of the head and his wife. The latter set offers a less complete representation of the human capital of the household, but it is less subject to omitted variable bias if better able couples have both higher incomes and better fed, better educated children. Results indicate that farm tools, input expenditures,

21 20 and cultivation labor are good predictors of output. Estimates of the supervision parameter γ a are not significant in any of the regressions, suggesting that, if supervision costs are present, they are not large. Year and village dummies are significant, confirming that crop production varies systematically across time and space. Human capital variables are, in general, non-significant. Taller heads of household appear to achieve higher output in the kharif season; shorter wives are associated with lower output in rabi. Age, education, and Raven s test scores are non-significant in all regressions, suggesting that experience, schooling, and innate ability are not important determinants of crop output in the survey areas. These results are consistent with evidence indicating that returns to schooling are low in Third World agriculture (e.g., Rosenzweig (1980), Jolliffe (1996)), particularly in technologically backward areas. To ensure that human capital coefficient estimates are not affected by possible endogeneity bias, we reestimate equation using predicted instead of real labor and expenditures. The instrumenting equations are similar to the labor choice equations discussed below. Instruments include: family composition, owned land, livestock assets, and nonearned income. They are, in general, highly significant. IV estimates (not shown) are somewhat different for labor, land, and input expenditures, but human capital coefficients are by and large unaffected. Estimation results for livestock, off-farm, and total income are summarized in Table 4. Factors of production are significant and have the right sign in all regressions. Share parameters are in general significant, although they appear less stable. For instance, crop labor seems more productive than other labor when average human capital variables are used, but less productive when husband and wife human capital is used instead. Similar instability occurs in the livestock share coefficients. Year and district dummies often are

22 21 significant, again emphasizing the existence of systematic income differences across space and time. Regarding human capital, the strongest result concerns the effect of male education on off-farm and total income: it is positive and highly significant in all four regressions, and it is insensitive to the category of human capital variable used. One additional year of education is estimated to raise non-farm earned income by 3.6 to 3.8%. It is also estimated to raise total income by 6.4%, if average human capital is used, and by 1.5% if only the education of the head of household is used as regressor. Female education is significant and positive in the livestock income (husband and wife human capital) and in total income (average human capital) regressions, but it is not significant in other regressions and it occasionally changes sign, hence suggesting that the coefficient estimate may be subject to omitted variable bias. Male and female height and BMI are significant in several of the regressions, suggesting that better fed households achieve higher incomes. To summarize, production regressions indicate that male education has a strong positive effect on non-farm and total income, but no effect on crop output. Better fed households in general achieve higher incomes, but the effect is not present in all regressions. Experience, innate ability, and female education do not appear to have any robust effect on incomes. It is possible that our estimates of the productivity of human capital are biased because certain individual traits which are correlated positively with output are correlated negatively with education or nutrition. If, for instance, better educated males derive most of their income from non-farm activities, they may neglect farming in ways that are hard to measure, e.g., by planting or irrigating late, supervising labor less effectively, and in general applying less care to their fields. To examine whether such omitted variable bias might have affected human capital regression coefficients, we rerun equations (20) to

23 22 (20) with the predicted residuals from the labor choice regressions (see supra). The approach is similar in spirit to the use of inverse Mills ratio to control for self-selection bias (e.g., Heckman (1976), Maddala (1983)), except that the selection equation is a tobit, not a probit. This parallels work by Pitt, Rosenzweig and Hassan (1990) which uses residuals from a health production function in their analysis of intrahousehold food distribution, and Behrman, Birdsall and Deolalikar (1995) in their analysis of marriage market outcomes in India. Formally, let y i and ε i denote the ith observation of the dependent variable and the ith residual in any of the income equations, respectively. Similarly, let z i and u i denote the dependent variable and the residuals in the tobit labor choice equation, respectively. The regressors in the y i and z i are denoted x i and w i, respectively. The residuals are assumed normally distributed. Their standard deviations are written σ ε and σ u, respectively; ρ is the correlation coefficient between the two. In case z i > 0, we have: In case z i = 0, we get: E [y i x i,u i ] = β x i + E [ε i u i ] σ ε = β x i + ρ ui (21) σu E [y i x i,u i ] = β x i + E [ε i u i α w i ] (22) = β x i + α w ε i f (ε i u i ) dε i which, by application of Theorem 20.4 in Greene (1997), p.975, is equivalent to: = β x i ρσ ε φ( α w i ) σ u α w i 1 Φ( ) where φ(.) and Φ(.) denote the probability function and cumulative distribution function of a standard normal variable. All production and income regressions are reestimated σ u (23)

24 23 with selection/effort correction terms constructed by replacing u i, σ u, φ(.), and Φ(.) in equations (22) and (23) by their predicted values from the tobit labor choice regressions discussed below. Four selection/effort correction terms are constructed, corresponding to kharif, rabi, livestock, and off-farm labor. The correlation coefficient between rabi and kharif selection variables is high The other selection variables are uncorrelated. To minimize endogeneity bias, only kharif labor residuals are used for the rabi production function and vice-versa. Results (not shown) are virtually identical to those reported in Tables 3 and 4. Human capital results are essentially unchanged. As anticipated, off-farm residuals are negative and significant in kharif crop production, suggesting that households which invest more labor in off-farm than predicted by the labor choice regression spend less quality time in their fields. Similarly, households more involved in rabi cultivation than predicted by the rabi labor regression get significantly lower off-farm income. Section 4. Human Capital and Labor Use We now examine how human capital affects labor used in four activities: kharif and rabi crop production, herding, and non-farm work. We also examine total family labor supply. Equations (11) to (15) form the basis of our estimation strategy. The householdspecific shadow cost of labor w * is not observed, but it can be instrumented using factors that influence it: household size and composition; non-earned income; family background; and productive assets in other activities. 16 For kharif and rabi, the dependent variable L a is the sum of family and hired labor. In the case of herding and off-farm 16 Jacoby (1993) uses a different approach and derives shadow wages from marginal products estimated from a farm production function.

25 24 work, it consists exclusively of family labor since the hiring of labor by the household was not observed in these activities. Around 37 percent of kharif and rabi labor observations are zeroes; the corresponding percentages for herding and off-farm work are 45 percent and 38 percent, respectively. The dependent variable is thus a censored variable. Latent labor use L a * is assumed to take the form: _ L * θam a = θ a F e mσ β am Fm/F O θ ab e θ ao O I /O θ T at (24) Σ 4 B θ θ ac B c /B ab e c=1 θ G a 1 θ 1 G a 2 2 S θ as e uσ θ au S u /S Σ 12 θ ah Z h e h=1 for a = {k, r, h, n} with actual labor L a = L a * in case activity a is undertaken by the household. A similar equation is assumed to represent latent labor supply F * e ε Σ F a. The a θ s are parameters to be estimated and the disturbance term ε is assumed to be normally _ distributed. Variable Fa stand for the number of household members in different age/sex _ categories and F Σ _m; F 17 As with other share variables, the parameters of each of the m age/sex categories indicate the efficiency of that category relative to the excluded category, adult males. O is total owned land; O I is owned irrigated land; T is farm tools; B c is the total number of livestock in category c and B Σ B c; G 1 and G 2 are back- c ground variables, land owned by father and land inherited by household, respectively; S u stands for three categories of unearned income, remittances, rental income, and pensions, with S Σ S u; unearned income is expected to have a negative effect on labor supply. Z h u as before denote a set of human capital variables. We focus our discussion on specifications without reported illness days; results with illness days are briefly discussed 17 The categories are: adult males and adult females aged 20 to 65; children aged 0-5; youth aged 6-19; and old aged 66 and above.

26 25 in the text. Equation (24) is estimated for each labor use category and for total labor in log form using tobit. 18 Table 5 presents regressions of equation (24) for each of the four activities and for total family labor supply as a function of average human capital variables. Household size appears to be an important determinant of herding, off-farm labor, and total labor supply. Household demographic composition is also significant in all of the regressions; estimated coefficients show that all age/sex categories supply less labor than adult males. This result is in full agreement with the dominant role that adult males play in all market oriented activities (see Section 2). It also constitutes a rejection of the hypothesis of perfect markets: if labor markets were complete, production decisions should be separable from household characteristics affecting total labor supply (e.g., Benjamin (1992)). Higher education of adult males is associated with less herding and farm work in both kharif and rabi seasons, but more off-farm labor. This effect is strong: one additional year of schooling leads to 8%, 7.2%, and 9.4% less work in kharif, rabi, and herding, respectively, and to 13.4% more labor off the farm. There is, therefore, agreement between the labor allocation and the productivity regressions discussed in Section 3: better educated males are more productive in non-farm work; they respond to this by reallocating their time away from less productive to more productive activities. The net effect of this reallocation on total family labor is non-significant. Males with a higher BMI also spend more days doing off-farm work and height of adult males is significant and positive in herding labor. Both results are consistent with the higher productivity achieved by better fed males in off-farm work and by taller men in herding (see Section 18 To avoid losing observations each time a dependent variable or a regressor is zero, all logs are taken after adding 1 to the variable. Shares are similarly set to zero whenever their denominator is zero.