Antonio Di Paolo Universitat Autònoma de Barcelona & IEB

Size: px
Start display at page:

Download "Antonio Di Paolo Universitat Autònoma de Barcelona & IEB"

Transcription

1 SCHOOL COMPOSITION EFFECTS IN SPAIN Antonio Di Paolo Univeritat Autònoma de Barcelona & IEB Abtract: Drawing on the PISA 2006 data et, thi tudy examine the impact of chool ocio-economic compoition on the cience tet core of Spanih tudent enrolled in compulory econdary education. We define chool compoition in term of the average parental human capital of tudent at that ame chool. Thee contextual peer effect are etimated uing a emi-parametric methodology, which enable pillover to affect all the parameter in the educational production function. We alo deal with the potential problem of tudent elf-election into pecific chool by uing an artificial orting mechanim, which we believe to be independent of a tudent unoberved abilitie. The reult indicate that the aociation between a chool ocio-economic compoition and tet core reult i clearly poitive and ignificantly higher when computed uing a emi-parametric approach. However, we find that the endogenou orting of tudent into chool alo play a fundamental role, given that pillover are ignificantly reduced when thi election proce i eliminated from our meaure of chool compoition effect. Specifically, the etimation ugget that contextual peer effect are moderately poitive only in thoe chool where the ocio-economic compoition i comparatively high. In addition, we find ome evidence of aymmetry a to how the external effect and the orting proce actually operate, apparently affecting male and female and high and low performance tudent differently. Keyword: Educational Attainment, Peer Effect, PISA, Spain JEL Claification: I20, I21, I29 Correponding author: Antonio Di Paolo; antonio.dipaolo@uab.cat, tel.: ; fax: Departament d'economia Aplicada, Univeritat Autònoma de Barcelona (UAB); Campu de Bellaterra, Edifici B Bellaterra (Cerdanyola), Spain. Intitut d Economia de Barcelona, Univeritat de Barcelona. 1. Introduction Thi paper invetigate chool compoition effect on Spain lower-econdary chool (Educación Secundaria Obligatoria, ESO), uing PISA data from 2006, and a uch i the firt tudy to examine thi quetion explicitly in the Spanih cae. Quantifying the impact of the ocio-economic compoition of Spain chool i epecially important in thi country becaue, where there i an exce of demand a i common in large citie the admiion proce in public and public-funded (concertada) private chool i very cloely related to zoning law and chool ditrict policie. Therefore, given that chool admiion criteria aign greatet weight to the proximity of the tudent home to the chool, thee educational policie are inextricably linked to the effect of the chool ocio-economic mix. In fact, they may reult in the direct tranfer of the exiting ocio-economic reidential egregation into the chool (Hoxby 2000, Gorard et al. 2003). Moreover, chool compoition effect might have gained additional relevance in Spain a a reult of the ignificant increae in the number of immigrant tudent 1

2 from le affluent ocial background in recent year and the ubequent interaction with exiting zoning law i.e. le advantaged immigrant familie tend to reide within ethnic enclave and, a a conequence, their children inevitably tend to concentrate in chool characteried by a low ocio-economic compoition. Empirically, there are many channel via which the feature of an individual choolmate or clamate namely, the peer effect might influence individual attainment. In the general framework propoed by Manki (1993, 2000), the overall effect of the peer group on individual outcome primarily involve element of ocial interaction that include both endogenou and contextual (or exogenou) effect. The former are the direct effect that peer behaviour or outcome can have on individual outcome; that i, tudent may well learn more becaue their chool/clamate learn more. The latter are the impact that certain exogenou characteritic of the peer group can have on a tudent achievement i.e. individual performance depend on the ocio-economic compoition of hi/her group. In addition, the extent of peer effect might be confounded by the preence of hared environmental/chool element or individual characteritic (e.g. cognitive and non-cognitive kill) that go unoberved by the econometrician; the o-called correlated effect. Obtaining eparate etimate of endogenou and contextual effect i fraught with empirical complication 1 and, moreover, i highly data-demanding. Thu, thi tudy concern itelf olely with contextual effect, which ha been a fairly common approach in the empirical literature to date. More pecifically, thi paper ue a broad meaure of the ocio-economic compoition of chool, baed on the average parental educational background (defined a the highet educational level completed by either one of the two parent) for each chool. It main contribution to the exiting literature conit in the implementation of a emi-parametric methodology that allow chool-contextual effect to influence all parameter in the educational production function (a uch, adapting the original propoal made by Raymond & Roig 2010). Intuitively, taking a a reference the mot diadvantaged chool in term of their ocioeconomic compoition (i.e. chool in the lowet quintile of average parental background), the paper how that moving to better endowed chool might generate a level hift a well a variou potential gradient hift. Indeed, the meaure of contextual peer effect propoed here hould capture the global impact of chool compoition on the educational production function. In addition, the flexible trategy adopted enure that uch chool compoition effect are nonlinear, ince they are eparately computed for each ucceive quintile of the chool average parental education. Finally, the paper deal with the mot common problem encountered in uch tudie, namely the elf-election of tudent into different chool (i.e. a pecific type of correlated 1 Specifically, what i commonly referred to a reflection problem, which involve the imultaneou determination of achievement for all tudent within a peer group (i.e. a imultaneity bia problem). 2

3 effect). Specifically, the preence of a orting mechanim that allocate thoe tudent that are better endowed of unoberved characteritic into chool with higher average parental chooling might bia our meaure of peer effect. Therefore, an alternative orting mechanim i provided that can be aumed to be unrelated to an individual tudent unoberved characteritic. Such reordering i baed on the predicted linear core, obtained from an ordered probit model that etimate the probability of memberhip in each quintile of the chool average parental education. Thi artificial orting i then ued to reduce election bia in the definition of reference and non-reference group. Thu, the tudy i able to provide a meaure of a chool compoition effect that i ignificantly le affected by correlated effect. With thee purpoe in mind, the ret of the paper i organized a follow: ection 2 contain a brief review of elected paper examining peer effect, focuing on the variou etimation trategie adopted to eliminate correlated effect. Section 3 decribe the empirical methodology that i ued in thi tudy and Section 4 i dedicated to a decription of the data. Section 5 contain the empirical reult, a well a a robutne check and an analyi of the potential aymmetrie of chool compoition effect. Section 6 conclude. 2. Selected Contribution Previou tudie of peer effect on cholatic achievement preent quite mixed finding and, to date, there i no unified evidence a to the exitence or to the actual form that thee effect might take. Thi line of reearch ha ought to capture thee potential pillover at everal point in the educational proce (from primary to tertiary tage), and by conidering different peer feature (actual or lagged peer tet core, ethnic and ocio-economic compoition of the peer group, etc.). Thi point to the fact that the reulting pillover will be either poitive or negative (or even zero), while dependent at all time on the nature of the peer variable, and a uch the final net effect become an empirical quetion. Furthermore, governed primarily by data availability, the definition of thee peer group ha been markedly heterogeneou, ranging from chool, chool-by-grade and claroom to other ocial peer uch a roommate or friend. A a reult, the finding tend to be highly cae-pecific and not alway trictly comparable. In general, thi lack of explicit comparability i attributable to i) the pecific characteritic of the ample ued, and ii) the (ubequent) econometric technique adopted in identifying peer effect other than the correlated effect. Interetingly, ome tudie are baed on pecial ample in which tudent are aigned randomly into peer group, thereby poibly eliminating the bia attributable to correlated effect. More pecifically, uch quai-experimental tudie exploit the randomized trial generated by the chance matching of tudent with firt-year roommate in college accommodation (ee Sacerdote 2001, Zimmerman 2003, Foter 2006 and Brunello et al

4 among other), or cla aignment on the bai of urname during firt year univerity coure (De Paola & Scoppa 2010). Several other paper, which focu principally on primary and econdary chool, adopt fixed effect framework in order to control for any potential bia in peer effect etimate. For example, McEwan (2003) control for both chool and family fixed effect when etimating peer contextual effect at the claroom level, finding a poitive and lightly concave effect of the claroom mean of the mother education. Hanuhek et al. (2003) exploit a unique panel dataet covering three ucceive cohort of tudent; hence, they are able to control for individual, chool and chool-by-grade fixed effect in a value-added pecification of the educational production function. They report a poitive effect of mean peer achievement on improvement recorded in tet core, which remain almot contant over the tet core ditribution. They alo found no important effect of the average family income of the peer and tet core heterogeneity in the peer group. Lavy et al. (2008) exploit cohort-to-cohort and within-chool change in the proportion of low achiever (i.e. their proxy of peer ability) to identify endogenou peer effect and the mechanim via which they impact on an individual own achievement. They report a clear negative impact of the proportion of low achiever in the claroom, which tend to be more pronounced for tudent of low ocio-economic background 2. Ammermuller & Pichke (2009) conider the contextual effect in primary chool for everal European countrie (uing PIRLS data). They aume that contextual peer effect at the claroom level are captured by the average number of book at home. Thee peer effect are identified by exploiting variation acro the claroom within the ame grade for the ame cohort of tudent (once etablihed that thee clae had been formed in what wa a largely random manner). Their reult indicate that, in general, contextual peer effect do exit; however, they alo point out that imple OLS etimation might be equally affected by election bia a well a by meaurement error in the peer variable, which tend to operate in oppoite direction. The preent tudy i mot cloely related to thoe undertaken by Fertig (2003), Schneewei & Winter-Ebmer (2007) and Rangvid (2007), which alo draw on PISA data. Specifically, Fertig (2003) invetigate the effect of reading achievement heterogeneity in US chool, which i identified through Intrumental Variable (IV) 3 namely, dummie for private and elective chool and a et of variable related to the prevalence of parental caring behaviour in each chool. Hi reult indicate that attending a heterogeneou chool in term of tudent achievement undermine individual performance; however, the negative effect he report 2 Moreover, their reult alo ugget that the negative impact of the proportion of low achiever mainly operate via the diruptive influence it ha on teacher pedagogical practice, interaction with other tudent and claroom diorder. 3 Other paper in which the identification of peer effect relie on IV trategie include thoe by Feintein & Symon (1999) and Roberton & Symon (2003), where the intrument conit of location variable and teacher aement of a tudent previou ability combined with region of birth dummie, repectively. 4

5 appear to be exceive when etimated uing IV (which raie the quetion about the validity of the intrument ued). The paper by Schneewei & Winter-Ebmer (2007) explore the effect of ocio-economic compoition at the chool-by-grade level in Autria. The author preent evidence obtained, on the one hand, from OLS etimation baed on an extenive et of individual and chool control and, on the other hand, from the application of chool fixed effect. They argue that, when accounting for chool type given the marked track ytem in Autrian lower and upper econdary chool chool fixed effect reduce the election bia in the etimation of peer effect. Their reult highlight a ignificant aymmetry in the peer effect on reading 4, which eem to have a more beneficial effect in the cae of tudent of a low ocio-economic background. Moreover, they alo adopt a quantile regreion trategy, which reveal that tudent in the lower part of the ability ditribution are more poitively affected by the ocioeconomic compoition of their peer group. Finally, Rangvid (2007) analye the effect of the ocio-economic compoition of a chool in term of the three PISA ubject (reading, math and cience) drawing on Danih data, which are complemented with adminitrative regiter to overcome the potential problem caued by the limited ample of tudent within each chool 5. Given the comprehenive nature of the Danih econdary chool education ytem, the author cannot rely on the chool-fixed effect etimation a wa the cae in Schneewei & Winter-Ebmer (2007); indeed, he cannot aume that individual (and their familie) who are placed in a given chool of a certain track hare imilar unoberved characteritic 6. Her identification trategy i intead baed on controlling for a large et of individual, family and chool variable, without explicitly conidering the role of election on unobervable feature. The reult in thi tudy ugget a clear poitive effect of attending a chool with a higher ocio-economic compoition in the middle of the tet core ditribution, wherea no ignificant effect i found for the ocio-economic heterogeneity at the chool level. Moreover, the quantile etimation reveal that chool compoition effect tend to be higher for low-ability tudent on the reading tet core, but the author find a U-haped effect for cience, which mean that low and high ability tudent benefit equally from a better ocio-economic chool compoition. 4 By contrat, they alo ugget that the apparent peer effect in mathematic, a etimated by OLS, are due only to election effect, given that their fixed-effect etimate are not tatitically ignificant. Additionally, in thi cae, peer group heterogeneity eem to play a very limited role in explaining tet core attainment. 5 A argued by Micklewright et al. (2010), the limited tudent ampling made by PISA can reult in a meaurement error in the etimation of peer effect. Thi would bia the effect of chool compoition toward zero. Unfortunately, uch adminitrative data are not available for public ue in the Spanih cae; therefore, it hould be borne in mind that the etimate reported in thi tudy repreent a lower boundary of the true impact of chool-average parental education. 6 Notice that ince the LOGSE reform of 1990, the Spanih econdary education ytem ha been compulory and comprehenive until the age of ixteen, which (a in the cae of Denmark) make the chool-fixed effect framework unfeaible for controlling endogenou peer group election. See ection 3 for detail a to how uch a problem i addreed in thi paper. 5

6 3. Empirical Framework The etimation trategy propoed in thi paper repreent a tep forward in term of the meaurement of peer effect. Indeed, the main innovation with repect to previou tudie conit, a briefly commented in the introduction, in the idea that the pillover produced by an improvement in a chool ocio-economic compoition may affect not only the intercept, but all the parameter of the educational production function. Thi original propoal ha been taken (and adapted) from the paper by Raymond & Roig (2010), in which they etimate the externality produced by the average human capital of worker in the ame firm. In keeping with thi externality, thi paper take a it tarting point the tandard educational production function, T = α + β X + δ Z + ε (1) i, i i, where tet core T i, of tudent i in chool depend on a et of individual and family characteritic (X i ) a well a on a et of chool characteritic (Z ), plu a compoite error term (ε i, ). Uually, exogenou peer effect are imply etimated by conidering that the intercept term (α ) i not fixed, but intead dependent on an average characteritic of the peer group i.e. in thi cae, the average parental education of tudent at that chool ( PE ). Thi mean that the intercept term in (1) can be rewritten a, ( PE ) α = α + µ (1a) which indicate that a unit increae in the average parental education in the chool modifie the mean tet core by µ point, through a hift in the intercept term. We could alo adopt a nonlinear pecification, where the impact of the chool compoition of parental human capital i allowed to vary for each ucceive quintile of chool-average parental education ( Q ( PE ), j = 1,..,5 ). In thi cae, the intercept term in (1) can be expreed a, j 1 j j j= 2 5 ( PE ) α = α + α Q (1b) where the contextual peer effect are now α j (j=1,..,5) and are allowed to be different for each quintile of average parental chooling. Even in thi cae, the impact of the peer group characteritic i only produced by a level effect, which operate through a modification of the educational production function intercept; in fact, once the expreion (1b) i ubtituted into equation (1) we obtain, 6

7 5 ( ) T = α + α Q PE + β X + δ Z + ε. (2) i, 1 j j i i, j= 2 Thi correpond to the tandard equation ued in the peer effect literature, except for the nonlinear pecification of the contextual peer effect. Equation (2) clearly pecifie that the tandard approach contrain chool compoition pillover o a to affect only the intercept term and no other parameter in the educational production function (even allowing for a non-linear effect). However, there i no theoretical reaon to believe that the contextual peer effect conit only of a imple level effect. For example, an improvement in the ocio-economic compoition of the peer group might modify the gradient of the effect of a tudent family background and home environment on hi/her tet core. Additionally, belonging to a good peer group in term of average parental human capital might relax the relationhip between other chool characteritic and an individual achievement. In order to capture any potential hape effect of chool compoition, we conider a reference group, which conit of all the tudent who belong to the leat-advantaged chool in term of average parental educational background. In the preent application, the leat-advantaged chool are defined a thoe chool that appear in the firt quintile of the average parental education 7 (i.e. Q j ( PE ) Q1 ( PE ) = ). Therefore, the educational production function i eparately etimated for the reference category, a in equation (3): Q ˆ ( ( )) ( ) ( ) 1 Q ˆ, Q 1 Q ˆ ˆ, 1 Q ˆ, 1 T ˆ i Q PE = α + β Xi+ δ Z+ εi= ψ Ri+ εi if Q, j PE = Q PE 1. (3) Q From the obtained parameter etimate ( 1 ψˆ the individual who do not belong to the reference group, that i, ˆ Q ˆ ˆ ( ( ) ) ( ), 1 Q ˆ ˆ 1 1 Q 1 i j ψ α β i δ j ), we then proceed to forecat the tet core for all T Q PE ; = + X + Z i Q PE, j > 1. (4) Finally, for each ucceive quintile of the chool-average parental education, the meaure of chool compoition pillover preented here conit of the average difference between the actual and the forecated tet core within each quintile: 7 A noted by Raymond & Roig (2010), the definition of the reference group i alway ubject to ome degree of arbitrarine; in their cae, they define the reference group a thoe productive etablihment in which the average worker human capital i equal to or le than eight year of chooling. Thi definition follow the logic that eight year of education correpond to the compulory length of education under the intitutional framework that wa then valid for individual in their ample; moreover, it hould repreent thoe firm that chiefly employ unkilled worker. In our cae, we conider it better to define the reference group in an endogenou way i.e. dividing the ample into quintile and taking the firt one a the reference group. Thi definition allow u i) to conider chool a being more heterogeneou unit than firm, and ii) to maintain a ufficient number of obervation in the reference and nonreference group. 7

8 N j ˆ Q1 ( ( ); ψˆ ) T T Q PE =, > 1 ( ) i, i, j i= 1 IEX j i Q j PE j N j (5) In other word, thi meaure of contextual peer effect conit of counterfactual evidence, which i baed on the ceteri paribu within-quintile mean differential between the oberved and the predicted tet core, where the latter i obtained by uing the parameter etimated for tudent in the leat-advantaged chool. More intuitively, thi methodology repreent a emiparametric approach to capture the ceteri paribu change in the tet core, produced by moving a repreentative tudent from the firt quintile to ucceive quintile of the chool-average parental education. Note that thi meaure of the effect of chool compoition capture in a emi-parametric way the change in each parameter making up the whole educational production function (both level and hape effect), produced by incrementing the average parental chooling from the firt to the higher quintile. In thi way, we are able to provide more compelling and complete evidence about the effect of chool compoition on individual tet core, obtained without contraining thee potential pillover of the peer group ocioeconomic tatu to operate only through a hift in the intercept term. 3.1 School Compoition and Selection Bia Thi emi-parametric methodology i not, however, exempt from the mot relevant empirical problem in the etimation of contextual peer effect, repreented by the elf-election of tudent into chool and peer group. In thi paper we eek to reduce the bia produced by the orting mechanim that allocate thoe tudent with a greater (leer) endowment of unoberved abilitie into better (wore) peer group, which may bia our meaure of chool compoition effect. Indeed, were thi to be the cae, the tet core forecat for non-reference group tudent from eq. (3) would preent a downward bia, pointing to an overetimation of the effect of chool compoition. In other word, even if we tried to account for election on obervable variable by conditioning for a large et of individual and chool control (imilar to Rangvid 2007, ee ection 4), we would not be particularly confident about the conditional zero mean of the error term in the tet core equation etimated for the reference group (eq. 3). In line once more with Raymond & Roig (2010), rather than uing a claification of reference and non-reference group baed on actual chool-average parental education, tudent were allocated to reference and non-reference group on the bai of their predicted linear core obtained from an ordered probit model, which etimate the probability of memberhip in each of the five quintile of the average parental education at the chool level. Specifically, we computed the predicted linear core that repreent a proxy of the (latent) parental human capital in each chool, obtained from the following equation: 8

9 * * i γ i µ i i γ i PE = W + PE = % W (6) The explanatory variable that are pecifically included in the vector (W i ) in eq. (6) comprie a et of dummie for chool availability (one, and more than one, chool available) and the tudent age on arrival in Spain (for immigrant), a well a region and municipality ize control variable that alo appear in the tet core equation (to capture unoberved chool characteritic that are common within region and municipalitie of imilar dimenion). Subequently, the obervation are orted according to the quintile of the predicted linear core ( ~ γ Wi ); thi proxy of the choolmate parental human capital would be correlated to the choolaverage parental education, but at the ame time it can be conidered a independent of a tudent unoberved abilitie. Therefore, we take a our reference group thoe tudent in the firt quintile of the predicted parental chooling and we etimate the tet core equation for them a * ) ) ) 1 * 1 * 1 * 1 * 1 * * * T Q γ W α Q β X Q δ Z Q ) Q ) ε ψ R Q ) % = = + ε if Q % γ W = Q % γ W ( i, 1 ( )) 1 i i, i, i, j( ) 1 ( ). (7) It i then poible to re-compute the index of chool compoition pillover in the ame fahion a above, but now without uch a marked effect of the elf-election of tudent into peer group: * N j * ( % * Q ) 1 ( % γ ); ψ) Ti, Ti, Qj Wi * * i= 1 * IEX j = i Q * j ( % γ Wi) = Q j PEi, j > 1. (8) N j Similarly to the IV etimation, we exploit the between tudent variability of chool availability and, in the cae of immigrant, of arrival age, within municipalitie of the ame dimenion within the ame region. Again, in line with IV, a valid excluion retriction i needed to rule out endogenou tudent orting. We conider that once controlling for parental education, ocio-economic tatu and many other family characteritic in the tet core equation (ee the next ection for detail), we can reaonably aume that the only channel through which chool availability and age on arrival might affect a tudent tet core i via the effect of chool election (i.e. they are independent of unoberved tudent characteritic). If thi i true, eq. (7) i correctly etimated 8 and the meaure of ocio-economic chool compoition obtained from (8) i now clean thank to the potential endogenou election of tudent into chool. 8 Notice that the compoite error term in eq. (1) may aume the general form ε i, = η i + ν + ς i,, which mean that apart from individual unoberved ability (η i ), unoberved chool characteritic (ν ) may alo caue ome bia in the reult. However, we are not able to deal explicitly with thi problem uing the PISA databae. We are, therefore, forced to aume that the correlated chool effect are zero once conditioned by a chool characteritic, at leat in the cae of the reference group. It hould be borne in mind that, hould thi aumption prove invalid, the reult preented in what follow may till be affected by the preence of ome unoberved correlated chool effect. 9

10 Neverthele, we recognize that the choice of the excluion retriction (chool availability and age on arrival) i not free of criticim. It i quite obviou that both variable might have an effect on the probability of being in a given quintile of the chool-average parental education. What i not o immediately obviou i the belief that, having controlled for a large et of family characteritic, thee variable are completely orthogonal to a tudent unoberved characteritic. In order to enure a greater degree of reliability for our reult, in ub-ection 5.1 we provide an intuitive falification tet for the validity of the excluion retriction ued here, which i baically aimed at howing that thee variable are not likely to contribute to chool compoition pillover (but they do explain the likelihood of memberhip in reference and nonreference group). 4. Data Decription A dicued above, the empirical analyi i baed on Spanih data from the 2006 Program for International Student Aement (PISA), undertaken by the OECD (ee OECD 2009 for detail). PISA focue on the acquiition of kill in reading, mathematic and cience among a target population of tudent aged 15 to 16. The 2006 aement wa pecifically concerned with the teting of cience kill and a uch i the only kill conidered in thi tudy 9. In the pecific cae of Spain, the tudent interviewed were drawn from a cohort of individual born in 1990 and enrolled in lower-econdary chool (Educación Secundaria Obligatoria, ESO) during the urvey year. A outlined earlier, Spanih lower-econdary education i completely comprehenive and compulory until the age of 16. Normally, 15-year-old pupil will be enrolled in the 4 th grade of lower-econdary education; however, the ample contain tudent from lower grade a well (3 rd, 2 nd and 1 t grade), repreenting thoe who have repeated one or more grade. The original Spanih ample compried 19,604 tudent enrolled at 686 different chool. The PISA urvey ha everal tatitical peculiaritie that mut be taken into account in the etimation phae. Firt of all, the kill aement wa carried out uing five Plauible Value for each field, which are then normalized to obtain a global average of 500 and a tandard deviation of 100. Thi technique, derived from Item Repone Theory, allow tudent (latent) kill to be repreented conitently when the number of ubmitted item i too mall to repreent true individual ability. Moreover, the tructure of the final ample mut alo be taken into account, given that it i the product of a complex two-tage tratification procedure ued to enure that the entire population i repreented. Specifically, the firt tep conit in the 9 Depite thi, the 2006 urvey alo contain information about reading and mathematic kill. Attention i limited here to the cience domain for reaon of pace. The reult for the other two kill are qualitatively imilar, and are available upon requet from the author. 10

11 tratified election of chool with 15- to 16-year-old enrolled in their clae, with ampling probabilitie that are proportional to the number of eligible tudent enrolled; in the econd tep, a given number of tudent are randomly elected within each ampled chool (up to 35). In order to take into account the pecific tatitical propertie of the PISA ample, all the tatitic and etimation that we preent in thi tudy have been carried out with the STATA routine pv, pecifically deigned for PISA and imilar urvey (Macdonald 2008, Lauzon 2004). The PISA urvey contain, apart from the plauible value of the tet core, an extenive (but often not exhautive) battery of quetion about a tudent and hi/her family characteritic, a well a everal other chool characteritic. The empirical analyi ha been conditioned to the information drawn from a large ubet of relevant quetion o a to limit the role of the unobervable variable (following Rangvid 2007 and the OLS pecification of Schneewei & Winter-Ebmer 2007, given the available variable). The whole et of control variable are reported in the complete verion of thi work (ee Di Paolo 2010), together with the exact definition of each variable, it mean and tandard deviation. In ummary, the conditioning variable can be divided into individual control (ex, grade attended, age, migration tatu and the language poken at home), family control (paternal and maternal education, family ocio-economic tatu, maternal working ituation, number of book at home and educational reource), chool control (prevalence of immigrant, girl and part-time teacher, lack of qualified teacher, chool autonomy, tudent/teacher ratio, chool ize, chool ownerhip, treaming procee, career guidance employee and preence of computer for intruction) and territorial control (municipality ize and region) 10. A uual, we alo generated indicator function for obervation with miing information for the explanatory variable, in order to control for the non-randomne of the miing value; in the cae of miing information, the explanatory variable are fixed a being equal to zero. A a meaure of the ocio-economic compoition of the chool we conider the chool-average parental education, taking the highet educational level completed by one or other of the two parent 11. Obervation with miing information about the highet parental education have been dicarded from the ample (2% of the total ample). Since our chool compoition meaure conit in the chool-average value, we alo dicarded the forty-two obervation of tudent 10 Notice that we alo retain information about the availability of neighbouring chool within the ame area and about the tudent age on arrival in Spain (for firt generation immigrant). Thee are included in eq. (6) only. 11 The final tudent weight provided in the PISA databae ha been ued in the computation of the chool compoition variable. Thi hould reduce the impreciion in the chool compoition meaure obtained from PISA data, where (a commented above) not all the tudent from every chool are ampled. Whatever the cae, the reult are inenitive to the excluion of the final tudent weight in the computation of the chool-average parental education. Notice alo that the mean peer characteritic i uually computed without the contribution of the individual (becaue thi might caue a reflection problem when the average value of the peer group i ued a an explanatory variable). In thi cae, where chool-average parental education i only ued to define reference and non-reference group, thi complication i not neceary; in any cae, the reult are virtually unchanged when the average parental education doe not include the individual contribution (the reult are available upon requet). See Table 1A in the Appendix for more detail about the chool compoition variable. 11

12 that are enrolled in chool with fewer than eight tudent. In the end, the ample ued in the empirical analyi wa formed by 19,164 tudent at 675 different chool. 5. Reult 5.1 Level Effect of School Compoition The tarting point for thi empirical analyi wa an etimation of the educational production function a decribed by eq. (2), in which the chool compoition meaure wa allowed to be non linear (dummie for chool-average parental education quintile), but contrained o a to produce only a level effect. A reported in Table 2 in the complete verion of the paper (Di Paolo 2010), the reult indicate that moving from the firt quintile to the econd quintile of average parental chooling at the chool level had only a lightly ignificant impact (7 point) on the cience tet core. However, the ceteri paribu comparion between tudent in the firt quintile and thoe in the third revealed that tudent in the latter group performed ignificantly better than the reference group, howing a poitive core gap of about 14 point. Thi poitive level effect of chool compoition fell omewhat when moving to the fourth quintile (11 point). Finally, the tet core for tudent in the mot-advantaged group in term of chool compoition (fifth quintile) wa, on average, 24 point higher than the core for tudent in the leat-advantaged group. Thi mean that an improvement in the chool ocioeconomic compoition had a ubtantial level effect on individual tet core, and that thi appear to be non-linear in the quintile of average parental human capital at the chool level. The etimate for the remaining control variable are of independent interet, and it i worth briefly commenting on the main finding. The increae in tudent age wa poitively aociated with the tet core, wherea female eemed to obtain wore reult than male in cience. The effect of the grade attended wa a expected, given that tudent from lower grade than that of the fourth grade (the tandard grade at age 15 and 16) performed ignificantly wore. Even accounting for the language poken at home and other family characteritic, firt-generation immigrant tudent performed ignificantly wore than native and econd-generation immigrant (negative gap of 25 point). An improvement in a family ocio-economic tatu had a marked poitive effect on the cience tet core, while only maternal education howed a ignificant and poitive effect on a tudent competence for the cience. Children of working mother performed markedly better than thoe whoe mother did not work, with a ceteri paribu average increae of 10 point in the tet core. In addition, the number of book and a home endowment of educational reource alo had a ignificant poitive effect on the tet core. 12

13 An analyi of chool control variable revealed the uual reult for PISA data i.e. chool characteritic control variable were hardly ignificant when explaining tudent tet core. Therefore, we hall only decribe in brief the few variable that diplayed tatitically ignificant coefficient. We detected a poitive effect of the percentage of girl attending a chool, wherea the increae in the ratio of peronal computer for intruction to chool ize had a negative impact on the cience tet core. After accounting for family characteritic, a chool ocio-economic compoition and other chool characteritic, it wa found that public chool performed ignificantly better than private and public-funded private chool. Finally, tudent enrolled at chool that can hire teacher autonomouly eemed to achieve better reult than their counterpart. The evidence obtained from the territorial control variable indicated that being chooled in a large city ha a poitive effect on cience attainment; moreover, the coefficient aociated with regional dummie (not hown here) uggeted that Catalonia and the Baque Country performed ignificantly wore than the ret of Spain region. 5.2 Accounting for Shape Effect and for Selection Bia The reult obtained from the etimation of eq. (2) ugget a ignificant and poitive effect of the chool ocio-economic compoition. However, a previouly highlighted, thi reult may merely repreent partial or incomplete evidence, given that we implicitly contrained the impact of the chool-average parental education o a to affect only average attainment (i.e. the intercept of the educational production function). In order to capture any other potential lope effect produced by an improvement in the chool endowment of parental human capital, we implemented the innovative methodology decribed above in ection Panel A of Table 3 contain the etimated value of our meaure of chool compoition pillover (eq. 5). We computed IEX j eparately for each quintile of the chool-average parental education and alo calculated the mean value for all the quintile (except that of the firt, which i the reference category). The reult from the emi-parametric methodology confirmed that the effect of the parental education of the peer group wa ubtantial and clearly non-linear. A in the previou cae, moving from the leat-advantaged group to the econd quintile of the chool ocio-economic compoition had almot no effect on individual tet core (almot 5 point, but not tatitically different from zero), wherea the tep to the third quintile produced a poitive increae of about 12 point. However, the movement to higher quintile generated ubtantial (and poitive) lope effect, which were hidden by the implicit contrain of eq. (2). Indeed, chool compoition effect could be quantified into 26 additional tet core point for tudent 12 The etimate of the educational production function for the reference group (eq. 3 and eq. 7) are not reported here for reaon of pace, but are available upon requet; in general, the reult are conventional and qualitatively imilar to thoe reported in Table 2. 13

14 in the fourth quintile of the average parental chooling and up to 71 point for tudent in the highet quintile. Additionally, the mean value for all the non-reference group wa alo tatitically ignificant, approaching 28 tet core point. [TABLE 3 ABOUT HERE] However, thee reult may well be biaed by the fact that tudent with a better endowment of unoberved abilitie are more likely to enrol in the better chool (in term, that i, of their ocio-economic compoition). In order to reduce thi potential election bia, we firt etimated eq. (6) uing an ordered probit model, the dependent variable of which wa the five quintile of the actual chool-average parental education. The etimate (ee Table 2A in the Appendix) indicate that immigrant pupil who arrived in Spain at an earlier date are ignificantly more likely to be enrolled in chool where their choolmate parental education i higher; moreover, conditional on region and municipality ize, the chance of being in better chool i alo higher for thoe who reide cloer to other chool. In general, the variable included provide a atifactory explanation of the probability of being in each of the quintile of the chool-average parental education. Subequently, we ued the predicted linear core to obtain a proxy of the chool-average parental human capital that wa independent of the tudent unoberved characteritic. When tudent were orted into reference and non-reference group according to the predicted linear core, the evidence concerning chool compoition pillover wa markedly different. A reported in the lower panel of Table 3, the mean effect for all the non-reference group wa tatitically non exitent, which i the reult of a clear convexity of chool compoition effect with repect to the different quintile of PE. In fact, tudent from the econd quintile of the proxied average parental education were penalized by about 10 point with repect to pupil at the leat-advantaged chool (i.e. the reference group), and the pillover effect for tudent in the third quintile were not tatitically different from zero. In addition, when the elective orting of pupil into chool wa accounted for, the effect of chool compoition wa trongly reduced for tudent enrolled in the better endowed chool (about 15 tet core point for both the fourth and the fifth quintile). Thi evidence ugget that, epecially for tudent in the highet quintile of the average parental chooling, there i a coniderable orting proce in their favour with repect to le-advantaged tudent. Summing up, a ignificant contextual peer effect wa till detected, but it eemed to generate a poitive and modet pillover only in thoe chool where the average level of parental education wa higher. However, the proce of tudent orting would eem to be even more important than the externality produced by the ocio-economic origin of an individual choolmate. 14

15 5.3 Robutne Check The evidence preented above ought to determine whether a more favourable chool compoition produced better individual reult in the cience tet core (ceteri paribu). The baeline reult eem to ugget that the exogenou characteritic of an individual choolmate (the contextual peer effect, here defined in term of parental education) exert a poitive externality on the individual acquiition proce of competence in thi field. Thi pillover wa even higher when we conidered not only the level effect, but alo the whole lope effect in the educational production function. However, thee reult are likely to have been confounded by the preence of an endogenou orting proce that allocated the tudent with a better endowment of unoberved abilitie to the better chool (in term of their ocio-economic compoition). On attempting to reduce thi potential bia, the reult are markedly different: there wa a mall and poitive effect of the chool ocio-economic compoition only in thoe chool where the average level of parental education wa coniderably higher. Whatever the cae, thee reult might till be biaed if the variable ued a excluion retriction had been ytematically related to a tudent unoberved ability. Recall that the validity of thee reult i baed on the aumption that, having controlled for the father and mother education, the family ocio-economic tatu, migration tatu, language poken at home and other family characteritic, the preence of one or more available chool and the arrival age for immigrant children are unrelated to the unoberved abilitie. Unfortunately, there i no formal way to prove the validity of thi aumption, given that it involve element that are, by definition, unobervable. Even o, we have provided an intuitive falification tet, which help u to corroborate our excludability aumption. Thi tet i baed on the idea that if the excluded variable had had ome effect on the tet core equation (even via correlation with the unobervable), including them in the equation for the reference group would have modified the reult obtained with our meaure of peer effect. In fact, the logic behind the excluion retriction i that thee variable only affect the tet core (for the reference group) through their effect on the probability of being in each quintile of the chool-average parental education. Firt, we performed everal tatitical tet to analye the ignificance of the variable excluded from the educational production function; the reult (not hown here) ugget that both variable (individually and jointly) do not differ from zero at any conventional level of ignificance. Moreover, we gradually included the dummie for chool availability and age on arrival in the tet core equation (3) and (7) and, then, we recomputed the meaure of chool compoition effect (5) and (8), without and with the endogenou orting correction repectively. The reult, reported in Table 4, howed i) the baeline meaure of pillover to be inditinguihable from the original one computed without the excluded variable; in addition, ii) the reult were only lightly different (but identical in tatitical term) when tudent were re- 15

16 orted into reference and non-reference group according to eq. (6) and the two variable were included in the tet core equation for the reference group. In principle, if the excluded variable had had an effect on a tudent tet core and/or had contributed to explain a chool compoition effect, we would have oberved a marked alteration in the meaure propoed in thi paper. The evidence that can be drawn from the fact that when chool availability and age on arrival are included in the tet core equation and no ignificant change are oberved make the excludability aumption made in thi paper more reliable. Whatever the cae, it hould be borne in mind that, were another orting mechanim to be operating epecially with repect to chool unobervable characteritic the reult could till contain ome bia and mut be conidered with caution. 6. Concluion Drawing on PISA 2006 data (primarily the cience tet core), thi paper ha invetigated the effect of chool compoition on Spanih econdary chool. A novel methodology ha been implemented to meaure the pillover produced by one pecific exogenou characteritic of a tudent choolmate, namely we treated the highet level of education completed by the parent a a meaure of the chool ocio-economic compoition. The propoed methodology relaxe the implicit contraint common to any peer effect tudy whereby the contextual element can only affect the average outcome through an intercept hift (i.e. a level effect). When accounting for all the change in the educational production function parameter generated by an improvement in the chool ocio-economic compoition (level and lope effect), it wa found that chool compoition effect are ubtantial and ignificantly higher than thoe obtained with the contrained pecification. More pecifically, the reult indicate that the effect of moving from the leat-advantaged chool (thoe in the firt quintile of the chool-average parental education) to better endowed chool improve the cience tet core in a non-linear way, with the poitive effect being particularly pronounced for pupil from top chool i.e. thoe enrolled in chool where mot of the parent had completed upperecondary or tertiary education (fifth quintile of the chool-average parental education). However, thi preliminary evidence hould not be undertood a being the pure contextual effect of the chool ocio-economic compoition, given that it might be confounded by the preence of correlated effect. Thi paper ha explicitly attempted to deal with the endogenou election proce whereby tudent endowed with higher unoberved abilitie are allocated to better chool (in term of their ocio-economic compoition). Thi wa achieved by re-orting tudent according to a predicted linear core obtained from an ordered probit model, which etimate the probability of memberhip of each quintile of chool-average parental education. 16

17 It i argued that, by proxying parental human capital, tudent can be re-orted in a way that i uncorrelated with unoberved individual characteritic. When chool-compoition pillover were recomputed on the bai of thi artificial reorting, the evidence wa ignificantly different. The externalitie produced by the parental human capital of choolmate were dratically reduced and they were moderately poitive only when the chool ocio-economic compoition wa comparatively high (in the fourth and fifth quintile). Moreover, additional evidence concerning the aymmetrie of the effect of chool compoition (ee Di Paolo 2010) revealed major difference between male and female and between high and low performance tudent. It eem that the reult of male tudent are more cloely affected by endogenou orting than they are by the exogenou characteritic of their choolmate; by contrat, the reult of their female counterpart are more enitive to the poitive contextual effect given that chool compoition effect were greater when cleaned by elf-election. Furthermore, the ubgroup of low performance tudent appear to be poitively affected by an improvement in chool-average parental background, even after accounting for the preence of endogenou orting. Having aid thi, it i important to bear in mind the potential pitfall of thi tudy, which are linked primarily to the limitation of the databae drawn upon. Firt, it i quite likely that the reported effect of chool compoition are a lower bound of the true impact, given that the limited ampling of tudent in PISA might caue ome attenuation bia in our etimate (compare Micklewright et al. 2010). Second, in the cae where election i made on the bai of a chool unoberved characteritic (i.e., thoe that are not captured by the extenive lit of chool control included here), the meaure of chool compoition effect might till contain ome bia. Third, if the variable ued a excluion retriction are in ome way correlated with unoberved tudent abilitie, the methodology adhered to here for reducing the bia generated by endogenou orting would not be effective at all. Whatever the cae, and even taking thee potential limitation into conideration, the evidence preented above make important contribution to the on-going public debate concerning chool law and the (re)allocation of certain type of tudent into (other) chool. Firt, the relevance of endogenou tudent orting raie the quetion a to jut how equitable and efficient the zoning law regulating acce to Spain econdary chool are. Thi become a matter of urgency when it i een that, with the elf-election of tudent ruled out, the poitive impact of enhancing a chool ocio-economic compoition i only poible when the average parental educational background i comparatively high. Thi reult would eem to ugget that the zoning law are actually impeding tudent of a low ocio-economic background from benefiting from a more favourable ocio-economic chool environment, given that they appear to lead to the concentration of uch tudent in diadvantaged chool environment. Thi i becaue familie of lower ocio-economic tanding tend to locate ytematically in certain 17