Inequality in tertiary education systems: which metric should we use for measuring and benchmarking?

Similar documents
Homework 1: Who s Your Daddy? Is He Rich Like Me?

Groupe de Recherche en Économie et Développement International. Cahier de recherche / Working Paper A Model of Horizontal Inequality

L7: Measuring Inequality and Poverty

The Productivity of Unskilled Labor in Multinational Subsidiaries from Di erent Sources

Job Turnover and Income Mobility

Regional Development and Inequality of Income Distribution

The Net Bene ts of Incentive-Based Regulation: A Case Study of Environmental Standard Setting

Problem Set on Unidimensional Inequality and Poverty Measures

LABLAC LABOR DATABASE FOR LATIN AMERICA AND THE CARIBBEAN

P rofit t (1 + i) t. V alue = t=0

1 Applying the Competitive Model. 2 Consumer welfare. These notes essentially correspond to chapter 9 of the text.

Counterfeiting as Private Money in Mechanism Design

Learning Objectives. Module 7: Data Analysis

Computing Descriptive Statistics Argosy University

Distinguish between different types of numerical data and different data collection processes.

TOPIC 4. ADVERSE SELECTION, SIGNALING, AND SCREENING

Dynamic Olley-Pakes Decomposition with Entry and Exit

Two Lectures on Information Design

Inter Agency Group on Disaggregated Education Indicators (IAG DEI): Concept note. 10 March 2016

Tutorial Segmentation and Classification

Vertical and horizontal decomposition of farm income inequality in Greece

Wage Cuts as Investment in Future Wage Growth

Information Design: Murat Sertel Lecture

Performance Pay and Wage Inequality

These notes essentially correspond to chapter 11 of the text.

Glass Ceilings or Glass Doors? Wage Disparity Within and Between Firms

Di erences in Wage Growth by Education Level

APPENDIX C: DISTRIBUTIONAL ANALYSIS OF RURAL ELECTRIFICATION

Private Returns to Education in Greece: A Review of the Empirical Literature

Department of Economics. Working Papers

World Bank, June 2012

Employer Learning, Job Changes, and Wage Dynamics

The Dummy s Guide to Data Analysis Using SPSS

Does Employer Learning Vary by Occupation? [Job Market Paper]

IO Experiments. Given that the experiment was conducted at a sportscard show, the sequence of events to participate in the experiment are discussed.

EconS Monopoly - Part 1

A New Model for Visualizing Interactions in Analysis of Variance

Chapter 1 Data and Descriptive Statistics

Introduction to Economics

The Power of Di erentiation? - An Empirical Investigation on the Incentive E ects of Bonus Plans

Returns to Skills in Self-Employment: Entrepreneurs as "Jack-of-all-Trades"

The causal effect of education on aggregate income Marcelo Soto June 30, 2009

The Di cult School-to-Work Transition of High School Dropouts: Evidence from a eld experiment

Examining Selection Rates and the Qualifying Standard for the Field Radio Operators Course

Chapter 3. Table of Contents. Introduction. Empirical Methods for Demand Analysis

Department of Economics Queen s University. ECON239: Development Economics Professor: Huw Lloyd-Ellis

Appendix 1: Agency in product range choice with di erent

Statistical Pay Equity Analyses: Data and Methodological Overview

Marginal Cost Pricing in Hydro-Thermal Power Industries: Is a Capacity Charge Always Needed? 1

Online shopping and platform design with ex ante registration requirements. Online Appendix

Discussion of Cayen, Coletti, Lalonde and Maier s "What Drives Exchange Rates? New Evidence from a Panel of U.S. Dollar Bilateral Exchange Rates"

San Francisco State University ECON 560. Human Capital

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Band Battery

Course on Data Analysis and Interpretation P Presented by B. Unmar. Sponsored by GGSU PART 1

Model Building Process Part 2: Factor Assumptions

C. Emre Alper Orhan Torul. Abstract

EconS Theory of the Firm

EconS 330, Fall 2011 Homework #2: Due on September 28th

Defining models using equations...

INFORMAL EMPLOYMENT AND INEQUALITY IN AFRICA: EXPLORING THE LINKAGES

REGIONAL TRAINING COURSE. Child Poverty and Disparity Measurement and Analysis March Cairo, Egypt

Alexander Tarasov: Consumer Preferences in Monopolistic Competition Models

EconS Second-Degree Price Discrimination

Competition, Product Safety, and Product Liability 1

Everything you wanted to know about the gender pay gap but were afraid to ask

Sawtooth Software. Learning Effects in Preference Tasks: Choice-Based Versus Standard Conjoint RESEARCH PAPER SERIES

On Measuring Inclusiveness of Growth in Pakistan

EconS Bertrand Competition

Chapter 12 Gender, Race, and Ethnicity in the Labor Market

Topic 2:Inequality. -relative vs absolute; four principles; Lorenz curves,gini coefficient, etc

Plant Level Evidence on Product Mix Changes in Chilean Manufacturing

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF ECONOMICS THE EFFECTS OF HUMAN CAPITAL ACCUMULATION ON INCOME INEQUALITY

THE IMPACT OF IN-HOUSE TRAINING ON THE DIVERSITY

Determinants of the Gender Gap in the Proportion of Managers among White-Collar Regular Workers in Japan

Discriminant Analysis Applications and Software Support

Pollution Abatement and Environmental Equity: A Dynamic Study

Monitoring the Evolution of Education and Training Systems: A Guide to the Joint Assessment Framework

Using SPSS for Linear Regression

How Much Do Employers Learn from Referrals? *

The sources of interindustry wage di erentials.

An Application of Categorical Analysis of Variance in Nested Arrangements

Commodity Content in a General Input-Output Model: A Comment

Urban Transportation Planning Prof Dr. V. Thamizh Arasan Department of Civil Engineering Indian Institute Of Technology, Madras

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 9, March 2014

Short communication (2,500 words, including notes but excluding abstract and cover page)

Empirical Exercise Handout

Entry and Pricing on Broadway

An ordered array is an arrangement of data in either ascending or descending order.

Changes in the Thai Educational Wage Structure,

MRW model of growth: foundation, developments, and empirical evidence

WORKING PAPERS IN ECONOMICS AND ECONOMETRICS

Core vs NYS Standards

EQUITY ACTIVITIES. Courtesy of Tulsa Technology Center, Tulsa,OK

Why Learn Statistics?

MEMBERS OF EMPLOYMENT EQUITY GROUPS: PERCEPTIONS OF MERIT AND FAIRNESS IN STAFFING ACTIVITIES

Are Incremental Bidders Really Naive? Theory and Evidence from Competing Online Auctions

Getting Started with HLM 5. For Windows

ANALYSING QUANTITATIVE DATA

Education, Ability and Earnings, 1980s vs. 2000s

In Ethiopia, Gender Analysis Findings for the Pharmaceuticals Fund and Supply Agency on Women s Supply Chain Participation and Leadership

Transcription:

Inequality in tertiary education systems: which metric should we use for measuring and benchmarking? by Béatrice d'hombres Produced as background for World Bank Equity of access and success in tertiary education study with funding from the Bank Netherlands Partnership Program (BNPP).

1 Introduction There is a very large literature on income inequality as well as on some non-income dimensions of inequality such as health inequality. A variety of metrics have been proposed to operationalize inequality and to allow crosscountry comparisons, with each of them having its own advantages and disadvantages. Studies on education inequality and inquitable attainment of education across countries are much more limited, with the notable exceptions of Thomas et al (2001), Zand and Li (2002), and Barros et al (2009). Drawing upon the existing literature on income inequality, this note intends to discuss the quanti cation of disparities in tertiary education and, more precisely, to examine which of the common metrics could be used for benchmarking the inequity dimension of tertiary education systems. We distinguish the summary indices that are appropriate for cross-country comparisons and those that should be more adapted for country-speci c studies. Our review suggests that cross-country comparisons should focus on disparities across social groups that exist and can be compared across countries. This restriction limits the benchmarking exercise to measuring and comparing disparities in tertiary education by sex and/or income quintile. In addition, given that such a benchmarking requires comparable indicators across countries and over time, the e orts should rst be put on compiling education indicators at the aggregated level (country level) but broken down by equity groups. The simple measures of dispersion such as those presented in section 3.1, therefore, may be the more appropriate indicators of disparities in tertiary education. When social groups are de ned on the basis of income groups, the analysis could be complemented by employing the concentration index or regression-based indices that are respectively presented in sections 3.2.1 and 3.2.7. Country-speci c studies would allow a more in-depth investigation of disparities in tertiary education. First, various equity groups could be covered by the analysis, in addition to the two groups mentioned before. It might 1

be, for instance, relevant in some countries to scrutinize disparities in tertiary education across ethnic or religious groups. Second, the data at hand will likely come from household surveys or survey on tertiary students, so the analysis will be carried out at the individual level. As a corollary, the datasets employed will most probably contain, in addition to the educational status of the respondents, several variables on the personal characteristics of the persons interviewed (age, family structure, province of residence, etc.). As a result, it should be possible to compute scalar summary measures of the dispersion of education across social groups, while controlling for other individual characteristics that are simultaneously correlated with education and the social groups to which the individuals belong to. Then the appropriate inequality indicator for such country-speci c studies depends on the characteristics of the variable used to de ne the equity groups. For measuring the inequality of opportunity across economic groups, it seems logical to rely on the concentration index or on regression-based indices. Similarly, sex disparities might be examined in the context of regression-based indices. Finally, when there is not an inherent ordering among social groups (religious or ethnic groups) and the groups are numerous (more than 2 ethnic or religious groups), it should be appropriate to rely on entropy indices (section 3.2.4) and/or the dissimilarity index (section 3.2.6). The structure of the paper is a follows. Section 2 includes preliminary considerations on the concept of inequity and a discussion on the characteristics of the dataset and education indicators that should be available for estimating empirically the disparities across social groups in tertiary education. It also presents the di erent equity target groups and their relevance for crosscountry comparisons and country-speci c studies. Section 3 describes several inequality indicators. This section seeks to (i) highlight the main advantages and shortcomings of each of those indicators and (ii) stress in which context (cross-country benchmarking or within country analysis) they could be employed. Section 4 concludes and makes some recommendations. 2

2 Measuring inequity in tertiary education systems: some preliminary considerations 2.1 Inequality versus Inequity There is often confusion surrounding the concepts of inequality and inequity. Inequality refers to di erences between groups without any considerations on the fairness of these di erences while, in contrast, inequity presupposes an ethical judgment. In a recent report, the OECD (2008) de nes equitable tertiary systems as "those that ensure that access to, participation in and outcomes of tertiary education are based only on individual s innate ability and study e ort. They ensure that the achievement of educational potential at tertiary level is not the result of personal and social circumstances, including of factors such as socio-economic status, gender, ethnic origin, immigrant status, pace of residence age or disability." In other words, although di erences between individuals are interesting in themselves, if we want to develop a policy message, it is necessary to distinguish variations in educational outcomes that are driven by di erences in individuals e orts from those linked to factors that are beyond the individuals control. 2.2 Data 2.2.1 Vertical versus horizontal dimensions of equity Assessing the vertical dimension of equity requires information on the: 1. Admission and enrollment in tertiary education of students having completed a secondary education 2. Progression of students enrolled in tertiary education 3. Completion of tertiary studies of students enrolled in tertiary education 3

4. Tertiary education learning outcomes A comprehensive understanding of the issue under study would also require information on the organization of tertiary education systems. In particular, the horizontal dimension of equity depends on the level of diversi - cation of tertiary education systems and refers to the type of institution or programmes attended by the di erent equity groups as well as the associated outcomes on the labour market. In the context of building an index of equity for benchmarking, the focus should probably be rst limited to the vertical dimension of equity. Country-speci c studies, however, could also consider the horizontal dimension of equity conditional data availability. 2.2.2 Individual and aggregated data The data used to examine the educational disparities in tertiary education will be drawn either from surveys collecting information at the individual level or from various aggregated indicators collected by institutions involved in the production of statistics. At the individual level, the information should mainly come from household surveys or surveys on tertiary education students. Household surveys are available in a large number of countries. Theoretically, such data could be used both for country-speci c studies and cross-country comparisons. The population under consideration included adult individuals having achieved their education at the time of the interview. Such data will, unfortunately, give a picture not of the current level of inequity in tertiary education systems but of the past. In addition, the sample will include individuals having undertaken tertiary studies at di erent periods of time. In terms of education-related information, we will utilize at most of variables on the educational attainment of the respondents (number of years of completed education, or the graduation level - primary, secondary or tertiary grade - of respondents). In other words, while 4

household surveys might allow us to canvass the equity dimension of tertiary education systems in terms of enrollment and completion rates, the information on the progression of students during tertiary studies will be more di cult to obtain. Household surveys will also contain information on the characteristics of individuals (sex, age, region of residence, etc) and this could be taken into account to re ne the analysis of inequity in tertiary education. In most of cases, however, family background-related information (socioeconomic status of the parents of the respondents) will not be available. It implies that the use of such data might be problematic for crosscountry comparisons. Instead, household surveys should be of wider interest for country-speci c studies. Surveys on tertiary students: such data should o er much more information on the educational performance of the respondents, with, in particular, assessments of higher education learning outcomes through scores obtained in di erent kinds of tests on key competencies. Surveys on tertiary students usually contain detailed information on the personal characteristics and the family environment of the respondents, however, such surveys are implemented in a limited number of countries. The OECD initiative (AHELO) which will eventually allow for future cross-country comparisons of higher education learning outcomes is still at an early stage. For the time being, when available, this source of information can thus only be employed for country-speci c studies. At the macro level, aggregated indicators provided by national statistical o ces and/or tertiary institutions might be the most useful. The indicators on education can be aggregated at various unit levels. Such indicators are appropriate for cross-country and time comparisons if they are broken down by equity groups (e.g. country mean level of educational attainment by equity group). 5

2.2.3 Education variables At the individual level, the education variables will generally take the form of discrete variables, i.e. variables which can be broken down into separate categories but where no fractions are possible. For instance, participation in tertiary education is a discrete variable since there are only two possible educational status for a given individual: the individual is participating or not participating in tertiary education. Similarly the completion of tertiary studies is a discrete variable. When the number of categories is equal to 2, the discrete variable is called a binary discrete variable. When the discrete variable can take more than 2 di erent values, then the variable is more generally considered to be a categorical variable. The use of surveys on tertiary students could also give access to continuous educational variables, i.e. variables that can take on any value from an interval of real numbers. An education variable de ned by the score obtained by tertiary students in a writing skill test is a continuous variable. However, as we mentioned earlier, survey data on tertiary students are only available for a very limited number of countries, and so can only be employed in the context of country-speci c studies. If we rely on individual datasets for carrying out cross-country comparisons of educational disparities, then the educational variables at hand will be discrete variables. 1 At the macro level, the education variables will be continuous variables. In other words, if we use indicators aggregated at the country level, the educational variables employed for the empirical analysis could be, for instance, the enrollment rate in tertiary education, the completion rate of tertiary studies, or the survival rate in tertiary education, with the education variable taking on the country mean value by equity group. 1 This is problematic for several of the inequality indicators with in particular the following two consequences. First, as we will see later, we will not be able to measure pure inequality among individuals and second, we will often have to work with grouped data (i.e aggregated indicators) in order to express the education variables on a ratio scale. 6

Table 1: Dataset, education indicators and level of analysis Dataset Education variable Type of analysis Individual data 1 - Household surveys - Discrete variable - Country-speci c studies - Cross-country comparisons 2 - Surveys on tertiary - Discrete variable - Country-speci c studies students - Continuous variable Aggregated indicators - Continuous variable - Cross-country comparisons 2.2.4 Equity groups There are several personal and social circumstances that can lead to inequitable access to tertiary education, including economic status, gender, ethnic, origin, or immigrant status. For constructing an index on inequality opportunity in tertiary education, the equity group should be (1) comparable across countries and (2) measured with an indicator available on an international scale. The concept note underpinning this entire project, Equity and Access to Tertiary Education, considers the following 4 equity target groups: (1) individuals from the lower income groups, (2) individuals from groups with a minority status de ned on the basis of their ethnic, linguistic, religious, cultural or age characteristics, (3) females, and (4) people with disabilities. The conditions above limit cross-country analysis to the rst and third equitytarget groups. The four equity groups can be examined in the context of country-speci c studies on inequity. Indeed, while it might be relevant to study ethnic, religious, or linguistic disparities in the access to tertiary education in a given country, this is more problematic for international comparisons, simply because the ethnic or linguistic heterogeneity is country-speci c. 7

Table 2: Equity groups and level of analysis Equity groups Type of analysis 1 - Individuals from the lower - Country-speci c studies income groups - Cross country-comparisons 2 - Individuals from groups with - Country-speci c studies a minority status 3 - Females - Country-speci c studies - Cross country-comparisons 4 - People with disabilities - Country-speci c studies Note that the rst equity group considers the educational disparities among individuals belonging to di erent categories of income. The variable used to de ne the equity groups is thus an ordinal variable measured on an interval scale. It means that there is an inherent rankings between the income groups and that the groups can be ranged from the poorest to the richest. This is a fundamental di erence with the variables employed to de ne the other equity groups. Indeed, there is not an inherent ordering among racial or religious groups and thus the variables used for de ning those equity groups are non ordinal categorical variables. 2.3 Quality criteria The legitimacy and applicability of each disparity indicator should be evaluated along the following dimensions: Is the indicator easy to compute and understand for non statisticians? Is the indicator adapted to the variables used for monitoring the tertiary education systems? Variable used for measuring the educational performance. Variable used for de ning the equity groups. 8

Is the indicator sensitive to the social gradient in education? Meaning, does the indicator provide a summary measure of inequality or does the indicator also tell us which are the most disadvantaged groups? Does the indicator provide us with information about the whole distribution of the education variable? Does the indicator meet the "desirable "statistical properties underlined in the literature on income inequality? There are three properties that are of particular interest: Principle of transfers (Pigou-Dalton condition), which says in the income literature that a transfer of income from a richer to a poorer individual will result in a reduction in the indicator of disparity, assuming that the income of other individuals remains unchanged and the transfer is not large enough to reverse anyone s relative position. Scale independence which says that if the value of the education indicator doubles for each of the equity groups, the value associated with the inequality indicator does not change. Boundness of the indicator: the interpretation of the value assigned to the inequality indicator will be easier if this indicator has a lower bound and an upper bound. 3 Measures of disparity 3.1 Simple mesures of dispersion Most of the studies that discuss education disparity rely on simple comparisons between speci c groups as those that are presented below. The easiest way to measure education disparities is to use range and ratio measures of dispersion. Consider J the number of equity groups, with j = 1; :::; J and Ed j the education variable for the equity group j. Range 9

measures (RM1) are based on the distance in education between pairs of equity groups as follows: RM1 = Ed j Ed k ; with j 6= k: (1) while ratio measures (RM2) are equal to: RM2 = Ed j Ed k ; with j 6= k: (2) Those two measures di er in the sense that RM1 is expressed in absolute terms and RM2 is a relative measure of dispersion. Very often, Ed j and Ed k are the two extreme groups. It is also possible to produce j pairwise comparisons as follows: RM 1j = Ed j Ed and RM 2j = Ed j ; j = 1; :::; J: (3) Ed with Ed = 1 J P J j=1 Ed j the mean educational performance of the total population. We can also replace Ed k by Ed the "ideal" educational performance: RM 1j = Ed j Ed and RM 2j = Ed j ; j = 1; :::; J: (4) Ed Example: Consider that the education variable is the enrollment rate in tertiary education and the population of interest is divided by income quintile. In such a situation, RM 1 would be the di erence in the enrollment rate between the highest and lowest socio-economic groups and RM2 the percentage of the enrollment rate of the lowest socio-economic group with respect to the highest socio-economic group. # Pros : Such indicators are very easy to compute and have a straightforward interpretation. 10

In addition RM1 and RM 2 do not pose a lot of restrictions on the data. The variable used to de ne the J groups does not need to be an ordinal variable. The population could, for instance, be divided according to the ethnic origin, the place of residence, or the sex, etc. It is a clear advantage with respect to some of the indicators that will be presented later on. Statistical properties: the scale invariance property is satis ed by RM2 (but not by RM1). # Cons : If the two groups Ed j and Ed k are small in size (in the case they are not de ned in terms of the highest and lowest quintiles), then RM1 and RM2 would only re ect the disparity between two small groups and the results will be instable. In general, RM1 and RM2 do not take into account the size of the groups being compared. Such pairwise comparisons as in (1) and (2) ignore a large part of the information (intermediary groups) and may conceal important heterogeneity. In order to take into account the educational performance of the other groups, we have already noticed that it is possible to produce j pairwise comparisons as expressed in (3) and (4). If the number of groups is important, however, it becomes tricky to summarize the information. Not taking the population share in each equity group into account may make time and cross-country comparisons problematic. These 2 measures provide di erent types of information and, if they are used for comparisons over time, they can lead to opposite conclusions. It is because RM2 does not inform about changes in absolute rates. We might observe an increase in RM1 if the enrollment rate (for instance) increases at the same speed in the two groups but RM2 will remain identical. 11

Statistical properties: the principle of transfers is not satis ed and the indicators are not bounded between 2 values. In Practical terms: Cross-country comparisons: Despite the fact that these measures of dispersion have several shortcomings they could be appropriate for cross-country comparisons because they are easy to compute, not data demanding and they can be used with equity groups de ned both by ordinal (income) and non ordinal (sex) variables. RM2 is preferred to RM 1 : Country-speci c analysis of inequity: These summary measures of dispersion taken alone do not permit an in-depth analysis of disparities across social groups. 3.1.1 Regression based analysis A convenient way of taking into account educational di erences between the intermediary groups and to produce a summary index of dispersion is to apply a regression analysis in which the educational performance is related to the equity group (SEG) of the individual i; i = 1; :::; N. The equation to be estimated will be the following: Ed i = o + 1 SEG i + i (5) with i the disturbance term of equation (5), o and 1 are the parameters to be estimated. The variable SEG i must be measured with an ordinal variable on an interval scale. The estimated parameter 1 provides us with a measure of disparity in educational performances between SEGs. Example: Depending on the characteristic of the right-hand side variable of equation (5), this approach produces a relative e ect index or a regressionbased absolute e ect. If, Ed i is a binary discrete variable measured at the individual level, for instance a variable taking on the value one if the individual has completed a tertiary degree and zero otherwise, while SEG i tells us in which income quintile the family of the individual i belongs to, then 12

equation (5) should be estimated with a probit (or logit) model and 1 will be (an odd-ratio) the e ect of moving up to the next quintile on the probability to complete a tertiary degree. On contrary, if we work on grouped data and Ed i is a continuous variable measuring the completion rate of a tertiary degree of group i, then 1 is the e ect of one unit change in the absolute increase in Ed i : # Pros: The indicator is sensitive to the social gradient in education. Staying with our previous example, the indicator tells us what is the direction of the correlation (positive or negative) between the educational status of the individual and the income group to which the family of the respondent belongs to. This approach allows to take into account the whole distribution of the education variable and to produce a summary measure of educational disparities. Equation (5) can be estimated with both individual and grouped data. On grouped data, eq(5) just becomes Ed j = o + 1 SEG j + j ; for the equity group j; j = 1; :::J: (6) The regression analysis will also produce con dence intervals for each of the estimated parameters, so it will be possible to know whether the disparities between SEGs are signi cantly di erent from zero. The precision of the estimate will depend on the number of observations. It is possible to include in the equation (5) additional covariates in order to control for other individual or group characteristics (age, location of residence, family structure, etc) that are simultaneously correlated with Ed i and SEG i. It constitutes a great advantage over other measures of dispersion, and it is particularly interesting if we work with individual data. 13

On grouped data: 1 is a valid measure of educational disparities # Cons: between social groups at one point in time and for a given country. However, it is also possible take into account changes or di erences in the distribution of equity groups if equation (6) is estimated through weighted least squares and SEG j measures the average relative ranking of the equity group j. In such a situation c 1 would tell us the e ect on education of moving from the bottom to the top of the social group distribution. 2 We need to assume that the relationship between Ed i and SEG i is linear. The groups are ranked by their SEG which means that SEG i must be an ordinal variable measured on an interval scale ( such as the socio economic status). If the equity groups are de ned by a non ordinal binary variable (gender status), then we can also compute a regressionbased summary index (of gender disparities). If we are interested, for instance, in ethnic disparities, then the equity group will be de ned by a non ordinal categorical variable. A regression based analysis can be carried out by replacing SEG i with (J 1) ethnic dummies with J being the number of ethnic groups. It would not, however, estimate one summary index of ethnic disparities with such a speci cation. # Review of the literature: Such a speci cation is adopted in most of the country-speci c studies, based on individual data, which examine the determinants of educational performances in primary and secondary schools. See, for instance, Alderman et al (1997), Filmer and Pritchett (2001), Fuchs and Woosman (2007), Pontili and Kassouf (2008). If we work with comparable household surveys and adopt a econometric 2 Note that from this estimate, it is possible to compute a relative disparity metric which is easy to interpret and comparable over time and across countries. 14

speci cation (equation (5)) similar for each country, it is then possible to compute an index of equality of opportunity to be used for benchmarking. In Practical terms: Cross-country comparisons: If we use comparable household surveys and adopt an econometric speci cation (equation (5)) similar for each country, it is then possible to measure disparities across economic groups or by gender. As already said, it is also possible to use such indices with aggregated data. Country-speci c analysis of inequity: The regressions-based approach is particularly useful when we carry out an in depth country speci c study of educational disparities while using individual data. It is, therefore, possible to take advantage the main pros of regression-based indices listed above (estimation in a multivariate context, con dence bounds associated with the summary index). Such indices are not appropriate when the social group is de ned by a non ordinal categorical variable (ethnic or linguistic groups for instance). 3.1.2 Population attributable risk The population attributable risk (P AR) is a summary index of the di erence between the educational performance of each group and the one of the best group. Mathematically, it can be expressed as follows: P AR% = P J j=1 p j(ed j Ed ref ) P J j=1 p j(ed j Ed ref ) + Ed ref (7) with p j being the share of group j in the population and Ed ref the value of the education variable for the best performing group. Example: If the education variable is the enrollment rate in tertiary education and the population is divided in three ethnic groups, then (7) would be the percent improvement in the enrollment rate for the total population that 15

would be necessary to assure that the three ethnic groups have an enrollment rate corresponding to the one of the best performing ethnic group. # Pros: Summary index of education inequality easy to compute and not data demanding There are no restrictions on the characteristics of the grouping variable which can be an ordinal variable (income groups) or a non ordinal variable (gender, ethnic or religious groups). P AR is sensitive to the proportion of individuals in each group: convenient for time comparisons. Statistical property: indicator bounded between 0 and 100. # Cons: The indicator fails to re ect the socioeconomic dimension of inequalities in education: the indicator does not tell us which are the most disadvantaged and the most advantaged groups. # Review of the literature: To the best of our knowledge, there are not studies in education using the P AR as indicator of inequality. For a discussion about the use of this indicator in public health, see, for instance, Mackenback and Kunst (1997), Krokstad et al (2002) or Regidor (2004). Statistical property: P AR does not satisfy the principle of transfers and is not scale invariant. In Practical terms: Cross-country comparisons: This use of this indicator for cross-country comparisons is a possibility but might be problematic if the best performing 16

group only represents a very low proportion of the population. In that case, comparing the education performance of each equity group with respect to this group does not really make sense. Country-speci c analysis of inequity: The population attributable risk indicator taken alone does not allow for an in-depth analysis of disparities across social groups. 3.1.3 Education Gini coe cient The Gini coe cient as well as the Theil and Atkinson measures are standard metrics for measuring pure income inequality among individuals. Such indicators fundamentally require continuous variable collected at the individual level. As we have discussed before, the indicators that we might have at our disposal are most probably binary variables at the individual level that can be converted into continuous variables when one works with grouped data. In this context, the information provided by the Gini coe cient or entropy indices is about disparities across groups of individuals. Education Lorenz Curve: The education Lorenz Curve maps the cumulative educational share on the y-axis against the cumulative population share ordered from the least educated to the most educated on the x-axis. When the education variable is a "positive" variable such as the the number of years of education (higher is the level of education, better it is), the Lorenz curve will lie below the diagonal, with the diagonal representing a uniform distribution of education as shown in gure (1). In contrast, the Lorenz curve will be above the diagonal if the education variable is a "negative" variable such as the retention ratio (lower is the retention ratio, better it is). In case of perfect equality in the distribution of education, the Lorenz Curve and the diagonal coincide. The larger the distance of the curve from the diagonal line, the larger the inequality. When for two countries, A and B, the Lorenz curve of country A lies in any point below the Lorenz curve of country B, we can conclude that education disparities in country A are higher than in country B. If the two Lorenz curves cross each other, then we cannot conclude which distribution is more equitable and we need to rely on 17

Figure 1: Education Lorenz Curve 1.9 Cumulative enrollment share.8.7.6.5.4.3.2.1 0 0.1.2.3.4.5.6.7.8.9 1 Cumulative percentage of the tertiary age population Lorenz curve Line of perfect equality the gini coe cient. Furthermore, while the Lorenz curve gives a graphical representation of disparity over the whole distribution of education, its use for comparisons across many countries is not su cient. Instead, the education gini coe cient provides a summary index of education disparities that can be easily used for international comparisons. Education Gini coefficient: Adapted from Thomas et al (2002), the Education gini coe cient can be written as follows: EG = 1 Ed j 1 JX X p j p k jed j Ed k j (8) j=1 k=1 where p j and p k are the proportions of the population that respectively belong to the equity groups j and k: Ed j and Ed k are the values taken by the educational variable for the two corresponding equity groups. The coe cient varies between 0, which re ects complete equality and 1, which indicates complete inequality. 18

Example. Suppose the enrollment rate in tertiary education at the region level for a given country known. The lorenz curve for that country will map the cumulative enrollment share on the y-axis against on the cumulative percentage of the tertiary age population, region by region, with the regions being sorted by enrollment rate from the region with the lowest enrollment rate to the one with the highest enrollment rate. The Gini coe cient corresponds to twice the area between the Lorenz curve and the diagonal. # Pros: It is the most well-known inequality metric. Among its advantages with respect to entropy measures, the outcome variable can include negative and null values. When the gini coe cient is used with grouped data, the grouping variable can be an ordinal or a non ordinal variable. Statistical properties: Pigou-Dalton Transfer sensitivity, scale invariant, bounded between 0 and 1. # Cons: The educational variable must be a continuous variable. Using individual data, supposing the education outcome is continuous, the Gini coe cient measures pure inequality between individual (overall level of inequality between individuals) but fails to provide information on the "unfair" component of inequality (part of the overall inequality due to the social circumstances). When the Gini coe cient is computed with data aggregated by social group, what matters is how the share of each social group in the population with a given education condition compares with its share in the total population. In other words, and staying with the previous example, the result is a summary index of disparities across regions but it is not possible to identify the regions in the most (less) favorable situation in terms of enrollment rate. 19

Although the level of inequality is given by the value of the education gini coe cient, the interpretation of the coe cient can only be done in comparative terms. # Review of the literature: The education Gini coe cient has only been used in few occasions to quantify and explore cross-country variations in education inequality (Maas and Criel 1982, Thomas et al., 2002, Zhang and Li, 2002, Sahn and Younger, 2007). Among the most cited studies, there are Maas and Criel (1982) who use enrollment data by province to estimate the education Gini coe cients of 16 East African countries and Thomas et al. (2001) who rely on the schooling distribution data of Barro and Lee (1993 and 1997) and the schooling cycle data of Psacharopoulos and Arriagada (1986) to measure the education Gini coe cient based on educational attainment of 140 countries from 1960 to 1990. Similarly, Zang and Li (2002) explore the international educational inequality and convergence in educational attainment over the period 1960-2000. See also SITEAL (2005) for a very comprehensive overview of the di erent inequality indices with empirical examples. In Practical terms: Cross-country comparisons: We believe that the Gini coe cient is of limited interest for the reasons outlined before, but it could be useful if we are interested in cross-regional comparisons within a given country. Country-speci c analysis of inequity: The Gini coe cient utilizing individual data can be a good starting point to measure overall inequality among individuals. But as noted earlier, its use for measuring educational disparities between equity groups is limited. 3.1.4 Generalized Entropy indices and Atkinson index Theil and mean logarithm deviation indices The property of additive decomposition between and within exclusive groups 20

valid for entropy indexes, but not for Gini index, greatly contributed to increasing the use of entropy indexes in income inequality studies. Additive decomposition means that it is possible to decompose the overall inequality among individuals into two components: the rst component (between social groups index) is the unfair part of inequality driven by factors beyond the control of the individual (gender, race, income groups, etc) while the second one results form the individual s e ort (within social groups entropy indices). On individual data, the generalised entropy (GE) class of metrics can be expressed as follows: GE() = 2 1 " 1 N NX Edi 1# where N is the number of individuals, Ed i is the value of the education variable for individual i, Ed is the mean value of the education variable in the total population and represents the weight given to the distance between Ed i and Ed at di erent part of the education distribution. The most common entropy indices, the Theil index and the Mean Logarithm Deviation index (MLD) correspond to (9) when respectively = 1 and = 0. 3 i=1 Ed (9) The more positive is the sensitivity parameter ; the more sensitive is the entropy index to inequalities at the top of the education distribution. Neither indices can be used with binary education indicators, so unless there are continuous education outcome data (through, for instance, the OECD initiative on assessing higher education learning), cross-country comparisons will utilize grouped data. The Theil index can mathematically be expressed, with grouped data, as follows: T heil = JX j=1 Ed j p j Ed ln Edj Ed (10) 3 Note that when = 2; the GE index corresponds to half the squared coe cient of variation. 21

while the Mean Logarithm Deviation index is equal to: MLD = JX j=1 Ed p j ln Ed j (11) where p j is the proportion of the population that belongs to the equity group j; Ed j is the value of education for group j and Ed is the mean value of the education variable in the total population. Atkinson Index The Atkinson s measure is another well-known inequality measure which can be de ned as follows: A = 1 " JX j=1 # 1 1 (1 ) Ed j p j, > 0 (12) Ed The extent of disparity depends on the value of, which indicates the degree of aversion to disparity. When > 0, there is a preference for equality (i.e. an aversion to inequality). As rises, more weight is attached to education transfers at the lower end of the education distribution and less weight to transfers at the top of the education distribution. The Atkinson index ranges between 0 and 1 with 0 indicating perfect equality and 1 maximum inequality. Example. Supposing that the educational variable of interest is the enrollment rate in tertiary education and that the population is divided by income groups, the Atkinson index and the two entropy indices are all function of each group proportion and the ratio (i.e. distance between) of the enrollment rate of each social group to the enrollment rate in the total population. # Pros: Main advantage: the generalized entropy index for the entire population can be decomposed into a weighted average of each social group s generalized entropy index (within social group entropy index) and a between social group index ("unfair" component of inequality) as previously mentioned. In other words, if the education variable is the score 22

obtained in a reading test by tertiary students and that population is divided into J ethnic groups, the property of additive decomposition should permit to measure the level of inequality between individuals belonging to the same ethnic group (i.e. within social groups) and the level of inequality between the ethnic groups (i.e. between social groups). The property of additive decomposition requires continuous education outcomes de ned at the individual level however. Convenient for cross-country and time comparisons. Statistical properties: Pigou-Dalton transfer sensitivity, scale invariant. # Cons: See comments on the Gini coe cient. Statistical properties: no upper bound for the two entropy indices. # Review of the literature: Only few studies have used entropy indices to compare across countries education inequalities: Sahn and Younger (2007) compare world education inequality in math and science knowledge, using scores on math and science achievement tests collected by the 1999 round of Trends in International Mathematics and Science Study (TIMSS) to entropy indices. Thomas et al (2001) investigate cross-country inequalities in educational attainment using the education Theil index. SITEAL (2005) computes the Gini, Theil and Atkinson (for di erent values of ) indices for Argentina and Mexico. Barros et al (2009) have recently examined inequality of opportunity in educational achievement for 5 Latin American countries using 2000 PISA surveys. The education variables are reading and mathematics test scores, and the inequality measure is the mean log deviation index. They nd that inequality of opportunity accounts for a substantial 23

amount of observed education inequality in Latin America ranging between 14% and 28% of total inequality in reading scores and between 15% and 29% of total inequality in mathematics achievement. In Practical terms: Cross-country comparisons: As noted above, the main advantage of those indicators is their decomposability property. It would be di cult to take advantage of this property for cross-country comparisons given that individual data with an education indicator measured by a continuous variable is required for such comparisons. Country-speci c analysis of inequity: The three indicators presented above might be particularly appealing for country-speci c studies when relying on surveys of tertiary students. 3.1.5 Education Standard Deviation and Coe cient of Variation of Schooling between groups. When social groups are unordered groups, i.e. groups without an inherent ordering (such as ethnic or religious groups), the education standard deviation (ESD) and the coe cient of variation of schooling (CV S) might be two useful indices of inequality. The education standard deviation between J equity groups (j = 1; :::; J) is given by v ux ESD = t J p j (Ed j Ed) 2 (13) j=1 and the coe cient of variation of schooling corresponds to the ESD divided by the Ed : CV S = q PJ j=1 p j(ed j Ed) 2 (14) Ed 24

with p j the share of group j in the population and Ed the mean level of the education variable in the population. 4 Example. If the education variable is the completion rate of tertiary studies and the population is divided by income groups, then ESD is the standard deviation in completion rate, between income groups, while the coe cient of variation expresses the standard deviation as a percentage of the population mean. # Pros: Both indicators are commonly used and very easy to compute. There are no restrictions on the characteristics of the variable used to regroup the individuals.: the grouping variable does not need to be ordinal. # Cons: Neither indicator is sensitive to the direction of the social gradient in education: for example, it is not known whether the education status either increases or decreases with increasing socioeconomic position. A given value of the CVS could simultaneously correspond to a positive or a negative association between the socioeconomic position and the educational performance. Statistical properties: The CV S is preferred to the ESD because the second one is not scale-invariant. Both indicators do not satisfy the principle of transfers. While both indicators have a lower limit equal to 0 and corresponding to 0 dispersion, they do not have an upper limiting value equal to 1. # Review of the literature: 4 Similarly, we could compute the variance or the absolute mean deviation index between groups. 25

Zang and Li (2002) and SITEAL (2005) compute and then compare various indicators of education inequality, the ESD and CV S are among them. Note also that Ram (1990) examines the relationship between educational expansion and schooling inequality for about 100 countries. Schooling inequality is measured by the standard deviation of the educational attainment. In the robustness section, the author tests whether the results change when the schooling coe cient of variation instead of the standard deviation is used for measuring education inequality. In Practical terms: The CV S is preferred to the ESD. Cross-country comparisons: The coe cient of variation is useful for crosscountry comparisons or cross-regional comparisons within a given country, but this indicator does not convey information on the sense of the correlation between education and the grouping variable and does not have as appealing an interpretation capacity as other indicators presented in this note. Country-speci c analysis of inequity: The coe cient of variation taken alone does not allow for an in-depth analysis of disparities across social groups. 3.1.6 Dissimilarity index The index of dissimilarity has been widely used in the literature on segregation. Supposing that the population is divided into J groups, the index of dissimilarity can be expressed as follows: ID = 1 2 JX js j p j j (15) j=1 where S j is the proportion of group j in the population with a level of education equal to Ed and p j is the share of group j in the total population. Exemple. Supposing that the education variable is the completion rate of tertiary studies and that the grouping variable refers to the sex of the individual, the index of dissimilarity tells us about the proportion of all cases that needs to be redistributed across the population to insure that, for 26

each sex, the male (female) group s share of the population having completed tertiary studies is equal to the male (female) group s population share. # Pros: The indicator is easy to compute and interpret. The social groups can be ordered or unordered groups, i.e they can be de ned by ordinal or non ordinal variables. Statistical properties: bounded between 0 and 1. # Cons: The index of dissimilarity is not sensitive to the socioeconomic dimension of inequalities in education. What matters is how each socioeconomic group s share of the population s education compares with its population share, not how this disparity compares with the group s socioeconomic status. Statistical properties: not scale invariant, does not satisfy the principle of transfers. # Review of the literature: Recently, Barros et al (2009) have used a version of the dissimilarity index for analyzing children s inequality of opportunity in education, electricity, and improved water and sanitation in 19 Latin America and Carribean countries. For education, they use the probability of having completed the sixth grade on time for children age 12 to 16 and school attendance for children ages 10 to 14. Social groups are de ned by parents education, family per capita income, gender, age, family structure and area of residence. Their results suggest that, on average, 11% of education, as measured with the rst indicator, needs to be reallocated in order to remove di erences between the di erent social groups. The estimated value of the dissimilarity index for the second education indicator is less than 5%. 27

In Practical terms:. Cross-country comparisons: The index of dissimilarity is an attractive metric for cross-country comparisons, but this indicator is not sensitive to the social gradient in education. Staying with the example given above, it implies that a given value of the indicator could correspond to a situation where females are disadvantaged in terms of the completion rate of tertiary studies or to a situation where males are those that experience inquitable opportunity.. Country-speci c analysis of inequity: This index would not be enough for an in-depth country speci c studies of educational disparities. 3.1.7 Concentration index The concentration curve and concentration index are derived from the bivariate distribution of education and the social group ranking. The concentration curve plots the cumulative percentage of the education variable on the y-axis against the cumulative percentage on the x-axis of the individuals ordered according to their socioeconomic status, beginning with the poorest and ending with the richest. The concentration curve is di erent from the Lorenz curve in that the x-axis for the Lorenz curve represents the cumulative percentage of individuals ordered according to their educational level. If the socioeconomic status has no e ect on the probability of enrolling at university, the concentration curve will correspond to the diagonal line as shown in gure 2. When the education variable corresponds to grade attainment, test scores, or enrollment ratios, the concentration curve will lie below the diagonal; while when the education variable is the retention ratio, then the concentration curve will be above the diagonal line. The area between the diagonal and the concentration curve represents the extent of disparities across socioeconomic groups. Like the Lorenz curve, the Concentration curve (CC) is not a summary measure of the magnitude of inequality, so it is not useful for comparisons of socio 28

Figure 2: Education Concentration Curve Cumulative enrollment share 1.9.8.7.6.5.4.3.2.1 0 0.1.2.3.4.5.6.7.8.9 1 Cumulative population ranked by SES Concentration curve Perfect equality economic-related education inequalities among many countries. 5 Instead, the concentration index (CI) is preferred. The concentration index, which is based on the CC in the same way as the Gini coe cient is related to the Lorenz curve. The index is negative if the CC lies below the diagonal and positive when the curve is situated below the diagonal. On grouped data, with j = 1; :::J equity groups, the concentration index is de ned as follows: CI = 2 Ed j 1 JX X p j Ed j r j 1 with r j = p k + p j =2 (16) j=1 k=1 where, p j is the proportion of the j th group in the total population; Ed j is the average of the Education variable in the j th group and r j is the fractional 5 In addition if the CCs of two countries A and B cross each other, it it not possible anymore to compare the degree of inequality in those 2 countries, given that neither distribution dominate the other. 29

rank of the j th group. 6 The index is ranged between 1 and 1. The concentration index can also be de ned in terms of covariance between the education variable and the rank in the living standards distribution as follows: CI = 2 cov(ed; r) (17) Ed Example. In plotting the cumulative percentage of enrollment at university accruing to the poorest quintile of the population, the concentration curve would tell us that the 25% poorest students represent x% of the students enrolled in tertiary education. The concentration index is equal to twice the area between the concentration curve and the diagonal. # Pros The CI is a good measure of socio-economic inequalities in education. Given equation (17), it can be showed that the CI could be obtained, on individual data (i = 1; :::N), from the estimate of the following equation: 2 2 r Edi = c + r i + X j + i (18) Ed with b an estimate of the CI, 2 r the variance of the fractional rank and X i a set of covariates to control for potential confounding e ects (i.e. if X i is simultaneously correlated with Ed i and r i and not included in equation (18), the estimated coe cient b will capture the e ect of socioeconomic status on education but also the impact of the other covariates X i ). In addition, a con dence interval of the estimated concentration index can be easily computed. The value of the index does not change if the living standard variable changes but does not a ect the rank. Convenient for time and cross-country comparisons. 6 Wagsta (2002) has proposed an extended CI capturing di erent levels of aversion to inequality. 30