Peer Group Choice and Chief Executive Officer Compensation

Size: px
Start display at page:

Download "Peer Group Choice and Chief Executive Officer Compensation"

Transcription

1 Peer Group Choice and Chief Executive Officer Compensation David F. Larcker Graduate School of Business Rock Center for Corporate Governance Stanford University Charles McClure Booth School of Business University of Chicago Christina Zhu The Wharton School University of Pennsylvania Draft: October 23, 2018 Abstract. We examine the peer group selection that boards of directors use when setting the level of CEO compensation. This choice is controversial because it is difficult to ascertain whether peer groups are selected to (i) attract and retain top executive talent or (ii) enable rent extraction by inappropriately increasing CEO compensation. In contrast to prior research, our analysis utilizes the degree to which the observed portfolio compensation level of peers is unusual relative to all potential portfolios of peers the board of directors could have reasonably selected. Using a sample of 12,894 firm-year observations covering the time period from 2008 to 2014, we estimate roughly 68% of firms appear to be engaging in rent extraction, while the remaining 32% seem to be selecting peers to attract and retain CEO talent. Relative to firms which appear to select peers for aspirational reasons, we find rent extraction firms have more realized negative governance outcomes, but we find more structural governance differences between aspirational firms and the smallest-sized rent extraction firms. Keywords: CEO Compensation; Peer Groups; Agency Problems; CEO Labor Market JEL Classification: M12, M52, G30, J33 We gratefully acknowledge the support of the Stanford Rock Center for Corporate Governance and the Stanford Graduate School of Business Centers & Initiatives for Research, Curriculum and Learning Experiences (CIRCLE). Charles McClure is grateful for the support from University of Chicago, Booth School of Business. Christina Zhu is grateful for the support from The Wharton School, University of Pennsylvania. We thank Nathan Atkinson, Kurt Gee, Ivan Marinovic, Venky Nagar, and workshop participants at Stanford University and Rice University for helpful suggestions. We thank Christopher Wiley for assistance with the portfolio approach algorithm. We thank several compensation consultants for sharing their institutional insight.

2 Peer Group Choice and Chief Executive Officer Compensation 1. Introduction An important factor in setting executive compensation is the choice of peer firms used to develop a benchmark for the CEO s market wage. One controversial question in this process is whether it is appropriate for firms to select highly paid peer firms that are larger than themselves. Firms aspiring to invest in executive talent may select larger peers to attract and hire executives from firms that command higher pay. Similarly, if firms lose their top executive talent to larger firms, they may rationally change their peer selection to include those firms. For example, in 2015 American Axle & Manufacturing Holdings Inc. (AAM) had a peer group of 20 firms and notes that this group includes companies that compete with AAM for executive talent [ ] The Committee believes that this approach reflects a generally accepted benchmark of external competitiveness and supports our ability to attract and retain key executives. 1 In contrast to the rationale cited by board of directors for selecting larger peers, governance activists and proxy advisory firms believe some firms select peers that are larger and/or have higher compensation levels simply to justify a high level of CEO compensation. For example, Glass Lewis criticized Omnicare for this practice: We note the following concerns with the structure of the Company's compensation programs: Peer Group Concerns. A company's choice of a peer group can have a significant impact on the size and structure of compensation. Shareholders need to be satisfied that the peer group is appropriate and not cherry-picked for the purpose of justifying or inflating pay. In general, we believe a peer group should range from 0.5 to 2 times the market capitalization of the Company. In this case, Glass Lewis has identified 23 peers outside this range, which represents approximately 79.4% of the peer group. (Meeting date May 24, 2011, quote from Ertimur et al., 2013) 1 AAM DEF 14A filed on 03/24/2016, p

3 Similarly, Institutional Shareholder Services (ISS) includes an assessment of Peer Group Benchmarking as a key evaluation focus in its recommendations for say-on-pay votes. 2 Given the controversy surrounding the choice of peer groups, it is important to understand whether peer group selection is either an appropriate labor market benchmarking analysis by the board of directors or a mechanism for rent extraction by the CEO. Prior research provides mixed insights into the determinants and consequences of peer group selection (Faulkender and Yang, 2010; Bizjak et al., 2011; Albuquerque et al., 2013; Cadman and Carter, 2013; Francis et al., 2016). This observation mirrors the generally mixed scholarly views on executive compensation. For example, some researchers view high executive compensation as rent extraction (Bebchuk and Fried, 2005), whereas others interpret high wages as a natural outcome produced by labor market competition for executive talent (Edmans and Gabaix, 2016). This study addresses a variety of conceptual and methodological issues in prior research and provides new insights into the choice of peer groups. We make several contributions to the existing literature on peer group selection. First, we show serious limitations of the matching methodology used in prior research. The standard methodology essentially follows a procedure of propensity score matching each selected peer with a similar non-peer firm, and then assessing whether the central tendency of CEO compensation in the selected peer group is different (e.g., perhaps higher) than the central tendency of CEO compensation for the matched group of peers or not. While this methodological approach may seem reasonable at first glance, we show that a large fraction of the matches produced by propensity score approaches lead to comparable peer firms that are neither similar in terms of industry, size, nor other traditional selection criteria used by 2 See Institutional Shareholder Services (2015) for a general discussion of proxy advisory firms evaluation of peer group selection by firms. 2

4 compensation committees. Thus, the resulting matched-pairs analyses are confounded and likely to provide biased inferences about the board s objectives when selecting peer groups. Second, we shift the unit of analysis from the selection of individual peer firms to the selection of a portfolio of peer firms. Although compensation committees clearly assess each individual firm for inclusion in the peer group, they are ultimately selecting a portfolio of firms that informs or justifies their choice for the level of CEO compensation. 3 The CEO compensation benchmark is often set at the median pay of the selected portfolio of peer firms. From the compensation committee s perspective, the desirability of each peer firm will depend on the other firms already included in the peer group. Since matching methods such as propensity score approaches ignore this dependence, the results produced by prior research are likely to be misleading. In order to address this concern, we develop a new measure for assessing peer groups, denoted as Peer Portfolio Percentile (PPP), which mimics the board of director s actual peer group selection process. Specifically, we compare the median compensation for the selected peer group to the distribution of median compensation for all alternative peer groups that could have been reasonably selected by the board of directors using traditional selection benchmarks such as firm size and industry. This measure enables us to assess whether the selected peer group produces a CEO compensation benchmark that is at the 1 st, 50 th, 99 th, or any other percentile of the distribution of plausible peer groups. This distributional measure better captures whether the peer group choice by the board of directors results in an unusually high or low compensation benchmark. 3 To provide some confirmation of this assumption, we interviewed six senior compensation consultants at top consulting firms. These conversations clearly support the idea that firms and consultants assess the applicability of individual firms (i.e., some are clear inclusions and others are clear exclusions), but ultimately they are concerned about the median compensation benchmark produced by a portfolio of peer firms. 3

5 Third, we assess whether peer group selection is associated with future firm performance. If firms choose a highly paid peer group (i.e., a high PPP) to attract more talented executives who require higher compensation, we should find that the high PPP selection is related to higher future firm performance. Conversely, if higher peer group pay reflects rent extraction, we should observe a negative relation between peer group pay and future operating performance. We expect both aspirational and agency motivations to be observed in a large cross-section of firms. Fourth, our methodological approach allows for the sample to contain a mixture of firms where the estimated coefficient linking peer group choice with future firm performance can be positive or negative for different subsets of firms. The typical pooled estimation approach used in prior research does not easily allow for this type of heterogeneity. Thus, we use Latent Class Analysis (LCA) to place firms into different homogeneous clusters depending on the sign and statistical significance of the association between peer group choice and future firm performance. 4 Once these clusters are identified, we can uncover the distinguishing factors associated with the observations in each cluster. For example, we expect firms with a negative association between peer group pay and future operating performance to exhibit characteristics associated with weak corporate governance. In contrast, we expect firms with a positive association to have strong corporate governance. Finally, we analyze a larger sample of firms and a longer time period than the research to date. Prior literature typically analyzes only two or three years of data for a subset of firms (Albuquerque et al., 2013; Faulkender and Yang, 2010, 2013; Bizjak et al., 2011). Specifically, 4 LCA has been used in empirical research in several disciplines, including biostatistics, medicine, psychology, and marketing. For example, the marketing literature has used LCA for the purpose of consumer segmentation, and researchers in the health sciences have used LCA to identify phenotypes of different diseases and disorders. For a review of applications of LCA, see Hagennaars and McCutcheon (2002). For an accounting application, refer to Larcker and Richardson (2004). 4

6 we examine a sample of 12,894 firm-year observations from 2008 to Our sample covers approximately the Russell 3000 firms during this period and is much more comprehensive than samples studied in prior research. Across our sample of firms, the mean (median) PPP is 72.5 (87.3), which indicates that boards of directors generally pick a peer group that supports higher compensation than the median portfolio of firms that could have been reasonably selected. Using LCA, we find consistent evidence that our sample has a mixture of three clusters, where roughly 68% of observations have a negative relation between future performance and PPP ( rent extraction firm clusters) and roughly 32% of observations have a positive relation ( aspirational firm cluster). There are two distinct clusters that comprise the firms with a negative relation between future performance and PPP. The smaller cluster of rent extraction firms, composed of 6% of observations, exhibits lower CEO talent and weaker corporate governance structures relative to the aspirational firm cluster. This cluster consists of smaller firms with lower operating performance. The larger cluster of rent extraction firms (about 62% of all observations) is comprised of larger firms and exhibits higher CEO talent with slightly weaker governance structures relative to the aspirational cluster. Both rent extraction clusters have more realized negative governance outcomes than the aspirational cluster. Overall, our results are consistent with the claims of governance activists and proxy advisory firms that peer group choice is related to rent extraction for a majority of firms. However, our results differ from the prior conclusions of Faulkender and Yang (2010), Bizjak et al. (2011), Albuquerque et al. (2013), and Francis et al. (2016) that conclude peer group choice is characterized by the desire to attract and retain CEO talent. In addition, we offer an explanation for prior studies conclusions that firms with governance issues do not select more highly paid 5

7 peers. While both rent extraction clusters have more realized negative governance outcomes than the aspirational cluster, we find weaker structural governance systems primarily in the smallsized rent extraction firms. Thus, prior research s inconclusive findings may be due to the focus on larger firms and ex-ante structural governance measures. To help quantify the magnitude of peer group rent extraction, if we use the 50 th percentile of the distribution of compensation medians as a benchmark, the firms in the rent extraction clusters earn approximately $12 billion in excess compensation from 2008 to This amount corresponds to an overpayment of approximately 40% of CEO compensation. Thus, the economic significance of peer group choice for the majority of firms is nontrivial. The remainder of the paper proceeds as follows. Section 2 provides a review and assessment of the prior literature examining peer group selection. Section 3 describes our sample selection and provides descriptive statistics for the sample. Section 4 illustrates the limitations of the typical matching approaches used in prior literature. Section 5 provides the conceptual justification and computation steps for the PPP measure. The econometric approach used to distinguish between rent extraction and labor market explanations for peer group selection is developed in Section 6. The primary results are reported in Section 7 and associated sensitivity analyses are discussed in Section 8. Section 9 provides a summary and concluding remarks. 2. Review of Prior Literature on Peer Group Selection 2.1 Institutional Background Beginning in 2006, the Securities and Exchange Commission (SEC) required each firm to disclose in its annual proxy statement whether the company engaged in any benchmarking of total compensation or any material element of compensation, identifying the benchmark and, if 6

8 applicable, its components (including component companies). 5 As described in proxy statements, most firms use a combination of industry and size along with other criteria, such as profitability, talent flows, and geographic location, to create a peer set of firms. Over 95% of the S&P 500 disclosed a peer group for fiscal year 2015 (Equilar, 2016) and approximately 80% of Russell 3000 firms benchmark their compensation against outside peers (Audit Analytics, 2015). The typical disclosure is illustrated in the 2016 proxy (DEF14A filed on 04/21/2016, page 23) of Alliance Data Systems (ADS): In 2013, the compensation committee, with the assistance of Meridian, undertook a comprehensive review of the companies comprising the proxy peer group. At that time, the compensation committee was presented an initial pool of 100 possible peer companies based on a representative mix of our core business competencies, including marketing, data, digital, card services and specialty finance, whose general revenue size ranged from 0.3x to 3.0x of our revenue and also sought to include high-performing companies that had achieved a minimum 5% revenue growth and 8% EBITDA growth over the prior year From this analysis, the compensation committee selected a total of 16 proxy peer companies for ADS primary screening criteria using industry and size are representative of most firms. Equilar (2016) reports that, of the 477 S&P 500 firms which disclosed peers in 2015, the two most common criteria were similar industry (441 firms) and revenue (363 firms). Interestingly, only 39 S&P 500 firms include profitability measures, such as EBITDA, in their selection criteria. Compensation consultants confirm that firms should use size and industry criteria. For example, Pay Governance recommends that firms choose peers that are within the same industry and comparable in size, business operations, and geographic presence (Bout, 2011). To assess the appropriateness of the selected peers, proxy advisory firms also create their own peer groups. 5 See SEC final rule a, Item 402(b)(2)(xiv). The full rule can be accessed at 7

9 ISS bases their selection of peer firms on industry (GICS) and size (revenue and market capitalization) (Institutional Shareholder Services, 2015). Firms typically calibrate their pay to the median of their selected peer group (Equilar, 2016). 6 This practice of benchmarking to the median may be due to proxy advisory companies comparison of compensation to the median of a selected group of peers. For example, ISS criticized Allegheny Technologies Inc. (meeting date May 11, 2012) in their proxy analysis and vote recommendation (dated April 24, 2012): As noted in the Pay for Performance discussion, CEO pay was 1.9 times the median of ISS selected peers. The Company Selected Peer Group chart shows that three companies Alcoa, Nucor and United States Steel are significantly above two times the company's revenue. (emphasis added) ISS formed its own set of peers for Allegheny Technologies, and it found that the firm s CEO pay, benchmarked against a company-selected peer group, was much higher than the median of the ISS-selected peer group. As a result, ISS recommended voting against the company s executive compensation. 2.2 Prior Literature An extensive prior literature on executive compensation proposes two main views. The rent extraction view is that managers seek to maximize their own compensation rather than shareholder value (e.g., Bebchuk and Fried, 2005). This theory assumes that corporate governance mechanisms are insufficient to mitigate agency problems between executives and shareholders or even between the board of directors and shareholders. In contrast, the shareholder value view is that boards of directors select compensation schemes to increase 6 For our sample of 12,894 firm years (described in Section 3), we confirm that firms do benchmark to the median compensation of the selected peer group. Specifically, the median compensation paid to the CEO is $3,454,045 million and the median of the 50 th percentile of compensation for the associated peer group is $3,814,396, or a difference of $360,351. The cross-sectional Pearson (Spearman) correlation between CEO compensation and median compensation for the peer group is (0.764). 8

10 shareholder value (e.g., Gabaix and Landier, 2008). These two views represent the considerable controversy over whether CEO compensation is best characterized as the result of agency problems or optimal contracting. Prior research provides mixed insights into the motivation of peer group selection for establishing CEO compensation. This ambiguity mirrors the contrasting perspectives on compensation more broadly, as described above. Faulkender and Yang (2010) find that firms with governance concerns select more highly paid peers. Bizjak et al. (2011) find that firms seem to manipulate peer group compensation upward, but they find no consistent evidence that governance concerns influence peer group choice. Albuquerque et al. (2013) conclude that the selection of highly paid peers is related to the desire to attract and retain CEO talent, rather than the result of governance problems producing rent extraction. Finally, Cadman and Carter (2013) find that the method used by researchers to select potential peers substantially influences whether peer group selection is consistent with self-serving behavior or rational labor market benchmarking. 3. Sample Selection The sample used in prior research consists of only large firms and is limited to the two or three years immediately following the 2006 disclosure requirement for peer groups. Specifically, Bizjak et al. (2011) and Faulkender and Yang (2010, 2013) consider S&P 500 and S&P 400 firms and Albuquerque et al. (2013) only analyze firms with ExecuComp data, which are mostly S&P 1500 firms. If larger firms face greater scrutiny when making compensation decisions, it may be difficult to observe rent extraction activity in this restricted sample. To mitigate this concern, our sample selection process begins with all firms in the Equilar database that disclose a peer group in their proxy statement. This sample corresponds roughly to the Russell 3000 firms. 9

11 We also restrict our analysis to the time period from 2008 to Our sample period addresses the timing concerns of Faulkender and Yang (2013) and SEC staff comments regarding the initial confusion about which types of compensation benchmarks required disclosure as some firms considered the identity of their peer group to be proprietary information. 7 We require each firm in our sample to have strictly more firms in its industry and size caliper than the number of firms it selected for its peer group. In order to construct PPP (discussed below) we require firms to have enough potential peers from which to select multiple portfolios of size k, where k is the number of actual peers they select. To eliminate outliers in CEO compensation, we drop firms with CEO total compensation below $100,000. Finally, we remove firm-years with one-year-ahead ROA less than -20% or greater than 20%. We remove these firm-years because next year s ROA may not be an appropriate performance measure for firms with extreme ROA. 8 Our final sample consists of 12,894 firm-year observations (2,888 unique firms) and covers 54.3% of the total market capitalization of NYSE/NASDAQ/Amex. We compare each firm s financial and compensation data in year t to its potential peer data in year t-1. We make this adjustment to account for a well-known timing concern in the compensation industry. In particular, when a board of directors decides compensation, it typically knows the firm s data for t; however, it only has publicly available data for potential peers (i.e., t-1). 7 See 8 For example, many of the firms with extreme future ROA are young firms in the pharmaceutical or technology industries with little to no revenue and extreme losses, who are waiting for FDA approval or waiting for adoption of the firm s products or services. The market value of these companies is based on expectations of future growth well beyond one year. For example, three observations with the lowest future ROA are Amarin Corporation in 2010, Omeros Corporation in 2014, and Virnetx Holding Corporation. These are all development-stage technology companies that had market values of $132.6, $550.7 and $44.7 million but revenue of $0, $0.523, and $0.026 million. Removing the 1,849 firm-years with absolute value of future ROA greater than 20% removes 633 unique firms from the sample. Figure 1 presents a histogram of ROA, which shows the long left tail of future ROA values. 10

12 Table 1 reports descriptive statistics for our sample. Panel A compares the industry composition of our sample to the industry composition of Compustat. We report the percentage of our sample with each two-digit GICS industry classification, along with the corresponding percentage with that classification in Compustat at both the beginning and end of our sample (i.e., 2008 and 2014). In general, our sample composition closely approximates the industry composition across all firms covered by Compustat. Panel B of Table 1 reports sample summary statistics for selected financial and compensation measures. All variables are defined in Appendix A. The mean (median) market capitalization is $5,000 million ($1,361 million), total sales is $4,204 million ($1,047 million), and return on assets (ROA) is 2.2% (2.9%). The mean (median) log of the standard deviation for the last 5 years ROA is (0.023). We require at least two previous years of ROA data to compute this measure, which results in 12,894 observations with available data. Panel B also reports compensation measures and shows the mean (median) CEO in our sample earns $5.3 ($3.5) million and has 15 (15) firms in the selected peer group. 4. Assessment of Matching Procedures Used in Prior Research As discussed in Section 2, the methodology used in prior research involves matching each individual firm in the selected peer group with a similar (non-peer) firm using propensity score matching (PSM). These studies then test whether there is a statistically significant difference in the central tendency of CEO compensation between the actual peer group and the matched peer group. Although PSM can be a useful matching approach when there are many explanatory covariates, it is necessary to demonstrate that the two groups exhibit covariate balance. Unfortunately, the covariate balance across the variables used in the first stage probit model is 11

13 typically not reported (e.g., Faulkender and Yang, 2010, 2013; Albuquerque et al., 2013). 9 Without covariate balance, it is problematic to attribute any observed differences in CEO compensation between the selected peers and the matched peer group to the selection process of the board of directors. Rather, the PSM process might have chosen firms with different economic fundamentals than the firms selected by the board. In order to assess the covariate balance in prior research, we first estimate the following probit model: Peerijt = ϴ1 + Σϴm Firm Characteristicsmijt +, (1) where Peerijt is an indicator variable that is set to one if firm i uses firm j as a peer in year t. Firm Characteristicsmijt is a vector of M firm attributes that capture important similarities between firm i and firm j. We consider two specifications of equation (1). The first specification follows the model of Faulkender and Yang (2010) and the second ( full ) specification combines aspects of the specifications from Faulkender and Yang (2010) and Bizjak et al. (2011), as well as elements found in practitioner guidelines discussed by Equilar and ISS (Institutional Shareholder Services, 2015; Equilar, 2016). 10 The probit estimation results for both models are reported in Table 2. We find that comparable firm size (i.e., sales, assets, and market capitalization), both firms being included in popular market indices, similar industry, and both firms having CEO duality are primary 9 The one exception is Bizjak et al. (2011), which only reports the differences in size and performance across the matched pairs. While they find balance along operating performance, they find both economically and statistically significant differences in size between groups, so they urge some caution in interpreting this result. 10 We do not exactly follow the Faulkender and Yang (2010) specification. Specifically, we omit two variables an indicator for whether both firms are in the Dow Jones Industrial Average index and the number of peers selected by the firm. We drop the Dow indicator because, in our sample with coverage comprising the Russell 3000, there are very few instances when this variable is non-zero (which makes convergence difficult). We also omit the number of peers, because including this variable would be equivalent to including a group fixed effect. Prior literature finds bias in finite-sample nonlinear fixed effects models (Greene, 2004). All remaining variables are not fixed for all firm-peer pairs in a given firm-year. 12

14 determinants of peer group selection. We also find that firms select peers based on whether they are losing management talent to specific firms (i.e., firm j has managers previously employed by firm i). In the full specification (column 3 of Table 2), we also find that comparable operating performance (ROA), location in the same metropolitan statistical area (MSA), and similar business complexity measured using geographical segments and business segments are related to peer group choice. Finally, sharing the same compensation consultant and being in the peer group for the firm being considered (a type of reciprocity where firm j uses firm i in their peer group) increases the probability of firm i selecting firm j as a peer. These results are broadly consistent with Bizjak et al. (2011), Faulkender and Yang (2010), and professional discussions regarding compensation peer firms. Based on these models, we can match each peer for firm i to another firm with the same selection probability. The results of this PSM approach are presented in Table 3. Panel A presents results for the Faulkender and Yang (2010) specification and Panel B presents results for the full model specification. The first row of each panel reports the mean and median total compensation for the selected peers and the matched sample of peers. The mean (median) peer compensation is $6.08 ($4.00) million. The matched firms using the Faulkender and Yang (2010) specification have a mean (median) compensation of $5.32 ($3.15) million, while the full model matched firms have a mean (median) of $5.13 ($3.16) million. These matched pair differences for both specifications are statistically significant at conventional levels. These results are consistent with those of prior studies, which interpreted this statistically significant difference as evidence that boards of directors strategically choose peers with higher CEO compensation. However, the problem with the PSM approach in prior research is that there is little covariate balance between the matched pairs, and this imbalance confounds the interpretation of the 13

15 compensation comparisons. This problem is clearly observed in Table 3. For example, we find that the selected peers are larger and more profitable than the matched peers, so any difference in compensation can easily be a result of differences in these firm characteristics that are known to be correlated with the level of CEO compensation. One way to quantify the quality of the matches is to use reasonable calipers around the observed measures for each peer chosen and calculate the fraction of matched firms that fall within a particular caliper. Specifically, we compute the fraction of matched firms within 50%-200% of the chosen peer s revenue, and we find that only 57% of matched firms satisfy this criterion. Using other proxies for size, we find that 58% and 50% of matched firms are in the 50%-200% range of total assets and market capitalization, respectively. Finally, we find that matched firms share the same two-digit (threedigit) GICS code in only 81% (73%) of the matches. One reason for the poor covariate balance in PSM is that the choice of potential peers is only based on the overall probability score. For example, consider Alliance Data Systems (ADS, which was discussed in Section 2) and one of its peers, Discover Financial Services (DFS). DFS is within ADS market capitalization caliper of 50%-200% and ROA caliper of ±3%. Like ADS, DFS is also in the S&P 500 and has multiple business segments. However, DFS is a financial services company, and it does not share the same two-digit GICS as ADS, an information technology company. While the companies are classified into different two-digit GICS codes, they are comparable in their lines of business and end customers, a similarity which is not captured by GICS. Another firm with the exact same propensity score as DFS is a petroleum and natural gas company, Chesapeake Energy. Like DFS, Chesapeake satisfies the size and ROA calipers and is also in the S&P 500 with multiple business segments. While neither Chesapeake Energy nor 14

16 DFS has a two-digit GICS that matches that of ADS, Chesapeake Energy is a highly implausible peer firm for ADS. Even if Chesapeake Energy satisfied all of the calipers, given that the two businesses are vastly different, it is unlikely that ADS would have ever considered Chesapeake Energy as a viable peer. The PSM approach used in prior research does not allow for the important practical setting where a firm matches on many variables that are less important (e.g., geographical location), but fails to match on the most critical variables (e.g., size and industry). 11 This example suggests that a sizable fraction of propensity score matched (non-selected) peer firms are poor substitutes for the chosen peers. This notion is further reinforced by the absence of covariate balance shown in Table 3. Overall, there are serious concerns about the appropriateness of and the conclusions based on the matching procedure used in prior research. 5. Peer Portfolio Percentile (PPP) Measure In order to address the problems with propensity score matching, we shift the unit of analysis from the selection of individual firms to the portfolio of peer firms. This focus on the portfolio also mimics the decision process by the board of directors. Although compensation committees clearly assess individual firms for inclusion in the peer group, they are ultimately selecting a portfolio of firms that informs or justifies their choice for the level of CEO compensation This incongruity raises an important concern with PSM. This method implicitly assumes that certain values of some variables can offset the poor matching of other variables. However, in this setting, certain variables are more important than others and a significant difference along a critical dimension may cause the potential firm to never be considered. The possibility that some firms are never considered is described in mathematics as non-archimedean geometry. Consider a discrete choice model with two covariates: P = f(β 1 X 1 + β 2 X 2 ) + ε. Assume that X 1 is a measure of similarity in industry between the firm and the firm in question and X 2 is a measure of the similarity in size between the firm and the firm in question. If we assume that both β 1 and β 2 are positive, a low value of X 1 can be offset by a high value of X 2 when computing f(β 1 X 1 + β 2 X 2 ). However, if X 1 is so small that the potential peer is never considered similarity in X 2 cannot overcome the deficiency. This practical hierarchical ordering is not captured by traditional PSM, and unusual (and likely inappropriate) matches can be produced. 12 This portfolio perspective implies that the probability of a firm being selected as a peer is a function of the characteristics of both the firm being evaluated (e.g., size, industry, CEO compensation level, etc.) and the compensation levels for other firms already included in the peer group. In this setting, the estimated probabilities in PSM from the first-stage probit that ignore role of other firms in the peer group will be biased and inefficient, which raises additional concerns about using PSM (Arpino and Mealli, 2011; Arpino et al., 2016). 15

17 We implement this notion using a measure that compares the median compensation for the selected peer group to the distribution of medians for all alternative peer groups that could have been reasonably selected. We denote this measure by Peer Portfolio Percentile (PPP). PPP enables us to assess whether the selected peer group produces a CEO compensation benchmark that is at the 1 st, 50 th, 99 th, or any other percentile of the distribution of plausible peer groups. This distributional measure captures whether the peer group choice by the board of directors results in an unusually high or low compensation benchmark for the CEO. In our PPP computation, we use the total dollar amount of the peer firm CEO s compensation (Total Compensation) as reported in the SEC required summary compensation table (DEF 14A). This measure includes cash compensation, bonus pay, payouts from long-term incentive plans, and the valuation of option grants. This amount is frequently used by proxy advisory firms and commonly reported in the financial press. To construct PPP, we first identify the universe of plausible peers for a given firm. We apply an industry filter based on two-digit GICS or the Hoberg-Phillips text-based industry classification and a size caliper based on the restriction that a potential peer must have either revenue or a market capitalization between 50% and 200% of the firm in question. 13 We also include all firms with any talent flows to or from the firm in question. We measure talent flows using BoardEx data to determine whether any officer or senior manager at the potential peer or the firm has ever been employed as an officer or senior manager at the other company. As discussed in Section 2.1, we choose these calipers because industry, revenue, talent, and market cap are cited as the most common peer selection criteria by companies, and 0.5x and 2.0x are common cutoffs for the revenue criterion (Equilar, 2016). While the probit results reported in 13 We provide sensitivity analyses in Section 8 to assess whether our results are substantively affected by the choice of filters to identify the universe of plausible peers. 16

18 Table 3 show that the sales caliper has a greater elasticity than our other two size proxies (i.e., market cap and assets), we also include a market cap caliper because, for firms such as growth firms, market cap is a more suitable measure of size. 14 The probit results also indicate that common industry membership has a higher elasticity than most other firm characteristics. We then calculate the empirical distribution of median CEO compensation for all possible peer groups of the same size as the number of firms selected. Next, we determine the percentile ranking of the actual selected median pay, relative to this empirical distribution. Figure 2 demonstrates the calculation of PPP for American Axle & Manufacturing Holdings (AAM), which reported 20 compensation peers in fiscal year Based on the talent flows, industry, and size calipers, AAM had 240 potential peers. Based on the combinatorics of selecting 20 from a group of 240, there are possible sets of 20 peers. Panel A of Figure 2 presents the histogram of these possible median CEO compensation amounts and their corresponding probabilities. AAM s chosen peer group had a median CEO compensation of $5.73 million, which is shaded in red. Panel B of Figure 2 maps this chosen median to the empirical distribution function of all potential medians and shows that it is at the 69.6 th percentile of this distribution. Because the chosen median is at the 69.6 th percentile, AAM has a PPP of 69.6 in Naïvely calculating the medians of all possible combinations of firms is computationally infeasible. We circumvent this limitation by observing that the upper bound on the number of possible medians is far lower than the number of combinations of peers that produce these medians. For example, if the number of peers is odd, the median must be one of the selected 14 The mean (median) number of potential peers generated by this approach is 194 (197) firms which is considerably larger than the mean (median) peer group size of 15 (14). This caliper also selects a mean (median) of 65% (69%) of the peers actually selected by a firm. 15 It may be the case that a firm selects peers that are not in the set of plausible peers (e.g., our example firm in Section 4, ADS, selected Discover Financial Services, which is outside of its two-digit GICS code). Our PPP measure can reasonably be interpreted as comparing the median of the selected peer group to medians of all potential peer groups, where all potential peers satisfy the caliper restrictions. 17

19 firms (i.e., n) whereas the number of combinations to produce the median is much higher (i.e., n choose k). Specifically, we develop an algorithm that is computationally feasible and identifies the percentile of the median chosen relative to the distribution of all possible medians from the firm s potential peer set. We outline this algorithm in Appendix B. The distribution of PPP for our sample of firm-years is presented in Panel A of Figure 3. This measure ranges from 0 to 100, where 0 (100) indicates that the firm chose a median compensation, based on their peer group of size k, that was the lowest (highest) compensation relative to all possible combinations of k peers it could have selected from its plausible peer set. Panel B of Figure 3 reports descriptive statistics for PPP. The mean (median) PPP is 72.7 (87.4). Assuming our calipers accurately capture the set of potential peers, an average PPP that is greater than 50 suggests firms systematically choose relatively more highly paid peers than random selection from their potential peer set. Table 4 compares the decile of PPP in year t to the decile of PPP in year t+1 and finds that PPP exhibits persistence over time. This persistence is one validation for our measure because we would expect the board of director s objectives when selecting a peer group to be relatively stable over time. Figure 3 shows that the distribution of PPP is relatively uniform over most of the range of values. However, the frequency of firms substantially increases at larger values of PPP, with a mass of firms having a PPP at, or just below, 100. Approximately 47% of our sample has a PPP greater than 90, and it is important to understand whether this feature of the distribution is a reasonable outcome or something that is induced by a weakness in our measurement approach. The mass of firms with large values of PPP occurs because the combinatorics of selecting a median for a subset of peers exacerbates any deviation from the median of all potential peers. To make this combinatorics problem more concrete, suppose a firm has 100 potential peers and 18

20 selects 11 of these to be in its peer group. Assume the firm selects a peer group with peer P as its chosen median, where P is the 20 th highest paid peer of the 100. There are possible peer groups of size 11 from this set of 100 (i.e., 100 choose 11 ). However, there are only sets with a median larger than peer P. This substantial difference occurs because it is far more likely that a random sample of 11 peers from this 100 will have at least 6 firms with pay less than P (and hence, have a median less than P). Therefore, despite choosing a median peer that is only at the 80 th percentile of all potential peers (i.e., peer P is the 20 th highest paid of the 100), the PPP in this example would be close to the maximum possible PPP (i.e., PPP = 99.2). 16 Due to this feature of the potential peer set, when firms choose a set of peers with higher median compensation than a random draw from the potential peer set, the PPP is often very high. Most of the potential medians will be below the median chosen, which explains why our sample includes a large proportion of high PPP firms. Still, we want to ensure that our distribution of PPP is not induced by our measurement approach. For example, PPP might exhibit a predictable pattern with firm size or when the number of peers chosen by the firm (k) is small relative to the number of potential peers (n). To examine these concerns, we assess whether firm characteristics vary across the different values of PPP (Table 5). Panel A reports mean values and Panel B reports medians. Column 2 (3) finds that observations with the largest market capitalization (revenue) are concentrated at values of PPP=0, but there is a non-monotonic relation between this variable and PPP. This grouping of large firms with small PPP values seems to occur because column 4 shows these 16 If firms choose peers outside of the potential set of peers with compensation that is higher than most (or all) potential peers, PPP will be even higher than the resulting PPP from this example. In such a case, it is even less likely that the selected median could be constructed from randomly drawing firms from the set of potential peers. 19

21 observations have 33 potential peers compared to the average of 194 for the entire sample. Therefore, our caliper may be ill-suited for the largest firms which may be more likely to look outside their industry for potential peers. Column 5 shows that the average (median) number of peers selected by the firms, with the exception of low PPP firms, is relatively constant across values of PPP with means (medians) in the range (14-16). Except for observations with PPP=0, over 50% of the selected peers are captured by our caliper. For observations with PPP=0, only 31% are captured by our caliper. Again, we interpret this low figure as an indication that our measure may be ill-suited for the largest firms, which select peers outside of their industry. Although the mean and median descriptive statistics are not identical across PPP groupings, the differences are substantively modest and do not suggest that the distribution of PPP is induced by our computational approach. 6. Econometric Approach Similar to prior research on compensation more broadly, we hypothesize that there are two competing influences on peer group selection: (i) attraction and retention of top executive talent and (ii) executive rent extraction (i.e., agency concerns). If firms choose a high paying peer group to attract more talented executives who command higher compensation, we should find that the selection of more highly paid peers is positively associated with future firm performance. 17 Conversely, if higher peer group pay reflects rent extraction, we should observe a negative relation between peer group pay and future performance. Thus, similar to the approach used by Albuquerque et al. (2013), we use the sign and statistical significance of the association 17 Although they do not examine the compensation benchmark produced by peer group selection, Francis et al. (2016) find that firms that select peers with greater managerial ability exhibit larger stock returns and operating performance. Their result is consistent with the talent attraction and retention story. 20

22 between our measure of peer group pay and future firm performance to distinguish between these two hypotheses. Both talent and agency motivations are likely to be observed in a large cross-section of firms. Therefore, it is important for the econometric approach to allow for a mixture of firms where the estimated coefficient linking PPP with future firm performance can be positive or negative for different subsets of firms. 18 We incorporate mixture features by employing Latent Class Analysis (LCA). Specifically, we place firms into different homogeneous clusters depending on the sign and statistical significance of the association between peer group choice and future firm performance. Once these clusters are identified, we can uncover the distinguishing factors associated with the observations in each cluster. For example, we would expect firms with a negative association between peer group choice and future operating performance to exhibit characteristics associated with poor corporate governance. In contrast, we would expect firms with a positive association to have more talented CEOs. 19 The LCA model assumes that the observed data can be characterized by P clusters, each with a different set of coefficient values. These subpopulations are called latent classes 18 The previous literature has also assumed that the same econometric model is applicable to all firms. However, if different firms select peer groups for different reasons, namely talent attraction versus rent extraction, this heterogeneity should be part of the econometric model. For example, consider a setting where there is a large cluster of firms that chooses peers based on talent objectives and a smaller cluster of firms selects peers for rent extraction. If one model is used to characterize all firms, it will be very difficult to detect the rent extraction motivation because the results will be dominated by the large cluster of firms selecting peers based on talent objectives. Alternatively, if the rent extraction and aspirational clusters are of a similar size, but have coefficients of different signs, a pooled analysis is likely to find no association between PPP and future performance. 19 The typical pooled regression approach used in prior research does not easily allow for this type of heterogeneity. For example, heterogeneity might be accommodated using interaction terms to see how the relation between PPP and future performance varies with another variable. However, interactions are subject to multicollinearity concerns because the main effects and associated interactions are included in the same regression model. Moreover, it is necessary for the researcher to ex-ante know which variables should be used in the interactions. For example, if corporate governance is assumed to be an important interactive effect, it is unclear which of the many corporate governance variables should be used (e.g., board structure, ownership, compensation plan design, anti-takeover provisions, etc.). 21

23 because each observation s class membership is not directly observed. We assume the dependent variable is distributed as a finite mixture of normal distributions so the likelihood expression is L = Π N i=1 P [Σ p=1 λ k (2πσ k ) 1 2exp [ (y i X i B p ) 2 ]] 2σ2 p where λ p is the unknown proportion of the sample that is contained in cluster k, σ p is the (2) standard deviation of the error term in cluster p, and N is the sample size. B p represents the coefficients of the linear model for cluster p. We assign firms to different clusters to maximize this likelihood function in equation (2). We utilize hard clustering, where each observation is assigned to the cluster with the greatest posterior probability. 20 To determine the optimal number of clusters, we examine the fit statistic represented by the Bayesian Information Criterion (BIC) (Nylund et al., 2007). The BIC is computed as 2log(L) + Plog(n), where L is the likelihood, P is the number of classes, and N is the number of observations. To determine the optimal number of classes that describes our data, we increase P until it no longer leads to an appreciable improvement in the BIC. Once the number of clusters is determined, we can probabilistically assign each observation to each of the P clusters. The fundamental relationship of interest is the statistical association between peer group choice (PPP) and future firm performance, after controlling for other variables shown in prior literature to be related to future performance. Both accounting and stock return performance are candidates for measuring firm performance. However, if the stock market fully incorporates future cash flow implications of peer choice into price, we are unlikely to detect an effect on 20 In untabulated analyses, we also repeat our analyses using soft clustering, where each observation is weighted in a weighted least squares regression based on the posterior probability that it belongs in a given cluster. 22

24 future stock returns. Thus, we focus on one-year-ahead annual return on assets (ROA) as our measure of future performance. 21 The basic regression equation of interest is: ROAt+1 = β0 + β1pppt + β2 LogSalest + β3log(stdroat) + βkindustryk + βjyearj + υ (3) where ROAt+1 is Compustat IB divided by Compustat AT in year t+1, multiplied by 100. Following Core et al. (1999) and Albuquerque et al. (2013), our control variables consist of LogSales and Log(StdROA), where LogSales is the natural logarithm of one plus Compustat REVT, and Log(StdROA) is natural logarithm of the standard deviation of the last five fiscal years ROA. Industry fixed effects, where Industryk denotes two-digit GICS, eliminate the common industry trends of ROAt+1. Year fixed effects (Yearj) eliminate the common time trend across firms. 22 Because of our focus on the β1 coefficient, we want the LCA to reveal whether there are different clusters for this coefficient of interest, rather than clusters produced by differences in the coefficients on the control variables. In order to adjust for the control variables, we first estimate first-stage regressions to obtain residuals for ROA and PPP: ROAt+1 = LogSalest + 2Log(StdROAt) + kindustryk + jyearj + ROA (4a) PPPt = LogSalest + 2Log(StdROAt) + kindustryk + jyearj + PPP (4b) We represent this procedure as a linear model equivalent to equation (3): ROA = β0 + β1 PPP + υ (5) 21 As we report in Section 8, our results are robust extending operating performance to the three-year-average return on assets. 22 There may be industry-specific or year-specific components in the relation between ROA and PPP that should not be removed using fixed effects (e.g., firms in a given industry might choose PPP that in a way that is systematically related to future performance, and we want to capture this fixed component). Therefore, we repeat our analyses excluding year and industry fixed effects. Removing these fixed effects results does not substantively change our results. 23

25 The coefficient estimate β1 and its standard error are the same in equations (3) and (5). The partial correlation coefficient, β1, measures the correlation between two variables after removing the effect of a set of control variables. We use the univariate regression equation (5) as the regression model for the LCA. Once the number of clusters is determined using the BIC criterion, we probabilistically assign each observation to a cluster by computing the estimated posterior probability from the likelihood function associated with the finite mixture of normal distributions. After assigning observations to clusters, model (3) can be estimated separately for each cluster, and we can test for differences in estimated coefficients across clusters. For example, if there are two distinct clusters (P = 2), we are interested in whether cluster one has a positive value for β1 (suggestive of a rent extraction interpretation) and cluster two has a negative value for β1 (suggestive of an aspirational interpretation). 23 Similarly, if there are more than two distinct clusters, we might expect that some clusters have a positive value and others have a negative value for β1 where the magnitudes vary across clusters. To confirm these interpretations, we then determine whether clusters with a negative coefficient have governance attributes commonly associated with weak oversight and clusters with a positive coefficient have attributes commonly associated with the search for executive talent. 23 In this example, one cluster consists of observations where increases in PPP are associated with increases in future operating performance, and this raises the question of why all firms do not increase PPP to the maximum value of 100. In the aspirational cluster, the endogenous choice of PPP is a function of the talent level desired by each firm (which is a function of various exogenous variables such level of competition, capital stock, intellectual property, and other value drivers of the firm). Each firm in this cluster would presumably select a talent level that maximizes firm value given the relevant exogenous variables. Since these exogenous variables will vary across firms, we would not expect all firms to select the same talent level, and thus we should observe different choices for PPP in this cluster. A similar argument can be made for the second ( rent extraction ) cluster which consists of observations where increases in PPP are associated with decreases in future operating performance. There are likely to be exogenous differences in self-interest and corporate governance across firms, and thus we should observe a range of PPP choices for this cluster of firms. 24

26 7. Results 7.1 Relation between PPP and Future Performance The estimates for equation (3) and the equivalent equation (5) used in the LCA are reported in Table 6, Columns 1 and 2. The coefficients β1 in both columns are and statistically significant. These findings contrast with those from prior research, suggesting that in the pooled sample there is a negative association in higher paying peers and future operating performance. When we apply LCA, we find that three clusters emerge when we use the BIC criterion. Columns 3, 4 and 5 of Table 6 report results from estimating equation (3) for these three clusters. In the smallest cluster (Column 3), the coefficient on PPP is negative and significant (-0.021, p < 0.01). In the largest cluster (Column 4), the coefficient on PPP is also negative and significant (- 0,019, p < 0.01). However, in the third cluster (Column 5), the coefficient is positive and significant (0.010, p < 0.01). 24 In Columns 6 and 7, we test whether β 1 is significantly different across clusters by pooling the sample and interacting cluster membership with our variable of interest, ε PPP. The insignificance of the coefficients on Cluster 2 ε PPP in Column 6 and Cluster 1 ε PPP in Column 7 imply that β 1 in Column 3 is not statistically different from β 1 in Column 4. However, the coefficient on Cluster 3 ε PPP is statistically significant (p < 0.01) in both Column 6 (comparing Cluster 1 to 3) and Column 7 (comparing Cluster 2 to 3), which suggests that the positive coefficient is significantly different from both negative coefficients. As we describe in further detail in Table 7, the first (second) cluster is comprised of smaller (larger) firms, so we refer to it as the small-sized rent extraction ( large-sized rent extraction ) 24 The adjusted R 2 in column 5 is 87%. Removing the industry and year fixed effects, the adjusted R 2 drops to 53%. Removing the control variables from the regression, in addition to removing fixed effects, decreases the adjusted R 2 to 11%. 25

27 cluster. We label these two clusters as rent extraction clusters because of the negative association between PPP and future operating performance. As reported by the summary statistic Fraction of Firms in Table 6, the small-sized rent extraction cluster contains just 6% of firm-years, whereas the large-sized rent extraction cluster contains 62% of firm-years. We refer to Cluster 3, the cluster with a positive coefficient on PPP, as the aspirational cluster, as it has a positive sign on PPP. The aspirational cluster contains 32% of firm-years. As expected, we find that cluster membership is relatively stable within firm (untabulated). It is unlikely many firms can substantially shift their compensation program from one year to the next. Specifically, 66% of aspirational cluster firms remain in the aspirational cluster in the next year, 50% of small-sized rent extraction firms remain in this cluster in the next year, and 78% of large-sized rent extraction firms remain in this cluster in the next year. At the firm level, 12% of the 2,888 unique firms are always in the aspirational cluster, and the remaining 88% of firms are assigned into one of the two rent extraction clusters for at least one year. In addition, the aspirational cluster percentages range from 25% (in 2008) to 34% (in 2011), which indicates that cluster membership is not concentrated in any single year. 7.2 Validation of Measure and Cluster Characteristics Despite labeling clusters as rent extraction and aspirational based on the sign of the coefficient linking PPP to future operating performance, it is necessary to examine the attributes of the firms in these clusters to validate our PPP measure. Furthermore, assessing cluster differences provides new insights into attributes that distinguish rent extraction and aspirational firms. In Table 7, we assess differences between the rent extraction and aspirational clusters in five categories of attributes: (i) firm characteristics, (ii) compensation characteristics, (iii) CEO talent measures, (iv) entrenchment and board structure measures, and 26

28 (v) realized governance measures. We report the mean values for the three clusters in Columns 1-3 and the pairwise differences between clusters in Columns Category (i) provides summary measures of firm characteristics. Our three proxies for size are market capitalization (Market Cap), assets (Assets), and sales (Sales). We also examine differences in profitability (ROA). Smaller firms are less visible and are subject to less outside scrutiny, making them more likely to extract rents. Similarly, we expect less profitable firms to be extracting rents from shareholders. Consistent with these expectations, we find that Cluster 1 is comprised of smaller and less profitable firms than the aspirational cluster (Column 3). Interestingly, the second rent extraction cluster (Column 2) is composed of the largest and most profitable firms in our sample (Columns 4-6). This finding suggests that firms engaging in rent extraction are not isolated to smaller, more obscure firms that are not performing well. The second category of attributes we examine consists of CEO pay characteristics. We examine total compensation across clusters (Total Compensation) and acknowledge that compensation is affected by other firm characteristics such as firm size. Category (ii) also includes the fraction of positive say-on-pay votes (ISS For SOP). Ertimur et al. (2013) show that proxy advisory firms can act as information intermediaries for institutional shareholders by gathering and processing information related to executive pay. Therefore, we expect ISS say-onpay support to be lower for rent extraction clusters. The first two rows of Category (ii) report the comparison for compensation and ISS support for say-on-pay. The aspirational firms have significantly higher total compensation than the small-sized rent extraction firms (difference of 25 All tests of significance for differences between clusters are derived from the empirical distribution of bootstrapped samples. We determine significance with bootstrapped samples because firms appear multiple times in our sample (i.e., they are present in multiple years). As many of our descriptive measures are autocorrelated, traditional test statistics would tend to be inflated. To construct the empirical distribution, we randomly assign, with replacement, observations into the three clusters, preserving the proportion of observations in each cluster. We then compute the relevant statistic for the bootstrapped clusters and repeat this entire process 1,000 times. We then compare the actual statistic to the empirical distribution of all bootstrap observations to determine significance. 27

29 $1.06, p<0.01) and significantly lower total compensation than the large-sized rent extraction firms ($0.72, p<0.01). These differences are likely due to inherent differences in firm size as reported in Category (i). The small-sized rent extraction cluster has 9% lower ISS support than the aspirational cluster (p<0.01), but we find no difference in ISS support between the aspirational cluster and the large-sized rent extraction firms. The results are consistent with ISS s focus on poorly performing firms (Ertimur et al. 2013). In Category (ii) we also examine whether there are differences in the number of peers (Number of Peers) across clusters. Because PPP is determined from a set of reasonable potential peers, we predict rent extraction firms will choose fewer peers from this reasonable set (Chosen in Caliper). In row 3 of Category (ii), we report that both rent extraction clusters have significantly fewer peers, but the magnitudes of these differences are economically small as all clusters have a mean of roughly 15. Both the small- and large-sized rent extraction clusters are less likely to choose peers in their potential peer set caliper (-0.07, p<0.01; -0.07, p<0.01), which validates that these firms are less likely to select reasonable peers. We also find that both rent extraction clusters have a significantly higher PPP. 26 Peer groups are often chosen with input from compensation consultants. Therefore, Category (ii) also includes two measures related to compensation consultants. We report the fraction of observations that use a top 10 compensation consultant (Top 10 Comp Consultant) and whether management retains its own consultant (Separate Comp Consultants). Armstrong et al. (2012) find that firms with weaker governance are more likely to use consultants to justify 26 A difference in PPP between clusters does not necessarily imply that some clusters are engaged in greater rent extraction than others. Rather, it is the relation between PPP and future performance that identifies rent extraction, Table 6 shows the rent extraction clusters have a negative association with future performance. This negative relation, combined with the finding that the rent extraction clusters have larger PPP values, suggests these clusters are engaged in more rent extraction than the aspirational cluster. 28

30 compensation choices. We also examine whether or not the board uses a different compensation consultant than management, which might result in more independence and lower pay. Prior literature finds mixed results when examining separate compensation consultants (Murphy and Sandino, 2010). We find that both rent extraction clusters are more likely to use a large compensation consultant. The fraction of firm-years with a Top 10 Comp Consultant in Clusters 1, 2, and 3 are 0.82, 0.75, and 0.72, respectively. We find the board and CEO of firms in the small-sized rent extraction cluster are less likely to use separate compensation consultants and this difference is marginally significant. In Category (iii), our three proxies for CEO talent closely follow those in Albuquerque et al. (2013). We include characteristics of firms at which the CEO was previously employed in any position. These variables are abnormal ROA (CEO ROA), abnormal returns (CEO Returns), and log of firm size (CEO Firm Size). More talented CEOs are those that were previously at larger or better performing firms. Across all three CEO talent variables, the small-sized (large-sized) rent extraction cluster is comprised of less (more) talented CEOs relative to the aspirational cluster. Our results suggest that, while CEOs with low talent are extracting rents in small-sized firms, in the majority of rent extraction firms, CEOs actually have higher talent than those in the aspirational cluster. Our fourth category includes ex-ante measures of entrenchment and board structure. We include eleven different proxies for these structural governance characteristics, because it is difficult to measure governance quality ex-ante (e.g., Daines et al., 2010). Prior literature provides mixed evidence on the characteristics, and we make few predictions for these variables. Prior literature has used CEO tenure (CEO Tenure) as a proxy for the power of the CEO (Baker and Gompers, 2003; Coles et al., 2008). However, shorter CEO tenure could also suggest 29

31 increased firm turmoil (Gilson and Vetsuypens, 1993). Prior literature has also studied the NEO/CEO pay ratio (Pay Ratio), CEO-Chairman duality (Is Chairman), and founder involvement (Founder Involved), with mixed results on the impact of these measures (O Reilly et al., 1988; Willard et al., 1992; Main et al., 1993; Begley, 1995; Brickley et al., 1997; Goyal and Park, 2002; Bebchuk et al., 2011). We also examine whether the firm has a staggered board (Staggered Board). This characteristic may be a manifestation of poor governance (Bebchuk and Cohen, 2005) or can promote value creation by helping firms undertake long-term investments (Cremers et al., 2017). We study the fraction of the board that is composed of outside directors (Percent Outside Dir) and the fraction that is CEO-appointed (Fraction CEO Appoint). Prior literature has found mixed evidence on whether director independence is optimal for shareholders (e.g., Adams and Ferreira, 2007). Core et al. (1999) find a higher fraction of CEO-appointed directors as systematic of poor governance. Busy boards (Busy Board) have been shown to be associated with weak governance, but the tests in prior literature may have low power (Core et al, 1999; Fich and Shivdasani, 2006; Ferris et al., 2003). We examine whether the firm is a dual class firm with unequal voting rights across multiple share classes (Dual Class). Gompers et al. (2009) and Masulis et al. (2009) find that dual class firms have agency problems related to the separation of ownership and control. We include insider ownership (Insider Ownership) and note that prior literature documents a non-monotonic relation with firm value (McConnell and Servaes, 1990). We also examine whether proxy advisory support for directors (ISS For Directors) is systematically different across clusters and expect lower support in rent extraction clusters. Table 7, Category (iv) reports results for these structural governance measures. We find that the small-sized rent extraction firms have lower CEO tenure than the aspirational firms (

32 years, p<0.01), which is consistent with shorter tenure being associated with increased turmoil. The average NEO/CEO pay ratio for the small-sized rent extraction cluster is higher than the aspirational cluster (0.04, p<0.01). Compared to the aspirational firms, the small-sized rent extraction firms are less likely to have a CEO who is also the Chairman (-0.13, p<0.01), and more likely to have a founder that is still involved in the firm (0.10, p<0.01). In addition, the small-sized rent extraction firms are 5% more likely (p-value <0.01) to have a staggered board than the aspirational firms. There are no significant differences in the percent outside directors and fraction of the board that is CEO appointed. Small-sized rent extraction firms are more likely to have a busy board (0.02, p<0.01), less likely to have a dual class structure (-0.01, p<0.1), and tend to have higher insider ownership than the aspirational cluster (0.03, p<0.01). Relative to the aspirational cluster, we find that small-sized rent extraction firms are less likely to have a positive ISS recommendation for directors (-0.04, p<0.05). Overall, we find evidence that smallsized rent extraction firms have weaker structural governance measures, compared to aspirational firms. Our results comparing the structural governance measures of the large-sized rent extraction firms to those of the aspirational firms find fewer differences between these clusters. The largesized rent extraction firms have lower CEO tenure and a lower NEO/CEO pay ratio than the aspirational cluster (-0.46, p<0.01; -0.02, p<0.01). We find no difference in CEO-chairman duality but observe they are more likely to have a founder that is still involved in the firm (0.04, p<0.01). We find no significant differences in the staggered board measure, percent outside directors, and fraction CEO appointed between the large-sized rent extraction and aspirational firms. While the large-sized firms are more likely to have a busy board (0.03, p<0.01) and have higher insider ownership (0.01, p<0.01) than the aspirational firms, there are no significant 31

33 differences in dual class structure or ISS support. Overall, category (iv) shows significant differences in governance structural between the small-sized rent extraction cluster and the aspirational cluster but finds weaker evidence of differences with the large-sized rent extraction cluster. As poor governance is difficult to infer based on ex-ante measures, our final category in Table 7 compares differences in realized negative firm outcomes across the three clusters. Prior research finds that a higher probability of restatements (Accounting Restatements), internal control weaknesses (ICW), SEC enforcement actions (SEC Enforcement), and shareholder lawsuits (Shareholder Lawsuits) are associated with weak governance (e.g., Dechow et al, 1996; Agrawal and Chadha, 2005; Zhang et al., 2007; Larcker et al., 2007). Thus, we expect these realized measures to be higher in rent extraction clusters. We also include environmental, social, and governance measures (ESG) as these ESG concerns are associated with bad corporate behavior (e.g., Walls et al., 2012). Specifically, we include concerns related to products (Product Concerns), diversity (Diversity Concerns), employee relations (Employee Relations Concerns), and the environment (Environmental Concerns). We expect these concerns to be more prevalent in the rent extraction clusters. Our results reported in category (v) find that small-sized rent extraction firms have a 1.9% and 3.5% higher (p< 0.1, p< 0.01) likelihood of an accounting restatement and an internal control weakness, respectively. The small-sized rent extraction cluster has a significantly greater likelihood of a shareholder lawsuit (0.088, p<0.01), but fewer product concerns (-0.49, p<0.01). Finally, the small-sized rent extraction firms have more diversity (0.066, p<0.01) and employee relations concerns (0.067, p<0.01) than the aspirational cluster. 32

34 The overall tenor of results in category (v) is similar for large-sized rent extraction firms. While we observe no differences in the likelihood of an accounting restatement and internal control weakness between large-sized rent extraction firms and aspirational firms, we do find that large-sized rent extraction firms have nearly twice the likelihood of an SEC enforcement action than the firms in the aspirational cluster (p-value < 0.01). They also have a significantly greater likelihood of a shareholder lawsuit (0.068, p<0.01). With respect to ESG concerns, the large-sized rent extraction firms have significant more product (0.042, p<0.01), diversity (0.024, p<0.05), employee relations (0.063, p<0.01) and environmental concerns (0.015, p<0.05) than the aspiration cluster. In sum, we find the small-sized rent extraction firms can be distinguished along several dimensions, while the large-sized rent extraction firms are more difficult to distinguish from the aspirational firms. Our small-sized rent extraction firms have lower CEO talent, weaker structural governance measures, and more negative realized outcomes than our aspirational firms. Our large-sized rent extraction firms have higher CEO talent than our aspirational firms. In contrast to our results comparing the small-sized rent extraction firms to the aspirational cluster, we do not find substantial differences in structural governance measures between the large-sized rent extraction firms and the aspirational firms. This weaker finding may be a consequence of larger firms facing pressures to conform with structural governance measures. However, when we examine realized governance measures, we find strong evidence that the large-sized rent extraction firms exhibit significantly worse outcomes than the aspirational firms. Our results cast doubt on the use of ex-ante structural governance measures as accurate indicators of rent extraction, particularly for larger firms under more scrutiny. As the sample used in prior studies consists of large firms, our results provide an explanation for prior studies 33

35 findings that weak governance doesn t seem related to higher peer group pay (e.g., Bizjak et al., 2011; Albuquerque et al., 2013). Instead, we find that realized negative outcomes, including ESG measures, are strongly associated with rent extraction. 7.3 Excess Pay from Rent Extraction As we document in previous tables, rent extraction firms appear to select peers opportunistically. To quantify the egregiousness of peer group selection for these firms, we define excess peer pay as the difference between the chosen median compensation and the counterfactual median at PPP=50. Row 8, Column 1 (4) of Table 8 reports that small-sized (large-sized) rent extraction firms have an average of $1.44 ($1.49) million in excess peer pay. Column 2 (5) reports this average as 53% (40%) of total compensation, to highlight the magnitude of excess peer pay for small-sized (large-sized) rent extraction firms. Combining these two clusters, Column 8 shows this excess accounts for 41% of the actual pay. Column 9 aggregates the excess pay and reports that aggregate excess peer pay is over $12 billion for all rent extraction firms. Overall, Table 8 demonstrates the sizable magnitude of expropriation by firms in our rent extraction clusters. 8. Sensitivity Analyses In this section, we show that our results are robust to different specifications of the potential peer set, definitions of future performance, and sample restrictions. The first sensitivity check expands both our revenue and market capitalization calipers from [50%, 200%] to [30%, 300%]. In untabulated results, we again find that three clusters describe the data. Similarly, 7% (53%) of observations are assigned into the small-sized (large-sized) rent extraction cluster with a β 1 of ( 0.016). The remaining 40% are assigned into an aspirational cluster with β 1 =

36 We also find qualitatively similar results when we compare firm and governance characteristics across clusters. Consistent with our main results, the small-sized rent extraction cluster has less talented CEOs and significantly weaker structural and ex-post governance measures than the aspirational cluster. Relative to the small-sized rent extraction cluster, the large-sized rent extraction cluster has fewer entrenchment and board structure differences with the aspirational cluster but has similarly poor ex-post governance measures. Our second robustness test uses average three-year future ROA as the dependent variable instead of one-year ahead ROA. In untabulated results, we find three clusters best describe the data, and 3% (70%) of the observations are in the small-sized (large-sized) rent extraction cluster with β 1 of (-0.016). 27 The remaining 27% are in an aspirational cluster with β 1 = Comparing firm characteristics across clusters, we find qualitatively similar, but slightly stronger, results than reported in Table 7. We find that both rent extraction clusters have more structural governance variables that are significantly different from those of the aspirational cluster. In particular, we find the large-sized rent extraction cluster now has fewer outside directors, is less likely to have a staggered board, and is more likely to have a dual class structure. Consistent with our main results, we find realized governance measures to be significantly different between rent extraction and aspirational clusters. As discussed in Section 5, our size caliper may be problematic for the largest and smallest firms in our sample, because these firms may have an uneven distribution of potential peers within their size caliper. For example, the largest firms will only be able to select peers out of a potential peer set that includes firms mostly smaller than themselves. Therefore, we repeat our analyses but remove the largest and smallest 5% of firms, where size is based on market 27 While the large-sized rent extraction cluster is significant at the 1% level, the small-sized rent extraction cluster is marginally significant (p=0.12). 35

37 capitalization. In untabulated results, we again find that three clusters describe the data and 6% (57%) of observations are assigned into the small-sized (large-sized) rent extraction cluster with a significant coefficient of β 1 = (= 0.023). The remaining 37% are in an aspirational cluster with β 1 = When we compare the sample characteristics, we find results largely consistent with those reported in Table 7. These sensitivity checks assure us that the results are not driven by our choice of caliper, future performance measure, or the largest and smallest firms in our sample. 9. Summary and Conclusions The vast majority of firms uses a peer group as an important factor in setting CEO compensation. Although the CEO compensation of similar firms can provide the board of directors with valuable information about the market wage, this benchmarking exercise is the subject of considerable controversy. For example, governance activists and proxy advisory firms claim that boards of directors select peers that are larger and have higher compensation levels to justify a high level of CEO compensation. However, firms respond that their peer group reflects a competitive labor market outcome. They assert their choice of peer groups reflect aspirations to invest in higher executive talent and prevents them from losing executives to larger firms with high pay levels. Given the important role played by peer groups in setting CEO compensation, our study examines whether this choice by the board of directors is characterized as rational attraction and retention of executive talent or a process enabling the CEO to engage in rent extraction. Prior research provides a mixed assessment of peer group choice but has several important shortcomings related to its use of propensity score matching methods. We address the limitations in prior research by developing a new measure for assessing peer groups, denoted as Peer 36

38 Portfolio Percentile (PPP), which mimics the actual peer group selection process used by boards of directors. Specifically, we compare the median compensation for the selected peer group to the distribution of medians for all alternative peer groups that could have been selected using traditional selection benchmarks such as firm size and industry. In addition, because both aspirational and rent extraction motivations are likely to be observed in a large cross-section of firms, we use Latent Class Analysis (LCA) as our primary econometric approach. Using a comprehensive sample of 12,894 firm-year observations from 2008 to 2014, we find that the peer group rent extraction problems conjectured by governance activists and proxy advisory firms appear to affect the majority of firms. On average, in our rent extraction clusters, the excess pay resulting from egregious peer group selection accounts for 41% of actual pay or $12 billion in aggregate. Our results find that only one-third of firms seem to use peer group choice to reflect the competitive labor market for CEO talent. Our analyses also provide insights into attributes that distinguish rent extraction firms from aspirational firms. Specifically, our methodology results in two rent extraction clusters, one with small-sized firms and one with large-sized firms. While the small-sized rent extraction firms have lower CEO talent and weaker structural governance measures than aspirational firms, our large-sized rent extraction firms have higher CEO talent and fewer structural governance differences. Interestingly, both sets of rent extraction firms have weaker realized negative outcomes, including SEC enforcement actions, shareholder lawsuits, and ESG concerns. By allowing for a mixture of firms, our study suggests the majority of firms exhibits rent extraction tendencies in selecting peer groups and shows that the largest of these firms, which may face more scrutiny to conform to certain structural governance forms, still experience negative realized governance outcomes. 37

39 References Adams, R.B., Ferreira, D., A theory of friendly boards. Journal of Finance 62, Agrawal, A., Chadha, S., Corporate governance and accounting scandals. Journal of Law and Economics 48, Albuquerque, A.M., De Franco, G., Verdi, R.S., Peer choice in CEO compensation. Journal of Financial Economics 108, Arpino, B., Benedictis, L.D., Mattei, A., Implementing propensity score matching with network data: the effect of the General Agreement on Tariffs and Trade on bilateral trade. Journal of the Royal Statistical Society: Series C (Applied Statistics). Arpino, B., Mealli, F., The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis 55, Armstrong, C.S., Ittner, C.D., Larcker, D.F., "Corporate governance, compensation consultants, and CEO pay levels." Review of Accounting Studies 17(2), Audit Analytics, Peer Benchmarking and Trends in Executive Compensation. Baker, M. and Gompers, P.A., The determinants of board structure at the initial public offering. Journal of Law and Economics, 46(2), Bebchuk, L.A., Cremers, K.M. U.C., Peyer, The CEO pay slice. Journal of Financial Economics, 102(1), Bebchuk, L., Cohen, A., The costs of entrenched boards. Journal of Financial Economics 78, Bebchuk, L.A., Fried, J.M., Pay without performance: Overview of the issues. Journal of Applied Corporate Finance 17,

40 Begley, T.M., Using founder status, age of firm, and company growth rate as the basis for distinguishing entrepreneurs from managers of smaller businesses. Journal of business venturing, 10(3), Bizjak, J., Lemmon, M., Nguyen, T., Are all CEOs above average? An empirical analysis of compensation peer groups and pay design. Journal of Financial Economics 100, Bout, A.B.L., Smart Compensation: Developing and Using Peer Groups, Workspan. Washington D.C.: Sheridan Press, Brickley, J.A., Coles, J.L. and Jarrell, G., Leadership structure: Separating the CEO and chairman of the board. Journal of Corporate Finance, 3(3), Cadman, B., Carter, M.E., Compensation peer groups and their relation with CEO pay. Journal of Management Accounting Research 26, Coles, J.L., Daniel, N.D. and Naveen, L., Boards: Does one size fit all?. Journal of Financial Economics, 87(2), Core, J.E., Holthausen, R.W., Larcker, D.F., Corporate governance, chief executive officer compensation, and firm performance. Journal of Financial Economics 51, Cremers, K.M., Litov, L.P. Sepe, S.M., Staggered boards and long-term firm value, revisited. Journal of Financial Economics, 126(2), Daines, R.M., Gow, I.D., Larcker, D.F., Rating the ratings: How good are commercial governance ratings?. Journal of Financial Economics, 98(3), Dechow, P.M., Sloan, R.G., Sweeney, A.P., Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement actions by the SEC. Contemporary Accounting Research 13, Edmans, A., Gabaix, X., Executive compensation: A modern primer. Journal of Economic Literature 54, Equilar, Peer Group Composition and Benchmarking. 39

41 Ertimur, Y., Ferri, F., Oesch, D., Shareholder votes and proxy advisors: Evidence from say on pay. Journal of Accounting Research 51, Faulkender, M., Yang, J., Inside the black box: The role and composition of compensation peer groups. Journal of Financial Economics 96, Faulkender, M., Yang, J., Is disclosure an effective cleansing mechanism? The dynamics of compensation peer benchmarking. Review of Financial Studies 26, Ferris, S.P., Jagannathan, M., Pritchard, A.C., Too busy to mind the business? Monitoring by directors with multiple board appointments. Journal of Finance 58, Fich, E.M., Shivdasani, A., Are busy boards effective monitors? Journal of Finance 61, Francis, B., Hasan, I., Mani, S., Ye, P., Relative peer quality and firm performance. Journal of Financial Economics 122, Gabaix, X., Landier, A., Why has CEO pay increased so much? The Quarterly Journal of Economics 123, Gilson, S.C. and Vetsuypens, M.R., CEO compensation in financially distressed firms: An empirical analysis. The Journal of Finance 48(2), Gompers, P.A., Ishii, J. and Metrick, A., Extreme governance: An analysis of dual-class firms in the United States. The Review of Financial Studies 23(3), Goyal, V.K. and Park, C.W., Board leadership structure and CEO turnover. Journal of Corporate Finance 8(1), Greene, W., The behaviour of the maximum likelihood estimator of limited dependent variable models in the presence of fixed effects. The Econometrics Journal 7(1), Hagenaars, J.A., McCutcheon, A.L., Applied latent class analysis: Cambridge University Press. 40

42 Institutional Shareholder Services, U.S. Peer Group Selection Methodology and Issuer Submission Process (Frequently Asked Questions). Larcker, D.F., Richardson, S.A., Fees paid to audit firms, accrual choices, and corporate governance. Journal of Accounting Research 42, Larcker, D.F., Richardson, S.A. and Tuna, I., Corporate governance, accounting outcomes, and organizational performance. The Accounting Review 82(4), Main, B.G., O'Reilly III, C.A. and Wade, J., Top executive pay: Tournament or teamwork?. Journal of Labor Economics 11(4), Masulis, R.W., Wang, C. and Xie, F., Agency problems at dual class companies. Journal of Finance 64(4), McConnell, J.J. and Servaes, H., Additional evidence on equity ownership and corporate value. Journal of Financial economics, 27(2), Murphy, K.J. and Sandino, T., Executive pay and independent compensation consultants. Journal of Accounting and Economics 49(3), Nylund, K.L., Asparouhov, T., Muthén, B.O., Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling 14, O'Reilly III, C.A., Main, B.G. and Crystal, G.S., CEO compensation as tournament and social comparison: A tale of two theories. Administrative Science Quarterly, Walls, J.L., Berrone, P. and Phan, P.H., Corporate governance and environmental performance: Is there really a link?. Strategic Management Journal 33(8), Willard, G.E., Krueger, D.A. and Feeser, H.R., In order to grow, must the founder go: A comparison of performance between founder and non-founder managed high-growth manufacturing firms. Journal of Business Venturing 7(3),

43 Zhang, Y., Zhou, J., Zhou, N., Audit committee quality, auditor independence, and internal control weaknesses. Journal of Accounting and Public Policy 26,

44 Appendix A: Variable Descriptions Variable Description Source Accounting Restatements Indicator variable set to 1 if the firm has an accounting restatement (i.e., Audit Analytics accounting rule application failure) in that fiscal year. Assets Total assets, Compustat AT. Compustat Assets Caliper Indicator variable set to 1if the potential peer has total assets within 50%- Compustat 200% of the firm s Both are Chairmen Indicator variable set to 1 if both the firm s CEO and the potential peer s Equilar CEO are also the chairmen Both are Not Chairmen Indicator variable set to 1 if neither the firm s CEO nor the potential peer s Equilar CEO is a Chairman of the Board Both in SP 400 Indicator variable set to 1 if the firm and potential peer are both in the S&P CRSP 400 Both in SP 500 Indicator variable set to 1 if the firm and potential peer are both in the S&P CRSP 500 Busy Board Fraction of outside directors that have more than 2 board appointments in Equilar the same fiscal year CEO Duality Indicator variable set to 1 when the CEO is also the Chairman of the Board. Equilar CEO Firm Size Arithmetic mean of the log of Compustat PRCC_F*CSHO of all companies Compustat, Equilar at which the firm-year's CEO was a named executive officer in the last three years, defined by fiscal year end between 1 and 36 months prior to the firm-year s fiscal year end. CEO Return Arithmetic mean of the annual excess return of all companies at which the CRSP, Equilar firm-year's CEO was a named executive officer in the last three years, defined by fiscal year end between 1 and 36 months prior to the firm-year s fiscal year end. Annual excess return is the sum of the 12 monthly excess returns, calculated as the monthly return in excess of the size decile monthly return, from CRSP. CEO ROA Arithmetic mean of the industry-adjusted ROA of all companies at which Compustat, Equilar the firm-year's CEO was a named executive officer in the last three years, defined by fiscal year end between 1 and 36 months prior to the firm-year s fiscal year end. Industry-adjusted ROA is calculated as the difference between ROA and median ROA for the constituents of the same two-digit SIC code in that fiscal year. CEO Tenure Number of years the CEO has been at the firm. Equilar Chosen / Potential Number of peers chosen divided by the number of potential peers satisfying the size and industry caliper or talent flows condition. Compustat, Equilar, BoardEx Diversity Concerns Number of diversity concerns. These concerns could include nonrepresentation KLD of minorities in senior positions within the company and major controversies on affirmative action issues. Dual Class Indicator variable set to 1 if the firm has a dual class structure with unequal voting rights across its multiple share classes. Potential dual class companies were identified based on either: 1) identifying firms that disclose SharkRepellent, SEC Form DEF 14A they are relying on the controlled company exemption, which allows controlled companies to avoid certain corporate governance listing standards (controlled companies are firms where more than 50% of the voting power for director elections is held by a single person, entity, or group), or 2) identifying firms from SharkRepellent unequal voting rights historical data. After identifying these potential dual class companies, we confirm whether or not the firm has unequal voting rights by reading the proxy statements and extracting voting and ownership information. Employee Relations Concerns Number of employee concerns. These concerns could include bad union relations, a poor safety record, and a poorly funded pension plan. KLD 43

45 Variable Description Source Environmental Concerns Number of environmental concerns. These concerns could include KLD hazardous waste and environmentally unfriendly products. Even Number of Peers Indicator variable set to 1 if the firm selected an even number of peers Equilar Firm is a Peer Indicator variable set to 1 if the potential peer chose the firm in question as Equilar a peer Founder Involved Indicator variable set to 1 if the company founder is a part of the firm s Equilar management Fraction CEO Appoint Fraction of outside directors that began their tenure after the CEO started at Equilar the firm. Fraction of Peers in Caliper The fraction of selected peers that satisfy the size and industry caliper or talent flows condition. Compustat, Equilar, BoardEx Has Multiple Business Indicator variable set to 1 if the firm reports multiple business segments Compustat Segments Has Multiple Geo. Segments Indicator variable set to 1 if the firm reports multiple geographic segments Compustat ICW Indicator variable set to 1 if there is at least 1 internal control weakness for Audit Analytics that fiscal year. Insider Ownership Fraction of common shares held by insiders, where total shares outstanding is from CRSP. Insiders include officers, directors, members of advisory Thomson Reuters, CRSP committees, and beneficial owners. Insiders total common holdings include direct and indirect holdings. Industry First two digits of Global Industry Classification Standard (GICS) industry Compustat code. ISS For Directors Indicator variable set to 1 if ISS supported all of the proposed directors. ISS Voting Analytics ISS For SOP Indicator variable set to 1 if ISS recommended voting For the executive s compensation package ISS Voting Analytics LogSales Log of (1 + Compustat REVT). Compustat Log(StdROA) Log of the standard deviation of the last 5 fiscal years ROA. Compustat Market Cap Fiscal year end market cap in millions, calculated as Compustat Compustat PRCC_F*CSHO. Mkt Cap Caliper Indicates whether the potential peer has end-of-year market capitalization Compustat within 50%-200% of the firm s Multiple Business Segments Indicator variable set to 1 if the firm and the potential peer both report Compustat multiple business segments Multiple Geographic Indicator variable set to 1 if the firm and the potential peer both report Compustat Segments multiple geographic segments Number of Peers Number of peers selected by the firm for compensation benchmarking. Eqiular Pay Ratio (Avg. NEO/CEO) Average compensation of the named executive officers, excluding the Equilar CEO, divided by the CEO compensation Peer Portfolio Percentile (PPP) The percentile of the median peer compensation relative to empirical distribution of all possible medians which could been chosen based on the size and industry calipers. For details on variable construction, refer to Section 5. Equilar, Compustat Percent Outside Dir Fraction of the directors on the board that are outside directors. Equilar Potential Peers Number of firms satisfying the size and industry caliper or talent flows condition. Specifically, we apply an industry filter based on two-digit GICS or the Hoberg-Phillips text-based industry classification and a size caliper based on the restriction that a potential peer must have either revenue or a market capitalization between 50% and 200% of the firm in question. We also include all firms with any talent flows to or from the firm in question. Product Concerns The number of product concerns over the calendar year. These concerns could include poor product safety, controversies over product advertising, and other product-related community concerns. Compustat, BoardEx, Hoberg- Phillips data library ROA Compustat IB divided by Compustat AT, multiplied by 100. Compustat KLD 44

46 Variable Description Source ROA Caliper Indicator variable set to 1 if the potential peer has ROA within ±0.03 of the Compustat firm Sales Compustat REVT Compustat Sales Caliper Indicator variable set to 1 if the potential peer has sales within 50%-200% Compustat of the firm s sales. Same Consultant Indicator variable set to 1 if both firms engage the same compensation Equilar consultant in a given year. Same MSA Indicates whether the firm and potential peer are headquartered in the same Metropolitan Statistical Area (MSA) Compustat, Department of Labor S&P 400 Membership Indicator variable set to 1 if the firm is in the S&P 400. CRSP S&P 500 Membership Indicator variable set to 1 if the firm is in the S&P 500. CRSP SEC Inquiries Indicator variable set to 1 if the firm received any SEC inquiries in a given Capital IQ year. Separate Comp Consultants Indicator variable set to 1 if the firm employed more than one Equilar compensation consultant Share 2-digit GICS Indicator variable set to 1 if both firms have the same two-digit Global Compustat Industry Classification Standard (GICS) industry code. Share 3-digit GICS Indicator variable set to 1 if both firms have the same three-digit Global Compustat Industry Classification Standard (GICS) industry code. Share 2-digit SIC Indicator variable set to 1 if both firms have the same two-digit Standard Compustat Industry Classification (SIC) industry code. Share 3-digit SIC Indicator variable set to 1 if both firms have the same three-digit Standard Compustat Industry Classification (SIC) industry code. Shareholder Lawsuits Indicator variable set to 1 if the firm experienced a lawsuit during the fiscal Capital IQ year Single Business Segment Indicator variable set to 1 if if the firm and the potential peer report only Compustat one business segment Single Geographic Segment Indicates whether the firm and the potential peer report only one Compustat geographic segment Staggered Board Indicator variable set to 1 if the firm has a staggered board. Equilar Talent Flows Indicator variable set to 1 if any officer or senior manager at the potential BoardEx peer or the firm has ever been employed as an officer or senior manager at the other company Top 10 Comp Consultant Indicator variable set to 1 if a compensation consultant employed by the Equilar company is a top ten consultant, as measured by number of engagements for the fiscal year Total Compensation Total compensation of the CEO including: cash, bonus, stock, options, long-term incentive plans, and all other compensation ($ millions). Equilar 45

47 Appendix B: Algorithm to create the empirical distribution of medians In this appendix, we describe our algorithm which makes the calculation of the distribution of peers computationally feasible. The number of potential peers is denoted by n. The implementation of this algorithm varies depending on whether the number of peers chosen by the firm, k, is even or odd. When the firm chooses an odd number of peers Suppose we have 10 potential peers and the firm wants to choose 3 peers (i.e., k = 3). To implement our algorithm, we first sort the set of 10 potential peers by the total compensation of their CEO. In this example, we will assume the potential peer compensation ranges from 1 to 10. Since the firm chooses an odd number of peers, the median compensation for any peer group will be the compensation for one of the selected firms. For instance, with k = 3, the only way in which the firm can pick a median pay of 2, is to have the median firm be the 2 nd lowest paid potential peer (which has a pay of 2). Thus, the firm must select the two peers with the 1 st and 2 nd lowest pay in the potential peer set. The remaining peer can be any one of the eight peers with pay 3 to 10. Therefore, there are 8 possible ways (8 choose 1) in which the firm can arrive at a median pay of 2. We can perform this analysis for all 10 possible medians when k = 3 and the number of combinations are: 46

48 Median Pay Chosen # Below (i) # Above (j) Number of Ways to Obtain Median (i j) For k > 1, it is not possible for the lowest or the highest potential peer to ever be selected as the median. Therefore, for k = 3, we only need to make 8 computations to construct the distribution of medians. The general algorithm to calculate the frequencies is: Freq x is median = ( i k 1 2 ) ( j k 1 ) 2 where x = median of selected peer group, k = number of peers chosen, i = number of possible peers with compensation < x, j = number of possible peers with compensation > x. These frequencies for each median provide the true distribution of medians for possible peer groups. Using this distribution and the associated selected median peer CEO compensation, it is straightforward to compute the percentile of peer CEO compensation by converting these frequencies into an empirical distribution. When the firm chooses an even number of peers A similar process is done for instances when the firm selects an even number of peers. For an even set of peers, as in the case of k = 4, the median will be the average of the two middle 47

49 values in the group k. We denote these two values as x and y where x < y. Since we calculate the median as the average of x and y, any potential peer with a value z, such that x < z < y, cannot be selected. For concreteness, we will use the same n = 10 as above but now set k = 4 (i.e., the firm chooses 4 peers from a set of 10 potential peers). Suppose we want to know the frequency of situations when x = 2 and y = 5 are averaged to form a median. In order for these potential peers to form the two elements of the median, one firm with compensation less than x must be selected (i.e., potential peer 1) and one firm with compensation greater than 5 (i.e., 6,..., 10). This can occur 1 5 = 5 possible ways. The general formula for this algorithm is: Freq x + y 2 is median = ( i k 2 2 ) ( j k 2 ) 2 where k = number of peers chosen, x = smaller compensation number comprising median of selected peer group, y = larger compensation number comprising median of selected peer group, i = number of possible peers with compensation < x, j = number of possible peers with compensation > y. This requires us to calculate less than ( n 2 ) = n(n 1) frequencies and averages of two numbers, which is again simple to implement. 2 48

50 FIGURE 1 Distribution of Future ROA This figure reports the distribution of ROA in t+1 for the sample prior to the requirement that observations must have ROA t+1 between -20% and

51 FIGURE 2 PPP Example with American Axle Manufacturing Panel A: Empirical Density of Median Compensation Panel B: Empirical Distribution of Median Compensation Panel A depicts the empirical density of medians from peer groups of size 20 that American Axle & Manufacturing Holdings (AAM) could have selected from its potential peer set of 240 firms in fiscal year The potential peer set of 240 firms is determined based on either: i) any talent flows between the two firms or ii) being in the same industry and of a similar size as AAM. Talent flows are defined as any officer or senior manager at the potential peer ever being employed as an officer or senior manager at AAM and vice versa. Being in the same industry is defined as sharing a 2-digit GICs industry code with or being a TNIC-3 peer of AAM (Hoberg-Phillips text-based industry classification system). Similar size is defined as revenue or market cap within [50%, 200%] of AAM's. The group of 20 peers had a median CEO compensation of $5.73 million. Panel B plots the empirical distribution of PPP (in percentage points) and shows that the selected median CEO compensation is at the 69.6 th percentile relative to all potential peer groups of size 20 from the 240 firms in the plausible universe of peers. 50

52 FIGURE 3 Distribution of PPP Panel A: Distribution of PPP Panel B: PPP Summary Statistics Std N Mean Dev Min P25 Median P75 Max PPP 12, Panel A presents a histogram for the distribution of chosen peer median pay relative to all other potential medians which we label as the Peer Portfolio Percentile, PPP. As described in Section 5, the potential peer are those that i) have any talent flows with the firm or ii) are in the same industry and of a similar size as the firm. Industry is based on 2-digit GICs and size is determined by the potential peer either being within [50%, 200%] of the firm s revenue or market capitalization. Panel B reports summary statistics of PPP. 51