Asymmetric Information, Adverse Selection and Seller Revelation on ebay Motors

Size: px

Start display at page:

Download "Asymmetric Information, Adverse Selection and Seller Revelation on ebay Motors"

Coral Hart
5 years ago
Views:

1 Asymmetric Information, Adverse Selection and Seller Revelation on ebay Motors Gregory Lewis Department of Economics University of Michigan December 28, 2006 Abstract Since the pioneering work of Akerlof (1970), economists have been aware of the adverse selection problem that asymmetric information can create in durable goods markets. The success of ebay Motors, an online auction market for used cars, thus poses something of a puzzle. I argue that the key to resolving this puzzle is the auction webpage, which allows sellers to reduce information asymmetries by revealing credible signals about car quality in the form of text, photos and graphics. To validate my claim, I develop a new model of a common value ebay auction with costly seller revelation and test its implications using a unique panel dataset of car auctions from ebay Motors. Precise testing requires a structural estimation approach, so I develop a new pseudo-maximum likelihood estimator for this model, accounting for the endogeneity of information with a semiparametric control function. I find that bidders adjust their valuations in response to seller revelations, as the theory predicts. To quantify how important these revelations are in reducing information asymmetries, I simulate a counterfactual in which sellers cannot post information to the auction webpage and show that in this case a significant gap between vehicle value and expected price can arise. Since this gap is dramatically narrowed by seller revelations, I conclude that information asymmetry and adverse selection on ebay Motors are endogenously reduced by information transmitted through the auction webpage. I would like to thank the members of my thesis committee, Jim Levinsohn, Scott Page, Lones Smith and especially Pat Bajari for their helpful comments. I have also benefited from discussions with Jim Adams, Tilman Börgers, Han Hong, Kai-Uwe Kühn, Serena Ng, Nicola Persico, Dan Silverman, Jagadeesh Sivadasan, Doug Smith, Charles Taragin and seminar participants at the Universities of Michigan and Minnesota. All remaining errors are my own. gmlewis@umich.edu

2 1 Introduction ebay Motors is, at first glance, an improbable success story. Ever since Akerlof s classic paper, the adverse selection problems created by asymmetric information have been linked with their canonical example, the used car market. One might think that an online used car market, where buyers usually don t see the car in person until after purchase, couldn t really exist. And yet it does - and it prospers. ebay Motors is the largest used car market in America, selling over cars in a typical month. This poses a question: How does this market limit asymmetric information problems? The answer, this paper argues, is that the seller may credibly reveal his private information through the auction webpage. Consider Figure 1, which shows screenshots from an ebay Motors auction for a Corvette Lingenfelter. The seller has provided a detailed text description of the features of the car and taken many photos of both the interior and the exterior. He has in addition taken photographs of original documentation relating to the car, including a series of invoices for vehicle modifications and an independent analysis of the car s performance. This level of detail appears exceptional, but it is in fact typical in most ebay car auctions for sellers to post many photos, a full text description of the car s history and features, and sometimes graphics showing the car s condition. I find strong empirical evidence that buyers respond to these seller revelations. The data is consistent with a model of costly revelation, in which sellers weigh costs and benefits in deciding how much detail about the car to provide. An important prediction of this model is that bidders prior valuations should be increasing in the quantity of information posted on the webpage. I find that this is the case. Using the estimated structural parameters, I simulate a counterfactual in which sellers cannot reveal their private information on the auction webpage. In that case a significant gap arises between the value of the vehicle and the expected price. I quantify this price-value gap, showing that seller revelations dramatically narrow it. Since adverse selection occurs when sellers with high quality cars are unable to receive a fair price for their vehicle, I conclude that these revelations substantially limit asymmetric information and thus the potential for adverse selection. My empirical analysis of this market proceeds through the following stages. Initially, I run a series of hedonic regressions to demonstrate that there is a large, significant and positive relationship between auction prices and the number of photos and bytes of text on the auction webpage. To explain this finding, I propose a stylized model of a common values auction on ebay with costly revelation by sellers, and subsequently prove that in equilibrium, there is a positive relationship between the number of signals revealed by the seller and the prior 1

Figure 1: Information on an auction webpage On this auction webpage, the

full description of the car s options (middle panel), and many photos

3 Figure 1: Information on an auction webpage On this auction webpage, the seller has provided many different forms of information about the Corvette he is selling. These include the standardized ebay description (top panel), his own full description of the car s options (middle panel), and many photos (two examples are given in the bottom panel). The right photo is of the results of a car performance analysis done on this vehicle. 2

4 valuation held by bidders. In the third stage, I test this prediction. A precise test requires that I structurally estimate my ebay auction model, and I develop a new pseudo-maximum likelihood estimation approach to do so. I consider the possibility that my information measure is endogenous, providing an instrument and a semiparametric control function solution for endogeneity. The final step is to use the estimated parameters of the structural model to simulate a counterfactual and quantify the impact of public seller revelations in reducing the potential for adverse selection. The first of these stages requires compiling a large data set of ebay Motors listings for a 6 month period, containing variables relating to item, seller and auction characteristics. I consider two measures of the quantity of information on an auction webpage: the number of photographs, and the number of bytes of text. Through running a series of hedonic regressions, I show that these measures are significantly and positively correlated with price. The estimated coefficients are extremely large, and this result proves robust to controls for marketing effects, seller size and seller feedback. This suggests that it is the webpage content itself rather than an outside factor that affects prices, and that the text and photos are therefore genuinely informative. It leaves open the question of why the quantity of information is positively correlated with price. I hypothesize that this is an artifact of selection, where sellers with good cars put up more photos and text than those selling lemons. To formally model this, in the second stage I propose a stylized model of an ebay auction. Bidders are assumed to have a common value for the car, since the common values framework allows for bidder uncertainty, and such uncertainty is natural in the presence of asymmetric information. 1 Sellers have access to their own private information, captured in a vector of private signals, and may choose which of these to reveal. The seller s optimal strategy is characterized by a vector of cutoff values, so that a signal is revealed if and only if it is sufficiently favorable. In equilibrium, the model predicts that bidders prior valuation for the car being sold is increasing in the number of signals revealed by the seller. I test this prediction in the third stage by structurally estimating the common value auction model. It is necessary to proceed structurally because the prediction is in terms of a latent variable - the bidder s prior value - and is thus only directly testable in a structural model. Strategic considerations arising from the Winner s Curse preclude an indirect test based on the relationship between price and information measures. In estimating the model, I propose a new pseudo-maximum likelihood estimation approach that addresses two of the common practical concerns associated with such estimation. One of the concerns is that in an ebay context it is 1 I also show that this assumption finds support in the data. In an extension, to be found in the appendix, I test the symmetric common values framework used here against a symmetric private values alternative. I reject the private values framework at the 5% level. 3

5 difficult to determine which bids actually reflect the bidder s underlying valuations, and which bids are spurious. Using arguments similar to those made by Song (2004) for the independent private values (IPV) case, I show that in equilibrium the final auction price is in fact equal to the valuation of the bidder with second most favorable private information. I obtain a robust estimator by maximizing a function based on the observed distribution of prices, rather than the full set of observed bids. The second concern relates to the computational cost of computing the moments implied by the structural model, and then maximizing the resulting objective function. I show that by choosing a pseudo-maximum likelihood approach, rather than the Bayesian techniques of Bajari and Hortaçsu (2003) or the quantile regression approach of Haile, Hong, and Shum (2003), one is able to use a nested loop procedure to maximize the objective function that limits the curse of dimensionality. This approach also provides transparent conditions for the identification of the model. The theory predicts that bidders respond to the information content of the auction webpage, but this is unobserved by the econometrician. Moreover, under the theory, this omitted content is certainly correlated with quantitative measures of information such as the number of photos and number of bytes of text. To address this selection problem I adopt a semiparametric control function approach, forming a proxy for the unobserved webpage content using residuals from a nonparametric regression of the quantity of information on exogenous variables and an instrument. I exploit the panel data to obtain an instrument, using the quantity of information provided for the previous car sold by that seller. As the theory predicts, I find that the quantity of information is endogenous and positively correlated with the prior mean valuation of bidders. My results suggest that the information revealed on the auction webpage reduces the information asymmetry between the sellers and the bidders. One would thus expect that seller revelations limit the potential for adverse selection. My final step is to investigate this conjecture with the aid of a counterfactual simulation. I simulate the bidding functions and compute expected prices for the observed regime and a counterfactual in which sellers cannot reveal their private information through the auction webpage. In the counterfactual regime, sellers with a peach - a high quality car - cannot demonstrate this to bidders except through private communications. The impact of removing this channel for information revelation is significant, driving a wedge between the value of the car and the expected price paid by buyers. I find that a car worth 10% more than the average car with its characteristics will sell for only 6% more in the absence of seller revelation. This gap disappears when sellers can demonstrate the quality of their vehicle by posting text, photos and graphics to the auction webpage. I make a number of contributions to the existing literature. Akerlof (1970) showed that in a market with information asymmetries and no means of credible revelation, adverse selection 4

6 can occur. Grossman (1981) and Milgrom (1981) argued that adverse selection can be avoided when the seller can make ex-post verifiable and costless statements about the product he is selling. I consider the intermediate case where the seller must pay a cost for credibly revealing his private information and provide structural analysis of a market where this is the case. Estimation and simulation allow me to quantify by how much credible seller revelation reduces information asymmetries and thus the potential for adverse selection. The contributes to the sparse empirical literature on adverse selection which has until now focused mainly on insurance markets (Cohen and Einav (2006), Chiappori and Salanie (2000)). In the auctions literature Milgrom and Weber (1982) show that in a common value auction sellers should commit a priori to revealing their private information about product quality if they can do so. But since private sellers typically sell a single car on ebay Motors, there is no reasonable commitment device and it is better to model sellers as first observing their signals and then deciding whether or not to reveal. This sort of game is tackled in the literature on strategic information revelation. The predictions of such models depend on whether revelation is costless or not: in games with costless revelation, Milgrom and Roberts (1986) have shown that an unravelling result obtains whereby sellers reveal all their private information. By contrast, Jovanovic (1982) shows that with revelation costs it is only the sellers with sufficiently favorable private information that will reveal. Shin (1994) obtains a similar partial revelation result in a setting without revelation costs by introducing buyer uncertainty about the quality of the seller s information. My model applies these insights in an auction setting, showing that with revelation costs, equilibrium is characterized by partial revelation by sellers. I also extend to the case of multidimensional private information, showing that the number of signals revealed by sellers is on average positively related to the value of the object. This relates the quantity of information to its content. The literature on ebay has tended to focus on the seller reputation mechanism. Various authors have found that prices respond to feedback ratings, and have suggested that this is due to the effect of these ratings in reducing uncertainty about seller type (Resnick and Zeckhauser (2002), Melnik and Alm (2002), Houser and Wooders (2006)). I find that on ebay Motors the size of the relationship between price and information measures that track webpage content is far larger than that between price and feedback ratings. This suggests that in analyzing ebay markets with large information asymmetries, it may also be important to consider the role of webpage content in reducing product uncertainty. I contribute to the literature here by providing simple information measures, and showing that they explain a surprisingly large amount of the underlying price variation in this market. Finally, I make a number of methodological contributions with regard to auction estimation. 5

7 Bajari and Hortaçsu (2003) provide a clear treatment of ebay auctions, providing a model which can explain late bidding and a Bayesian estimation strategy for the common values case. Yet their model implies that an ebay auction is strategically equivalent to a second price sealed bid auction, and that all bids may be interpreted as values. I incorporate the insights of Song (2004) in paying more careful attention to bid timing and proxy bidding, and show that only the final price in an auction can be cleanly linked to the distribution of bidder valuations. I provide a pseudo maximum likelihood estimation strategy that uses only the final price from each auction, and is therefore considerably more robust. There are also computational advantages to my common values estimation approach relative to that of Bajari and Hortaçsu (2003) and the quantile regression approach of Hong and Shum (2002). One important potential problem with all these estimators lies in not allowing for unobserved heterogeneity and endogeneity. For the independent private values case, Krasnokutskaya (2002) and Athey, Levin, and Seira (2004) have provided approaches for dealing with unobserved heterogeneity. Here I develop conditions under which a semiparametric control function may be used to deal with endogeneity provided an appropriate instrument can be found. In sum, I provide a robust and computable estimator for common value ebay auctions. I also provide an implementation of the test for common values suggested by Athey and Haile (2002) in the appendix, using the subsampling approach of Haile, Hong, and Shum (2003). This research leads me to three primary conclusions. First, webpage content in online auctions informs bidders and is an important determinant of price. Bidder behavior is consistent with a model of endogenous seller revelation, where sellers only reveal information if it is sufficiently favorable, and thus bidders bid less on objects with uninformative webpages. Second, quantitative measures of information serve as good proxies for content observed by buyers, but not the econometrician. This follows from the theory which yields a formal link between the number of signals revealed by sellers and the content of those signals. Finally, and most important, the auction webpage allows sellers to credibly reveal the quality of their vehicles. Although revelation costs prevent full revelation, information asymmetries are substantively limited by this institutional feature and the gap between values and prices is considerably smaller than it would be if such revelations were not possible. I conclude that seller revelations through the auction webpage reduce the potential for adverse selection in this market. The paper proceeds as follows. Section 2 outlines the market, and section 3 presents the reduced form empirical analysis. Section 4 introduces the auction model with costly information revelation, while section 5 describes the estimation approach. The estimation results follow in section 6. Section 7 details the counterfactual simulation and section 8 concludes. 2 2 Proofs of all propositions, as well as most estimation results, are to be found in the appendix. 6

8 2 ebay Motors ebay Motors is the automobile arm of online auctions giant ebay. It is the largest automotive site on the Internet, with over $1 billion in revenues in 2005 alone. Every month, approximately passenger vehicles are listed on ebay, and about 15% of these will be sold on their first listing. 3 This trading volume dwarfs those of its online competitors, the classified services cars.com, autobytel.com and Autotrader.com. In contrast to these sites, most of the sellers on ebay Motors are private individuals, although dealers still account for around 30% of the listings. Listing a car on ebay Motors is straightforward. For a fixed fee of $40, a seller may post a webpage with photos, a standardized description of the car, and a more detailed description that can include text and graphics. The direct costs of posting photos, graphics and text are negligible: text and graphics are free, while each additional photo costs $0.15. But the opportunity costs are considerably higher, as it is time-consuming to take, select and upload photos, write the description, generate graphics, and fill in the forms required to post all of these to the auction webpage. 4 Figure 2 shows part of the listing by a private seller who used ebay s standard listing template. In addition to the standardized information provided in the table at the top of the listing, which includes model, year, mileage, color and other features, the seller may also provide his own detailed description. In this case the seller has done poorly, providing just over three lines of text. He also provided just three photos. This car, a 1998 Ford Mustang with miles on the odometer, received only four bids, with the highest bid being $1225. Figure 3 shows another 1998 Ford Mustang, in this case with only miles. The seller is a professional car dealership, and has used proprietary software to create a customized auction webpage. This seller has provided a number of pieces of useful information for buyers: a text based description of the history of the car, a full itemization of the car features in a table, a free vehicle history report through CARFAX, and a description of the seller s dealership. This listing also included 28 photos of the vehicle. The car was sold for $6875 after 37 bids were placed on it. The difference between these outcomes is striking. The car listed with a detailed and informative webpage sold for $1775 more than the Kelley Blue Book private party value, while that with a poor webpage sold for $3475 less than the corresponding Kelley Blue Book value. This 3 The remainder are either re-listed (35%), sold elsewhere or not traded. For the car models in my data, sales are more prevalent, with about 30% sold on their first listing. 4 Professional car dealers will typically use advanced listing management software to limit these costs. Such software is offered by CARad (a subsidiary of ebay), ebizautos and Auction123. 7

In the auction, the seller received only four bids, with the highest bid being $1225. The Kelley Blue Book private party value, for comparison, is $4700.

9 Figure 2: Example of a poor auction webpage This webpage consists mainly of the standardized information required for a listing on ebay. The seller has provided little additional information about either the car or himself. Only three photos of the car were provided. In the auction, the seller received only four bids, with the highest bid being $1225. The Kelley Blue Book private party value, for comparison, is $4700. Figure 3: Example of a good auction webpage The dealer selling this vehicle has used proprietary software to create a professional and detailed listing. The item description seen on the right, contains information on the car, a list of all the options included, warranty information, and a free CARFAX report on the vehicle history. The dealer also posted 28 photos of the car. This car received 37 bids, and was sold for $6875. The Kelley Blue Book retail price, for comparison, is $7100, while the private party value is $

10 Figure 4: Additional information The left panel shows a graphic that details the exterior condition of the vehicle. The right panel shows the Kelley Blue Book information for the model-year of vehicle being auctioned. suggests that differences in the quality of the webpage can be important for auction prices. How can webpages differ? Important sources of differences include the content of the textbased description, the number of photos provided, the information displayed in graphics, and information from outside companies such as CARFAX. Examples of the the latter two follow in Figure 4. In the left panel I show a graphic showing the condition of the exterior of the car. In the right panel, I show the Kelley Blue Book information posted by a seller in order to inform buyers about the retail value of the vehicle. It is clear that many of these pieces of information, such as photos and information from outside institutions, are hard to fake. Moreover, potential buyers have a number of opportunities to verify this information. Potential buyers may also choose to acquire vehicle history reports through CARFAX or Autocheck.com, or purchase a third party vehicle inspection from an online service for about $100. Bidders may query the seller on particular details through direct communication via or telephone. Finally, blatant misrepresentation by sellers is subject to prosecution as fraud. In view of these institutional details, I will treat seller revelations as credible throughout this paper. All of these sources of information - the text, photos, graphics and information from outside institutions, as well as any private information acquired - act to limit the asymmetric information problem faced by bidders. The question I wish to examine is how important these sources of information are for auction prices. This question can only be answered empirically, and so it is to the data that I now turn. 9

11 3 Empirical Regularities 3.1 Data and Variables I collected data from ebay Motors auctions over a six month period. 5 I drop observations with nonstandard or missing data, re-listings of cars, and those auctions which received less than two bids. 6 I also drop auctions in which the webpage was created using proprietary software. 7 The resulting dataset consists of over observations of 18 models of vehicle. The models of vehicle may be grouped into three main types: those which are high volume Japanese cars (e.g. Honda Accord, Toyota Corolla), a group of vintage and newer muscle cars (e.g. Corvette, Mustang), and most major models of pickup truck (e.g. Ford F-series, Dodge Ram). The data contain a number of item characteristics including model, year, mileage, transmission and title information. I also observe whether the vehicle had an existing warranty, and whether the seller has certified the car as inspected. 8 In addition, I have data on the seller s ebay feedback and whether they provided a phone number and address in their listing. I can also observe whether the seller has paid an additional amount to have the car listing featured, which means that it will be given priority in searches and highlighted. Two of the most important types of information that can be provided about the vehicle by the seller are the vehicle description and the photos. I consider two simple quantitative measures of these forms of information content. The first measure is the number of bytes of text in the vehicle description provided by the seller. The second measure is the number of photos posted on the webpage. I use both of these variables in the hedonic regressions in the section below. 5 I downloaded the auction webpages for the car models of interest every twenty days, recovered the urls for the bid history, and downloaded the history. I implemented a pattern matching algorithm in Python to parse the html code and obtain the auction variables. 6 I do this because in order to estimate my structural auction model, I need at least two bids per auction. Examining the data, I find that those auctions with less than two bids are characterized by extremely high starting bids (effectively reserves), but are otherwise similar to the rest of the data. One respect in which there is a significant difference is that in these auctions the seller has provided less text and photos. This is in accordance with the logic detailed in the rest of the paper. 7 Webpages created using proprietary software are often based on standardized templates, and some of the text of the item description is not specific to the item being listed. Since one of my information measures is the number of bytes of text, comparisons of standard ebay listings with those generated by advanced software are not meaningful. 8 This certification indicates that the seller has either hired an outside company to perform a car inspection. The results of the inspection can be requested by potential buyers. 10

12 3.2 Relating Prices and Information In order to determine the relationship between the amount of text and photos and the final auction price, I run a number of hedonic regressions. Each specification has the standard form: log(p j ) = z j β + ε (1) where z j is a vector of covariates. 9 I estimate the relationship via ordinary least squares (OLS), including model and title fixed effects. 10 I report the results in Table 1, suppressing the fixed effects. In the first specification, the vector of covariates includes standard variables, such as age, mileage and transmission, as well as the log number of bytes of text and the number of photos. The coefficients generally have the expected sign and all are highly significant. 11 Of particular interest is the sheer magnitude of the coefficients on the amount of text and photos. I find that postings with twice the amount of text sell for 8.7% more, which for the average car in the dataset is over $800 more. Likewise, each additional photo on the webpage is associated with a selling price that is 2.6% higher - a huge effect. I am not asserting a causal relationship between the amount of text and photos and the auction price. My claim is that buyers are using the content of the text and photos to make an informed decision as to the car s value. The content of this information includes the car s options, the condition of the exterior of the vehicle, and vehicle history and usage, all of which are strong determinants of car value. Moreover, as I will show in the structural model below, buyers will in equilibrium interpret the absence of information as a bad signal about vehicle quality, and adjust their bids downwards accordingly. I will also show that the amount of information is an effective proxy for the content of that information. Thus my interpretation of this regression is that the large coefficients on text and photos arise because these measures are proxying for attributes of the vehicle that are unobserved by the econometrician. In the next four regressions, I explore alternate explanations. The first alternative is that this is simply a marketing effect, whereby slick webpages with a lot of text and photos attract more bidders and thus the cars sell for higher prices. I therefore control for the number of bidders, and also add a dummy for whether the car was a featured listing. I find that adding these controls does reduce the information coefficients, but they still remain very large. A 9 I use the log of price instead of price as the dependent variable so that it has range over the entire real line. 10 The seller may report the title of the car as clean, salvage, other or unspecified. The buyer may check this information through CARFAX. 11 One might have expected manual transmission to enter with a negative coefficient, but I have a large number of convertible cars and pickups in my dataset, and for these, a manual transmission may be preferable. 11

13 Table 1: Hedonic Regressions (1) (2) (3) (4) (5) Age (0.0019) (0.0018) (0.0018) (0.0018) (0.0020) Age Squared (0.0000) (0.0000) (0.0000) (0.0000) (0.0000) Log of Miles (0.0041) (0.0040) (0.0040) (0.0040) (0.0041) Log of Text Size (in bytes) (0.0073) (0.0072) (0.0075) (0.0076) (0.0085) Number of Photos (0.0009) (0.0009) (0.0009) (0.0009) (0.0010) Manual Transmission (0.0124) (0.0122) (0.0122) (0.0122) (0.0121) Featured Listing (0.0211) (0.0210) (0.0211) (0.0210) Number of Bidders (0.0015) (0.0015) (0.0015) (0.0015) Log Feedback (0.0033) (0.0033) (0.0033) Percentage Negative Feedback (0.0015) (0.0015) (0.0015) Total Number of Listings (0.0015) (0.0015) Phone Number Provided (0.0123) (0.0122) Address Provided (0.0183) (0.0182) Warranty (0.1638) Warranty*Logtext (0.0248) Warranty*Photos (0.0031) Inspection (0.1166) Inspection*Logtext (0.0176) Inspection*Photos (0.0022) Constant (0.0995) (0.0980) (0.0980) (0.0980) (0.1025) Estimated standard errors are given in parentheses. The model fits well, with the R 2 ranging from 0.58 in (1) to 0.61 in (5). The nested specifications (2)-(4) include controls for marketing effects, seller feedback and dealer status. Specification (5) includes warranty and inspection dummies and their interactions with the information measures. 12

14 second explanation is that the amount of text and photos are somehow correlated with seller feedback ratings, which are often significant determinants of prices on ebay. I find not only that including these controls has little effect on the information coefficients, but that for this specific market the effects of seller reputation are extremely weak. The percentage negative feedback has a very small negative effect on price, while the coefficient on total log feedback is negative, which is the opposite of what one would expect. A possible reason for these results is that for the used car market, the volume of transactions for any particular seller is small and this makes seller feedback a weak measure of seller reputation. Another problem is that feedback is acquired through all forms of transaction, and potential buyers may only be interested on feedback resulting from car auctions. Returning to the question of alternative explanations, a third possibility is that frequent sellers such as car dealerships tend to produce better webpages, since they have stronger incentives to develop a good template than less active sellers. Then if buyers prefer to buy from professional car dealers, I may be picking up this preference rather than the effects of information content. To control for this, I include the total number of listings by the seller in my dataset as a covariate, as well as whether they posted a phone number and an address. All of these measures track whether the seller is a professional car dealer or not. I report the results in column (4). I find that although posting a phone number is valuable (perhaps because it facilitates communication), both the number of listings and the address have negative coefficients. That is, there is little evidence that buyers are willing to pay more to buy from a frequent lister or a firm that posts its address. Again, the coefficients on text and photos remain large and significant. Finally, I try to test my hypothesis that text and photos matter because they inform potential buyers about car attributes that are unobserved by the econometrician. To do this, I include dummies for whether the car is under warranty, and has been certified by the seller as inspected. Both of these variables should reduce the value of information, because the details of car condition and history become less important if one knows that the car is under warranty or has already been certified as being in excellent condition. In line with this idea, I include interaction terms between the warranty and inspection dummies and the amount of text and photos. The results in column (5) indicate that the interaction terms are significantly negative as expected, while both the inspection and warranty significantly raise the expected price of the vehicle. For cars that have neither been inspected nor are under warranty, the coefficients on text and photos increase. There are problems with a simple hedonic regression, and in order to deal with the problem of bids being strategic and different from the underlying valuations, I will need a structural 13

15 model. I will also need to demonstrate that my measures of the quantity of information are effective proxies for the content of that information. Yet the reduced form makes the point that the amount of text and photos are positively related to auction price, and that the most compelling explanation for this is that their content gives valuable information to potential buyers. 4 Theory Modeling demand on ebay is not a trivial task. The auction framework has a bewildering array of institutional features, such as proxy bidding, secret reserves and sniping software. With this in mind, it is critical to focus on the motivating question: how does seller revelation of information through the auction webpage affect bidder beliefs about car value, and thereby behavior and prices? I make three important modeling choices. The first is to model an auction on ebay Motors as a symmetric common value auction. In this model, agents have identical preferences, but are all uncertain as to the value of the object upon which they are bidding. The bidders begin with a common prior distribution over the object s value, but then draw different private signals, which generates differences in bidding behavior. This lies in contrast to a private values model, in which agents are certain of their private valuation, and differences in bidding behavior result from differences in these private valuations. I believe that the common value framework is better suited to the case of ebay Motors than the private values framework for a number of reasons. First, it allows bidders to be uncertain about the value of the item they are bidding on, and this uncertainty is necessary for any model that incorporates asymmetric information and the potential for adverse selection. Second, the data support the choice of a common values model. I show this in the appendix, where I implement the Athey and Haile (2002) test of symmetric private values against symmetric common values, and reject the private value null hypothesis at the 5% level. 12 Third, it provides a clean framework within which one may think clearly about the role of the auction webpage in this market. The auction webpage is the starting point for all bidders in an online auction, and can thus be modeled as the source of the common prior that bidders have for the object s value. I can identify the role of information on ebay Motors by examining its effect on the distribution of the common prior. Finally, though market participants on ebay Motors 12 The test is based on the idea that the Winner s Curse is present in common values auctions, but not in private values auctions. This motivates a test based on stochastic dominance relationships between bid distributions as the number of bidders n, and thus the Winner s Curse, increases. 14

16 will in general have different preferences, it may be reasonable to assume that those bidders who have selected into any particular auction have similar preferences. This is captured in the common values assumption. 13 The second modeling choice is to use a modified version of the Bajari and Hortaçsu (2003) model of late bidding on ebay. On ebay Motors, I observe many bids in the early stages of the auction that seem speculative and unreflective of underlying valuations; while in the late stages of the auction, I observe more realistic bids. With late bidding, bidders often do not have time to best respond to opponent s play. Sniping software can be set to bid on a player s behalf up to one second before the end of the auction, and bidders cannot best respond to these late bids. Bajari and Hortaçsu (2003) capture these ideas in a two stage model, in which bidders may observe each other s bids in the early stage, but play the second stage as though it were a sealed bid second price auction. I vary from their model by taking account of the timing of bids in the late stage, since a bid will only be recorded if it is above the current price. 14 Accounting for timing is important, as an example will make clear. Suppose that during the early stage of an auction bidder A makes a speculative bid of $2000. As the price rises and it becomes obvious this bid will not suffice, she programs her sniping software to bid her valuation of $5600 in the last five seconds of an auction. Now suppose that bidder B bids $6000 an hour before the end of the auction, and so when bidder A s software attempts to make a bid on her behalf at the end of the auction, the bid is below the standing price and hence not recorded. The highest bid by bidder A observed during the auction is thus $2000. An econometrician who treats an ebay auction as simply being a second price sealed bid auction may assume that bidder A s valuation of the object is $2000, rather than $5600. This will lead to inconsistent parameter estimates. I show that regardless of bid timing, the final price in the auction with late bidding is the valuation of the bidder with the second highest private signal. This allows me to construct a robust estimation procedure based on auction prices, rather than the full distribution of bids. Lastly, I allow for asymmetric information, and let the seller privately know the realizations of a number of signals affiliated with the value of the car. One may think of these signals as describing the car history, condition and features. The seller may choose which of these signals to reveal on the auction webpage, but since writing detailed webpages and taking photos is costly, such revelations are each associated with a revelation cost. In line with my earlier description of the institutional details of ebay Motors, all revelations made by the seller are 13 One could consider a more general model of affiliated values, incorporating both common and private value elements. Unfortunately, without strong assumptions, such a model will tend to be unidentified. 14 Song (2004) also makes this observation, and uses it to argue that in the independent private values context (IPV), the only value that will be recorded with certainty is that of the second highest bidder. 15

17 modeled as being credible. This assumption prevents the model from being a signaling model in the style of Spence (1973). 15 Instead, this setup is similar to the models of Grossman (1981) and Milgrom (1981), in which a seller strategically and costlessly makes ex-post verifiable statements about product quality to influence the decision of a buyer. In these models, an unravelling result obtains, whereby buyers are skeptical and assume the worst in the absence of information, and consequently sellers reveal all their private information. My model differs in three important respects from these benchmarks. First, I allow for revelation costs. This limits the incentives for sellers to reveal information, and thus the unravelling argument no longer holds. Second, the parties receiving information in my model are bidders, who then proceed to participate in an auction, inducing a multi-stage game. Finally, I allow the seller s private information to be multidimensional. While each of these modifications has received treatment elsewhere 16 there is no unified treatment of revelation games with all these features, and attempting a full analysis of the resulting multi-stage game is beyond the scope of this paper. Instead I provide results for the case in which the relationship between the seller s private information and the car s value is additively separable in the signals. This assumption will be highlighted in the formal analysis below. 4.1 An Auction Model with Late Bidding Consider a symmetric common values auction model, as in Wilson (1977). There are N symmetric bidders, bidding for an object of unknown common value V. Bidders have a common prior distribution for V, denoted F. Each bidder i is endowed with a private signal X i, and these signals are conditionally i.i.d., with common conditional distribution G v. The variables V and X = X 1, X 2 X n are affiliated. 17 The realization of the common value is denoted v, and the realization of each private signal is denoted x i. The realized number of bidders n is common knowledge. 18 One may think of the bidders common prior as being based on the content of the auction webpage. Their private signals may come from a variety of sources 15 In those games, the sender may manipulate the signals she sends in order to convince the receiver that she is of high type. Here the sender is constrained to either truthfully reveal the signal realization, or to not reveal it at all. 16 See for example Jovanovic (1982) for costly revelation, Okuno-Fujiwara, Postlewaite, and Suzumura (1990) for revelations in a multi-stage game and Milgrom and Roberts (1986) for multidimensional signal revelation. 17 This property is equivalent to the log supermodularity of the joint density of V and X. A definition of log supermodularity is provided in the appendix. 18 This assumption is somewhat unrealistic, as bidders in an ebay auction can never be sure of how many other bidders they are competing with. The alternative is to let auction participants form beliefs about n, where those beliefs are based on a first stage entry game, as in Levin and Smith (1994). But dealing with endogenous entry in this way has certain problems in an ebay context (see Song (2004)), and thus I choose to go with the simpler model and make n common knowledge. 16

18 such as private communications with the seller through or phone and the results of inspections they have contracted for or performed in person. Bidding takes place over a fixed auction time period [0, τ]. At any time during the auction, t, the standing price p t is posted on the auction website. The standing price at t is given by the second highest bid submitted during the period [0, t). Bidding takes place in two stages. In the early stage [0, τ ɛ), the auction takes the form of an open exit ascending auction. Bidders may observe the bidding history, and in particular observe the prices at which individuals drop out of the early stage. In the late stage [τ ɛ, τ], all bidders are able to bid again. This time, however, they don t have time to observe each other s bidding behavior. They may each submit a single bid, and the timing of their submission is assumed to be randomly distributed over the time interval [τ ɛ, τ]. 19 If their bid b it is lower than the standing price p t, their bid is not recorded. The following proposition summarizes the relevant equilibrium behavior in this model: Proposition 1 (Equilibrium without Revelation) There exists a symmetric Bayes-Nash equilibrium in which: (a) All bidders bid zero during the early stage of the auction. (b) All bidders bid their pseudo value during the late stage, where their pseudo value is defined by v(x, x; n) = E[V X i = x, Y = x, N = n] (2) and Y = max j i X j is the maximum of the other bidder s signals. (c) The final price is p τ is given by p τ = v(x (n 1:n), x (n 1:n) ; n) (3) where x (n 1:n) denotes the second highest realization of the signals {X i }. The idea is that bidders have no incentive to bid a positive amount during the early stage of the auction, as this may reveal their private information to the other bidders. Thus they bid only in the late stage, bidding as in a second price sealed bid auction. The second highest bid will certainly be recorded, and I obtain an explicit formula for the final price in the auction as the bidding function evaluated at the second highest signal. This formula will form the basis of my estimation strategy later in the paper. In the model thus far, though, I have not allowed the seller to reveal his private information. I extend the model to accommodate information asymmetry and seller revelation in the next section. 19 Allowing bidders to choose the timing of their bid would not change equilibrium behavior, and would unnecessarily complicate the argument. 17

19 4.2 Information Revelation Let the seller privately observe a vector of real-valued signals S = S 1 S m with common bounded support [s, s]. As with the bidder s private information, these signals are conditionally i.i.d with common distribution H v and are conditionally independent of the bidder s private signals X 1 X n. Let V and S = S 1 S m be affiliated. The seller may report his signals to the bidders prior to the start of the auction, and bidders treat all such revelations as credible. The seller s objective is to maximize his expected revenue from the auction. A (pure strategy) reporting policy for the seller is a mapping R : R m {0, 1} m, where reporting the signal is denoted 1 and not reporting is denoted 0. Letting s = s 1 s m be a realization of the signal vector S = S 1 S m, define I = #{i : R i (s) = 1} as the number of signals reported by the seller. Finally, let the seller face a fixed revelation cost c i for revealing the signal S i, and let the associated cost vector be c. Revelation costs are common knowledge. This setup generates a multi-stage game in which the seller first makes credible and public statements about the object to be sold, and then bidders update their priors and participate in an auction. Since each revelation by the seller is posted to the auction webpage, there is a natural link between the number of signals I and my empirical measures of information content, the number of photos and bytes of text on the webpage. Define the updated pseudo value w(x, y, s; n) = E[V X = x, Y = y, S = s, N = n] where Y = max j i X j is the maximum of the other bidder s signals. Then in the absence of revelation costs, the equilibrium reporting policy for the seller is to reveal all his information: Proposition 2 (Unravelling) For c = 0, there exists a sequential equilibrium in which the seller reports all his signals (R(s) = 1 s), bidders bid zero in the early stage of the auction, and bid their updated pseudo value w(x, x, s; n) in the late stage. This result seems counter-intuitive. It implies that sellers with lemons - cars with a bad history or in poor condition - will disclose this information to buyers, which runs counter to the intuition behind models of adverse selection. It also implies that the quantity of information revealed on the auction webpage should be unrelated to the value of the car, since sellers will fully disclose, regardless of the value of the car. Since I know from the hedonic regressions that price and information measures are positively related, it suggests that the costless revelation model is ill suited to explaining behavior in this market. Revealing credible information is a costly activity for the seller, and thus sellers with lemons have few incentives to do so. I can account for this by considering the case with strictly positive revelation costs, but this generates a non-trivial decision problem for the seller. In order to make the equilibrium analysis tractable, I make a simplifying assumption. 18

20 Assumption 1 (Constant Differences) The function w(x, y, s; n) exhibits constant differences in s i for all (x, y, s i ). That is, there exist functions i (s 1, s 0 ; n), i = 1 m, such that: i (s 1, s 0 ; n) = w(x, y, s i, s 1 ; n) w(x, y, s i, s 0 ; n) (x, y, s i ) This assumption says that from the bidder s point of view, the implied value of each signal revealed by the seller is independent of all other signals the bidder observes. 20 The extent to which this assumption is reasonable depends on the extent to which car features, history and condition are substitutes or complements for value. Under this assumption, one may obtain a clean characterization of equilibrium. Proposition 3 (Costly Information Revelation) For c > 0, there exists a sequential equilibrium in which: (a) The reporting policy is characterized by a vector of cutoffs t = t 1 t m with 1 if s i t i R i (s) = 0 otherwise where t i = inf {t : E [ i (t, t ; n) t < t] c i }. (b) Bidders bid zero during the early stage of the auction, and bid v R (x, x, s R ; n) = E[w(x, x, s; n) {s i < t i } i U ] during the late stage, where s R is the vector of reported signals and U = {i : R(s i ) = 0} is the set of unreported signals. (c) The final auction price is p τ = v R (x (n 1:n), x (n 1:n), s R ; n). In contrast to the result with costless revelation, this equilibrium is characterized by partial revelation, where sellers choose to reveal only sufficiently favorable private information. Bidders (correctly) interpret the absence of information as a bad signal about the value of the car, and adjust their bids accordingly. Then the number of signals I revealed by the seller is positively correlated with the value of the car. Corollary 1 (Expected Monotonicity) E[V I] is increasing in I. I prove the result by showing that the seller s equilibrium strategy induces affiliation between V and I. This result is useful for my empirical analysis, as it establishes a formal link between 20 A sufficient condition for the constant differences assumption to hold is that the conditional mean E[V X, S] is additively separable in the signals S, so that E[V X = x, S = s] = P m j=1 fj(sj) + g(x) for some increasing real-valued functions f j and g. 19