Revealing a Preference for Mixing: An Experimental Study of Risk

Size: px
Start display at page:

Download "Revealing a Preference for Mixing: An Experimental Study of Risk"

Transcription

1 Revealing a Preference for Mixing: An Experimental Study of Risk Paul Feldman UC San Diego John Rehbeck Ohio State University This Version: November 27, 2018 Click for latest version Abstract Behavioral theories can be distinguished by different attitudes towards mixing between lotteries. Using a revealed preference approach, we conduct an individual choice experiment in which subjects choose from different linear budgets covering the space of three-outcome lotteries. Using experimental choices, we determine whether behavior can be explained by any increasing concave in the probabilities utility model and whether mixing behavior is consistent with a preference for randomization. The main finding is pervasive evidence of a preference for non-degenerate mixing over lotteries and that these choices can be organized according to revealed preferences. Further, we relate this result to previous experimental evidence of a preference for randomization non-constant choices in repeated binary decisions to mixing behavior in budgets. Moreover, we test the out-of-sample predictive accuracy of various models on our data. The overall results lend clear support for the convexity of risk-preferences that has implications for a range of topics, from incentive design to game theory. JEL classification: C91, D81, D91 Keywords: Probability Weighting, Cumulative Prospect Theory and Risk Preferences. We are very grateful to Mark Machina and Charlie Sprenger for all their excellent suggestions. We would also like to thank Marina Agranov, Jim Andreoni, David Blau, Zach Breig, Mehmet Caner, Mauricio Fernández- Duque, P.J. Healy, Kirby Nielsen and Joel Sobel for their thoughtful comments. Excellent research assistance was provided by Jacqueline Vallon. pfeldman@ucsd.edu. rehbeck.7@osu.edu

2 1 Introduction Identifying risk preferences is critical to our understanding of decision-making. The foundation on which most of modern economic theory rests is Expected utility (EU). As elegant, parsimonious, and normatively appealing as the theory might be, it has faced clear challenges when describing actual behavior. 1 More recently, EU behavior is being questioned on descriptive grounds since it appears individuals strictly prefer mixtures which implies a preference for randomization (Agranov and Ortoleva, 2017; Dwenger, Kübler and Weizsäcker, 2018). Behavioral theories deviate from the EU framework, often by relaxing the independence axiom (Machina, 1982). Frequent among the relaxations is to allow for nonlinearities in the probabilities. Depending on the nature of nonlinearities, agents will either react positively or negatively to mixing between two different lotteries. Linear preferences like expected utility imply mixing only occurs under indifference because linear combinations of lotteries are only indifferent when their component lotteries are indifferent. 2 To contribute to our understanding of risk preferences, we run an experiment identifying whether agents strictly prefer to mix. We implement an experimental design specifically using convex budgets over lotteries 3,4 that elicits attitudes toward mixtures over final-stage lotteries. Individuals are faced with a safer lottery and a riskier lottery and choose any convex combination of the two. As in other areas of experimental research, the decision environment calls for a much richer individual-level data set of choices over distributions than standard designs, which elicit preferences, or more precisely elicit risk-attitudes, via binary choices. In our within-subject design, we examine choices made on seventy nine different convex budgets over lotteries with three monetary outcomes. We chose our budgets in a way that resembles the standard consumer problem, with linear prices, so that we can examine standard 1 The most famous example is the paradox introduced in Allais (1953). 2 Formally, let p and q denote two lotteries and indifference with. If p λp + (1 λ)q then p q. This result follows from applying the independence axiom twice, or its weaker version: betweenness. 3 E.g. Sopher and Narramore (2000); Andreoni and Miller (2002); Choi, Fisman, Gale and Kariv (2007); Andreoni and Sprenger (2012); Burghart, Epper and Fehr (2015). 4 Formally, given two lotteries p, q, the convex budget is the set of lotteries on the probability simplex ( ), i.e. {r : r = λp + (1 λ)q}. 1

3 objects such as the price-offer curve, but now in the domain of lotteries. If individuals are expected utility maximizers, there is nothing interesting to learn from enhancing the choice set from binary to convex, as responses will be corner solutions almost surely. This might help clarify why most previous studies restricted themselves to binary choices. If, however, individuals constantly prefer to mix between different lotteries, then we can infer meaningful information about the shape of their preferences. For instance, anyone with a concave in the probabilities utility function, and consequently with quasiconcave indifference curves, has a preference for mixing. In contrast, linear and strictly quasiconvex indifference curves imply mixing only occurs either under indifference or never, respectively. The separating prediction between various behavioral theories is thus predicated on attitudes towards mixtures. We exploit this novel data set and test for rationality using a revealed preference approach to prove that observed choices are not purely random. Hence, whether an increasing (consistent with first order stochastic dominance) and concave in the probabilities utility function can represent preferences. One concern, when trying to elicit a preference for mixtures, is that individuals may be choosing at random so that we are only collecting noise. We show that this is not the case by examining revealed preference tests of rationality. In particular, we find that individuals have behavior that is significantly closer to rational behavior most of their choices can be rationalized than to random choices. Thus, we conclude this is indeed a preference for a mixture of lotteries. Another difficultly is that mixing between lotteries may appear as driven by a desire to randomize subjects mix to maximize a preference but is in fact, driven by unstable preferences or mistakes 5. To explore this hypothesis, subjects faced repeated binary decisions between two boundary lotteries under advanced knowledge that the repetition would take place. In the cases where randomization is desired we would expect non-constant choices. By relating behavior between these known repetitions and the analogous convex budgets, we test for consistency 5 Machina (1985) elaborates on this distinction between deterministic a desire to randomize and stochastic choice. 2

4 between the tasks and provide further evidence of this desire. Under our hypothesis, mixing in the convex task should lead to non-constant choice in the binary task, in an effort to mimic the former. To overcome potential shortcomings from previous experiments such as a primary focus on parametric identification and a failure to control sufficiently for individual heterogeneity we primarily use non-parametric tests which are not aimed at refuting a specific model. In addition, we use a within-subject design, collect a detailed amount of individual data, and conduct most of our analysis at the individual level. To deal with known confounds, we also design incentives so they can be incentive compatible. One hundred and forty four University of California, San Diego undergraduates participated in our experiment. Each participant completed the convex tasks, in different orders, and six additional binary tasks in around an hour. For each task, subjects were presented with a menu of possible final-stage distributions over three monetary outcomes: $2, $10 and $30. All lotteries were represented visually by interactive pie-charts. The design is aimed at inducing subjects to select their preferred lottery, potentially identifying their preferred mixture, from each convex menu. Foresight in the binary tasks lead subjects to select their preferred mixture over the known repetitions. We explore mixing preferences descriptively, non-parametrically and parametrically and find echoes of the next few results in all these methods. This paper adds to the debate on preferences for mixtures in several ways: one, people mix a lot: 94% of subjects mix in at least one task, and the average percentage of subjects that mix in a task is 44%; two, this behavior is systematically different from random choices, using a standard benchmark for rationality we find that 98.6% of our subjects mix in a nonrandom way and a binary preference relation can rationalize 91.6% of each subject s choices (on average); three, the elicitation method is not driving this since repeated binary choice is close to the selected mixtures, of the individuals that give non-constant answers in the binary-choice tasks, 73% choose a mixture in the corresponding convex tasks. Thus, our main result is that mixing attitudes are prevalent, consistent with rationality and a desire for randomization. 3

5 The central finding from our convex budget analysis is that choices are consistent with a preference for mixtures. Our results indicate that mixing behavior is widespread and pervasive, consistent with rationality, and that our assumption of a deterministic model is justified. These models are often interpreted as featuring purposeful randomization(machina, 1985; Cerreia-Vioglio, Dillenberger, Ortoleva and Riella, 2017). We bolster this interpretation with the previous evidence from the additional binary-choice task and find that subjects, who gave non-constant answers, were disproportionately consistent with an apparent preferences for mixtures in the convex tasks. Therefore, akin to the standard assumption of convexity of preferences in consumer theory, we find our subjects choices are revealing a preference for mixtures between lotteries. These results seemingly carry over to our different analysis whether they are parametric, non-parametric or descriptive and are robust to different specifications and measures. A natural question is: what quasiconcave behavioral alternative could best explain the data? In an effort to sort between different alternative non-expected utility theories, and rank the alternatives according to their predictive accuracy, we perform a cross-validation exercise (Hastie, Tibshirani and Friedman, 2009; Arlot, Celisse et al., 2010). The models under consideration are expected utility (Von Neumann and Morgenstern, 1944), disappointment aversion (Gul, 1991), cautious expected utility (Cerreia-Vioglio, Dillenberger and Ortoleva, 2015) and rank-dependent utility (Quiggin, 1982) 6. By basing our model-selection exercise on out-of-sample prediction, as opposed to in-sample-fit, we discriminate models based on whether they predict individual-choice behavior and not because models might have some unsensible implications. This procedure also punishes models for overfiiting when this better fit has no explanatory power. Estimated models of stochastic reference dependence with loss-tolerant preferences yield 6 I.e. various probability weighting functions with the rank-dependent fix. Rank-dependence fixes the cumulative distributions so that the implied utility functions are consistent with monotonicity. As this class is quite broad, we restricted ourselves to specific functional forms. The functional forms we explored are a version of stochastic reference dependence(koszegi and Rabin, 2007), cumulative prospect theory(tversky and Kahneman, 1992) and power weighting(hey and Orme, 1994). 4

6 the greatest predictive power. Our reduced form results indicate that a quasiconcave specification has the potential to explain observed behavior. Given that we also find that choices are consistent with an Allais-type effect, i.e. that mixing will favor safer lotteries at certainty, a concave probability-weights are foreshadowed. 7 Our exercise has several antecedents in the prior literature. Sopher and Narramore (2000) were the first to examine choice behavior on convex lottery sets, and found that choice-behavior is mostly constant across repetitions and permutations of these sets. Agranov and Ortoleva (2017) found that choice behavior is non-constant for binary-choice tasks, whether repetitions are known or not, and that this non-constant choice is correlated with the willingness to purchase randomization devices which allow individuals to select mixtures when faced with binary choices. Lastly, Dwenger et al. (2018) found extensive evidence, in both the field and the lab, that individuals prefer randomization devices. We contribute to this literature by correlating mixing behavior in a convex-choice task to non-constant behavior in a related binary-choice task. In the related binary-choice task, subjects faced known repetitions of the component lotteries of the convex task. Therefore, the related binary task can induce a subset of the possible mixtures from the convex task. This provides correlational evidence that non-constant choice in the repeated binary tasks evidence of a desire to randomize could be driven by a preference for mixtures. Our results could reconcile a few empirical puzzles. The first puzzle, due to Markowitz (1952), is that lotteries with multiple prizes being provided by the market is contradicted by expected utility in fact, any linear or quasiconvex specification. Under expected utility, adding more than two outcomes to a lottery is never lucrative. 8 If agents favor mixtures, then they have a preference for specific multiple-outcome lotteries.quiggin (1991) discusses how to 7 Wakker (1994) shows that the quasiconcavity of indifference curves is equivalent to concave probability weighting functions under rank-dependence. Further, it can easily be shown that under the convention of ranking outcomes from best to worst, concave probability weighting functions also imply an Allais-type effect as discussed in Appendix 9.8. Further, Masatlioglu and Raymond (2016) prove the equivalence between stochastic reference dependent model, under choice acclimating personal equilibrium and loss-tolerance, and a rankdependent representation, with a concave probability weighting function. 8 Consider a Firm that wants to design a lottery to extract maximum revenue from a representative consumer. It can only offer a fixed set of monetary outcomes X {0} and wants to keep (expected) costs fixed at k. 5

7 extend some of the behavioral alternatives identified in this paper to the design of optimal lotteries. The second puzzle is the lack of empirical support for the frequencies of play implied by mixed strategy Nash equilibria(levitt, List and Reiley, 2010). Again, under a preference for mixtures, existence is maintained (Crawford, 1990); nevertheless, both the procedure to find them and the implied frequencies can be different. In particular, indifference between pure strategies is no longer a prerequisite for agents wantingly mixing between strategies. Further, casual introspection will confirm we often favor small risk mixtures that afford us small chances of extremely long shots, even when they are hardly justified by their cost. The paper proceeds as follows: Section 2 describes the theory on risk preferences; Section 3 describes the experimental procedures; Section 4 provides a descriptive analysis of the data; Section 5 provides a revealed preference test of individual level data and evidence that this mixing behavior is consitent with a desire to randomize; Section 6 performs a model selection exercise on predictive accuracy restricted to preferences that are convex; Section 7 elaborates how our paper fits within the broader experimental context; and Section 8 gives our final remarks. 2 Theoretical Preliminaries Before describing the experiment, we discuss some standard features of preferences. In this paper, we study preferences over lotteries when there are three distinct monetary prizes: A low prize (x L ), a middle prize (x M ), and a high prize (x H ) where x L < x M < x H. For the experiment, we set the low, middle, and high prizes at $2, $10, and $30 respectively. We refer to the low and high prize as tail prizes, while we refer to the middle prize as the centered prize. Lotteries over Therefore, the firm solves: max p (X ) s.t. p i u i i p i x i = k, i w/ u i = u(x i ). Given that this problem is linear in the probabilities, for both the utility function and the constraints, Dantzig (1963) s simplex algorithm guarantees that if optimal solutions exist then one is at a corner. Therefore there exists an optimal solution with at most two outcomes. 6

8 the prizes are denoted by the vector p = (p L, p M, p H ) where p L is the probability of receiving the low prize, p M is the probability of receiving the middle prize, and p H is the probability of receiving the high prize. Since the vector p is a lottery, the entries of p are non-negative and sum to one. Within the paper, we use a standard Marschak-Machina (MM) triangle (Marschak, 1950; Machina, 1982) to represent the space of lotteries and indifference curves. An example of the MM triangle can be seen in Figure 1. A key insight of the MM triangle is that a lottery over n prizes can be represented in an (n 1)-dimensional space, since probabilities sum to one. In the case of three outcomes, we represent the probability of receiving the high prize on the vertical axis and the probability of receiving the low prize on the horizontal axis. Therefore, the point (0,0) represents the middle prize with certainty, while (0,1) represents the low prize with certainty and, (1,0) represents the high prize with certainty. x H p H increasing preferences - - iso-p M lines FOSD lotteries p E endowment lottery p S spread lottery budget lines x M p L x L Figure 1: Marschak-Machina (MM) Triangle We now highlight some features of the MM triangle that can be seen in Figure 1. First, the dashed lines in Figure 1 have the same probability of receiving the middle prize and we refer to them as iso-p M lines. As the lines move northeast, the level of p M decreases. Next, first order stochastic dominance is easily represented in the MM triangle: as any movement north or west of a lottery decreases the probability of receiving the lowest prize and first order 7

9 stochastic dominates the original lottery. Thus, the set of points that first order stochastic dominate a lottery is given by the points that are northwest of the lottery. An example of first order stochastic dominating (FOSD) lotteries is shown in Figure 1. Indifference curves can have a variety of shapes in the MM triangle. Assuming first order stochastically dominance, more preferred bundles lie to the northwest as indicated by the arrow pointing in the direction of increasing preferences in Figure 1. We recall risk aversion/tolerant behavior, under EU, is derived from the slope of an indifference curve. As the curve gets steeper an individual becomes more risk averse (less risk tolerant). The intuition is that steeper indifference curves imply that a larger increase in the likelihood of the highest outcome is necessary to compensate for an increase in the likelihood of the lowest outcome. Note that under more general shapes, we can still derive a local risk averse measure, by looking at the steepness of the indifference curve at single point, and a global risk averse measure, by looking at steepness across all the points in the triangle. x H p H increasing prices p E endowment lottery p S spread lottery budget lines x M p L x L Figure 2: Price Variation For the main analysis of the experiment, we elicit preferences using convex budgets. Each individual faces budgets generated by the line connecting two reference lotteries: A spread lottery and an endowment lottery. 9 The spread lottery is denoted p S and places probability 9 In terms of previous studies, one can think of the spread lottery as a riskier lottery and the endowment 8

10 exclusively on the tail prizes. The endowment lottery denoted p E is a lottery that endows the individual with some probability of getting the centered prize. Example budgets are shown in Figure 1. For the experiment, an individual is asked to choose their most preferred lottery from several budgets-lines between an endowment and a spread lottery. Importantly, the budget allows the individual to express a preference for a mixture between the two lotteries. This is important because many non-expected utility theories allow an individual to prefer mixtures over some region of the MM triangle. Moreover, if an individual has a preference for mixtures between some pair of lotteries, then it is difficult to elicit the optimal mixture without using a convex budget. Notwithstanding, when eliciting mixing behavior from a budget, a subject s decision is always over final-stage lotteries choices are over reduced single-stage lotteries. The MM triangle also relates the intuition about risk preferences to standard consumer choice. For example, suppose one considers choices of an individual while holding the endowment lottery, p E, fixed as in Figure 2 with budgets induced from different spread lotteries. Rather than thinking of this as a choice between probability shares of x L, x M, and x H, one can instead think of this as a choice between the endowment lottery and spread lottery. For this setting, we interpret the slope of the budget line as the price of the endowment lottery. Note that as one increases the probability of receiving the high prize of the spread lottery, the slope of the budget line increases, and the endowment lottery becomes less attractive. Therefore, the slope of the budget line given by ˆr = ps H pe H p S L pe L represents the relative trade-off rate between the endowment and spread lotteries. For the experiment, we look at varying trade-offs of the endowment and spread lottery at different levels of the endowment lottery. This allows us to look at several interesting understudied comparative statics of risk preferences. First, we are able to look at standard demand curves for the endowment lottery and check if they are downward sloping as theory predicts. Second, we can examine choices from different regions of the MM triangle and examine lottery as a safer lottery. 9

11 if there is any behavior that is systematic to a region. For example, the corners of the MM triangle are well studied because of the Allais paradox. To the best of our knowledge, less is known about behavior in the other regions of the MM triangle. 3 Experimental Design and Implementation 3.1 Design $2.00: % $30.00: % $10.00: % Figure 3: Example of a Task We employed convex budgets 10 in which subjects had to select their preferred distribution over the monetary prizes x L = $2, 70 x M = $10, and x H = $30. Each budget constraint consisted 50 of all the distributions that can be constructed using convex combinations of an endowment 0 lottery and a spread lottery. In more detail, the distributions that could be chosen satisfied p(k) = 100 k 100 ps + k corresponds to k = pe where k {0,..., 100}. Thus, p S corresponds to k = 0, while p E The convex-choice tasks are related to those of Figure 2, but use a more intuitive interface pioneered by Sopher and Narramore (2000). 1 10

12 We implemented these convex budgets by having a subject move a slider between zero and one hundred as displayed in Figure 3. As the subjects moved their slider, the pie chart beneath the slider was updated to reflect the lottery induced by mixing the endowment and spread lottery for the task. Choices are presented as singlel-stage lotteries to avoid possible confounds that could arise if the assumption of reduction of compound-lotteries fails. Outcomes were always displayed at the top and color coded to correspond with the assigned colors on the pie chart. Additionally, we placed a small box next to the slider in which subjects had to confirm their choices by manually typing them in. This was to eliminate the possibility that we registered accidental choices. 3.2 Implementation All subjects faced the same set of 79 budgets, but the order of the budgets was randomized for each subject. This simplifies comparing the rationality measures and individual mixing results between subjects. Our budget lines all have strictly positive slopes so that within a budget no choices are ordered by first order stochastic dominance. This allows us to focus on violations resulting from mixing without worrying about individuals satisfying monotonicity within a budget. This is similar to other revealed preference experiments that restrict individuals from expressing satiation, e.g. Andreoni and Miller (2002); Choi et al. (2007); Andreoni and Sprenger (2012). Figure 4 summarizes the 79 different convex budgets subjects faced. This graph was not disclosed to subjects. Budgets were chosen in order to guarantee high enough incentives, have somewhat simple odds, and cover a wide range of the MM triangle. Importantly, the set of budgets were selected so they had plenty of price variation for different endowments. Therefore, they can elicit individual and aggregate demand curves for mixtures. We used eight different endowment(e) lotteries to generate our budgets. They are numbered from worst to best and summarized in table 1. 11

13 (0.05, 0.95) E 8 E 7 E 6 (0.50, 0.50) E 5 E 4 E 3 (0.95, 0.05) E 2 E 1 Figure 4: Budget Lines in the MM Triangle E 1 E 2 E 3 E 4 E 5 E 6 E 7 E 8 p $ p $ p $ Table 1: Endowments There are several important things to note. The slider was positioned at 50 for the start of each task; however the pie graph is not displayed, nor can subjects proceed to the next task, until they interact with the slider. We use this to reduce the possibility that subjects use the initial position as a reference point. Moreover, the experimental design reduces compound lotteries so that subjects cannot misperceive the mixed distribution they are choosing. Lastly, the location of p S on the left and p E on the right are fixed throughout the experiment. We suspect that this makes it easier for an individual to discover the lotteries used to generate the budget. Thus, if an individual had expected utility preferences, they could easily find their 12

14 preferred choice from p S or p E. Additionally, our subjects completed six binary-choice tasks. These tasks were completed at the end by each subject. Each subject faced a binary choice problem between a spread lottery and endowment lottery used in a previous convex budget. In particular, these tasks were D 1 = {(0, 0.9, 0.1), (0.65, 0, 0.35)} and D 2 = {(0.5, 0.5, 0), (0.95, 0, 0.05)}, faced in that order. For each, they knew they would face each binary-choice three consecutive times. The tasks were designed to examine whether subjects try to convexify the budget when not explicitly given a convex budget. In detail, a subject can mix (convexify) their budget by choosing each lottery from the binary budget in some set combination. The convex price (ˆr) of the endowment lottery for first budget favors the endowment lottery (ˆr D1.38) while the price of the second budget favors the spread lottery (ˆr D2 2.1). These prices are relatively extreme, so we expect little mixing in general, but can compare choices from the binary and convex budgets. One hundred and forty four undergraduates from UC San Diego participated in this study. Six sessions were conducted from January 22-23, in The tasks were conducted on internetenabled laptops through a web browser and the design was coded in otree (Chen, Schonger and Wickens, 2016). The experimental explanations were separated into three sections: examples, 79 convex choice tasks, and six binary-choice tasks. Separate instructions were provided before each section and read aloud by the experimenter. The full set of instructions can be found in Appendix On average subjects earned $23.15 and spent an average of 15.6 seconds per task with a standard deviation of 11.6 seconds. In order to provide incentives compatible with truthful revelation, subjects were paid according to one randomly chosen decision in the experiment. 11 A live version of the experiment can be found here. Google Chrome and patience are required. 13

15 3.3 Incentive Compatibility Our incentive compatibility is predicated on the isolation effect, i.e. that subjects maximize preferences over individual tasks and not across the whole experiment. In order to provide incentives compatible with truthful revelation, subjects were paid according to one, randomly chosen decision Allais (1953) in the experiment. Azrieli, Chambers and Healy (2012); Healy, Azrieli and Chambers (2016) provides the theoretical justification: as long as individuals do not reduce compound lotteries across choice tasks this procedure is compatible with truthful revelation for the convex tasks. In our context not EU preferences our design is incentive compatible if we have both increasing preferences with respect to money and failure of reduction of compound lottery. If reduction holds and preferences are convex, then subjects would be randomizing across multiple budgets instead of within a budget. Sopher and Narramore (2000) s main result suggests that isolation is present as the elicited mixing behavior is constant across multiple repetitions for convex budgets and our interface is very similar to theirs. Further, Starmer and Sugden (1991) and Cubitt, Starmer and Sugden (1998) were the first to document the isolation effect and emphasize it holds for simple (final-stage) lotteries. In order to justify incentive compatibility for the repeated binary tasks we require that isolation hold across different repetitions but not within them. This possibility is provided by Brown and Healy (2018) who find evidence that presenting decisions together may trigger reduction-like behavior while separating decisions has the opposite effect. We are certain isolation fails within the repeated tasks, as we observe non-constant choices, and exploit their result across the different repeated tasks. 4 Descriptive Results We begin by presenting some results that highlight some of our findings. First, we provide an intuitive characterization of the different patterns of individual choice we observed, summariz- 14

16 ing heterogeneity. Second, we present aggregate results on mixing behavior, summarizing the pervasiveness of mixing behavior. Lastly, we analyze aggregate demand behavior, emphasizing the sensibility of the behavior we observed and consistency with previous studies. These three different types of descriptive results suggest the incompatibility of choice behavior with expected utility, that behavior is sensible and generalize the Allais certainty effect increased risk aversion at certainty to convex-choice tasks. In the Section 5.1 we provide further evidence that behavior is not random and consistent with a preference for mixtures. 4.1 Behavior Patterns and Individual Heterogeneity Within the dataset gathered from the experiment, we find that there is substantial heterogeneity in individual behavior. Most of the individual behavior matches one of the six heuristic types given in Figure 5: Expected utility, price responsive, middle prize thresholding, low prize thresholding, a combination of behavior, and a random chooser. The example behavior in Figure 5 correspond to different subjects, each panel is a different subject in our study. In more detail, we say an individual is expected utility if the subject chooses the endowment or spread lottery almost exclusively, and there are only a few switches that are incompatible with EU. For example, as the slope of the budget line slope increases we should not observe subjects that have already chosen the spread lottery, at lower slope, choosing the endowment lottery. A price responsive individual is one who often chooses extreme points, but there is some region where as the slope of the budget line increases, the subject begins to choose mixtures. Thus, the subject is lured away from the endowment lottery as the relative price increases. While the previous two types are most closely related to standard theory, we also find a number of heuristic thresholding rules. For example, we say an individual is a middle-prize thresholder if there is a cloud of choices around an iso-p M line. In other words, it appears as if an individual increases the probability of receiving a high prize while guaranteeing a fixed chance of the middle prize. Similarly, we find low thresholding behavior where there are clouds of choices around an iso-p L line. In other words, it seems like individuals accept some maximal 15

17 chance of receiving the low prize. There are also combinations of these various types that will occur within a single individual. The behavior in Figure 32c has behavior consistent with a middle prize thresholder, but is also responsive to prizes in other regions. Finally, there are random choosers that have no discernible pattern to their decisions. (a) Expected Utility (b) Middle Thresholding (c) Combination (d) Price Responsive (e) Low Thresholding (f) Random Figure 5: Example Types of Individual Choice Behavior for Endowments 1,2 and 5 The types we generate in Figure 5 were chosen by an eyeball test and only display a subset of choices. We present figures of all choices for each of these subjects in Appendix 9.12, as well as, the same three endowments for all subjects in Appendix 9.13.While this classification is not precise, it gives a broad picture of the different types of behavior observed from the experiment. In Section 6, we show how to formally classify individuals using the data collected from the experiment. However, instead of using different heuristics, we examine several models of risk preferences and examine which model best predicts the choices of each individual. 16

18 4.2 Mixing Behavior Mixing is pervasive across subjects. 94% of them mix at least once and on average each subject mixes on 44.6% of budgets with a standard deviation of 28.1%. Figure 6 provides a full histogram of number of mixtures across subjects out of the 79 convex budgets. Mixing is also pervasive across budgets as average mixing is approximately 44% subjects. Therefore, a preference for mixtures is prevalent in our subjects. Further, 91% of subjects mix in a way that is inconsistent with EU, i.e. in at least two differently-sloped budget lines. Frequency Interior Choices Figure 6: Histogram of Mixing Behavior To examine how behavior changes as the slope of the budget line increases, Figure 7 shows the percentage of choices for the endowment lottery, the spread lottery, and non-degenerate mixtures of both as relative prices change. 12 The left panel plots the changes only for endowment 2 full certainty of middle outcome and the right panel incorporates all endowments. The percentage of subjects who mix for a given price and endowment oscillates between 16% and 62% of subjects. One interesting comparative static is that mixing first increases as the endowment lottery becomes less desirable according to first order stochastic dominance, and then decreases as the spread lottery further improves. We also find that most of the mixing occurs when the relative price of the endowment lottery is close to one. In other words, 12 Appendix 9.9 examines how mixing behavior changes with each endowment. 17

19 subjects mix more when the cost of the two reference lotteries is nearly the same even when making choices that give the individual the possibility of a certain prize of $10. The right panel suggests similar amounts of mixing regardless of the endowment. This provides some evidence that individual behavior responds more to the relative prices than the levels of endowment. Fraction of Subjects Endowment Lottery Spread Lotteries Mixtures of Endowment and Spread Lotteries Fraction of Subjects Endowment Lotteries Spread Lotteries Mixtures of Endowment and Spread Lotteries log of Relative Prices (a) Endowment log of Relative Prices (b) All Endowments Figure 7: Relative Prices and Mixing Behavior 4.3 Demand Analysis As eloquently expressed in Machina (1985) under the assumption that preferences are quasiconcave in the probabilities, the deterministic preference model of stochastic choice possesses behavioral implications which are exactly analogous in both their strength and nature to the behavioral implications of standard consumer choice and demand theory. Therefore, aggregate demand schedules provide an intuitive summary of our main findings. First, subjects have a preference for mixtures that cannot be rationalized by models that are linear in the probabilities. Linearity in the probabilities imply a discontinuous jump in demand, as choices for non-degenerate mixtures can only occur for one price. Heterogeneity in these discontinuities could generate a non-single jump downward-sloping aggregate demand schedule; nevertheless, 18

20 this plausible hypothesis is ruled out by our previous finding that subjects mix across many different tasks. Second, even if the observed behavior were inconsistent with any rational model we can still use these modified demand schedules to model behavior if a law of demand 13 holds.(becker, 1962) Mixtures over Endowment Lottery Endowment 2 +/ 1.96 S.E. Linear Interpolation Mixtures over Endowment Lottery Endowment 1 Endowment 2 Endowment Relative Prices Relative Prices Figure 8: Relative Prices and Behavior Figure 8 graphs the mixtures over the endowment lottery as a function of the relative prices. The left panel plots only endowment 2, the certain middle-outcome and the right panel plots two additional endowments. These mixture demands are downward sloping consistent with a law of demand. EU is consistent with a law of demand marginal rates of substitution do not change but demands should be step functions. Other models as long as their substitution effects are greater than their income effects will also be consistent. Increases in the degree of quasiconcavity lead to increases in the substitution effect. So, preferences for mixture models can also accommodate a law of demand. Because demand schedules also are decreasing at a decreasing rate we have further suggestive evidence in favor of a convexity of preferences. 13 See Appendix 9.1 for a formal treatment of this law of demand. 19

21 5 Non-parametric Tests 5.1 Revealed Preference Tests In the previous section, we examined simple descriptive measures of whether or not the choices made by subjects were consistent with some deliberate mixing model. Our main result was that preferences for mixtures were pervasive. In this section, we will be primarily concerned with refuting formally the hypothesis that this preference for mixing is generated by purely random behavior. Revealed preferences gives us a way to refute this hypothesis and determining how close behavior is to being rationalizable. Revealed-preference analysis was introduced by Samuelson (1938) and popularized by Varian (1982, 1983). 14 Revealed preference analysis allows us to ascertain the consistency of the individual-choice behavior on convex budgets with any increasing concave utility model. It also allows us to explicitly test whether our data was randomly generated. From the work of Heufer (2013), it is known that one can test when a finite dataset of lotteries chosen from abstract budget sets is consistent with some concave specification using revealed preference methods. 15 In more detail, there exists a concave model that can describe the decisions made if and only if a modified version of the generalized axiom of revealed preference (GARP) holds. 16 Recall GARP requires that there are no observations that could induce a cycle of preferences. While the choices which are not allowed from the revealed preference condition are difficult to see in general, it is easy to see what is not allowed for pairs of observations. Recall a violation of the revealed preference condition for two observations is known as the weak axiom of revealed prices (WARP). First, notice that if a lottery lies to the southeast of the budget set, then (assuming monotonicity) it is less preferred to the lottery chosen from that budget An excellent reference for this extensive literature is Chambers and Echenique (2016). 15 Heufer (2013) examines more generally the hypothesis of utility maximization in the simplex. However the results are analogous to Afriat (1967), so if choices satisfy the variation of GARP then preferences can be represented by an appropriate concave utility function. 16 See Appendix 9.3 for details on how we implement various tests of revealed preference. 17 This follows since any points to the north or west of a point to the southeast of the budget line are strictly 20

22 In our context, violations of WARP occur when an individual chooses two mixtures below the intersection point between two budget lines. This is a violation of WARP as two choices cannot be strictly revealed preferred to each other. This situation is illustrated in Figure 9. x H increasing preferences p E endowment lottery p S spread lottery budget lines x chosen lottery x M x x x L Figure 9: Violation of the Weak Axiom of Revealed Preferences The main issue with revealed preference tests is that either all choices made by an individual can be represented by a well behaved preference relation or they can not. As an illustration, only approximately 10% of our subjects are fully consistent with rationality. However, we desire to know whether individual choices are close to some concave specification and whether behavior can be distinguished from purely random choice. For example, there could be sources of measurement error that would otherwise lead us to refute that an individual is rational. 18 Therefore, we examine several measures of rationality developed to measure how close individuals are to some convex risk preference. Thus, the focus of this analysis is on how close individuals are to a convex preference, rather then checking whether behavior is fully rational. The Houtman-Maks index (HMI,Houtman and Maks (1985)) measures the largest acyclic set that can be generated by a dataset, i.e. the largest transitive chain that does not contain preferred assuming monotonicity. Making enough movements north or west will eventually lead to a point on the budget line. Finally, the point on the budget line is weakly less preferred to the chosen point assuming transitive preferences. 18 For example individuals may tremble when making decisions, the grid of convex combinations may not be fine enough to discern preferences, or there could be rounding error in the computation of revealed preferences. 21

23 a cycle. We focus on this measure on two reasons. First, it has a very natural interpretation in our context. Given the mean HMI of approximately 72 for our subjects, it tells us that we can find some increasing concave utility function, for each subject, that could explain, on average, 72 out of their 79 choices. Second, it is less susceptible to small perturbations of the design and definitions when compared to other measures: e.g, the rationality measures that gauge by how much budgets would have to be relaxed to remove all violation are sensitive to our definition of income shifts. There is no natural way to measure income in this space. In contrast, in standard consumer setup income is always measured in money. In our context it can be measured as increases in the likelihoods of better outcomes. Notwithstanding, we report the full set of measures in Appendix 9.3. With any of these measures of rationality, there is no natural benchmark for when to classify an individual as being described by a concave utility model. Moreover, as one increases the number of observations by increasing the number of budgets all measures become weakly worse. Ultimatetly, exogenous cutoffs for rationality are arbitrary since they depend on the dataset and design. Consequently, we compare the choices made by the subjects to a distribution of irrational consumers following the method from Choi et al. (2007). Following the ideas from Becker (1962) and Bronars (1987) we say an individual is irrational if they make their decisions following a random choice rule. 19 Our random choice rule is just 79 draws from a uniform distributions between 0 and 100 percent. Each number drawn represents the prefered mixture for the endowment lottery in a budget. We considered binary {0, 100}, discrete {0, 1, 2,..., 100} and continuous 0, 100 uniformly distributed choices. The binary rules correspond to a random EU agent, the discrete mimics our design more closesly and the uniform continuous rule is also sensible. If an individual is choosing randomly then he is not following a maximization principle so this is a endogenous benchmark. Using this idea, we statistically test the null hypothesis that the distribution of each measure of rationality generated by the experimental subjects is equal to the distribution generated by 19 Even though there is now a large literature on random choice, we use the term irrational to remain consistent with Becker (1962) and Bronars (1987). 22

24 a population of irrational decision makers. When the distributions are not equal, we classify individual behavior as generated by a concave specification when the measure of rationality is greater than 95% of the irrational decision makers. We present results when irrational behavior is generated from various uniform distributions over the convex budget. Choices unif{0,100} unif[0,100] unif{0,1,2,...,100} Max. Acyclicity (HMI) (s.e.) (0.46) (0.03) (0.03) (0.03) Differences 13.58*** 18.05*** 17.86*** 1 # of subjects=144. *** is significant at p <.01. Using n=10,000 for simulated choices. Table 2: Means and Differences for HMI Table summarizes our rationality results as measured by the HMI and the random-choice rules. In all cases we can refute that the means from the random rules are the same as our experimantal means. Likewise, we reject that the HMI for the the experiments distributions and irrational decision makers distribuitions are the same for all random-choice rules ( p <.001, Wilcoxon-Mann-Whitney and Kolmogorov-Smirnov). We also find that at least 98.6% of our subjects are closer to rationality than 95% of the pure randomizers according to the HMI. All our measures are aggregated and report comprehensive individual measures in Appendix 9.3. Figure 10 provides visual evidence of these rationality results in the left panel and contrasts these results with a linear HMI specification on the right panel. Both panels plots two histograms, one for the HMI measures for our subjects and another for 10,000 uniform random choosers. The panel on the left shows that the distribution of HMI for our subjects is different from the one generated by random uniform continuous choices. In particular, our subjects do much better than the benchmark as more of their choices could be rationalized by a concave specification. The right panel shows a different HMI computed for a strictly linear specification, i.e. if we restricted ourselves to a linear increasing utility model over the probabilities: what are the most choices we could rationalize? 20 Note, that linear specifications nest both EU and all models consistent with the betweenness axiom, i.e. models with indifference curves that are 20 A detailed explanation of how we computed this measure appears in appendix

25 linear but not necessarily parallel in the MM-triangle. In order to provide a suitable benchmark, we use the binary uniform choice-rule, as interior choices are almost entirely ruled-out by linearity. In this instance, the mean HMI for the experiment is approximately 48, and only 54.2% are closer to being rationalized by a linear specification than pure randomizers. Therefore, the majority of our subjects can be clearly distinguished from pure chance; however, the majority of their choices are better rationalized by a strictly concave specification. Houtman/Maks, Experiment vs. Continuous Houtman/Maks, Experiment vs. Continuous Density 0.10 Continuous Experiment Density Continuous Experiment Houtman/Maks (a) Rationalizability by a Concave Specification Houtman/Maks (b) Rationalizability by a Linear Specification Figure 10: Comparing HMI from our Subjects to Uniform Random Choice To our knowledge, Choi et al. (2007) were the first paper to fully exploit the revealed preference approach in the context of risk. Their approach is complementary to ours: they track how choices change as outcomes with fixed likelihoods change, while we look at choices within different distributions with fixed outcomes. Andreoni and Harbaugh (2009) examines the trade-off between an outcome for a given distribution and the change in a likelihood for a fixed set of outcomes. Both studies, like ours, find individual choices significantly distinct from random choice in favor of being partially rationalized by some specification and behavior that cannot be explained by the EU model. 24

26 5.2 Mixing Behavior and a Desire for Randomization x H p H Increasing Preferences p S p E endowment lottery p S spread lottery D 1.. D 2 p E p E p L p S x L Figure 11: Binary-Choice Tasks and Implied Mixtures In this section, we further examine when individuals choose mixtures from convex budgets and compare choices from binary-choice tasks to their convex-choice task equivalents. These two tasks were designed to correlate a preference for deliberate randomization, as uncovered by Agranov and Ortoleva (2017), to a preference for mixtures. If risk-preferences are convex then individuals with this preferences will have a emphdesire for randomization. Even if the repeated binary choices could be perceived differently individuals might not reduce compound lotteries under a preference for randomization, this type of behavior should be correlated. After our subjects finished the convex-choice tasks, they were presented with a pair of binary choice tasks. In each of these tasks, subjects choose three times in-a-row between a fixed endowment and a fixed spread lottery and the repetitions where known in advance. This afforded them the opportunity to replicate, or at least come close, to their preferred mixtures in the analogous convex budget task, i.e. the one where the mixture was over the same spread and endowment lottery. Figure 11 visually depicts the implied mixtures the binary-choice tasks could induce. The dots represent the implied mixtures as they could choose the endowment lottery three times, twice, once or never. 25

27 The flatter budget was chosen as a non-mixing task, while the steeper budget was chosen as a mixing-task. Our choice of labels is justified by the percentage of subjects making interior allocations. In the mixing task we had 55.6% with strictly interior allocations and only 16% had interior allocations in the non-mixing task. Across both tasks, our main finding is that 73% of the subjects with non-constant choices in the binary-choice tasks also mixed and 95% of subjects that did not mix had constant choices in the corresponding binary-choice tasks. This suggests subjects choices were deliberate as most of them mix when expected to and do not mix when not expected to. Our evidence suggests behavior in both tasks is related, as we find that mixing in these tasks is positively correlated with mixing in the analogous convex budgets. This is a result that would extend Agranov and Ortoleva (2017) results of deliberate randomization in binary-choice tasks since it provides evidence that the elicitation between tasks is the same. The rest of this section presents additional information linking the choices in both tasks. interior non-interior non-constant 13.19% 2.78% constant 42.36% 41.67% D 1 :Mixing Task interior non-interior non-constant 3.47% 3.47% constant 12.50% 80.56% D 2 :Non-Mixing Task Table 3: Desire to Randomize and Convex Preferences Table 3 summarizes these discrete relationships: between the interior allocations and the non-constant choices for both tasks. Unfortunately, for these tasks we did not find a lot of nonconstant choices (16% and 6.9%, respectively). The striking result is that from the subjects that had non-constant choices, a majority also chose interior allocations (82.6% and 50%, respectively). Further, from the subjects that had corner allocations in both circumstances, almost all of them had constant choices (93.8% and 95.9%, respectively). These results are indicative of the relationship between mixing and non-constant behavior; nevertheless, they provide a partial glimpse, constant choices could be indicative of a mixture that is near one of the component lotteries. 26

28 Implied Mixtures on Binary Choice Tasks Chosen Mixtures on Convex Choice Tasks n (a) Mixing Task Implied Mixtures on Binary Choice Tasks Chosen Mixtures on Convex Choice Tasks n (b) Non-Mixing Task Figure 12: Raw Choices for Mixtures Our stronger result is the finding that subjects are consistent assuming their preferences are convex irrespective of the elicitation method, as their choices imply similar preferred distributions. To convince the skeptical reader that our raw results corroborate this, we present the scatterplots in Figure 12 of all choices for both tasks. As can be immediately glanced, the correlation is positive. Further, the plotted line represents a linear regression with no intercept and is close to the 45 line. The correlations between mixtures and implicit mixtures are.42 and.33 respectively. Therefore, not only could the desire to randomize be driven by convexity, its extent is also consistent with the implied optimal mixture. 6 Model Selection In this paper, we also attempt to sort between several models of convex risk preferences and rank them according to their within-subject predictive accuracy. By examining out-of-sample prediction we punish models that fit the data better but lack any explanatory power. We perform a model selection tests on the following six models: Expected utility (EU), 27

29 disappointment aversion (DA), cautious expected utility (CEU), rank-dependent utility: power weighting (RDU), cumulative prospect theory (CPT), and stochastic reference dependence (SRD). We chose these models based partially on their parsimony, salience, and popularity. The following paragraphs give a brief outline of each model, more in-depth coverage appears in Appendix 9.2. Expected Utility The canonical model of risk preference is expected utility theory. With three prizes, the lottery p = (p xl, p xm, p xh ) is evaluated according to V EU (p) = u(x H )p xh + u(x M )p xm + u(x L )p xl. Expected utility is equivalent over affine transformations so that we can set u(x L ) = 0, u(x H ) = 1, and u(x M ) (0, 1). As seen in Figure 13, expected utility gives linear indifference curves in the M-M triangle and so it is a convex risk preference. We note that expected utility has some strong predictions that we can examine in individual choice data. One such prediction is that an individual can only choose a lottery choose on the interior of a budget when they are indifferent. Therefore, if an individual is an expected utility maximizer, then we would only observe choices on the interior for a single sloped indifference curve. Disappointment Aversion The model of disappointment aversion developed by (Gul, 1991) is a one parameter relaxation of expected utility theory. Given a three prize lottery p and the fact that all models considered are equivalent over affine transformations, we can evaluate p according to V DA (p; u, β) = p H + (1+β)p Mu 1+β(1 p H ) p H + 1+β(1 p H p M ) 1+β(1 p H ) if ce(p) x M p M u 1+β(1 p H p M ) if ce(p) < x M where β 0 denotes the degree of disappointment aversion or if β < 0 the degree of elation 28

30 seeking and ce(p) is the certainty equivalent which depends on the preferences. Disappointment Aversion also places a strong prediction on data as it is a model of the betweenness class (Dekel, 1986). These models have linear indifference curves, as seen in Figure 13, but they need not be parallel. In this case, if an individual is disappointment averse, we can only observe interior choices on budgets that do not cross in the MM triangle. P(high) P(high) P(high) P(low) Expected Utility P(low) Disappointment Aversion P(low) Cautious Expected Utility P(high) P(high) P(high) P(low) P(low) P(low) Stochastic Reference Dependence Cumulative Prospect Theory Rank-Dependent Utility: Power Weighting Figure 13: Six models of Risk Preferences. Cautious Expected Utility The recent model of cautious expected utility (CEU) was introduced by Cerreia-Vioglio et al. (2015). Given a family of utility functions U, we can evaluate p according to V CEU (p; U) = min u U {ce u(p)}. 29

31 where ce u (p) denotes the certainty equivalent of lottery p under the utility function u. Cautious Expected utility also places several restrictions over preferences. It implies a preference for mixtures and that preferences are linear and the most risk averse at the certain middle outcome as depicted in Figure 13. Cerreia-Vioglio, Dillenberger and Ortoleva (2018) proves that it nests the disappointment averse version, but not the elation seeking version, of disappointment aversion. Stochastic Reference Dependence The popular model of stochastic reference dependence was first introduced by Koszegi and Rabin (2006). Their model features various forms depending on which version of personal equilibria we restrict the model to, these personal equilibria serve as a way of disciplining personal beliefs with respect to the referent. We restrict ourselves to the choice acclimating personal equilibria variant (Koszegi and Rabin, 2007) as other refinements can induce intransitive choices (Freeman, 2013). Under these assumptions, a linear loss-aversion function characterized by λ and normalizing utility; we can evaluate p according to V SRD (p; u, λ) = p H + p M u +(1 λ) (p M p L (u 0) + p H p L (1 0) + p H p M (1 u)). If we restrict ourselves to λ 1 loss-tolerance then this model, as depicted in Figure 13, is consistent with a preference for mixtures. Cumulative Prospect Theory This is a variation of the canonical behavioral model, prospect theory(kahneman and Tversky, 1979), which was introduced initially by Quiggin (1982) and later adopted by Tversky and Kahneman (1992). In our context, this model is essentially expected utility but probabilities are weighted in a way that depends on the order of the prizes and does not violate dominance. 30

32 Normalizing utility, we can evaluate p according to V CP T (p; u, π) = g(p H ) + g(p M )u where g(p M ) = π(p M + p H ) π(p H ) and g ph (p) = π(p H ) is the rank-dependence innovation that insures consistency with dominance and π(r) = probability weighting function. r γ (r γ +(1 r) γ ) 1/γ is the canonical inverse-s Because this function is concave and then convex, it yields indifference curves that are quasiconcave, consistent with a preference for mixtures and then quasiconvex, consistent with mixture aversion. Using commonly identified parameters (Andreoni, Feldman and Sprenger, 2017) Figure 13 illustrates this model. Rank-Dependent Utility: Probability Weighting This is a variation of the previous function used by Hey and Orme (1994); Quiggin (1982) where the probability weighting function is a power function, i.e. π(r) = r γ. Again normalizing, we can evaluate p according to V RDU (p; u, π) = (p H ) γ + ((p M + p H ) γ (p H ) γ ) u. Restricting γ (0, 1) this model, showcased in Figure 13, is consistent with a preference for mixtures. Our models fit into four broad classes as classified by their functional forms and axioms. We can group our models into the following classes: betweenness, rank dependent, cautious expected utility and quadratic. The betweenness class is composed by all models that satisfy the betweenness axiom, i.e. given a preference relation between two lotteries, their convex mixture lies between them (Dekel, 1986). Note, that betweenness implies linear indifference curves. The rank dependent class is additively separable in the probability weights where these affect the cumulative distribution of outcomes and not individual weights(quiggin, 1982). The cautious expected utility class is characterized by the negative certainty independence axiom, 31

33 i.e. if a lottery is preferred to a non-degenerate mixture, then mixing both with a third lottery will preserve this relationship (Dillenberger, 2010). 21 The quadratic class assumes the following functional form V (P ) = x y φ(x, y)p xp y with φ : XxX R and continuous (Machina, 1982). Rank Dependent Cumulative Prospect Theory RDU: Power Weighting Convex Preferences Betweenness Stochastic Reference Dependence EU CEU: Expo-power and Pareto Disappointment Aversion Quadratic Cautious EU Figure 14: Relationships between Models Figure 14 summarizes the relationships between the different sets of models we consider and these broad classes. We can also see how the restriction of convex preferences affects the sets we consider. For the models we consider: disappointment aversion is in the betweenness class. As mentioned previously part of the set is also nested under cautious expected utility. Cautious expected utility belongs to its own class. Further, both the rank-dependent powerweighting model and cumulative prospect theory are part of the rank dependent class. As all these classes generalize EU, it is within all classes. Less obvious, perhaps, is that stochastic reference dependence under the choice acclimating personal equilibrium rule is both in the quadratic class and a rank-dependent class (Masatlioglu and Raymond, 2016). We perform a model selection exercise on individual level choices made from the convex budgets in the experiment. Specifically, we use 10-fold cross validation to evaluate which models 21 Appendix contains more details on this class. 32