Strategic Reasoning in Persuasion Games: An Experiment

Size: px
Start display at page:

Download "Strategic Reasoning in Persuasion Games: An Experiment"

Transcription

1 Strategic Reasoning in Persuasion Games: An Experiment Yingxue Li University of California, Davis November 14, 2016 Abstract We experimentally study persuasion games, in which a sender (e.g., a seller) with private information provides verifiable but potentially vague information (e.g., about the quality of a product) to a receiver (e.g., a buyer). Various solution concepts such as sequential equilibrium or iterated admissibility predict unraveling. In our experiment we find that 59% of the sellers and 62% of the buyers are consistent with the highest possible level of reasoning allowing for mistakes in 10% of the rounds. Iterated admissibility also predicts that the levels of reasoning required for unraveling increases in the number of quality levels of the good. We find more but insignificant unraveling in a game with two quality levels compared to a game with four quality levels. We reject the hypothesis that participants learn unraveling by playing successively persuasion games with more quality levels. Instead, participants do not seem to transfer learning across persuasion games. Finally, participants with higher scores of cognitive abilities using Raven s progressive matrices test also display significantly higher levels of strategic reasoning. Keywords: Persuasion games, verifiable information, communication, disclosure, unraveling, iterated admissibility, prudent rationalizability, common strong cautious belief in rationality, experiments, cognitive ability. JEL-Classification: C72; C91; D82. Department of Economics, University of California, Davis, One Shields Avenue, Davis, CA 95616, yxli@ucdavis.edu 1

2 1 Introduction Communication is at the heart of strategic interaction. In many contexts in economics, politics and law, players have asymmetric information and communication is restricted to verifiable disclosures. Such situations between for instance a seller and buyer, between political parties and voters, contracting relationships, financial markets, etc. have been studied theoretically with persuasion games starting with seminal work by Grossman (1981), Milgrom (1981), Grossman and Hart (1980), and Milgrom and Roberts (1986) (see Milgrom, 2008, for a review). The central result of this theoretical literature is unraveling of information. Surprisingly, the experimental literature testing this prediction is almost nonexistent. Our work aims to fill this gap. To gain some intuition about persuasion games, consider a seller and a buyer. The seller has private information about the quality of her good as given by a finite number of quality levels. The buyer s optimal purchase of units of the good depends on the quality. Assume that the higher the quality, the more units he likes to purchase. Before any trade, the seller can provide information to the buyer. This information must be truthful but may be vague. We assume that the seller can disclose a range of quality levels with the provision that the true quality level is contained in this set. In real life this may correspond to a quality certificate. The unraveling argument goes as follows: If the buyer receives the message that the good is of highest quality, then he knows that it has the highest quality because the seller sent a precise message and is not allowed to lie. Thus, a seller who possesses the highest quality good, strictly prefers to disclose it because otherwise she runs the risk of the buyer buying a lower amount. If a buyer receives information that the good is of highest or second highest quality, then he now knows that it is of second highest quality because otherwise the seller would have happily disclosed that it is only of highest quality. Thus, a seller having the good of second highest quality, strictly prefers to disclose that it is of highest or second highest quality because otherwise if she mentions also lower quality levels in her message, she runs the risk of the buyer buying less. This argument continuous inductively. The punch line is that the seller will disclose the true quality (and higher quality levels) and the buyer understands that the good is of lowest quality among the quality levels disclosed by the seller. Thus, information unravels. Originally persuasion games have been solved using sequential equilibrium (e.g., Milgrom and Roberts, 1986). Yet, they can be solved more transparently level-by-level using iterated admissibility and related rationalizability procedures (see Battigalli, 2006, Heifetz et al., 2011). This has the advantage of providing predictions for every finite level of mutual cautious belief in rationality. In our experiments it offers us a window to partially observe strategic reasoning of players. Analogous to experimental studies on level-k thinking, it allows us to study the question about levels of reasoning consistent with behavior of sellers and buyers in the experiment without the disadvantage of having to fix level-0 behavior to some more or less arbitrary benchmark. (In fact, it is not really 2

3 clear to us what the level-0 benchmark should be in persuasion games.) Surprisingly we find that behavior is frequently consistent with relatively high levels of reasoning. The iterative solution concept also allows us to study how the number of quality levels affects strategic reasoning and unraveling. It predicts that that the number of levels of mutual cautious beliefs in rationality required for unraveling increases in the number of quality levels. Our experimental design is as follows: Participants play 30 rounds of persuasion games with random rematching after each round. There are two treatments. In treatment 2-4, the first 15 rounds consist of play of persuasion games with two quality levels only while in the last 15 rounds participants play persuasion games with four quality levels. In treatment 4-4, all rounds consist of playing persuasion games with four quality levels. Participants are randomly assigned to either the role of the seller or buyer before each round. After the play, participants complete a Raven s progressive matrices task to evaluate their cognitive abilities. Finally, they complete a questionnaire about demographics. Our design allows us to address the following questions: 1. What is the empirical distribution of levels of strategic reasoning? Is it relatively low as in previous experimental studies using different games and the level-k reasoning as solution concept (e.g. Stahl and Wilson 1995, Costa-Gomes, Crawford, and Broseta 2001, Costa-Gomes and Crawford 2006)? Or is it relatively high so as to facilitate unraveling of information in persuasion games? As mentioned earlier, we find relatively high levels of reasoning among participants in our experiment. 2. Is unraveling easier attained in persuasion games with a fewer quality levels than with more quality levels? We find more unraveling in persuasion games with two quality levels than with four quality levels, but the effect is insignificant, contrary to the theoretical prediction. 3. Can participants learn to unravel? In particular, can they learn better when playing first persuasion games with fewer number of quality levels and only then persuasion games with more quality levels? We had to reject this hypothesis. In particular, participants seem to have difficulties transferring learning from the persuasion game with 2 quality levels to the persuasion game with 4 quality levels. 4. How are levels of strategic reasoning correlated with cognitive ability as for instance evaluated with a Raven s progressive matrices task? We find a positive correlation. Despite the large theoretical body of work on persuasion games, they have surprisingly received very little attention from experimentalists. Forsythe, Isaac, and Palfrey (1989) test a series of experimental markets where sellers have better information about the good and decide whether to reveal this information to buyers. Their game has multiple Nash equilibria among which there is a sequential equilibrium with unraveling that the 3

4 participants converge to through the experiment. In a more recent study, Benndorf, Kübler, and Normann (2015) test the voluntary revelation of private information in a labor-market experiment where workers can reveal their productivity with costs. Subjects take only the role of workers as sellers but not buyers in persuasion games. They find unraveling frequently played by subjects less often than predicted, especially with lowproductivity workers. Jin, Luca, and Martin (2016) run an experiment about product quality disclosure in which they conclude that the unraveling principle fails as senders do not always disclose their private information, especially lower-type senders. They claim the reason to this failure is due to receivers being insufficiently skeptical. They require the seller s message to be truthful, same as ours; however, we also allow the seller s message to be vague in the sense that the true quality level is included in the message while in their experiment seller only has the option of reveal or hide the true quality. Hagenbach and Perez-Richet (2016) investigate experimental results with verifiable information in a number of sender-receiver games where the sender s payoff function is not necessarily monotonic in the state space. While in our experiment, the same as previous literature the seller s payoff function is set to be increasing in quality level. Our experiment differs from theirs such that we focus on testing unraveling and players strategic behaviors while their focus is mostly on checking the actions of the players when facing more complex incentives. Our work is also related to the experimental literature on forward induction. Forward induction is required for unraveling in persuasion games. To see this, note that the buyer must infer from not receiving the highest quality signal from the seller that the seller does not have the good of highest quality (because otherwise he would have happily told the buyer). Inductively, for any quality level k, the buyer must infer from not having received the message that the good is of quality k or higher, that the seller must have a good of quality lower than k. The evidence for forward induction in the experimental literature is mixed. Balkenborg and Nagel (2016), Blume and Gneezy (2010), Brandts and MacLeod (1995), Cachon and Camerer (1996), Cooper et al. (1992, 1993), Dufwenberg et al. (2016), Muller and Sadanand (2003), and Huck and Müller (2005) find only weak support in experimental data. Brandts and Holt (1995) find no evidence. Blume, Kriss, and Weber (2016), Brandts, Cabrales, and Charness (2007), Evdokimov and Rustichini (2016), and Shahriar (2013, 2014) find strong evidence for forward induction. All these experiments use games different from ours. Finally, our work is related to the literature on level-k reasoning. Starting from Nagel (1995) and Stahl and Wilson (1995), many experimental studies have been devoted to infer levels of strategic reasoning from behavior in experimental games using the level-k model or cognitive hierarchies. Level-k reasoning models differ from iterated admissibility or prudent rationalizability. Under prudent rationalizability, any strategy is zero-level rationalizable. At level-1, a strategy is prudent rationalizable if there exists a full support belief over opponents strategies with which this strategy is optimal. A strategy is level-2 prudent rationalizability if there exists a full support belief over opponents level- 4

5 1 prudent rationalizable strategies with which this strategy is optimal etc. For level-k reasoning, the level-0 strategy is fixed to some more or less ad hoc strategy. A level-1 player best responds to this level-0 strategy. A level-2 player best responds to the level-1 players (or some mixture of level-0 and level-1 players) etc. Experimental studies fit empirical distributions of level-k types to behavior. While it has mostly been applied to simultaneous move games (e.g., Nagel 1995, Stahl and Wilson 1995, Costa-Gomes, Crawford, and Broseta 2001, Camerer, Ho, and Chong 2004, Costa-Gomes and Crawford 2006, Arad and Rubinstein 2012), Crawford (2003) applied it sender-receiver games (without verifiable information) and explained over-communication and systematic deception. Wang, Spezio, and Camerer (2010) use level-k model and report experimental results on sender-receiver games with over-communication by the sender. Ho and Su (2013) apply the level-k in a dynamic game where players choose rules based on their best guesses of others rules and use historical plays to improve their guesses. The level frequencies are either estimated or calibrated in previous studies where level-0 is usually zero or small and most frequencies are on level-1 and level-2. 1 There is also a small but growing literature that applies cognitive tests to analyze the relationship between cognitive ability and strategic behavior (e.g. Burnham et al. 2009, Oechssler et al. 2009, and Brañas-Garza et al. 2012, Benito-Ostolaza et al. 2016). To measure each subject s cognitive abilities, we use Raven s standard progressive matrices test (Raven et al. 2000). It is a test of cognitive abilities that can be easily administered. It is nonverbal, requires no reading, writing, or mathematical training, thus making it feasible to use in a wide variety of contexts, and the results comparable between subjects of different backgrounds. In economics, Raven test scores have been found to correlate positively with fewer Bayesian updating errors (Charness et al., 2011) and with more accurate beliefs (Burks et al., 2009). In a recent study, Gill and Prowse (2016) applies a level-k framework to analyze individual s experimental behavior in a p-beauty contest game and they find a positive relationship between cognitive abilities and subjects level of reasoning. The paper is organized as follows: The next section presents the model used in the experiment; then we bring up the hypotheses that are tested in the experiment; Section 5 analyzes the results regarding the hypotheses raised in Section 4; and further discussions are included in Section 6. 2 Model 2.1 The Experimental Persuasion Game Our game is phrased as a game between a seller and a buyer. The seller has a good. This good can be of quality q Q = {1, 2, 3, 4}. Nature moves first and select the quality q 1 See Crawford (2013) for a comprehensive survey on the application of level-k model. 5

6 Table 1: The Buyer s Payoffs in the Experiment Quality q = 1 q = 2 q = 3 q = 4 x = Units x = Purchased x = x = Q. The quality is observed by the seller but not by the buyer. After observing the quality, the seller can provide exactly one message to the buyer. The message specifies a nonempty subset of qualities with the provision that the true quality is contained in the message. In this sense, the message contains verifiable information (or certified information) although it may be vague. This distinguishes the model also from cheap talk games. More formally, upon observing quality q Q, the seller s set of messages is M(q) := {M 2 Q q M}. Since there are four quality levels, for each quality level the seller has exactly eight different messages available. For example, if the true quality is q = 2, then the set of seller s messages is {{2}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 3}, {1, 2, 4}, {2, 3, 4}, {1, 2, 3, 4}}. After receiving the message from the buyer, the buyer decides on the quantity he wants to buy. We assume that the quantity is x X := {1, 2, 3, 4}. We assume that each unit costs 4. Since we abstract from any cost of production or selling, the seller s revenue and profit is 4x. Clearly, the seller wants that the buyer purchases as many units as possible. The buyer s payoff depends both on the quality of the good and the quantity purchased. It is given by 12 x q x 3 q 3 4x. We interpret 12 as the buyer s initial wealth. The term x q 1 2 can be viewed as a penalty of incorrectly guessing the quality value. The term 6x 2 3 q 1 3 is the payoff from obtaining x units of the object with quality q. Finally, 4x is the cost of purchasing. The payoff function is set up such that for each possible actual quality q, the buyer s payoff is uniquely maximized when x = q. For any quantity level purchased different from the optimal level, the buyer is strictly worse off. The marginal payoff from an increase in x to the buyer is increasing in quality q of the good. In particular this means that if the quality is higher, then the buyer s optimal choice of x will also be higher. The buyer s payoff function can easily be summarized in a payoff table that shows for each quantity-quality pair the payoff to the buyer (see Table 1). Rows refer to quantities; columns refer to quality levels. The number in each cell of the table is rounded to the nearest integer. In the experiment, we will also consider a version of the persuasion game with just 6

7 two quality levels in which Q = {2, 3}. 2.2 Prudent Rationalizability It is well known that iterative elimination of strictly dominated strategies is characterized by rationalizability (Pearce, 1984). Similarly, iterated elimination of weakly dominated strategies (or iterated admissibility) is characterized by prudent rationalizability, in which for each level, players have full support beliefs over the one-step lower level strategies of opponents (i.e., cautious beliefs). Iterated admissibility is a solution concept for strategic games. Yet, our game is an extensive-form game. Thus, we use the extensive-form analogue to iterated admissibility (or more precisely, the extensive-form prudent rationalizability analogue. It is equivalent to iterated admissbility in the associated normal-form game (see Heifetz, Meier, and Schipper, 2011, and Meier and Schipper, 2014). In extensive-form games, strategies map information sets into actions available at that information sets. For the seller, the quality levels selected by nature represent the information sets. Let σ s : Q q Q M(q) such that σ s(q) M(q) for all q Q denote a strategy of the seller. That is, if the quality level selected by nature is q, the actions available to the seller are M(q). A strategy of the select assigns to each quality level observed by the seller a subset of quality levels that contain the observed quality level, which is exactly what the verifiable information paradigm requires. The information sets of the buyer are identified with the messages sent by the seller. Thus, the buyer s strategy is a map σ b : 2 Q \ { } X. For each player i {b, s}, we denote by Σ i player i s set of strategies. We say that a move of nature q Q and a strategy σ s of the seller reaches the information set Q 2 Q \ { } of the buyer if σ s (q) = Q. For any finite set Y, denote by (Y ) the set of probability measures on Y. Players form beliefs about strategies of the other player and - in case of the buyer - also moves of nature. A belief system of the seller is a profile of beliefs β s = (β s (q)) q Q ( (Σ b )) Q, one for each move of nature. A belief system of the buyer is a profile of β b = (β b (Q )) Q 2 Q \{ } Q 2 Q \{ } (Q Σ s ) such that β b (Q ) assign probability 1 to the subset of pairs of (q, σ s ) such that q Q and (σ s ) 1 (Q ) = q. That is, when observing Q, the buyer is certain that a move of nature q obtained and the seller chose a strategy σ s that reach Q. We say that strategy σ i of player i is rational with belief system β i at her information if that σ i maximizes expected payoffs with respect to the belief prescribed by β i at that information set. Prudent rationalizability is now defined inductively. For each player i {b, s}, Σ 0 i := Σ i. 7

8 For each k 1, Bs k := { β s = (β s (q)) q Q β s (q) (Σ k 1 b ) }. (1) { } Σ k s := σ s Σs k 1 There exists βs k Bs k for which σ s is rational. (2) at any move of nature q Q. For every Q 2 Q \ { }, if there exists a move of nature and a seller s strategy σ s Σ k 1 Bb k := β b = (β b (Q s such that (q, σ s ) reaches Q )) Q 2 Q \{ } then the support of β b (Q ) is the set of.(3) profiles of quality levels and seller s strategies (q, σ s ) (Q Σ k 1 s ) such that (q, σ s ) reach Q. { Σ k b := σ b Σ k 1 There exists βb k b Bk b for which σ } b is rational at any Q 2 Q. (4) \ { }. The set of prudent rationalizable strategies of player i {b, s} is Σ i = Σ k i. k=1 A strategy survives level k of the prudent rationalizability procedure, if there exists a full support belief on the remaining strategies of the other player and - in the case of the buyer - feasible moves of nature for which the strategy is rational at every information set of the player. The prudence or cautiousness is captured by full support beliefs. It means that at each level, a player does not completely exclude any of the opponents remaining strategies and feasible moves of nature. This includes the worst possible quality levels consistent with the message observed in case of the buyer. Hence, skepticism about messages comes for free through prudence/cautiousness. Heifetz, Meier, and Schipper (2011) and Meier and Schipper (2012) provide further discussions of the solution concept, including an existence proof for finite extensive form games and the proof of equivalence to iterated admissibility. Prudent rationalizability can be viewed as a rationalizability concept corresponding to iterated admissibility analogous to rationalizability and iterated eliminated of strictly dominated actions. Proposition 1 In our experimental persuasion game with Q = {1, 2, 3, 4}, the prudent rationalizable strategies of the seller and the buyer are given by Tables 2 and 3, respectively. The proof is contained in Appendix A. 8

9 Table 2: Prudent Rationalizable Strategies of the Seller 9 Level Quality Message {1} {2} {3} {4} {1, 2} {1, 3} {1, 4} {2, 3} {2, 4} {3, 4} {1, 2, 3} {1, 2, 4} {1, 3, 4} {2, 3, 4} {1, 2, 3, 4} q = 1 q = 2 0 and 1 q = 3 q = 4 q = 1 q = 2 2 and 3 q = 3 q = 4 q = 1 q = 2 4 and 5 q = 3 q = 4 q = 1 q = 2 6 and higher q = 3 q = 4 Level Table 3: Prudent Rationalizable Strategies of the Buyer Message {1} {2} {3} {4} {1, 2} {1, 3} {1, 4} {2, 3} {2, 4} {3, 4} {1, 2, 3} {1, 2, 4} {1, 3, 4} {2, 3, 4} {1, 2, 3, 4} 0 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1,2,3,4 1 and ,2 1,3 1,4 2,3 2,4 3,4 1,2,3 1,2,4 1,3,4 2,3,4 1,2,3,4 3 and ,2 1,2 1,3 2,3 1,2,3 5 and ,2 7 and higher

10 Note that for unraveling to be the prudent rationalizable outcome for every quality level chosen by nature, we require 7 levels of strategic reasoning. Any outcome surviving 7 levels of the procedure, after the seller observes q Q, she provides a message in which q is the lowest quality level. Moreover, for the buyer the optimal quantity corresponds to q. For the persuasion games with Q = {2, 3} that is used in our experiment, the prudent rationalization predictions for the seller and the buyer are shown in Table 4 and Table 5. With two quality levels, we require 3 levels of strategic reasoning for unraveling to be reached for either quality level under prudent rationalization. Table 4: Prudent Rationalizable Strategies of the Seller for Q = {2, 3} Level Quality Message {2} {3} {2, 3} 0 and 1 q = 2 q = 3 2 and higher q = 2 q = 3 Table 5: Prudent Rationalizable Strategies of the Buyer for Q = {2, 3} Level Message {2} {3} {2, 3} 0 1,2,3,4 1,2,3,4 1,2,3,4 1 and ,3 3 and higher Sequential Equilibrium Milgrom and Roberts (1986) observe that there is a sequential equilibrium in the persuasion game such that the buyer believes at information set Q 2 Q \ { } that the quality selected by nature is min Q. Thus, the buyer is skeptical about the message received. Subsequently his equilibrium action at information set Q is x = min Q. The seller s equilibrium strategies are such that if the quality level selected by nature is q then σ s (q) is such that q = min σ s (q). Thus, sequential equilibrium predicts unraveling but it does not allow us to get predictions for every fine level of reasoning. Proposition 2 The sequential equilibrium of the experimental persuasion game is given by Table 6. The proof is in Appendix A. The sequential equilibrium of the persuasion game with Q = {2, 3} is given by Table 7. 10

11 Table 6: Sequential Equilibrium Quality Seller s Action (Message) Buyer s Action {1} {2} {3} {4} {1, 2} {1, 3} {1, 4} {2, 3} {2, 4} {3, 4} {1, 2, 3} {1, 2, 4} {1, 3, 4} {2, 3, 4} {1, 2, 3, 4} (Quantity) q = 1 x = 1 q = 2 x = 2 q = 3 x = 3 q = 4 x = 4 11 Table 7: Sequential Equilibrium for Q = {2, 3} Quality Seller s Action (Message) Buyer s Action {2} {3} {2, 3} (Quantity) q = 2 x = 2 q = 3 x = 3

12 3 Experimental Design The experiment was programmed in ztree (Fischbacher, 2007) except for a questionnaire on demographics, which was paper-based and distributed only after the experimental session involving the computerized persuasion games had been completed. Subjects were recruited on campus using the ORSEE recruitment system by Greiner (2004). Upon arrival in the lab, participants received written instructions for the experiment (see Appendix B). They were given time to read the instructions. After the experimenter went over the written instructions in front of the participants. At the end of the first phase, participants were able to ask questions about the instructions and the experiment. These questions were answered by the experimenter in public. Each session run one of the following two treatments: Treatment 2-4: Participants played 15 rounds of a persuasion game with only two quality levels, {2, 3}, followed by 15 rounds of a persuasion game with four quality levels, {1, 2, 3, 4}. Treatment 4-4: Participants played all 30 rounds a persuasion game with four quality levels, {1, 2, 3, 4}. Each participant was allowed to participate in one session only. Thus, each participant was either assigned to treatment 2-4 or 4-4. Hence, we conducted a between-subject experiment. At the beginning of each session, participants were randomly assigned into matchinggroups of six and stayed in the same matching-group throughout the session. Participants were unaware of matching-groups. At the beginning of each round, each participant was randomly matched to another participant from the same matching-group. The resulting pair played the persuasion game together in this round, one participant being randomly assigned to the role of the seller and the other participant being the buyer. Participants were randomly rematched within their matching-group after each round. The random rematching should prevent to a large extend repeated games effects. The possible change of roles should facilitate interactive reasoning as participants should find it easier to reason about the other player once they have been in a similar role. In each round, first the seller received information about the quality level selected by nature and decided on what message to send to the buyer (see the seller s screenshot in Appendix C). The quality level selected by nature was private information of the seller and was not observed by the buyer. After receiving the message by the seller, the buyer decided on the quantity to buy (see the buyer s screenshot in Appendix C). Finally, both the buyer and seller were informed about their own payoffs (see the screenshots of payoffs in Appendix C). After the playing 30 rounds of persuasion games, participants completed 30 rounds of the Raven s progressive matrices test (Raven et al., 2000) on the computer. This 12

13 test consists of 30 question. Each question is a graphic pattern with one piece missing. Participants need to select the missing piece out of 8 options in order complete the pattern. After the Raven s progressive matrices test, participants received a paper-based questionnaire about demographics from the experimenter. At the end of the session, participants were paid a show-up fee plus earnings from the persuasion games. For each participant, one persuasion game was randomly and independently selected for payment. The Raven s progressive matrices test and the questionnaire were not incentivized. This was known to the participants upfront. 4 Hypotheses Our experimental design allows us to address the following hypotheses: Hypothesis 1 According to theoretical predictions (see Section 2), information should unravel in each persuasion game of each treatment. As outlined in Section 2, both prudent rationalizability (and thus iterated admissibility) and sequential equilibrium predict unraveling in each persuasion game of each treatment. Despite this clear theoretical prediction, our prior is tilted towards rejecting this hypothesis upfront. This is because the prior literature on level-k reasoning rarely observes more than four levels, which is not sufficient for unraveling in our games. Although this literature is based on prior experiments using different games, different samples, and also a different solution concept, we nevertheless believe it is relevant here if the claim of relatively low levels of reasoning observed in this literature is to be generally valid. Having a lower number of qualities requires less levels of reasoning for unraveling according to theoretical prediction based on prudent rationalizability in Section 2. This motivates the following hypothesis. Hypothesis 2 Unraveling is easier to attain in first 15 rounds of treatment 2-4 than in the first 15 rounds of treatment 4-4. Although participants may not display unraveling in the first few rounds of the experiment, they may be able to learn over time. Since our experiment last 30 rounds, we should be able to observe some learning. This should not only apply for learning to unravel but to the level of sophistication of reasoning more generally. Hypothesis 3 The cumulative distribution over levels of reasoning in the last 15 rounds first-order stochastically dominates the cumulative distribution over levels of reasoning in the first 15 rounds in treatment

14 Intuition suggests to us that it might be easier to learn sophisticated reasoning and unraveling when first being trained in a simpler but similar problem like the persuasion game with exactly two quality levels and only then play the persuasion game with four quality levels. Hypothesis 4 The cumulative distribution over levels of reasoning in the last 15 rounds of treatment 2-4 first-order stochastically dominates the cumulative distribution over levels of reasoning in the last 15 rounds in treatment 4-4. Finally, we expect that the level of sophistication of reasoning is positively correlated with cognitive abilities (as for instance measured with the Raven s progressive matrices test). This is also suggested by the literature on level-k reasoning. Gill and Prowse (2016) find that subject with higher Raven test score are more likely to choose equilibrium actions, and they also converge more frequently to equilibrium play and earn more even as behavior approaches the equilibrium prediction. Since cognition affects behavior and learning in strategic games, players with high cognitive abilities are expected to play actions that are consistent with higher levels in the strategic reasoning hierarchy. Hypothesis 5 The mean level of reasoning of any participant is on average positively correlated with her/his score on the Raven s progressive matrices test. 5 Results The experiment was conducted in a computer lab of University of California, Davis during spring quarter There were 20 experimental sessions; 10 sessions for each treatment. A total of 372 participants joined the experiment; 186 participants in each treatment. Table 8 summarizes demographic information of our sample. The average payment including the show-up fee was $17.48 with a maximum of $25 and a minimum of $6. Each experimental session lasted about 1 hour and 15 minutes. 5.1 Unraveling Unraveling is attained when the buyer is being skeptical and believes that the true quality is the minimum quality contained in the seller s message, then he purchases the quantity that maximized his payoff according to the smallest quality; the seller releases a message with the true quality being the smallest component contained in the message. The outcome then is the same as the buyer is fully informed. More formally we define unraveling in our experiment such that it satisfies the following conditions: 14

15 Table 8: Demographics of Our Sample Variable Number Mean Std. Dev. Subjects 372 Female Age White Asian American Black or African Native Hawaiian or Other Pacific Islander Mixed or Others GPA Math All Sciences Engineering Economics Other Social Sciences Humanities and Arts Note: The major classification is taken from majors by college from The sum of the mean of majors is greater than 1 since we take double-major into account. i. For each quality q Q, the quantity purchased is x = q; ii. For each quality q Q, the seller sends a verifiable message M M(q) such that q = min M; iii. For each message received M 2 Q \ { }, the buyer purchases a quantity x X such that x = min M. By the definition of unraveling, for each round we use a binary variable to represent unraveling such that unraveling equals 1 if the outcome of the game in that round achieves unraveling and 0 otherwise. Does information unravel in the persuasion games in our experiment? The subjects choices show high percentage of unraveling in both treatment groups. In 4-4 treatment group, the average unraveling over all 30 rounds is percent with standard deviation of In the persuasion games with four quality levels, if there is no information unraveling and players simply make random choices, then the chances of unraveling is 15 out of 128 where 15 is the number unraveling outcomes and 128 is the total number of outcomes in the persuasion games with four quality levels. In 4-4 treatment, the 30-round unraveling average is percent, significantly higher than the random chosen 15/128. In the first 15 rounds of 2-4 treatment when the persuasion games 15

16 are played with two quality levels, the average unraveling percent is also significantly higher than 3/16, the average unraveling if the outcomes are randomly chosen. Table 9 shows the average unraveling in percentage for both treatment groups separated by the first 15 rounds and the last 15 rounds. The unraveling averages in both the first and the last 15 rounds of 4-4 treatment and the last 15 rounds of 2-4 treatment are all greater than 65 percent and significantly higher than 15/128. In the 4-4 treatment group, results show more unraveling in the last 15 rounds than in the first 15 rounds. Although in the last 15 rounds of 2-4 treatment group drops about 5 percentage points from the 15 rounds, the average unraveling is still high. Table 9: Average Unraveling Round 4-4 Treatment 2-4 Treatment Difference (3.55) Difference (2.31) (2.51) (4.23) Note: We use robust standard errors clustered at group level. Significance levels: *10%, **5%, and ***1%. One might question the effect of learning in the process of information unraveling since subjects play the same game for 15 rounds in 2-4 treatment and 30 rounds in 4-4 treatment. The high unraveling might be due to that the subjects become more sophisticated and learn to play the unraveling outcome during the last few rounds of the experiment. In order to explore the possibility of learning effect, we look at the average unraveling among all subjects round by round for both treatments. In Figure 1, the average unraveling in 4-4 treatment shows an increasing trend though the 30 rounds of the persuasion games with four quality levels. However, the overall increase in the average unraveling is from about percent in the first couple of rounds to about percent in the last few rounds, so the average percentage increase is only 1 percent per round. It is very possible that the increase part comes from the same small group of the subjects, and the rather large and majority of the subjects are always unraveling. While in the 2-4 treatment, in both the first 15 rounds and the last 15 rounds, there is an increase in average unraveling in the number of rounds played, but the increment per round is still only between 1 and 2 percent. Therefore, in the determination of information unraveling, even though we cannot exclude learning completely, the effect is relatively little. The high percentage in average unraveling still mainly attributes to information unraveling itself. Even though we expect low unraveling in the persuasion games in our experiment given the previous studies on strategic reasoning, the results in our experiment on persuasion games show the opposite. Due to the high average unraveling in the persuasion 16

17 Figure 1: Average Unraveling: Round1-15 vs. Round in Both Treatments games in both treatments, we can confirm hypothesis 1 that information does unravel in the persuasion games in our experiment. Is unraveling easier to attain in the first 15 rounds of 2-4 treatment than in the first 15 rounds of 4-4 treatment? By a simple comparing the sample means in the first 15 rounds of two treatments, in Table 9, the persuasion games with two quality levels show more but insignificant unraveling than the persuasion games with four quality levels. The variation of average unraveling per round also resembles the same pattern. In Figure 1, the whole series of the first 15 rounds of 2-4 treatment lie right above the first 15 rounds of 4-4 treatment, but the vertical distance is just about 5 percentage points at most. Before coming to the conclusion, we explore the advantage of the binary variable of unraveling to further analyze the difference in two treatments. We run both probit and logit regressions on 2-4 Treatment dummy, Round dummy, the intersection of those two, and demographic variables. The results are given in Table 10. The coefficient of 2-4 Treatment indicates the treatment effect for the first 15 rounds between two treatments. The coefficient, , implies that playing the persuasion games with two quality levels increase the probability of producing unraveling outcome by , significant at 5%. Therefore, after we control for the interaction between Treatment and Round and the demographics, the treatment effect becomes significant, 17

18 Table 10: Binary Regression Results of Unraveling Probit Unraveling Logit 2-4 Treatment (0.1029) (0.0977) (0.1708) (0.1632) Round (0.0783) (0.0798) (0.1386) (0.1415) 2-4 Treatment Round (0.1052) (0.1080) (0.1806) (0.1868) Demographics, Majors and GPA No Yes No Yes Number of Observations Pseudo R Note: We use robust standard errors clustered at group level. Significance levels: *10%, **5%, and ***1%. but the increase in probability is , close to the difference between the average unraveling in two treatments when the choices are made by nature, 3/16 15/128 = The t-test on the marginal effect of 2-4 Treatment being larger than shows a t- value of 1.72 and P -value of Therefore, we cannot conclude that the average unraveling in the first 15 rounds of 2-4 treatment is higher than in the first 15 rounds of 4-4 treatment by larger than 7 percent. The coefficient estimate of 2-4 Treatment in the logit regression confirms our findings in the probit regression. The coefficient estimate of , statistically significant at 5%, implies that a increase in the log-odds of the dependent variable 2-4 Treatment dummy, which equivalently means the odds ratio of attaining unraveling in the first 15 rounds of 2-4 treatment and attaining unraveling in the first 15 rounds of 4-4 treatment is exp(0.4025) = If the outcomes are randomly chosen, the odds of attaining unraveling in the first 15 rounds of 2-4 treatment is 3/13 and the odds of attaining unraveling in the first 15 rounds of 4-4 treatment is 15/113, and that leads to the odds ratio of attaining unraveling between two treatments of (3/13)/(15/113) = , which is significantly larger than The results from logit regression actually demonstrates the opposite to the claim in Hypothesis 2. Thus, so far all evidence has shown that we cannot accept with confidence the claim in Hypothesis 2 that unraveling is easier to attain in the first 15 rounds of 2-4 treatment than in the first 15 rounds of 4-4 treatment. 5.2 Level of Reasoning The procedure of prudent rationalizability iteratively eliminates players strategies from the least rationalization strategies to the most rationalization strategies. The prediction from prudent rationalizability has a nested feature such that the higher-lever strategies subset low-level strategies for both the seller and the buyer. For example, prudent ra- 18

19 tionalizability predicts that given a true quality of 3, Level-0/1 seller s message could be any message as long as it contains 3; a Level-2/3 seller s message is {3, 4}, {1, 3, 4}, {2, 3, 4}, or {1, 2, 3, 4}; and a Level-4/5 and Level-6/higher seller s message is {3, 4}. For each state, the strategies with the highest level of reasoning are also qualified for the strategies with the least rationalization which causes a problem with isolating the depth of players strategic thinking. We apply an idea marginal best rationalization in order to identify players levels. Battigalli (1996) defines best rationalization principle such that a player always believes her opponents are implementing one of the most rational strategies that are consistent with her information set. In our identification, we assume players are implementing the best rationalization at each information set. So if a message for a given quality level can be consistent with multiple levels, we count the highest level as that player s identified level of reasoning. For example, on the seller side, for a given quality q = 3 in a certain round if the seller s message is {1, 3, 4}, the seller could be either a Level-0/1 or Level-2/3 seller according to the model prediction (see Table 2); while in our identification we treat the seller as a Level-2/3 seller since players are presumed to be implementing the best rationalization. Here is another example for the buyer side: for a given received message {1, 2, 3} if the buyer purchases 2 units, the buyer could be a Level-0, Level-1/2, or Level- 3/4 buyer according to the model prediction in Table 3; while in our identification we treat the buyer as a Level-3/4 buyer. In each round we assign a level of strategic reasoning according to marginal best rationalization to each player according to their message or quantity choice. Then each player s level of reasoning is identified as the least level reached over all the rounds played either as the seller or as the buyer. The seller s and the buyer s level identification strategies are provided in Appendix D. We consider two types of players: strict players and forgiving players. The level of a strict player is determined by the least level that is consistent with all choices that player has made through the experiment; while a forgiving player is allowed to choose inconsistently ten percent of the time. We assume players types are fixed and their choices follow a certain type through the whole experiment. Then in terms of level identification, a strict player sets a rigid yet precise standard on levels while a forgiving player takes a rather loose yet robust way of analyzing subjects behaviors. Here we use forgiving player classification to analyze levels of the players in the following discussion. As a robustness check, we include the results using level identification of strict players in Appendix E. We identify that percent of the sellers are Level-6/higher and percent of the buyers are Level-7/higher in 4-4 treatment, if we assume each player s level of reasoning is constant over all 30 rounds of the persuasion games. This result is different from previous studies in which most subjects are found associated with relatively lower levels of reasoning. In order to study the first order stochastic dominance in Hypothesis 3 and Hypothesis 4, we assume in bother treatments each player s level of reasoning is 19

20 fixed at certain level in the first 15 rounds and each player s level of reasoning is fixed at another level in the last 15 rounds. Does level distribution in the last 15 rounds of 4-4 treatment first order stochastic dominate the first 15 rounds of 4-4 treatment? If the result holds, the first order stochastic dominance of the last 15 rounds in 4-4 treatment over the first 15 rounds of 4-4 treatment would imply higher average level in the last 15 rounds than in the first 15 rounds of 4-4 treatment. Figure 2 and Figure 3 indicate the cumulative distribution of identified levels between the first 15 rounds and the last 15 rounds in 4-4 treatment for the seller and the buyer respectively. Notice that the levels on the horizontal axis are in reverse order. The cumulative distribution of identified levels in the last 15 rounds in 4-4 treatment stays above the cumulative distribution of identified levels in the first 15 rounds in 4-4 treatment. Table 15 and Table 16 provides the exact frequency of each level for the seller and the buyer respectively. In the seller s level distributions, over 60 percent of the subjects are on the highest level, Level-6/higher. The probabilities for other levels are all lower in the last 15 rounds than in the first 15 rounds. On the buyer side, similar to the seller side, the highest level, Level-7/higher shows the largest frequency but with a 30 substantial increase from the first 15 rounds to the last 15 rounds; while the frequencies of all other levels are lower in the last 15 rounds than in the first 15 rounds. We apply the two-sample Kolmogorov-Smirnov test on both the seller s and the buyer s two distributions of levels. On the seller side the two-sample Kolmogorov-Smirnov test shows a insignificant P -value of that the first 15 rounds of 4-4 treatment contains lower levels than the last 15 rounds of 4-4 treatment. There is no significantly strong evidence to demonstrate a difference in distributions of the seller. While on the buyer side the two-sample Kolmogorov-Smirnov test confirms that levels in the first 15 rounds of 4-4 are significantly lower than in the last 15 rounds of 4-4 treatment and hence the two distributions are significantly different. Therefore, based on the above evidence, we can only confirm the significant first order stochastic dominance in Hypothesis 3 for the buyer side, not for the seller side. If Hypothesis 3 could be approved, it would indicate that there is learning in the first 15 rounds of 4-4 treatment, which would lead to higher subjects levels in the last 15 rounds of 4-4 treatment than in the first 15 rounds. However, we are only able to demonstrate learning on the buyer side, but not on the seller side. One explanation of this result is that it is relatively more difficult for the seller to learn in persuasion games. As an extensive-form game, the seller moves before the buyer and the seller would think about the buyer s strategies as a reference to his own action. The seller can only learn better if there is more learning from the buyer, but not the other way around. That would explain the relative higher level of strategic reasoning on the buyer side in our experiment. 20

21 Figure 2: Seller s Distribution of Levels: Rd vs. Rd in 4-4 Figure 3: Buyer s Distribution of Levels: Rd vs. Rd in 4-4 Table 11: Level Distribution of the Seller Level 4-4 Treatment 2-4 Treatment Round 1-15 Round Round and and and and higher Table 12: Level Distribution of the Buyer Level 4-4 Treatment 2-4 Treatment Round 1-15 Round Round and and and and higher Does level distribution in the last 15 rounds of 2-4 treatment first order stochastic dominate the last 15 rounds of 4-4 treatment? The diagrams of cumulative distribution of levels (see Figure 4 and Figure 5) imply that the cumulative distribution of levels in the last 15 rounds of 4-4 treatment first order stochastic dominant level distribution in the last 15 rounds of 2-4 treatment for both the seller and the buyer, contrary to our expectation. The exact frequency of each level in the last 15 rounds of 2-4 treatment, see Table 4 and Table 5, is similar to those in the first 15 rounds of 4-4 treatment when compared with the last 15 rounds of 4-4 treatment. For the seller s level distribution, the two-sample Kolmogorov-Smirnov test shows that 21

22 Figure 4: Seller s Distribution of Levels: Rd in 2-4 vs Rd in 4-4 Figure 5: Buyer s Distribution of Levels: Rd in 2-4 vs Rd in 4-4 the last 15 rounds of 2-4 treatment does not significantly contain lower levels than the last 15 rounds of 4-4 treatment. Although we can visualize clear first order stochastic dominance of the last 15 rounds of 4-4 treatment, we cannot conclude a significant difference in those two distributions. However, on the buyer side, the two-sample Kolmogorov- Smirnov test shows that levels in the last 15 rounds of 2-4 treatment are significantly lower than in the last 15 rounds of 4-4 treatment. We conclude to reject Hypothesis 4. The result is to the opposite. Even though the seller s cumulative distribution of levels is not significantly different, we can surely confirm on the buyer side that level distribution in the last 15 rounds of 4-4 treatment first order stochastic dominates level distribution in the last 15 rounds of 2-4 treatment. The purpose of Hypothesis 4 is to test our expectation that comparing to learning in the same persuasion games, learning in relatively easier persuasion games would lead to even higher levels in the following relatively more difficult persuasion games. However, not only we could not find learning being transferred from the easier persuasion games to the more difficult persuasion games as the seller, but also it is quite against our expectation that the buyer s levels actually become substantially lower in the last 15 rounds of 2-4 treatment than in the last 15 rounds of 4-4 treatment. The argument is similar to that when less learning appears on the seller side. The seller seems to be less sensitive to the difference between two treatments than the buyer. In this case, after playing persuasion games with two quality levels, when playing the consequent persuasion games with four quality levels the buyer s levels of reasoning are significantly lower than when playing the games after the training by playing the same games, while the seller s levels of reasoning are not significantly impacted. 22

23 5.3 Cognitive Ability and Reasoning The average test score is out of 30 for the Raven s Progressive Matrices test. We also record each subject s answering time left for each test question. For every recording, the more time left means the subject using less time to answer that question. The average time left is seconds out of total possible time 930 seconds. The correlation between the test core and answering time left is implying subjects with higher test score tend to spend more time on answering questions. Figure 6 is the distribution of Raven s test scores of all 372 subjects. In the histogram, there is a large increase between scoring 11 and 12. There are 45 subjects whose test scores are less than or equal to 11 points comparing to 25 subjects who score 12 points only. The reason of this sudden increase in frequency is not clear. One explanation might be for those subjects who score only a few points they tend to click through most of the test questions. For those 45 subjects who score less than or equal to 11 points, the average time they spend on answering each question is seconds, and for the 7 subjects who score 4 points or less, the average time for answering each question is only 3 seconds. Since subjects are not paid for answering the test questions, subjects are not very motivated to complete the test with full effort. The fact is that some subjects chose to click through most of the test questions. Therefore, some of the test scores are not the true reflection of cognitive ability of those subjects. However, even though we are not able to determine the cognitive ability for those subjects by their test scores, it is reasonable to believe that those subjects test scores are on average lower than they are supposed to be. Hence, the positive correlation between the subjects performance and their cognitive ability are underestimated if we use the test score to represent subjects cognitive ability. Figure 6: Raven s Progressive Matrices Test Score Distribution 23

24 Is the mean of level reasoning positively correlated with the Raven s test score? In order to see the relationship between cognitive ability and reasoning, we use OLS and ordered-probit regressions of subject s level on Raven score, Buyer dummy (equals 1 if assigned as a buyer for a certain round), 2-4 Treatment dummy, Round dummy, and the intersection of 2-4 Treatment dummy and Round dummy, and demographics. Since the levels are discrete and OLS regression requires continuous normal distribution of the error term, we present the ordered-probit regression along with the OLS results. The results are reported in Table 13. We also consider the OLS and ordered-probit regressions over all observations with a Buyer dummy, however, the results are not robust to show the effect of Round on seller s level of reasoning. Table 13: Regression Results of Level of Reasoning Level of Reasoning Seller Buyer OLS OLS O-P OLS OLS O-P Raven Test (0.0122) (0.0120) (0.0085) (0.0117) (0.0114) (0.0090) 2-4 Treatment (0.0911) (0.0912) (0.0797) (0.1202) (0.1141) (0.1046) Round (0.0999) (0.1013) (0.0854) (0.0905) (0.0915) (0.0858) 2-4 Treatment Round16-30 (0.1339) (0.1358) (0.1157) (0.1509) (0.1524) (0.1486) Time of Raven (0.0001) (0.0001) (0.0001) (0.0001) (0.0001) (0.0001) GPA (0.0944) (0.0676) (0.0900) (0.0713) Demographics and Majors Yes Yes Yes Yes Yes Yes Number of Obs (Pseudo) R Note: We use robust standard errors clustered at group level. Significance levels: *10%, **5%, and ***1%. In Table 13, we report results of OLS regressions with and without explanatory variable GPA since GPA might work the same way as Raven s test score. However, the regression included GPA does not affect the coefficient estimate for Raven Test much for both the seller and the buyer. The correlation coefficient between Raven Test and GPA is , which is considered weak in terms of representing relationship between those two variables. The Raven s test score is statistically significant at 1% in regressions for both the seller and the buyer. In the OLS regressions for the seller and the buyer, 1 point increase in the Raven s test core is associated with and increase in level of reasoning 24

25 for the seller and the buyer respectively. The coefficient estimate for the time spent on completing the Raven test is very small in value and statistically insignificant in all four regressions. In both OLS and ordered-probit regressions, the coefficient estimates in the regression for the seller are similar to those in the regression for the buyer except that the coefficient estimate for Round dummy for the seller is insignificant. That coefficient represents the difference in the mean of levels between the first 15 rounds and the last 15 rounds in 4-4 treatment for the seller. This result further explains the insignificant difference in cumulative distributions of levels between the first 15 rounds and the last 15 rounds in 4-4 treatment in Section 5.2. In comparison, we also look at the significance of the sum of coefficient estimates of 2-4 Treatment and 2-4 Treatment Round 16-30, which represents the difference in mean of levels between the last 15 rounds in 2-4 treatment and the last 15 in 4-4 treatment. The t-test of = 0 has a P -value of 0.455, which also confirms the results in Section 5.2 that the seller s cumulative distributions of levels in the last 15 rounds in 2-4 treatment and the last 15 rounds in 4-4 treatment are not significantly different. While on the buyer side, the coefficient estimate of 2-4 Treatment is statistically significant and we also further confirm that the sum of coefficient estimates of 2-4 Treatment and 2-4 Treatment Round is significant at 1% with a P -value of In the persuasion games with four quality levels in our experiment, we identify level- 0/1, level-2/3, level-4/5, and level-6/higher of the seller, and level-0, level-1/2, level-3/4, level-5/6, and level-7/higher of the buyer. While in the persuasion games with two quality levels, we identify level-0/1 and level-2/higher of the seller, and level-0, level-1/2, and level-3/higher of the buyer. So in the first 15 rounds of 2-4 treatment, the levels on average are much lower than in other rounds. That explains the relatively large and negative coefficient estimate of 2-4 Treatment. Table 14 gives the average marginal effects of the Raven test score on level of reasoning. Each column represents the change in probability for each level if there is 1 point increase in the Raven s test score. At the average, subjects with 1 point higher in the Raven test score, they are 1.02 and 0.97 percentage points more likely to be on the highest level of reasoning of the seller and the buyer respectively. Notice the marginal effects for other levels are all negative and the sum of all marginal effects is zero. This is because over 60 percent of sample observations have attained the highest level of reasoning for both the seller and the buyer. Therefore, if there is 1 point increase in the Raven s test score, the distribution shifts to the right slightly and the effect of this shift will cause some mass to shift out of the lower levels. Based on the above evidence, we conclude that the mean level of reasoning is positively correlated with the Raven s test score. Since the Raven s test has shown to be an indicator of cognitive ability, we can confirm that there is a positive relationship between the cognitive ability and the mean level of reasoning in prudent rationalizability. 25

26 6 Discussions Table 14: Average Marginal Effects of Raven Test Seller Buyer Level dy/dx Level dy/dx 0 and (0.0015) (0.0006) 2 and and (0.0011) (0.0005) 4 and and (0.0001) (0.0012) 6 and higher and (0.0025) (0.0001) 7 and higher (0.0024) Note: We use robust standard errors clustered at group level. Significance levels: *10%, **5%, and ***1%. The set of prudent rationalizability strategies is a real subset of the set of sequential equilibrium strategies. The strategies that are sequential equilibrium but not prudent rationalizable are such that when q = 1, 2, 3, the seller sends a single component message that contains the true quality only and the buyer then chooses x = q as the quantity to purchase. In the sample, percent of the outcomes are at sequential equilibrium while percent of the outcomes are prudent rationalizable percent of the outcomes are those at the sequential equilibrium but not prudent rationalizable. When those outcomes are randomly chosen, the probability is 3/128 which is approximately equal to 2.34 percent. These outcomes would explain an additional 4.52 percent of the outcomes. So we would not consider that the sequential equilibrium in persuasion games predicts strategic behaviors better than prudent rationalizability. Level-k model has been widely used in studying strategic behaviors in games. However, we find that level-k model does not quite explain the strategic behaviors in the persuasion games in our experiment. For a level-k model, level-0 has to be assumed either to be random or to be naive as a starting point. By the setting of the persuasion games such that the information revealed has to be truthful but it could be vague, no matter which level-0 is chosen, one needs to impose assumptions on player s belief on the opponent s strategies at each information set when that player gets to play. If level-0 is random, the level-k model lacks a consistent belief system that works for the persuasion games. It seems rather convincing for level-0 to be truth telling for the seller. If level-0 is assumed to be truthful, however, the best-responding level-1 buyer would not be able to move forward when the buyer is confronted with messages that are not fully revealing, since the buyer s belief is that the seller must be truth telling. Then level-k model actually has to contain an inconsistent belief system in order to go further to higher levels. 26

27 The most surprising finding in our experiment is that the subjects do not transfer learning in the persuasion games with two quality levels to the persuasion games with four quality levels. They rather treat the persuasion games with four quality levels as a new game in stead of applying any generalization from the same but simpler game. This result contradicts our common belief that players can generalize from the most basic of the games. In our experiment, the difference between the two types of the persuasion games is only in the number of quality levels. It would be interesting to know how the results would differ if even more complicated games are studied. This finding also demonstrates the importance of research on behavioral experiments regarding the application of theoretical models. Level of reasoning is found more consistent with the highest level in persuasion games. Although we do not see similar research on strategic reasoning in persuasion games, so there is not much reference to compare our results to. However, previous literature on strategic choices in games reports rather low average levels among subjects. We expect future research on this topic for us to study the application of persuasion games better, as well as players strategic behavior in such setting. Our approach of level identification is quite different from the traditional logit estimation since the logit method does not reason the level identification in prudent rationalizibility predictions conceptually or empirically. Conceptually, if a choice falls out of certain level, prudent rationalization counts it as a lower level choice instead of an error. Empirically, since the structure of prudent rationalizable strategies with level-0 starting with all strategies available, possible strategies of each level reduce as level of reasoning increases, therefore, logit estimation would generate the maximum likelihood on level-0 since all strategies are correct on this level. Even we exclude level-0, it would be the next lowest level with the maximum likelihood, or with multiple levels that have the same maximum likelihood. One alternative way of thinking about the strategic reasoning in persuasion games is lying aversion. When lying aversion is applied, the utility function of seller would adapt to accommodate this new feature with payoff motivation. Tests could be used to see if lying aversion shows on the seller side and how this changes the outcomes of the persuasion games. Future research providing evidence of the impact of lying aversion on strategic reasoning would enrich our understanding of human behaviors. References [1] Arad, Ayala and Ariel Rubinstein (2012). The money request game: a level-k reasoning study. American Economic Review 102(7): [2] Balkenborg, Dieter and Rosemarie Nagel (2016). An experiment on forward vs. backward induction: how fairness and level k reasoning matter. German Economic Review 17(3):

28 [3] Battigalli, Pierpaolo (1996). Strategic rationality orderings and the best rationalization principle. Games and Economic Behavior 13: [4] Benito-Ostolaza, Juan M., Penélope Hernández, and Juan A. Sanchis-Llopis (2016). Do individuals with higher cognitive ability play more stategically? Journal of Behavioral and Experimental Economics 64: [5] Benndorf, Volker, Dorothea Kübler, and Hans-Theo Normann (2015). Privacy concerns, voluntary disclosure of information, and unraveling: an experiment. European Economic Review 75: [6] Blume, Andreas and Uri Gneezy (2010). Cognitive forward induction and coordination without common knowledge: an experiment study. Games and Economic Behavior 68: [7] Blume, Andreas, Peter H. Kriss, and Roberto A. Weber (2016). Pre-play communication with forgone costly messages: experimental evidence on forward induction. Experimental Economics, forthcoming. [8] Brañas-Garza, Pablo, Teresa García-Muñoz, and Roberto Hernán González (2012). Cognitive effort in the beauty contest game. Journal of Economic Behavior and Organization 83(2): [9] Brandts, Jordi, Antonio Cabrales, and Gary Charness (2007). Forward induction and entry deterrence: an experiment. Economic Theory 33: [10] Brandts, Jordi and Charles A. Holt (1995) Limitations of dominance and forward induction: experimental evidence. Economic Letters 49(4): [11] Brandts, Jordi and W. Bentley MacLeod (1995). Equilibrium selection in experimental games with recommended play. Games and Economic Behavior 11: [12] Burks, Stephen V., Jeffrey P. Carpenter, Lorenz Goette, and Aldo Rustichini (2009). Cognitive skills affect economic preferences, strategic behavior, and job attachment. Proceedings of the National Academy of Sciences, 106(19): [13] Burnham, Terence C., David Cesarini, Magnus Johannesson, Paul Lichtenstein, and Björn Wallace (2009). Higher cognitive ability is associated with lower entries in a p-beauty contest. Journal of Economic Behavior and Organization 72(1): [14] Cachon, Gérard P. and Colin F. Camerer (1996). Loss-avoidance and and forward induction in experimental coordination games. The Quarterly Journal of Economics 111(1): [15] Camerer, Colin F., Teck-Hua Ho, and Juin-Kuan Chong (2004). A cognitive hierarchy model of games. Quarterly Journal of Economics 119(3):

29 [16] Charness, Gary, Rustichini, Aldo, and Jeroen van de Ven (2011). Self-confidence and strategic deterrence. Mimeo, UCSB. [17] Cooper, Russell, Douglas J. DeJong, Robert Forsythe, and Thomas W. Ross (1992). Communication in coordination games. The Quarterly Journal of Economics 107(2): [18] Cooper, Russell, Douglas J. DeJong, Robert Forsythe, and Thomas W. Ross (1993). Forward induction in the battle-of-the-sexes games. American Economic Review 83(5): [19] Costa-Gomes, Miguel A. and Vincent P. Crawford (2006). Cognition and behavior in two-person guessing games: an experimental study. American Economic Review 96(5): [20] Costa-Gomes, Miguel A., Vincent P. Crawford, and Bruno Broseta (2001). Cognition and behavior in normal-form games: an experimental study. Econometrica 69(5): [21] Crawford, Vincent P. (2003). Lying for strategic advantage: rational and boundedly rational misrepresentation of intentions. American Economic Review 93(1): [22] Crawford, Vincent P., Miguel A. Costa-Gomes, and Nagore Iriberri (2013). Structural models of Nonequilibrium strategic thinking: theory, evidence, and applications. Journal of Economic Literature 51(1): [23] Dufwenberg, Martin, Gunnar Köhlin, Peter Martinsson, and Haieselassie Medhin (2016). Thanks but no thanks: a new policy to reduce land conflict. Journal of Environmental Economics and Management 77: [24] Evdokimov, Piotr and Aldo Rustichini (2016). Forward induction: thinking and behavior. Journal of Economic Behavior and Organization 128: [25] Frederick, Shane (2005). Cognitive reflection and decision Making. Journal of Economic Perspectives 19(4): [26] Fischbacher, Urs (2007). z-tree: Zurich toolbox for ready-made economic experiments. Experimental Economics 10(2): [27] Forsythe, Robert, Mark Isaac, and Thomas R. Palfrey (1989). Theories and tests of blind bidding in sealed-bid auctions. The RAND Journal of Economics 20(2): [28] Gill, David and Victoria Prowse (2016). Cognitive ability, character skills, and learning to play equilibrium: a level-k analysis. Journal of Political Economy, forthcoming. 29

30 [29] Greiner, Ben (2004). The online recruitment system ORSEE - a guide for the organization of experiments in economics. Papers on Strategic Interaction , Max Planck Institute of Economics, Strategic Interaction Group. [30] Grossman, Sanford J. (1981). The informational role of warranties and private disclosure of product quality. Journal of Law and Economics 24(3): [31] Grossman, Sanford J. and Oliver Hart (1980). Disclosure laws and takeover bids. Journal of Finance 35(2): [32] Hagenbach, Jeanne and Eduardo Perez-Richet (2016). Communication with evidence in the lab. Working paper. [33] Heifetz, Aviad, Martin Meirer and Burkhard C. Schipper (2011). Prudent rationalizability in generalized extensive-form games. Working paper. [34] Ho, Teck-Hua and Xuanming Su (2013). A dynamic level-k model in sequential games. Management Science 59(2): [35] Huck, Steffen and Wieland Müller (2005). Burning money and (pseudo) first-mover advantages: an experimental study on forward induction. Games and Economic Behavior 51: [36] Jin, Ginger Zhe, Michael Luca, and Daniel Martin (2016). Is no news (perceived as) bad news? an experimental investigation of information disclosure. Working paper. [37] Meier, Martin and Burkhard C. Schipper (2014). Bayesian games with unawareness and unawareness perfection. Economic Theory 56(4): [38] Milgrom, Paul (1981). Good news and bad news: representation theorems and applications. The Bell Journal of Economics 12(2): [39] Milgrom, Paul and John Roberts (1986). Relying on the information of interested parties. Rand Journal of Economics 17(1): [40] Milgrom, Paul (2008). What the seller won t tell you: persuasion and disclosure in markets. Journal of Economic Perspectives 22(2): [41] Muller, R. Andrew and Asha Sadanand (2003). Order of Play, forward induction, and presentation effects in two-person games. Experimental Economics 6(1): [42] Nagel, Rosemarie (1995). Unraveling in guessing games: an experimental study. American Economic Review 85(5): [43] Oechssler, Jörg, Andreas Roider, and Patrick W. Schmitz (2009). Cognitive abilities and behavioral biases. Journal of Economic Behavior and Organization 72(1):

31 [44] Pearce, David G. (1984). Rationalizable strategic behavior and the problem of perfection. Econometrica 52(4): [45] Raven, J., J.C. Raven, and J.H. Court (2000). Manual for Raven s Progressive Matrices and Vocabulary Scales. San Antonio, TX: Pearson. [46] Shahriar, Quazi (2013). Forward induction and other-regarding preferences arising from an outside option: an experimental investigation. Journal of Management and Strategy 4(4): [47] Shahriar, Quazi (2014). An experimental test of the robustness and the power of forward induction. Managerial and Decision Economics 35(4): [48] Stahl, Dale O. and Paul W. Wilson (1995). On players models of other players: theory and experimental evidence. Games and Economic Behavior 10(1): [49] Wang, Joseph T.Y., Michael Spezio, and Colin F. Camerer (2010). Pinocchio s pupil: using eyetracking and pupil dilation to understand truth telling and deception in sender-receiver games. American Economic Review 100(3):

32 Appendix A Proofs A.1 Proof of Proposition 1 Proof. We characterize level-by-level: Level 1: For a buyer who receives message M, any level-1 prudent rationalizable strategy involves him choosing any quantity which is a best response to some full support belief over qualities in M. Any quantity choice that is not in M could be proven strictly dominated by some quantity choice that is a best response to a quality in M. In particular, if he hears {q} then he knows it is q and hence he purchases q. Any seller s strategy is level 1 rationalizable. Level 2: If the quality of the seller is q, then any message M with {s, q, r} M for s q < r is 2-level prudent rationalizable with the full support belief over buyer s level-1 prudent rationalizable strategies that put sufficiently large probability on the buyer buying r upon receiving message M while buying q upon receiving any other message consistent with q. Any non-singleton message M with max M = q (e.g., M = {r, q} with r < q) is not level-2 prudent rationalizable for the seller with quality q. To see this, note given a full support belief over buyer s strategies, this message yields a strictly smaller expected payoff than message {q}. In particular, this means that if q = 4, M is uniquely level-1 prudent rationalizable for the buyer if and only if M = {4}. If q < 4, then any singleton message consistent with q is not level-2 prudent rationalizable for the seller because with a full support belief over level-1 prudent rationalizable strategies of the buyer, any message {q, 4} yields a higher expected payoff. Any level-1 prudent rationalizable strategy of the buyer is also level-2 prudent rationalizable for the buyer. Level 3: A buyer who receives a non-singleton message M with max M = q knows from the seller s level-2 prudent rationalizable strategies that the quality is not q. Hence, any level-3 prudent rationalizable strategy of the buyer must prescribe to buy a quantity r < q. In particular, if M = {q, 4}, then any level-3 prudent rationalizable strategy of the buyer must prescribe to buy q. Any level-2 prudent rationalizable strategy of the seller is also level-3 prudent rationalizable. 32

33 Level 4: For a seller with quality q = 3, the only level-4 prudent rationalizable strategy is the send message {3, 4}. For any message consistent with q = 3 yields a strictly lower expected payoff because there is strict positive probability that the buyer buys a quantity strictly less than 3. For a seller with quality q, any strategy prescribing message M with max(m \ max M) = q and min M < q is not level-4 prudent rationalizable. To see this, note that message {q,..., 4} yields a strictly higher expected payoff than M for any full support belief over buyer s level-3 prudent rationalizable strategies. 2 Any other level-2 prudent rationalizable strategy of the seller is also level-4 prudent rationalizable by the same arguments as in the characterization of level-2 prudent rationalizable strategies. Any 3rd-level prudent rationalizable strategy of the buyer is also level-4 prudent rationalizable. Level 5: A buyer who receives a message M with M 3 knows from the seller s level-4 prudent rationalizable strategies that q = min M. To see this note that if q = 4 then any level-4 prudent rationalizable strategy prescribes message {4}, if q = 3 then it prescribes message {3, 4}, and if q = 2, then any level-4 prudent rationalizable strategy prescribes {2, 3}, {2, 4}, {2, 3, 4}, or {1, 2, 3, 4}. (When q = 1, then q = 1 must be contained in M.) For any other message, any purchase consistent with level-4 prudent rationalizable strategies of the buyer is also consistent with level-5 prudent rationalizable strategies of the buyer. Any 4-level prudent rationalizable strategy of the seller is also level-5 prudent rationalizable. Level 6: For a seller with quality q = 2, the only level-6 prudent rationalizable strategies must prescribe to send a non-singleton message M with min M = 2. To see this, note that the only level-4 prudent rationalizable strategy with min M = 1 of a seller with q = 2 is {1, 2, 3, 4}. Given a full support belief over buyer s level-5 prudent rationalizable strategies, any other level-4 prudent rationalizable strategy of the seller yields a higher expected payoff. Any level-5 prudent rationalizable strategies of the buyer are also level-6 prudent rationalizable. Level 7: The buyer, upon receiving message {1, 2, 3, 4} knows from level-6 prudent rationalizable strategies of the seller that q = 1. In fact, for any message M with 1 M, 2 This is stated more general that necessary since this case can apply only to a seller with q = 2. Thus, this condition states that any strategy that prescribes messages {1, 2, 4} or {1, 2, 3} are not level-4 prudent rationalizable for a seller with quality q = 2 because message {2, 3, 4} yields a strict higher payoff. 33

34 the buyer now knows that q = 1. Consequently any level-7 prudent rationalizable strategy of the buyer must prescribe to buy 1 upon receiving a message M with 1 M. Any level-6 prudent rationalizable strategies of the seller are also level-7 prudent rationalizable. No further refinements of strategies occur at higher levels of the prudent rationalizability procedure. A.2 Proof of Proposition 2 Proof. First notice that in our persuasion games, the buyer s payoff is maximized when x = q. Since the buyer s payoff function is strictly concave in x and so is the expected utility function, so it will never be optimal for the buyer to choose a mixed strategy. This implies that at the equilibrium, the buyer s quantity choice would be unique, and the seller s message choices could be multiple. Since the seller s payoff function increases in x and at the equilibrium it requires that the seller weakly prefers σ s (q) M(q) to {q}, this implies σ b (σ s (q)) q for all q Q. The strict inequality cannot hold since if it could hold the buyer could get strictly better off by reducing σ b to q which is dissatisfactory to an equilibrium condition. So it must be σ b (σ s (q)) = q, which means at equilibrium the best the seller can do is to sell the amount of x that is equal to q since if x > q the buyer can switch to x = q and make herself better off. Then for all M M(q) with q M, we have q σ b (M) due to the payoff maximization of the seller. Therefore, σ b (M) min{q q M}. We can prove that the strict inequality does not hold since if it does the buyer would increase her payoff by increase the amount to purchase which contradicts the maximization condition of the equilibrium. Thus at the equilibrium the buyer is skeptical and purchase the minimum quantity that is contained in received message. 34

35 B Experimental Instructions B.1 Instructions for 4-4 Treatment C Instructions for the Market Game Welcome to the experiment! Please now turn off your cell phones and any other electronic devices. These must remain turned off for the duration of the experiment. The amount of money you will earn in this experiment will depend on your choices. Thus, it is in your best interest to follow these instructions carefully. You will be paid in cash at the end of the experiment. During the experiment, we ask that you please do not talk to each other. If you have any question, please raise your hand and an experimenter will assist you. The experiment is made up of 3 phases. The first phase consists of a repeated market game. In the second phase you will complete a simple test. The third phase consists of a questionnaire. Phase 1 The market game in the first phase is repeated for 30 rounds. In each round you will be randomly selected as a seller or buyer and then paired up with another participant in the other role. Your role assignment is shown to you on the computer screen through each round. The market works as follows: Each market consists of one seller and one buyer. The seller can sell an imaginary object with a fixed price of $4 to the buyer. The object s quality may differ. The quality is randomly chosen from 1, 2, 3, and 4 with equal probability by the computer. 4 represents the highest quality while 1 is the lowest quality. At the beginning of each round, the seller is notified of the object s quality (q), which is displayed on the computer screen. The seller is able to supply as many objects of that quality as demanded by the buyer. The buyer does not know the object s quality unless the seller chooses to provide some information about the quality to the buyer. The seller can communicate through the computer any set of qualities to the buyer provided that the true quality is contained in this set. For instance, if the true quality of the object is 2, then the seller can send the buyer one of the 8 messages from the right-hand side column of the following table. The associated messages you will see on the computer screen are displayed in the column on the left. Out of 4 numbers, the shaded number(s) is(are) contained in the message. So if the true quality is 2, any possible message sent by the seller must include the true quality

36 C After receiving the information, the buyer selects the quantity of the good (x) to purchase. The quantity to purchase is restricted to 1, 2, 3, and 4 in the experiment. So only one of these 4 integers is acceptable as the buyer s purchasing quantity. The seller s payoff in each round is the price of the object ($4) multiplied by the number of units (x) sold to the buyer: 4 x. The buyer s payoff in each round is determined by both the quantity purchased (x) and the true quality (q) of the object: 12 6 x q + 6 x 2 3 q x. The key to interpret the buyer s payoff function is that for each quality q the buyer s payoff is maximized when the units purchased is equal to the true quality, that is x = q. We realize that this formula may look complicated. You may want to look at the following payoff table instead. The entries in the table show your rounded payoff for each true quality level (in columns) and units purchased (in rows). For instance, if the true quality is q = 4 and you purchase 2 units (that is, x = 2), then your payoff in this round is approximately $11. Quality q = 1 q = 2 q = 3 q = 4 x = Units x = Purchased x = x =

37 C After the buyer informs the seller about the quantity purchased via the computer, the computer will show the seller and the buyer the quantity purchased, the true quality and their own payoffs for the round just played. For an instance, if the true quality is 4 and the buyer chooses to purchase 3 units, then the seller s payoff is $12 and the buyer s payoff is $14. The experiment proceeds to the next round after both the seller and the buyer acknowledge this information by clicking the button on the computer screen. In the next round, each participant again is randomly selected to be a buyer or a seller and randomly matched with some participant of the experiment to play the market game. The true quality of the seller in this market game is also randomly selected and may differ from the true quality of the prior round. Phase 1 ends after 30 rounds of the market game have been played. Phase 2 Phase 2 consists of a simple test. The test is made up of 30 questions. For every question, there is a pattern with a piece missing and a number of pieces below the pattern. You have to choose which of the pieces below is the missing one to complete the pattern. For each question, one and only one of these pieces is the missing one to complete the pattern. You will score 1 point for every correct answer. After completing the test, you will be informed of your own test score. The test score will not affect your payment that you receive from the experiment. After completing both phases 1 and 2, your cash payment will be displayed on your computer screen. Your cash payment will be your payoff from one round randomly drawn from the 30 rounds of the market game plus a $5 show-up fee. Phase 3 While waiting to be called upon for payment, please complete the questionnaire that the experimenter will hand you. The questionnaire contains questions about demographics. Please carefully complete this questionnaire as this information is very important to us. After completing the questionnaire, please remain in your seat until you have been called upon for payment. Thank you very much for your participation. 3 37

38 B.2 Instructions for 2-4 Treatment T Instructions for the Market Game Welcome to the experiment! Please now turn off your cell phones. These must remain turned off for the duration of the experiment. The amount of money you will earn in this experiment will depend on your choices. Thus, it is in your best interest to follow these instructions carefully. You will be paid in cash at the end of the experiment. During the experiment, we ask that you please do not talk to each other. If you have any question, please raise your hand and an experimenter will assist you. The experiment is made up of 3 phases. The first phase consists of a repeated market game. In the second phase you will complete a simple test. The third phase consists of a questionnaire. Phase 1 The market game is repeated for 30 rounds. In each round you will be randomly selected as a seller or buyer and then paired up with another participant in the other role. Your role assignment is shown to you on the computer screen as the experiment proceeds. The market works as follows: Each market consists of one seller and one buyer. The seller can sell an imaginary object with a fixed price of $4 to the buyer. The object s quality may differ round by round which is randomly chosen from a set of numbers. At the beginning of each round, the seller is notified of the objects s quality (q), which is displayed on the computer screen. The seller is able to supply as many objects of that quality as demanded by the buyer. The buyer does not know the object s quality unless the seller chooses to provide some information about the quality to the buyer. The seller can communicate through the computer any set of qualities to the buyer provided that (s)he does not exclude the true quality. In the first 15 rounds, the quality is randomly chosen from 2 and 3 with equal probability by the computer. 3 represents the higher quality while 2 is the lower quality. For instance, if the true quality is 2, then the seller can send one the the following 2 messages shown in the right-hand side column of the following table to the buyer. The images in the column on the left are the associated messages displayed on the computer screen. The shaded number(s) is(are) contained in the message. So if the true quality is 2, any 1 38

39 T possible message sent by the seller must include the true quality 2. For the remaining of 15 rounds, the quality is randomly chosen from 1, 2, 3, and 4 with equal probability by the computer. 4 represents the highest quality while 1 is the lowest quality. For instance, if the true quality of the object is 2, then the seller can send the buyer one of the 8 messages from the right-hand side column of the following table. The associated messages that you will see on the computer screen are displayed in the column on the left. Out of 4 numbers, the shaded number(s) is(are) contained in the message. So in this case, any possible message sent by the seller must include the true quality 2. After receiving the information, the buyer selects the quantity of the good (x) to purchase. The quantity to purchase is restricted to 1, 2, 3, and 4 and only one of these 4 integers is acceptable as the buyer s purchasing quantity. The seller s payoff in each round is the price of the object ($4) multiplied by the number of units (x) sold to the buyer: 4 x. The buyer s payoff in each round is determined by both the quantity purchased (x) and the true quality (q) of the object: 12 6 x q + 6 x 2 3 q x. 2 39

40 T Don t panic! Here is what it means: for each quality q the buyer s payoff is maximized when the units purchased is equal to the true quality, that is x = q. Instead of looking at the formula, it would be easier to look at the following payoff table. The entries in the table show your rounded payoff for each true quality level (in columns) and units purchased (in rows). For instance, if the true quality is q = 4 and you purchase 2 units (that is x = 2), then as the buyer your payoff in this round is approximately $11. Quality q = 1 q = 2 q = 3 q = 4 x = Units x = Purchased x = x = After the buyer informs the seller about the quantity purchased via the computer, the computer will show the seller and the buyer the quantity purchased, the true quality and their own payoffs for the round just played. For an instance, if the true quality is 4 and the buyer chooses to purchase 3 units, then the seller s payoff is $12 and the buyer s payoff is $14. The experiment proceeds to the next round after both the seller and the buyer acknowledge this information by clicking the button on the computer screen. In the next round, each participant again is randomly selected to be a buyer or a seller and randomly matched with some participant of the experiment to play the market game. The true quality of the seller in this market game is also randomly selected and may differ from the true quality of the prior round. Phase 1 ends after 30 rounds of the market game have been played. Phase 2 Phase 2 consists of a simple test. The test is made up of 30 questions. For every question, there is a pattern with a piece missing and a number of pieces below the pattern. You have to choose which of the pieces below is the missing one to complete the pattern. For each question, one and only one of these pieces is the missing one to complete the pattern. You will score 1 point for every correct answer. After completing the test, you will be informed of your own test score. The test score will not affect your payment that you receive from the experiment. 3 40

41 T After completing both phases 1 and 2, your cash payment will be displayed on your computer screen. Your cash payment will be your payoff from one round randomly drawn from the 30 rounds of the market game plus a $5 show-up fee. Phase 3 While waiting to be called upon for payment, please complete the questionnaire that the experimenter will hand you. The questionnaire contains questions about demographics. Please carefully complete this questionnaire as this information is very important to us. After completing the questionnaire, please remain in your seat until you have been called upon for payment. Thank you very much for your participation. 4 41

42 C Screenshots Figure 7: Seller s Message Options Figure 8: Buyer s Purchase Decision 42

43 Figure 9: Seller s Payoff Information at the End of Each Round Figure 10: Buyer s Payoff Information at the End of Each Round 43