Post-Classical Game Theory: Opportunities for IS Researchers

Size: px
Start display at page:

Download "Post-Classical Game Theory: Opportunities for IS Researchers"

Transcription

1 1 / 57 Post-Classical Game Theory: Opportunities for IS Researchers Steven O. Kimbrough University of Pennsylvania kimbrough [á] wharton.upenn.edu and visiting at Karlsruhe Service Research Institute, KIT KSRI,

2 2 / 57 Outline 1 A Grand Challenge 2 How to Proceed? 3 Example: Oligopoly Markets Bertrand Competition Cournot Competition 4 Discussion 5 A Start on Vehicle 2 6 End Matter

3 3 / 57 Acknowledgements Earlier version presented at CSWIM, 30 June 2012, Thanks to Christian, Magie, and Charles for largely convergent comments on a much earlier version. This is also a continuation of my first talk, Karlsruhe-intro-beamer.tex on 7 May Will try to minimize overlap.

4 4 / 57 The design, monitoring, and maintenance of social institutions. A grand challenge of our time. (Why is this an IS (IS/ICT/IM) topic?) Goal of this talk: to introduce and motivate a program of research in IS, one that addresses this grand challenge. Note: I am not saying that this is the only way forward. It is one of several, non-competing alternatives. Here, I will draw extensively on results from post-classical game theory.

5 5 / 57 See my book: [Kimbrough, 2012] here will be routed separately from ance to review and make corrections AGENTS, GAMES, AND EVOLUTION AGENTS, GAMES, AND EVOLUTION Strategies at Work and Play Kimbrough K11564 Steven Orla Kimbrough

6 6 / 57 Classical game theory 1 Observe a CSI in the wild. 2 Model the CSI rigorously as a game. 3 Assume ideal rationality predict outcome as equilibrium. 4 Find the equilibri(um/a) of the game. CSI: context of strategic interaction (interdependent decision making). What s not to like? There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy. (Long story. Includes prediction failures, implausible assumptions, and narrowness of scope. Checkers?)

7 7 / 57 Post-classical game theory 1 Observe a CSI in the wild. 2 Model the CSI rigorously as a game. NB. Model may be specified by rules or by procedures. 3 Undertake a strategy acquisition process. (Or observe it.) Exogenously: Discover a satisfactory consideration set of strategies/policies of play. Endogenously: Specify a learning / procedural regime whereby the agents (players) acquire strategies/policies of play. 4 Find the behavior of the resulting system. What happens when the players/agents with their acquired strategies play in the model? (Note: Address both strategy selection questions and institutional design questions.)

8 8 / 57 On the demand side... New technologies and globalization New possibilities for kinds of markets and other institutions. Some requirements: quickly and easily formed institutions, not supported by advertising. Example: Social Clouds (explored at KSRI). Increasing complexity in existing institutions, evisceration or capture of governments, increasing size and potential for extraordinary returns Urgent need to monitor and maintain institutions, to prevent their capture by special interests and their becoming extractive [Acemoglu and Robinson, 2012]. Examples: California electricity deregulation. Financial services, banking?

9 9 / 57 On the theory side: severe problems Thinking of the economy as a thermodynamic machine (an energy-driven system) is out of the mainstream. Also insufficiently factored into current thinking about design, monitoring and management of institutions: Externalities. Such as air pollution. Common pool resource problems. Such as protecting the atmosphere, preserving a fishery, maintaining social capital. Environmental sustainability. Monitoring and maintaining the well-being of environmental resources. The account of rationality assumed by theory is not viable. With full rationality and full knowledge. Even the assumption of equilibrium in models (in both economics and in game theory) is problematic.

10 10 / 57 Just looking at equilibrium Many challenges, such as time to reach it, too many equilibria. More fundamentally: Unless a given game has a self-evident way to play, self-evident to the participants, the notion of a Nash equilibrium has no particular claim upon our attention. [Kreps, 1990, page 31] And even more fundamentally...

11 It is simply not rational to play the equilibrium strategy ,15 12,17 10,15 8,13 6,11 4,9 4 17,12 13,13 10,15 8,13 6,11 4,9 3 15,10 15,10 11,11 8,13 6,11 4,9 2 13,8 13,8 13,8 9,9 6,11 4,9 1 11,6 11,6 11,6 11,6 7,7 4,9 0 9,4 9,4 9,4 9,4 9,4 5,5 Table: A cascading Prisoner s Dilemma in strategic form. Five rounds of Axelrod s stage game. Variations of GRIM TRIGGER. [Kimbrough, 2012, page 414]. If i plays n, then i wants to play n 1, n > / 57

12 12 / 57 How to Proceed? And what is the role for IS? Broadly, philosophical position is Pragmatism. Think: If you can t make one, then you don t know how it works. So, build realistic models (not merely stylized models), see how they track reality. Principles of simplicity and minimalism. Simple models, minimal rationality, then elaborate as appropriate. Claim: Real progress will require procedural, computationally feasible models, tested by real data (field, experimental). ICT? Most institutions of import will be non-trivially mediated by ICT systems. Required: design, monitoring, adjustment, transparency. Needed: Substantial social immune systems. Think: beyond autonomic computing.

13 13 / 57 Can you be a little more specific? A good model is Braitenberg s Vehicles

14 14 / 57 Begin simply, study in depth, elaborate. Cycle

15 15 / 57 Begin simply, study in depth, elaborate. Cycle

16 16 / 57 Begin simply, study in depth, elaborate. Cycle

17 17 / 57 After multiple cycles Complex systems, emergent properties. We know how it works because we built it. Substantial biological verisimilitude demonstrated in the book. (Not merely stylized models.) Much detailed knowledge gained about the model systems. Doors opened for a program of research. OK, what about an example involving an institution?

18 Bertrand Competition Example: Competition on price Each period all firms offer a price and the market takes all demand from the low-price firm. Economics theory: collusion is impossible. Even with just two firms in the market they will compete away their profits. If firm 1 really believes that firm 2 will charge a price ˆp that is greater than the marginal cost, it will always pay firm 1 to cut its price to ˆp ε. But firm 2 can reason the same way! Thus any price higher than marginal cost cannot be an equilibrium; the only equilibrium is the competitive equilibrium. [Varian, 2003, page 488] Note the business literature on this: Don t do it! 18 / 57

19 19 / 57 Bertrand Competition PROBE AND ADJUST [Kimbrough, 2012] A kind of reinforcement learning for a continuous quantity. Episode (round of play). Epoch (a number of episodes). Probe uniformly ±δ anchor value in each episode. Adjust anchor value ±ε at the end of each epoch. +δ δ +ε ε Anchor value Record rewards per episode (up, down) and adjust according to update policy.

20 Bertrand Competition Bertrand (price) competition with PROBE AND ADJUST Both firms using the update policy of Own Returns. Replicates standard theory. 20 / 57

21 Bertrand Competition Bertrand (price) competition with PROBE AND ADJUST Both firms using the update policy of MR-COR. Contradicts standard theory. 21 / 57

22 Bertrand Competition Comments These findings are robust to starting positions, costs to firms, etc. What does matter is the number of firms in the market. Reverts to competitive market after a tipping point, affected by the number of firms, the patience of the firms, and their epoch lengths. See [Kimbrough, 2012, Kimbrough and Murphy, 2009] for detailed discussions, including pseudo-code. See AGEbook/nlogo/OligopolyBidPrice.html for the NetLogo program. This simple model explains tacit collusion and its loss with increasing numbers of firms, etc. It also counsels patience. Think: executive compensation. 22 / 57

23 Cournot Competition Cournot (quantity) competition The other main theoretical model of oligopoly. Quantity competition: Each period firms offer quantities of a good and the market sets the price. Each firm receives a reward that is the product of the quantity it put to the market and the realized market price. 23 / 57

24 Cournot Competition Cournot reference model Roughly: A market for a particular product supplied by n firms. During each time step each of the supplying firms offers quantity Q i (i = 1, 2,..., n) to the market, so that the total supply in a given period is Q = n Q i (1) i=1 The unit price resulting is determined by the demand function P = max{a slope Q, 0} (2) Each firm i receives revenues of P Q i. 24 / 57

25 25 / 57 Cournot Competition Cournot reference model (con t.) Firms may independently and without communication with each other adjust the quantities they offer to the market, their Q i s. In setting their Q i s each firm takes into account its unit cost of production, k i, and the behavior of the other firms. Each firm follows the best response strategy If all of the firms do this they will reach the Cournot equilibrium in which the individual firm Cournot quantities are Qi C (a k i ) (n, k i ) = (3) (n + 1) slope

26 Cournot Competition Technical aside: The math behind Cournot [Kimbrough, 2012] For clarity we present the underlying model and resulting key quantities that we refer to throughout, as well as the terminology we use. To begin, we assume a linear inverse demand function: P = a slope Q (4) P is the price realized in the market. Q is the total quantity of good supplied to the market. a is the price intercept and slope > 0, we assume. We also assume that negative prices are not permitted, so (5) is actually what is assumed. P = max{a slope Q, 0} (5) We begin with the duopoly case and then generalize the results. Let the agents have unit costs, k i, which can differ. In the duopoly case the profit of firm 1 is then 26 / 57

27 27 / 57 Cournot Competition π 1 = P Q 1 k 1 Q 1 = (a slope (Q 1 + Q 2 )) Q 1 k 1 Q 1 (6) For firm 2 we have π 2 = P Q 2 k 2 Q 2 = (a slope (Q 1 + Q 2 )) Q 2 k 2 Q 2 (7) Differentiating we get dπ 1 dq 1 = a 2 slope Q 1 slope Q 1 dq 2 dq 1 slope Q 2 k 1 (8) dπ 2 dq 2 = a 2 slope Q 2 slope Q 2 dq 1 dq 2 slope Q 1 k 2 (9) Setting dq 2 dq 1 and dq 1 dq 2 to 0 as usual leads to Here is where we get reaction inconsistency!

28 28 / 57 Cournot Competition 0 = a 2 slope Q 1 slope Q 2 k 1 (10) 0 = a 2 slope Q 2 slope Q 1 k 2 (11) and then on to Q 1 = a slope Q 2 k 1 2 slope (12) which when solved yield Q 2 = a slope Q 1 k 2 2 slope Q C 1 (2, [k 1, k 2 ]) = a 2k 1 + k 2 3 slope Q C 2 (2, [k 1, k 2 ]) = a + k 1 2k 2 3 slope (13) (14) (15)

29 29 / 57 Cournot Competition Notice that Q C (2, [k 1, k 2 ]) = Q C 1 (2, [k 1, k 2 ]) + Q C 2 (2, [k 1, k 2 ]) = 2a k 1 k 2 3 slope (16) The formula generalizes. With n players having proportional costs k i {1, 2, 3,..., n} (total cost = unit cost quantity = k i Q i ) we have expression (17). Q C (n, k) = Q C = n i=1 Q C i = na n i=1 k i (n + 1) slope (17)

30 Cournot Competition Is there another way of modeling this? The Cournot conclusion follows mathematically provided you make the best response assumption (and hence are committed to reaction inconsistency. But why should you? Behaviorally implausible. What if the agents follow PROBE AND ADJUST in learning to set their quantities? See [Kimbrough and Murphy, 2009], Learning to Collude Tacitly on Production Levels by Oligopolistic Agents or Chapter 10 of Agents, Games, Evolution. 30 / 57

31 Cournot Competition Quantity setting with PROBE AND ADJUST Agents collectively reach the Cournot quantity. That is, they individually and collectively put to the market the total quantity that is predicted by the Cournot model. Without the implausible Cournot assumptions! And under a plausible behavioral procedure. Also observed: number effects consistent with behavioral experiments. 31 / 57

32 32 / 57 Cournot Competition Cournot (quantity) competition with PROBE AND ADJUST Both firms update with Own Returns and settle near the Cournot equilibrium. Replicates the standard theory. Robust generally and specifically to the number of firms.

33 Cournot Competition Cournot competition with PROBE AND ADJUST Both firms update with Market Returns and settle near the monopoly equilibrium. Contradicts the standard theory. Robust generally and specifically to the number of firms. Market Returns update policy is highly exploitable. 33 / 57

34 Cournot Competition Cournot competition with PROBE AND ADJUST Both firms update with MR-COR and settle near the monopoly equilibrium. Contradicts the standard theory. Robust generally and specifically to the number of firms. 34 / 57

35 Cournot Competition Cournot competition with PROBE AND ADJUST Firm 0 updates with MR-COR, firm 1 with Own Returns. They settle near the Cournot equilibrium. Beyond the scope of the standard theory. 35 / 57

36 Cournot Competition Comments Firm 1 does slightly worse than firm 0. Firm 0 would make both firms better off by switching to MR-COR. = MR-COR is not exploitable. Robust generally and specifically to the number of firms. [Kimbrough, 2012, Kimbrough and Murphy, 2009]. Similar results for supply curve bidding with step functions (electricity markets). [Kimbrough, Murphy, working paper]. Worse, theoretical results: With reaction consistency, Cournot = Bertrand [Kimbrough, Murphy, Smeers working paper]. 36 / 57

37 Cournot Competition Conclusions on oligopoly results Standard oligopoly theory is seriously deficient. The results on display here 1 Give a unified contradiction of oligopoly theory, 2 Provide a credible explanation for why tacit collusion and market power are possible, and 3 Explain the differences between price and quantity competition without having to assume reaction inconsistency in one model and not the other. Even very simple markets need new approaches to investigation. 37 / 57

38 38 / 57 Vehicle 1 Think of the material here on oligopoly markets and PROBE AND ADJUST as analogous to Braitenberg s Vehicle 1. How should we build Vehicle 2?

39 What do we want to know about our markets? And more generally, our social institutions? Can market/institutional power be realized? If so, how? (See above on oligopoly.) What are the social welfare characteristics of the institution? Fairness? Productivity?... Pareto efficiency is hardly sufficient. Stability? Robustness? Resilience? Reconfigurability? Autonomic potential? Credible alternatives and their properties? Externalities? Positive? Negative? Can they be endogenized? If so, how? What are the epistemic burdens placed on participants? How can they be reduced and with what consequences? 39 / 57

40 40 / 57 What do we want to know about our institutions? (continued) Privacy? Liquidity? What needs to be monitored and why? (Both inside and outside of the institution.) How can we monitor what needs to be monitored? How will real people behave with a given institution and what will its resulting behavior be? How can we find effective strategies for acting within a given institution?...

41 41 / 57 Points arising 1 We can easily make the list of interesting questions much longer. 2 Just articulating the list in depth is an important research challenge. 3 While analytic, closed form results are always welcome, we should use whatever methods yield tangible results for designing, monitoring, and maintaining social institutions. Real institutions, not merely stylized models of them. This will surely involve procedural modeling and simulation (and much else). Game theory, economics, and institutional design must be seen as branches of empirical science, not as branches of applied mathematics.

42 42 / 57 And IS? The role of ICT looms large in this grand project. The field of IS is potentially, but not inevitably, a major player. Systems analysis with a higher calling. Look beyond the system to the entire institution.

43 43 / 57 IDS Games Another example, also from Agents, Games, and Evolution [Kimbrough, 2012, chapter 14], to illustrate the possibilities for this style of research into institutions. Also, A Framework for Computational Strategic Analysis with an Application to Iterated Interdependent Security Games by Yevgeniy Vorobeychik, Steven Kimbrough, and Howard Kunreuther (under review). IDS games: InterDependent Security games. Conceived by Howard Kunreuther to model the Lockerbie bombing, among other situations. Characterized by stochastic payoffs. In theory these should not matter. In the lab they have been shown to have strong effects.

44 44 / 57 One-shot IDS payoff schema Invest (I) Not Invest ( I) Invest (I) Y c Y c q L Y c Y p L Not Invest ( I) Y p L Y [p L + (1 p) q L] Y c q L Y [p L + (1 p) q L] Table: Basic IDS game: Expected outcomes associated with investing or not investing in security. Y is the wealth position of each player before the game is played. c is the cost of investing in the security or protection device. p is the probability that a security event happens. It may or may not do damage, depending on what the players have chosen. If damage is done, then the loss is L.

45 45 / 57 The Strategy Selection Problem Central to post-classical game theory; absent in classical game theory. What are good strategies for playing IDS games? And how do we find them? Approach: Once we have a candidate set, line em up and let em play. Started with Axelrod s IPD tournaments. But we need to go further.

46 [Kimbrough, 2012, chapter 3] To be warranted (to be choosable with Pragmatic Strategic Rationality) a strategy should preferably: 1 Do well in competition with alternative strategies. 2 Do well when played against itself. 3 Do well when played against other strategies that do well against themselves. 4 Do well when played with other high-quality strategies. 5 Be robust to starting conditions. 6 Be robust to conditions of play. 7 Not be exploitable: not do very much worse than any strategy it encounters. 8 Avoid self-destructive behavior (a problem for GRIMTRIGGER). 9 Be viable in small numbers. 10 Be an effective signaler. 46 / 57

47 47 / 57 Back to IDS games Observe subjects behavior in IDS laboratory experiments and adduce strategies to explain their behavior (e.g., INVESTAFTERLOSS, NEVERINVEST). Add to this candidate list strategies that intuitively seem sensible or are interesting for other reasons (e.g., FICTITIOUSPLAY, INVESTWITHPROBABILITY 0.3). Implement the strategies in a program environment. Line em up and let em play.

48 Figure: IDS-2x2-Tournaments.nlogo, AlwaysInvest plays itself with 48 / 57

49 49 / 57!" A Grand Challenge)!" How )*" to%!" Proceed? %*" Example: **" Oligopoly &!" &*" Markets )!" Discussion %!" %*" A*!" Start **" on Vehicle &!" 2 End Matter *!" )*" &*" 4#0(* Figure 12: Relative tournament performance (as measured by V T ) of the two strategies that invest and don t invest after loss, respectively. Lighter boxes correspond to lower tournament regret and, therefore, better performance in tournament relative to the best strategy.!" 1#0(* Proportions Replicator Dynamics: Proportions of Policies in the Population Over Time NeverInvest 0.6 TFT InvestAfterLoss DontInvestAfterLoss 1TitFor2Tats 0.5 2TitsFor1Tat FictitiousPlay Expected Proportion Investing Replicator Dynamics: Level of Investment over Time t t Figure 13: Left: Proportion of a subset of policies through a sequence of replicator dynamics iterations that begin with each policy having an equal share of the population. Right: Expected level of investment over a sequence of replicator dynamics iterations that begin with each policy having an equal share of the population. Full feedback setting. 35 Figure: IDS replicator dynamics with full feedback. [VKK paper] 2TitsFor1Tat wins, with TitForTat doing well. Investment takes over.

50 Replicator Dynamics: Proportions of Policies in the Population Over Time Replicator Dynamics: Level of Investment over Time Proportions NeverInvest InvestAfterLoss InvestNAfterLoss DontInvestAfterLoss PFTFTPlusLossInvest PFTFTPlusLossNotInvest PFTFTPlusSticky Expected Proportion Investing t t Figure 14: Left: Proportion of a subset of policies through a sequence of replicator dynamics iterations that begin with each policy having an equal share of the population. Right: Expected level of investment over a sequence of replicator dynamics iterations that begin with each policy having an equal share of the population. Partial feedback setting. Figure: IDS replicator dynamics with partial feedback. [VKK paper] NeverInvest wins, with PFTitForTatPlusSticky doing well. No $" investment!#," takes over. /05;768"!#+" =>??"=003@;1A" 4;57;?"=003@;1A" 50 / 57

51 PFTITFORTATNSTICKY from [Kimbrough, 2012] Assuming partial feedback, PF. Begin by cooperating (playing Invest ) and continue to do so until receiving an indirect loss. After receiving an indirect loss, play NotInvest until the counterpart has played Invest for N = 3 rounds in a row. if( policy-of-play = "PFTitForTatNSticky") [ let NN 3 let DaLosses 0 if (length MyIndirectLosses = 0) [set DaLosses 0] if (length MyIndirectLosses > 0 and length MyIndirectLosses < NN) [set DaLosses sum MyIndirectLosses] if (length MyIndirectLosses >= NN) [foreach n-values NN [?] [ set DaLosses (DaLosses + item? MyIndirectLos 51 / 57

52 -./01203"456/65768"69":66/05;768" $" &!" &(" '!" '(" (!" ((" )!" :6<2" Figure 15: Proportion of cooperation vs. cost in full and partial information games. Figure: IDS sensitivities summary on cost. [VKK paper] 52 / 57

53 53 / 57 Consequences for policy making? Large regions of stability. Financial incentives? Can there ever be enough money? With, say, full feedback in which investment wins out, can we use computation and visualization to convince people (even selected or elites) that it is better to invest? Role of norms? Three kinds of norms [Bicchieri, 2005]: 1 Descriptive 2 Convention 3 Social (informal, but having normative force) Formal rules or laws (e.g., requirement to buy health insurance)?

54 54 / 57 In conclusion... The study of vehicle 2 is hardly complete. Yet, I submit, we can see a template or formula for going forward productively. More realistic modeling + collection of real data + computational exploration Useful results for policy making, for design, monitoring, and maintenance of social institutions.

55 55 / 57 Acemoglu, D. and Robinson, J. A. (2012). Why Nations Fail: The Origins of Power, Prosperity, and Poverty. Crown Publishers, New York, NY. Bicchieri, C. (2005). The Grammar of Society: The Nature and Dynamics of Social Norms. Cambridge University Press, Cambridge, UK. Kimbrough, S. O. (2012). Agents, Games, and Evolution: Strategies at Work and Play. CRC Press, Boca Raton, FL. Kimbrough, S. O. and Murphy, F. H. (2009). Learning to collude tacitly on production levels by oligopolistic agents.

56 56 / 57 Computational Economics, 33(1): and sokpapers/2009/oligopoly-panda-r2.pdf. Kreps, D. M. (1990). Game Theory and Economic Modeling. Clarendon Press, Oxford, England. Varian, H. R. (2003). Intermediate Microeconomics: A Modern Approach. W. W. Norton & Company, New York, NY, sixth edition.

57 57 / 57 $Id: Karlsruhe beamer.tex :09:23Z so