On Modeling Imperfect Rationality in Agent-Based Models

Size: px
Start display at page:

Download "On Modeling Imperfect Rationality in Agent-Based Models"

Transcription

1 1 / 32 On Modeling Imperfect Rationality in Agent-Based Models Steven O. Kimbrough Operations and Information Management University of Pennsylvania kimbrough [á] wharton.upenn.edu July 23, 2011, Nancy, FR, XIV Congress on Logic, Methodology and Philosophy of Science

2 2 / 32 Outline 1 Introduction 2 Main Topic 3 PROBE AND ADJUST Cournot Competition Bertrand Competition 4 Discussion 5 End Matter

3 3 / 32 Advertisement Much of the material from this presentation draws upon my forthcoming book Agents, Games, and Evolution: Strategies at Work and Play, published by Taylor & Francis. The book develops procedural game theory and aims to be a principled alternative to standard accounts of rationality and strategic interaction found in classical game theory and neoclassical economics.

4 The Defense of Rationality point, often made in discussion, is roughly: Yes of course rationality ideal rationality as defined in game theory and economics is utterly unrealistic. Humans, let alone other animals, simply do not have the epistemic and computational powers to operate in accordance with the theory. The thing is, ideal rationality allows us to make unique predictions. There is only one way to be (ideally) rational in any given situation. Ideal rationality is 1 Clearly defined, 2 Mathematically grounded, 3 Principled, and 4 Leads to unique predictions. There are infinitely many ways to behave nonrationally, and so any alternative to ideal rationality has to be ad hoc and so unprincipled. And with many alternatives, these alternatives cannot give us a unique prediction. 4 / 32

5 5 / 32 Concluding with This point is often followed up with the claim that individual deviations from rationality are unimportant in the general scheme of things because (a) they appear as noise and will be averaged out in real situations, and (b) failures of rationality will quickly be punished by arbitragers, yielding systemic rationality after all. With or without the above point, the conclusion is drawn (or insinuated) that modeling with the assumption of (ideal) rationality should continue and be the preferred mode for Serious People doing modeling in the social sciences, and that there is no genuine credible alternative to Rational Choice Theory.

6 6 / 32 Comments This is a terrible argument. I am not primarily concerned to rebut it. (Although I will make some passing comments in that direction.) I am mainly concerned with addressing the important question touched upon by Defense of Rationality point: If not Rational Choice Theory in our models, then what?

7 Really? Can you show us some data that markets obey in the aggregate the strictures of Rational Choice Theory? We d be especially interested in your empirical demonstration that bubbles never occur, that market economies are inherently stable and will not go into boom and bust cycles absent regulation. While you are at it, please refute the now thousands of empirical studies seeming to undermine your position. Why do you think that only Rational Choice Theory can be clearly defined, mathematically grounded, and principled? In strategic interactions, there may be 2 or more, even an infinite number of equilibria. You predict an equilibrium outcome but theory cannot say which one. Why do you say RCT yields a unique outcome? While you are at it, explain when there are 2 or more equilibria just how the agents are to coordinate their strategies so as to achieve an equilibrium outcome. 7 / 32 First and quickly, some rebuttals

8 Consider Stag Hunt Just in pure strategies, in just the one-shot game we have two equilibria. C D C D 3 2 [PN] [N] 2 1 How are the players to achieve coordination? If indefinitely iterated, every sequence of play by either player belongs to some equilibrium. Consider: Row flips a coin to determine its play each round. Matching by Column is an equilibrium. So there are very many indeed. How is any one of these to be achieved? 8 / 32

9 9 / 32 Criteria for a principled alternative to Rational Choice Theory for agents in a model Clearly defined Mathematically or algorithmically grounded Principled Leads to clear predictions, produces outcomes reliably Computable with plausibly-available resources Plausibly approximate to real processes by the agents in question Testable, calibratable Warranted from the agent s perspective

10 10 / 32 In addition... Modesty is a virtue. If an effect advantageous to the agents can be demonstrated with a modest (cf., minimal ) rationality, that itself is interesting. Presumably, more ambitious forms of rationality will not lead to abandonment of the initial advantage.

11 11 / 32 OK, show me Agreed: Can we find an example and do the lessons generalize? That s next.

12 12 / 32 PROBE AND ADJUST A kind of reinforcement learning for a continuous quantity. Episode (= round of play). Epoch (= a number of episodes). Probe uniform ±δ in each episode. Adjust ±ε at the end of each epoch. +δ δ ε ε Base value

13 Cournot Competition Cournot Competition Classic model of oligopoly. Quantity competition: Firms offer quantities of a good and the market sets the price. The classic theory is undermotivated mathematically. Assumes best response behavior. This leads to the Cournot equilibrium, which lies between the monopoly quantity and the competitive quantity. 13 / 32

14 Cournot Competition Cournot reference model Roughly: A market for a particular product supplied by n firms. During each time step each of the supplying firms offers quantity Q i (i = 1, 2,..., n) to the market, so that the total supply in a given period is Q = n Q i (1) i=1 The unit price resulting is determined by the demand function P = max{a slope Q, 0} (2) Each firm i receives revenues of P Q i. 14 / 32

15 15 / 32 Cournot Competition Cournot reference model (con t.) Firms may independently and without communication with each other adjust the quantities they offer to the market, their Q i s. In setting their Q i s each firm takes into account its unit cost of production, k i, and the behavior of the other firms. Each firm follows the best response strategy If all of the firms do this they will reach the Cournot equilibrium in which the individual firm Cournot quantities are Qi C (a k i ) (n, k i ) = (3) (n + 1) slope

16 Cournot Competition Is there another way of modeling this? The Cournot conclusion follows mathematically provided you make the best response assumption. But why should you? Behaviorally implausible. What if the agents follow PROBE AND ADJUST in learning to set their quantities? See [Kimbrough and Murphy, 2009], Learning to Collude Tacitly on Production Levels by Oligopolistic Agents or Chapter 10 of Agents, Games, Evolution. 16 / 32

17 Cournot Competition Quantity setting with PROBE AND ADJUST Agents collectively reach the Cournot quantity. That is, they individually and collectively put to the market the total quantity that is predicted by the Cournot model. Without the implausible Cournot assumptions! And under a plausible behavioral procedure. Also observed: number effects consistent with behavioral experiments. 17 / 32

18 Cournot Competition But there is more... Part of PROBE AND ADJUST is the update policy used by the agent. What should the agent track for deciding on adjustments at the end of epochs? The above results were obtained for agents using the Own Returns policy. What happens if instead the agents track the total returns to the industry, Market Returns? They collectively arrive at the monopoly quantity of the low-cost producer! But This is exploitable. 18 / 32

19 Cournot Competition But there is more... Part of PROBE AND ADJUST is the update policy used by the agent. What should the agent track for deciding on adjustments at the end of epochs? The above results were obtained for agents using the Own Returns policy. What happens if instead the agents track the total returns to the industry, Market Returns? They collectively arrive at the monopoly quantity of the low-cost producer! But This is exploitable. 19 / 32

20 Cournot Competition But there is more... Part of PROBE AND ADJUST is the update policy used by the agent. What should the agent track for deciding on adjustments at the end of epochs? The above results were obtained for agents using the Own Returns policy. What happens if instead the agents track the total returns to the industry, Market Returns? They collectively arrive at the monopoly quantity of the low-cost producer! But This is exploitable. 20 / 32

21 Cournot Competition Is there another way? MR-COR Market Returns, Constrained by Own Returns When all agents use it, they collectively arrive at the monopoly quantity of the low-cost producer. If an agent defects (to Own Returns), they return to the Cournot equilibrium. In the defecting scenario, the MR-COR agents do just slightly worse than the Own Returns agent(s). But all agents are better off if they all choose MR-COR. (Stag Hunt-like) 21 / 32

22 Cournot Competition Tacit collusion, achieved by simple agents Contrary to the Cournot model, quantity putting in an ongoing market is an indefinitely iterated game, subject to the Folk Theorem. PROBE AND ADJUST with MR-COR has found a natural, credible equilibrium for the II game. See AGEbook/nlogo/OligopolyPutQuantity.html 22 / 32

23 Bertrand Competition Competition on price Each period all firms offer a price and the market takes all demand from the low-price firm. Economics theory: collusion is impossible. Even with just two firms in the market they will compete away their profits. If firm 1 really believes that firm 2 will charge a price ˆp that is greater than the marginal cost, it will always pay firm 1 to cut its price to ˆp ε. But firm 2 can reason the same way! Thus any price higher than marginal cost cannot be an equilibrium; the only equilibrium is the competitive equilibrium. [Varian, 2003, page 488] Note the business literature on this: Don t do it! 23 / 32

24 24 / 32 Bertrand Competition PROBE AND ADJUST With Own Returns, agents indeed compete away their profits. A race to the bottom. With MR-COR, tacit collusion to reach the monopoly price, is possible. It depends on The number of firms in the market The epoch lengths of the firms How tolerant the firms are to sub-par returns See AGEbook/nlogo/OligopolyBidPrice.html

25 25 / 32 Bertrand Competition PROBE AND ADJUST With Own Returns, agents indeed compete away their profits. A race to the bottom. With MR-COR, tacit collusion to reach the monopoly price, is possible. It depends on The number of firms in the market The epoch lengths of the firms How tolerant the firms are to sub-par returns See AGEbook/nlogo/OligopolyBidPrice.html

26 26 / 32 Bertrand Competition PROBE AND ADJUST With Own Returns, agents indeed compete away their profits. A race to the bottom. With MR-COR, tacit collusion to reach the monopoly price, is possible. It depends on The number of firms in the market The epoch lengths of the firms How tolerant the firms are to sub-par returns See AGEbook/nlogo/OligopolyBidPrice.html

27 27 / 32 Revisiting the criteria for a principled alternative to Rational Choice Theory for agents in a model Clearly defined: Yes, PROBE AND ADJUST is specified as an algorithm and implemented in publicly-available code. Mathematically or algorithmically grounded: Sure. Principled: Yes, Leads to clear predictions, produces outcomes reliably: We have seen examples; there are others. Computable with plausibly-available resources: Obviously so. Plausibly approximate to real processes by the agents in question: Credibly so. Testable, calibratable: Clearly. Warranted from the agent s perspective: Clearly. Can we find a better approach?

28 28 / 32 Further comments Needed: Full exploration of applicable procedures. Are there important and useful procedures other than PROBE AND ADJUST? Of course! Melioration, EWA, etc. Hope for results robust across multiple procedures. That such a simple procedure (PROBE AND ADJUST) can lead to tacit collusion is quite significant. Will other procedures do this? Is there any reason to think that smarter procedures will not? Notice that we have dispatched with the coordination problem and the equilibrium selection problem in these examples.

29 29 / 32 Wither Rational Choice Theory? While it is theoretically interesting that, e.g., Own Returns in PROBE AND ADJUST coheres with Rational Choice Theory in outcome (but not in process), really what has RCT brought to the table? But the principle of (ideally) rational choice is: Maximize expected utility. (And Rational Choice Theory says you do.) What s not to like?

30 30 / 32 In large part, irrelevance The injunction to maximize expected utility is not an algorithm, is not an effective procedure. But we need an effective procedure. Consider two procedures, each involving decisions taken over time, each decision yielding a reward at the time taken. Procedure A is locally greedy, B is not. Overall, B performs better than A. Which is rational? The advice to consider all possible procedures and to pick one with the highest expected utility will only rarely be actionable, and then only in simple or contrived cases. Compare with equilibrium prediction in game theory.

31 31 / 32 Kimbrough, S. O. and Murphy, F. H. (2009). Learning to collude tacitly on production levels by oligopolistic agents. Computational Economics, 33(1): and sokpapers/2009/oligopoly-panda-r2.pdf. Varian, H. R. (2003). Intermediate Microeconomics: A Modern Approach. W. W. Norton & Company, New York, NY, sixth edition.

32 32 / 32 $Id: Nancy-CLMPS-2011-beamer.tex :46:10Z sok