An Introduction to Agent-Based Modeling Unit 9: Advanced Topics

Size: px
Start display at page:

Download "An Introduction to Agent-Based Modeling Unit 9: Advanced Topics"

Transcription

1 An Introduction to Agent-Based Modeling Unit 9: Advanced Topics Bill Rand Assistant Professor of Business Management Poole College of Management North Carolina State University

2 Using Big Data and Agent-Based Modeling to Understand Social Media Diffusion William Rand in collaboration with David Darmon, Jimpei Harada, Jared Sylvester, and Michelle Girvan

3 The Beauty of Big Data Individuals leave their footprints (and fingerprints) in the digital sand. These traces represent digital / social signals of human behavior. We can use these signals to explore the complexity of individuals and digital environments.

4 The Challenge of Big Data What do we do with all this data? Aggregation of the data eliminates the richness of it The real benefit is when the data is used at the individual level How to make sense of that individual-level information?

5 Agent-Based Modeling Agent-Based Modeling provides a way to understand individual-level interactions Traditionally, agent-based models use simple rules derived from theory If we could create ABMs directly from big data we would have an individual-level detailed model derived directly from digital traces

6 The Solution Use Machine Learning to derive individuallevel rules for agents automatically Then use the resulting ABM to understand the overall emergent properties of agent interactions

7 The Computational Mechanics Framework Creating Epsilon Machines Assume was generated by a conditionally stationary stochastic process. Explicitly learn the predictive distribution by grouping together pasts x that give equivalent predictions. Causal State Model (CSM) built for each user using Causal State Splitting Reconstruction (CSSR) Begin with one state and divide as necessary.

8 Predicting Twitter Engagement Using ML Computational Mechanics Call the model learned the causal state model (CSM) for each user. CSM has the unique property of being maximally predictive and minimally complex Learn this state-space representation of the process using Causal State Splitting Reconstruction (CSSR).

9 Two Case Studies Predicting Twitter Behavior Identifying Optimal Timing for Messaging

10 Understanding the Predictive Power of Computational Mechanics and Echo State Networks in Social Media David Darmon, Jared Sylvester, Michelle Girvan, and William Rand SocialCom 2013

11 Predicting Twitter Engagement Using ML Motivation Opportunity: Social Media is a great marketing channel. Challenge: However, there is a lot of noise, and its not apparent when users are paying attention. Solution: Understand when users are active. Goal: Create a tool that can accurately predict when different segments will post content.

12 Predicting Twitter Engagement Using ML The Dataset Twitter users embedded in a 15k user follower network. Statuses of all users collected over 7 weeks. Select 3k subset of most frequently tweeting users.

13 Predicting Twitter Engagement Using ML The Setup Timestamp Tweet Text :54:06 Is Your Gmail Social? How to Use Gma :11:22 Facebook's Embedded Posts Now Ava :14:06 The Credible Hulk :29:02 25 Things You Didn t Know About Nin :32:59 Twitter Users: Revoke and Reestablish :48:46 10 Brilliant Facebook Marketing Tactics :17:11 Google Now Adds Cards for NCAA F :18:03 What is the NSA Really Up To? [COM :39:04 6 Things Every Good Business Blog MU User: DanielZeevi

14 Predicting Twitter Engagement Using ML The Setup Bin (in time) Twitter data, giving a discrete time series for each user v at time t: user v doesn t tweet user v tweets Time

15 User: HadiJayaPutra

16 Results

17 Predicting Twitter Engagement Using ML Testing Procedure Build model for each user separately Training: 45 days Testing: 4 days Look back 10 steps Predict ahead 1 step 0-1 Loss Compare to majority vote baseline

18 CSM vs. ESN

19 CSM vs. ESN

20 Base Rate: CSM Rate: ESN Rate: User: DanielZeevi

21 Forecasting High Tide: Predicting Times of Elevated Activity in Online Social Media David Darmon, Jimpei Harada, William Rand, and Michelle Girvan ASONAM, 2015

22

23

24

25

26

27

28

29

30

31

32 Conclusions Machine Learning plus Agent-Based Modeling provides a robust and powerful solution to harnessing the power of big data This method can provide predictive power that exceeds traditional aggregate methods Limitation: Requires high-resolution time series data

33 Future Work Construction of a software platform that would automatically generate agent-based models in a uniform way Implementation on GPGPU architectures Tests of the long-range predictive power of these models

34 Elaborated and Realistic Models Many ABMs are very simple Elaborated and Realistic models are the opposite variety of mechanisms use empirical data match a variety of outcomes more easily falsifiable Imagine a Schelling model that matched Chicago s urban patterning

35 Criticisms of ER Models Highly contingent May not be generalizable Very difficult to understand

36 False Dichotomy? We do not need to choose between simple and ER models Pattern Oriented Modeling (POM; Grimm et al., 2006) argues that one model should be able to replicate patterns at multiple levels of granularity This could entail a model that is both ER and generalizable However, such a singular perfect model might be difficult to create

37 Full Spectrum Modeling Full Spectrum Modeling is the idea that we should create a suite of models at different levels of granularity You can construct simple models and elaborate and realize them as necessary to create specific models Simple models explore the necessity and importance of mechanisms ER models can explore specific instances and make particular forecasts

38 A Suite of Models Rather than thinking of models as singular models, think of them as suites kill your darlings. - Faulkner

39 Typical Modeling Setup Two Groups of Researchers Subject Matter Experts Model Implementers Typical Modeling Lifecycle 1. SMEs design model 2. Model implementers build model 3. Implementers present results to SMEs

40 Problems with Typical Setup Model designs are rarely complete Early results could have dramatic implications for model design Lack of communication results in lack of understanding Saying That s what the model shows. is never enough, model designers need to understand why the model shows that

41 Iterative Modeling It is important for model designer and implementer to communicate often New Modeling Lifecycle 1. Specify Minimal Model 2. Implement Minimal Model 3. Communicate Results A. Revise Model Design B. Collect additional data as necessary 4. Expand Minimal Model Minimally, go to #1 Fail Fast Just-In-Time Model (JIT) Construction

42 ABMs that are not for Research Most of what we have discussed is ABMs in the context of scientific research But ABM can be used in other contexts as well Communication Persuasion Education Decision Support

43 The Problem with Complex Systems Complex Systems often require knowledge for multiple subject areas, but also knowledge from different backgrounds e.g., Urban planning scientists policymakers citizens businesses city managers emergency services

44 ABM as Communication An ABM can be created as an object to think with (Seymour Papert), i.e., a shared focal object This can then be potentially understood and examined by all stakeholders as a communication tool Using a language like NetLogo facilitates this by providing a simple language to understand Imagine CitySim as a solution to our urban planning problem

45 ABMs as Persuasion Now that the model has been constructed and communicated Stakeholders can use CitySim to argue for their own policies policy flight simulator - Holland, 1996

46 ABM as Education The same model could also be used to educate students about how complex systems work ABM has been used for chemistry, materials science, electromagnetism, and evolution among many other subjects ABM enables learning because micro-rules are often easier to understand than macro-behaviors ABM encourages a generative understanding of phenomenon

47 ABM as Decision Support ABM can be used to explore the effect of a wide variety of strategies For instance, explore how to use ABM to identify optimal word-of-mouth marketing decisions (in press, JMR, Chica and Rand)

48 Anonymous Procedures Anonymous Procedures are bits of code that we do not give names to and then we can pass these procedures to other procedures to solve complex problems SET SETUP1 [ [] -> RANDOM 4 ] SET SETUP2 [ [] -> BLUE ] ASK PATCHES [ SET PCOLOR RUNRESULT SETUP1 (or SETUP2) ]

49 RUN / RUNRESULT RUN and RUNRESULT both take strings, sets of commands or anonymous procedures, i.e., code and run them This allows you to dynamically create code during the run of the model The difference is that RUNRESULT returns a result, i.e., a reporter RUN [ BK 1 LT 90 RT 90 ] SET HEADING RUNRESULT [ * 2 ]

50 MAP MAP takes a reporter and a list and applies the reporter to all elements of the list map [ val -> round (val / 1000)] [ ] (company revenue in 1000s) (map [ [rev emp] -> round ((rev / emp) / 1000)] [ ] [ ]) (company revenue per employee in thousands) from El Farol sum (map [ [weight week] -> weight * week ] butfirst strategy subhistory)

51 REDUCE REDUCE takes a list and applies one reporter to each item in the list from left to right (reduce + [ ]) / 5 Average reduce [ [input1 input2] -> ifelse-value (input2 = 35) [input1 + 1] [input1]] [ ] Counts number of occurrences of 35

52 Participatory Simulation Participatory Simulation is the creation of simulation that incorporate input from human agents along with simulated agents In NetLogo, this is implemented via HubNet For instance, in the HubNet Disease model you can control humans trying to avoid catching a disease

53 Creating a HubNet Model Basic architecture One computer is the server (host) Other computers / terminals can connect to this computer (clients) Steps 1. Initialize HubNet (HUBNET-RESET) 2. Listen to clients (HUBNET-MESSAGE-WAITING?, HUBNET-FETCH-MESSAGE) 3. Process messages (HUBNET-ENTER-MESSAGE?, HUBNET-EXIT-MESSAGE?, HUBNET-MESSAGE-SOURCE, HUBNET-MESSAGE-TAG, HUBNET-MESSAGE)

54 Basic Techniques Many applications map a client to a turtle Often done by setting a turtles-own variable Need to programmatically send messages from host to clients Add elements to client interface HUBNET-SEND Customize what a client sees Client Perspectives and Client Overrides

55 System Dynamics Modeling SDM represents complex systems using numerical states of the world called stocks and changes in those states called flows In NetLogo there is a tool called Systems Dynamics Modeler, which is similar to other toolkits, e.g., STELLA SDM can be used in combination with ABM

56 Wolf-Sheep Predation (docked) This model shows how you can compare ABM and SDM models ABM has discrete units, while SDM tends to represent things in continuous variables so the results can be different

57 Tabonuco-Yagrumo Hybrid Model This model contains a hybrid of ABM and SDM The ABM controls the location of trees, while SDM controls the death and birth of trees

58 Extensions API The extensions API in NetLogo gives you the ability to create new commands for the NetLogo language Often this is done to give NetLogo access to external software packages e.g., GIS and Network tools

59 GIS Extension (gis) Gives you the ability to read, write and interact with GIS data in NetLogo gis:load-dataset gis:envelope, gis:envelope-union-of, gis:intersecting gis:feature-list-of gis:property-value gis:draw, gis:set-drawing-color

60 Network Extension (nw) Gives you the ability to generate standard networks and calculate statistics about them nw:generate-preferential-attachment nw:generate-random nw:betweenness-centrality

61 LevelSpace Extension (ls) The LevelSpace Extension gives you the possibility to call other models from your model ls:reset ls:create-models ls:ask ls:models

62 The Controlling API and the Mathematica Link NetLogo can be invoked by another program if that program is running on the Java Virtual Machine This means you can call NetLogo code from Java, Scala, Clojure, Groovy, JRuby, Jython, etc. The Mathematica Link operates similarly allowing you to call NetLogo code from inside Mathematica

63 Future of ABM

64 Automatic Generation of Agent Rules More work needs to be done about how to create rules from data sources automatically These rules also need to be validated Causal State Modeling is one example of this New sources of data: big data, administrative data, natural language data, social data, app data

65 Improved Methods of Validation and Calibration We need rigorous guidelines to follow to show that our models have been validated appropriately A statistics-like suite of tests Make tools like BehaviorSearch easier to use so that users can easily calibrate models

66 Models Using Streaming Data If we can build models automatically, and continuously validate them then we could construct a model that was continually updating on the basis of streaming data We could then use this to support decisionmaking in realtime

67 Unit 9 Overview Big Data + ABM Design Guidelines of ABM Other Uses of ABM Advanced Programming Constructs Participatory Simulation System Dynamics Modeling Extensions Future of ABM Unit 9 Slides Unit 9 Test