Generative Models of Cultural Decision Making for Virtual Agents Based on User s Reported Values

Size: px
Start display at page:

Download "Generative Models of Cultural Decision Making for Virtual Agents Based on User s Reported Values"

Transcription

1 Generative Models of Cultural Decision Making for Virtual Agents Based on User s Reported Values Elnaz Nouri and David Traum Institute for Creative Technologies,Computer Science Department, University of Southern California, USA Abstract. Building computational models of cultural decision making for virtual agents based on behavioral data is a challenge because finding a reasonable mapping between the statistical data and the computational model is a difficult task. This paper shows how the weights on a multi attribute utility based decision making model can be set according to the values held by people elicited through a survey. If survey data from different cultures is available then this can be done to simulate cultural decision making behavior. We used the survey data of two sets of players from US and India playing the Dictator Game and the Ultimatum Game on-line. Analyzing their reported values in the survey enabled us to set up our model s parameters based on their culture and simulate their behavior in the Ultimatum Game. Keywords: Cultural Decision Making Models, Dictator Game, Prediction Models 1 Introduction In this paper we address one of the major challenges in building behavioral models of culture through data driven approaches which is the task of finding an appropriate mapping from statistical behavior data onto computational models. This paper shows how the reported values of people can be used for building generative models for cultural decision making. The model is a multi attribute decision making model that takes into account several valuation functions such as the utility for self, the utility for the others in the interaction, the competitiveness and etc. This model can be used by virtual humans to make a wide variety of decisions in different contexts, including interacting with both other virtual agents or humans [3]. 2 Background 2.1 The Multi Attribute Relational Value Decision Making Model We briefly introduce the MARV framework ( Multi Attribute Relational Value).(for more details you can refer to [2]). In [3] cultural decision models based on this

2 framework were built by setting the weights on the attributes based on the Hofstede s dimensional scores [1] but our goal here is to find these weights on the attributes based on the survey reports. The framework considers a number of different metrics for evaluating a given situation. The metrics considered include: {Self Interest (the agent s own utility), Other Interest (the utility of another), Total Utility (sum of individual utilities of all participants), Average Utility (may not be derivable from Total Utility when the number of participants is variable), Relative Utilities (viewed in several ways, such as self/total, self/other, selfother, self/average), Minimum Utility, Uncertainty (variation among possible outcomes)} Each of these metrics can be given one or more valuations, choosing an optimum point and scale. Each individual agent has a vector of weights, one per valuation, indicating the relative importance of that valuation. The total value for each choice is the sum of the product of values and weights for each valuation, as shown below in formula 1: Utility(choice i ) = Σ(W j V j ) (1) For every decision, the agent calculates the utility of all of its possible choices and selects the one that has the highest overall valuation (according to the agent s knowledge and ability to calculate or estimate these values). 2.2 Self Reported Values in Games with MARV Survey Based on the set of attributes in MARV model, [4] made a survey of 8 questions for eliciting the values of people after they make their decisions.[4] used the survey to collect the values of people playing the Ultimatum and the Dictator game over a 100 points. The effect of culture on offers and values was investigated by recruiting people from US and India. 101 and 107 people from each country were recruited using Amazon Mechanical Turk to play the Ultimatum Game and the Dictator game respectively. In the Dictator game the players is asked to split a 100 points between themselves and the other player. The Ultimatum game is played similarly but the other player gets to decide whether he accepts the offer or not. If he accepts they split the points accordingly, if he rejects, they both end up with zero points. Participants were asked to report how much they cared about each of the values (shown in table 1.), on a scale from -5 to 5. 3 Mapping the Reported Survey Values to Utility Valuation Functions Step 1: Calculate Basic Valuation Functions for the Choices We first defined a set of simple mathematical functions (F ) capturing the meaning of each survey question. These functions are called valuation functions. Table 1

3 Survey Description Valuation Function Value V self Getting a lot of points f self = 100 offer V other The other player getting a lot of points f other = offer V compete Getting more points than the other player f compete = f self f other = (100 offer) offer ) V equal Having the same number of points as the f equal= 50/(50 f self f other = other player 50/(50 (100 offer) offer) ) V joint Making sure that added together we got as f joint = 100 many points as possible V rawls The player with fewest points gets as many f rawls = min{f self, f other } = as possible [5] min{offer, 100 offer} V lowerbound Making sure to get some points (even if not f lower bound = min{f self, f other } = as many as possible) min{offer, 100 offer} V chance The chance to get a lot of points (even if f chance = 100 offer there s also a chance not to get any points) Table 1. MARV Value Survey shows the definition of the functions. The definition of f self and f other functions are based on the structure of the game. 1 The utility of each choice is calculated based on a linear combination of the valuation functions (f j ) and appropriate weights (W j ) on them. The reported importance for each value (R j ) is used as contributing factors to the weights on these valuation functions. The weight (W j ) on each valuation functions is defined as W j = S j R j so the formula is: Utility(c i C) = (W j f j ) = (S j R j f j ) (2) f j F f j F In formula 2, f j refers to the valuation functions. R j is the corresponding weight reported by the player and Scale j is a parameter in the model that is set according to the dataset. Example: When the proposed offer is 20 then split is (80 self,20 other). The valuation functions for this offer (choice) are calculated according to the formulas in the table??. 2 Utility equation of choice offer 20 for the proposer in the Dictator Game: Utility(choice offer20 ) = (80 W self )+(20 W other )+(80/20 W compete )+(50/50 1 Note that these valuation functions can be generalized to other games based on the definition of the f self and f other functions. (for example in the case of the Dictator Game being played over 100 points f self +f other =100). Their interpretation depends on the definition of the game but the other functions can be computed based on these two basic functions. 2 Note that if the offer is rejected in the Ultimatum Game the f self = 0 and f other = 0 according to the rules of the game but the remaining valuation functions can be calculated like the previous cases.

4 abs(80 20) W fairness ) + (100 W joint ) + (20 W rawls ) + (20 W lowerbound ) + (80 W chance ) Step 2: Find Appropriate Scales for the Model based on the Culture For the scales S j in formula 2 on the valuation functions. This is done by searching the space of different possible values for each of the scale variables (All possible combinations of the scales drawn from the range of s S = { 10 3 to10 3 } were tried for finding the culture specific scales on each attribute in our simulations.), calculating the utilities and comparing the square distance of the generated behavior for the set of training observations to that of the actual cultural data. The combination of the scales that result in the minimum distance from the cultural data are selected as the scales on the valuation function for the members of that culture. Step 3: Calculate Prior Probability of Choices based on the Culture With appropriate scales found for each culture the model can be used for deterministic calculation of the utility of each choice based on each player s value profile. However, our agent uses the notion of Expected Utility (Equation 3) in order to asses the desirability of the choice. At time t in the game, the Expected Utility of each choice (c t ) is calculated by: E(Utility(c t )) = Utility(c t ) α P (c t ) β (3) Variations in the decisions of the individuals within a culture by calculating the probability of the offers based on his value profile and the Bayesian likelihood of choosing each offer amount based on the offers made by people holding similar values with the culture of the player.for each value (v j V ) the distribution of offers for each possible answers determined. c i C, v j V : P (c i v j ) = P (v j c i )P (c i ) P (v j ) = P (c i v j ) P (v j ) (4) The general formula for calculating the probability of offer c t when the player has reported a profile v self, v other,... of importance on the survey questions is calculated as: P (c t v j ) = v j V v j V P (c t v j ) P (v j ) (5) These probabilities are based on the reports and offers made among players from the cultural group that the the player belongs to.

5 Game Model US India Dictator Game Selfish Model Majority Model Our Model Ultimatum Game Selfish Model Majority Model Our Model Table 2. KL Divergence Distance of the Simulated Behavior to Human Behavior Step 4: Select the Choice This is the final step in the algorithm, the Decision t is made from the set of possible choices (C) by selecting the choice associated with highest expected utility (previously shown in formula??). 3 4 Evaluation We use the algorithm to generate behavior for agents representing US and India. We use the ten-fold cross validation paradigm for splitting our dataset into the training and test segments. We use two methods for evaluating our model: Comparing the Model s Output with the Actual Data The distribution of the offers made by the model were compared against the actual offers and two based lines: Selfish baseline: This model chooses the decisions that maximize self-utility. Majority baseline: This model chooses the most common decision made by the members of the culture. For both US and Indian in the Dictator Game and the Ultimatum Game this was an offer of 50 (the equal split). In 2 table the distribution of offers made by our model and the baseline models are compared to the actual data by using the KL divergence distance metric. 4 Our model is performing significantly better for US and Indian players playing both games. Predicting Future Offers of Players based on their Values In table 3 we compare the performance of our model versus the Selfish model and the Majority model in predicting the future offers of the players when provided with their values. Our model outperforms the other models in the prediction task in all conditions except for the US players playing the Ultimatum Game. 3 It is possible to define other selection functions at this step. For example one can use a selection function enabling the agent to make decisions as soon as a minimum threshold is met on a specific value. We discuss this in future work section. 4 To measure the difference between the distributions of offers we use Kullback-Leibler divergence measure between two probability distributions.

6 Game Model US India Dictator Game Selfish Model 12% 20% Majority Model 52% 20% Our Model 56% 28% Ultimatum Game Selfish Model 0% 4% Majority Model 52% 56% Our Model 48% 64% Table 3. Accuracy of the Prediction of the Offers 5 Conclusions and Future Work In this paper we provided an algorithm for effectively mapping people s reported values to the utility calculation component of the agent. The cultural variations in behavior are captured by adjusting the weights of the attributes in the model according to the data and calculating the Bayesian probabilities of occurrences of the decisions in the culture group. We compared the performance of our model to human behavioral data by using two well-known games of the Dictator Game and the Ultimatum Game which are widely used by researchers for studying decision making among different groups of people.one of the advantage of using this approach is that its based on a short 8-question survey and doesn t rely on other culture models. We acknowledge that there are other possible mapping functions for this purpose. We address these mappings in future work. References 1. Hofstede, G., Hofstede, G.J., Minkov, M.: Cultures and organizations: software of the mind: 3rd edition. McGraw-Hill Professional (2010) 2. Nouri, E., Georgila, K., Traum, D.: Culture-specific models of negotiation for virtual characters: multi-attribute decision-making based on culture-specific values. Journal of AI and Society 1(1), (2014) 3. Nouri, E., Traum, D.: A cultural decision-making model for virtual agents playing negotiation games. In: Proceedings of the International Workshop on Culturally Motivated Virtual Characters. 11th International Conference on Intelligent Virtual Agents (2011) 4. Nouri, E., Traum, D.: Prediction of game behavior based on culture factors. In: proceedings of Group Decision and Negotiation Conference (2013) 5. Rawls, J.: Some reasons for the maximin criterion. The American Economic Review 64(2), (1974)