Modeling individual differences in socioeconomic game playing

Modeling individual differences in socioeconomic game playing Derrik E. Asher 1 , Shunan Zhang 1 , Andrew Zaldivar 1 , Michael D. Lee 1 and Jeffrey L. Krichmar 1,2 1 Department of Cognitive Sciences, University of California, Irvine 2 Department of Computer Science, University of California, Irvine Abstract and conflict with people (Asher, Zaldivar, Barton, Brewer, & Krichmar, submitted). Game theory has been useful for understanding risk-taking and cooperative behavior. In the present study, subjects played the Hawk-Dove game with simulated and embodied (robotic) neu- ral agents which used a neurobiologically plausible model of action selection and adaptive behaviors. Subjects had their serotonin levels temporarily altered through acute tryptophan depletion (ATD). The traditional assumption for subject data from Game-theory-ATD or human robot interaction (HRI) studies is that all participants come from the same underlying distribution or same group. We used probabilistic graphical models in order to determine potential sub-group affiliations based on the subjects’ responses while playing the Hawk-Dove game. The results from the models indicate sub-groups within a subject population exist. We find that two-group, one that tends toward cooperation and the other that tends toward ag- gression, best describes the effect of subject behavior in re- sponse to ATD and embodiment. Keywords: Adaptive systems; Human robot interaction; Neurotransmitters; Cognitive Robotics; Bayesian inference; Graphical models; Individual Differences. Subjects played a series of Hawk-Dove games against robotic and simulated agents. The effects of serotonergic lev- els on adaptive behavior in these games were tested by simu- lating serotonergic lesions in the neural agent, which results in a more aggressive agent, or lowering the CNS serotonin levels of people through a dietary manipulation called acute tryptophan depletion (ATD), which has been shown to de- crease cooperation and lower harm-aversion (Crockett, Clark, Tabibnia, Lieberman, & Robbins, 2008 ; Wood, Rilling, San- fey, Bhagwagar, & Rogers, 2006). A major finding of the study was that people changed their overall strategies in response to changes in the neural agents state. Subjects tended to deploy either Tit-For-Tat (T4T) or Win-Stay, Lose-Shift (WSLS) strategies during game play. In a T4T strategy, a subject copies the most recent move of the opposing player. In a WSLS strategy, a subject selects the same action that led to a positive payoff in the previous game (Win-Stay), or a different action from the previous game if that action led to zero or negative payoff (Lose-Shift). When playing against a more aggressive neural agent, which had a lesion to its serotonergic system, subjects switched from a Win-Stay, Lose-Shift (WSLS) strategy to a Tit-For-Tat (T4T) strategy. This change in strategy was independent of whether the neural agent was a robot or a computer simulation, and independent of subject tryptophan levels. Introduction Economic game theory has had a long, productive history of predicting and describing human behavior in cooperative and competitive situations (Maynard Smith, 1982 ; Nowak, Page, & Sigmund, 2000 ; Skyrms, 2001). The theory of games has also been used to illuminate the neural basis of economic and social decision-making (Lee, 2008 ; Rilling & Sanfey, 2011). However, these studies typically have people play against op- ponents with set strategies and predictable behavior. More- over, in most of these studies, subjects are making decisions while sitting in front of an antiseptic computer screen. The present study addresses these issues by having subjects play a socioeconomic game, known as Hawk-Dove, against an au- tonomous robot with the ability to adapt its behavior to the game situation. Neuromodulatory systems, such as dopamine and sero- tonin, appear to be applicable to decision-making in social situations. The serotonergic (5-HT) and dopaminergic (DA) systems oppose each other with respect to predicting pun- ishment (5-HT) versus predicting reward (DA) (Boureau & Dayan, 2011). We developed a computational model of neuromodulation and action selection based on the assumptions, that dopamine levels are related to the expected reward of an action, and serotonin levels are related to the expected cost or risk of an action (Asher, Zaldivar, & Krichmar, 2010 ; Zaldivar, Asher, & Krichmar, 2010). The model of neuromodulation and ac- tion selection demonstrated the ability to adapt to the game situation and its opponent’s strategy. The model was embed- ded in both simulated and embodied neural agents to inves- tigate reciprocal social interactions in games of cooperation In the present study, we test whether embodiment and low- ering serotonin has an effect on individual subject behavior during Hawk-Dove game playing by using hierarchical latent mixture models with Bayesian inference. This framework for developing and evaluating structured cognition offers a prin- cipled and comprehensive approach for modeling individual differences and their use of cognitive strategies (Lee, 2008 ; Lee, Zhang, Munro, & Steyvers, 2011). The hierarchical na- ture of the models allows variation in the parameters control- ling cognitive processes across individuals to be accommo- dated. We find that two categories of subjects, one that tends to be more aggressive and one that tends to be more coopera- tive, best describes subject behavior in response to ATD and embodiment. Experiment Subjects Eight subjects (three female; mean age: 26.6 years; standard deviation of age: 3.8 years) participated in this study.

[1]  Jeffrey L. Krichmar,et al.  Effect of neuromodulation on performance in game playing: A modeling study , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[2]  D. Nutt,et al.  Acute Tryptophan Depletion. Part II: Clinical Effects and Implications , 2005 .

[3]  Daeyeol Lee Game theory and neural basis of social decision making , 2008, Nature Neuroscience.

[4]  K. Lesch,et al.  Looking on the Bright Side of Serotonin Transporter Gene Variation , 2011, Biological Psychiatry.

[5]  Matthew D. Lieberman,et al.  Serotonin Modulates Behavioral Reactions to Unfairness , 2008, Science.

[6]  Jeffrey N. Rouder,et al.  A hierarchical model for estimating response time distributions , 2005, Psychonomic bulletin & review.

[7]  Saori C. Tanaka,et al.  Low-Serotonin Levels Increase Delayed Reward Discounting in Humans , 2008, The Journal of Neuroscience.

[8]  J M Smith,et al.  Evolution and the theory of games , 1976 .

[9]  Jonathan D. Cohen,et al.  The Neural Basis of Economic Decision-Making in the Ultimatum Game , 2003, Science.

[10]  E. Wagenmakers,et al.  Bayesian parameter estimation in the Expectancy Valence model of the Iowa gambling task , 2010 .

[11]  J. Rilling,et al.  The neuroscience of social decision-making. , 2011, Annual review of psychology.

[12]  M. Lee Three case studies in the Bayesian analysis of cognitive models , 2008, Psychonomic bulletin & review.

[13]  D. Nutt,et al.  Acute tryptophan depletion. Part II: clinical effects and implications. , 2005, The Australian and New Zealand journal of psychiatry.

[14]  C. Breazeal,et al.  Robots that imitate humans , 2002, Trends in Cognitive Sciences.

[15]  Cynthia Breazeal,et al.  Effect of a robot on user perceptions , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[16]  B. Newell,et al.  The Right Tool for the Job? Comparing an Evidence Accumulation and a Naive Strategy Selection Model of Decision Making , 2011 .

[17]  J. Rilling,et al.  Effects of Tryptophan Depletion on the Performance of an Iterated Prisoner's Dilemma Game in Healthy Adults , 2006, Neuropsychopharmacology.

[18]  Jeffrey L. Krichmar,et al.  Brain-Based Devices for the Study of Nervous Systems and the Development of Intelligent Machines , 2005, Artificial Life.

[19]  Saori C. Tanaka,et al.  Serotonin Differentially Regulates Short- and Long-Term Prediction of Rewards in the Ventral and Dorsal Striatum , 2007, PloS one.

[20]  P. Dayan,et al.  Opponency Revisited: Competition and Cooperation Between Dopamine and Serotonin , 2010, Neuropsychopharmacology.

[21]  Michael D. Lee,et al.  Psychological models of human and optimal performance in bandit problems , 2011, Cognitive Systems Research.

[22]  Jeffrey L. Krichmar,et al.  Simulation of How Neuromodulation Influences Cooperative Behavior , 2010, SAB.

[23]  P. Chapman Unknown Title. , 2008, Integrated environmental assessment and management.

[24]  T. Leddy Pacific Division of the American Philosophical Association , 1989 .

[25]  M. Nowak,et al.  Fairness versus reason in the ultimatum game. , 2000, Science.

[26]  Jeffrey L. Krichmar,et al.  Reciprocity and Retaliation in Social Games With Adaptive Agents , 2012, IEEE Transactions on Autonomous Mental Development.

[27]  W. M. Keck,et al.  Machine Psychology : Autonomous Behavior , Perceptual Categorization and Conditioning in a Brain-based Device , 2002 .