Convergence in human decision-making dynamics

A class of binary decision-making tasks called the two-alternative forced-choice task has been used extensively in psychology and behavioral economics experiments to investigate human decision making. The human subject makes a choice between two options at regular time intervals and receives a reward after each choice; for a variety of reward structures, these experiments show convergence of the aggregate behavior to rewards that are often suboptimal. In this paper we present two models of human decision making: one is the Win-Stay, Lose-Switch (WSLS) model and the other is a deterministic limit of the popular Drift Diffusion (DD) model. With these models we prove the convergence of human behavior to the observed aggregate decision making for reward structures with matching points. The analysis is motivated by human-in-the-loop systems, where humans are often required to make repeated choices among finite alternatives in response to evolving system performance measures. We discuss application of the convergence result to the design of human-in-the-loop systems using a map from the human subject to a human supervisor.

[1]  B. Øksendal Stochastic differential equations : an introduction with applications , 1987 .

[2]  Naomi Ehrich Leonard,et al.  Explore vs . Exploit : Task Allocation for Multi-robot Foraging , 2009 .

[3]  Shirley Dex,et al.  JR 旅客販売総合システム(マルス)における運用及び管理について , 1991 .

[4]  Rachid Alami,et al.  Task planning for human-robot interaction , 2005, sOc-EUSAI '05.

[5]  Alexei Makarenko,et al.  Measuring human-robot team effectiveness to determine an appropriate autonomy level , 2008, 2008 IEEE International Conference on Robotics and Automation.

[6]  Jean Scholtz,et al.  Common metrics for human-robot interaction , 2006, HRI '06.

[7]  Petter Ögren,et al.  Cooperative control of mobile sensor networks:Adaptive gradient climbing in a distributed environment , 2004, IEEE Transactions on Automatic Control.

[8]  Samuel M. McClure,et al.  Short-term memory traces for action bias in human reinforcement learning , 2007, Brain Research.

[9]  L. Lefebvre,et al.  Wild Carib grackles play a producer-scrounger game , 2007 .

[10]  Philip Holmes,et al.  A simple decision task in a social context: Experiments, a model, and preliminary analyses of behavioral data , 2008, 2008 47th IEEE Conference on Decision and Control.

[11]  Andres G. Zellweger,et al.  Modeling Distributed Human Decision Making in Traffic Flow Management Operations , 2001 .

[12]  J. Baillieul,et al.  Reactive exploration through following isolines in a potential field , 2007, 2007 American Control Conference.

[13]  J. Gregory Trafton,et al.  Enabling effective human-robot interaction using perspective-taking in robots , 2005, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[14]  P. Dayan,et al.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[15]  P. Montague,et al.  A Computational Role for Dopamine Delivery in Human Decision-Making , 1998, Journal of Cognitive Neuroscience.

[16]  Jonathan D. Cohen,et al.  Explicit melioration by a neural diffusion model , 2009, Brain Research.

[17]  Frederik W. Heger,et al.  Human-Robot Teams for Large-Scale Assembly , 2007 .

[18]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[19]  Kristi A. Morgansen,et al.  Modeling and analysis of dynamic decision making in sequential two-choice tasks , 2008, 2008 47th IEEE Conference on Decision and Control.

[20]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[21]  Birsen Donmez,et al.  Auditory Decision Aiding in Supervisory Control of Multiple Unmanned Aerial Vehicles , 2009, Hum. Factors.

[22]  R. Herrnstein Rational Choice Theory Necessary but Not Sufficient , 1990 .

[23]  Panos M. Pardalos,et al.  Cooperative control and optimization , 2002 .

[24]  R. Herrnstein,et al.  The Matching Law Papers in Psychology and Economics , 1997 .

[25]  F. Knorn Topics in Cooperative Control , 2011 .

[26]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[27]  Naomi Ehrich Leonard,et al.  Integrating human and robot decision-making dynamics with feedback: Models and convergence analysis , 2008, 2008 47th IEEE Conference on Decision and Control.

[28]  M. Nowak,et al.  A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game , 1993, Nature.

[29]  Samuel M. McClure,et al.  Policy Adjustment in a Dynamic Economic Game , 2006, PloS one.

[30]  Sonia Martínez,et al.  Coverage control for mobile sensing networks , 2002, IEEE Transactions on Robotics and Automation.

[31]  P. Montague,et al.  Neural Economics and the Biological Substrates of Valuation , 2002, Neuron.

[32]  T. Caraco,et al.  Social Foraging Theory , 2018 .

[33]  Jonathan D. Cohen,et al.  The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.