Applying Cognitive Architectures to Decision-Making: How Cognitive Theory and the Equivalence Measure Triumphed in the Technion Prediction Tournament

Applying Cognitive Architectures to Decision-Making: How Cognitive Theory and the Equivalence Measure Triumphed in the Technion Prediction Tournament Terrence C. Stewart (tcstewar@uwaterloo.ca) Centre for Theoretical Neuroscience, University of Waterloo Waterloo, ON, N2L 3G1 Robert West (robert_west@carleton.ca) Institute of Cognitive Science, Carleton University Ottawa, ON, K1S 5B6 Christian Lebiere (cl@cmu.edu) Psychology Department, Carnegie Mellon University Pittsburgh, PA, 15213 Abstract to risk-aversion depending on the variance of the reward. However, many of these effects are subject to variations as a function of learning, individual differences, and other factors (Lebiere, Gonzalez & Martin, 2007). For the Technion Prediction Tournament, we developed a model of making repeated binary choices between a safe option and a risky option. The model is based on the ACT-R declarative memory system, with the use of the Blending mechanism and sequential dependencies. By using established cognitive theory, rather than specialized machine learning techniques, our model was the most predictive when generalizing to new conditions. However, we did not tweak parameters to minimize prediction error; instead we maximized the number of different conditions producing statistically equivalent behavior. If we had not done this the model would not have won the tournament. This leads to the paradoxical result that by emphasizing cognitive explanation over prediction, we achieve more accurate predictions. Technion Prediction Tournament To encourage the creation and evaluation of models of this fundamental component of human decision-making, Ido Erev organized a competitive modeling tournament called the Technion Prediction Tournament. The tournament had three divisions. The division of interest for us involved modeling human behavior in different versions of a repeated binary-choice game. Empirical data was gathered on 120 randomly chosen empirical conditions with different rewards. In each condition, one option always produced the same deterministic reward M, while the other option would produce the reward H with probability p H and otherwise produce the reward L. Rewards and probabilities were chosen to make the expected value of each choice roughly even, emphasizing attitudes toward risk rather than abilities to estimate reward. For each condition, 20 participants made 100 decisions, receiving a numerical reward after each choice. This type of task is meant to capture the essential qualities of what most people would call a game or competition (e.g., tennis, baseball, boxing, paper-rock- scissors, poker). The competition also included two other divisions where only a single choice was made after subjects either learned or were told the reward structure. Intuitively, these conditions model informed human decision-making. Neither of these divisions is considered here. For complete details on the tournament, see Erev et al., (in press). As part of the competition, empirical data on 60 of the 120 conditions was publicly released. Researchers were free to use this data to produce predictive models that were then tested by examining their predictions on the remaining 60 conditions. The model presented here won the tournament in the repeated game division. That is, it produced more accurate predictions (in terms of mean squared error) on the testing data set than any of the other models in the division. Due to Keywords: decision-making; cognitive modeling; equivalence; ACT-R; blending; sequential dependencies; Repeated Binary Choice Decisions The effects of rewards on decision-making are highly studied, and a wide variety of effects have been observed. In the simplest paradigm, two choices are presented to a participant, and once a decision has been made an explicit numerical reward is provided. If this process is repeated many times, participants will start to favor one choice over the other if it is rewarded more. The standard empirical result is “probability matching”. Here, if option A has a probability p of providing more reward than option B, then participants would choose option A with probability p. Interestingly, this is very different from the optimal strategy of choosing A if p>0.5 and otherwise choosing B. However, Friedman and Massaro (1998) note that “probability matching in binary choice ... is less robust than most psychologists seem to believe.” Many more complex effects have since been identified. For example, in the Loss Rate Effect, “when the action that maximizes expected value increases the probability of losses, people tend to avoid it” (Erev & Barron, 2005, p. 917). That is, a choice that has a higher expected value in the long run will be chosen less often if it is comprised of many small losses and few large gains. In the Payoff Variability Effect, individuals will switch from risk-seeking

[1]  Zenon W. Pylyshyn,et al.  Computation and Cognition: Toward a Foundation for Cognitive Science , 1984 .

[2]  Alex M. Andrew,et al.  Computation and Cognition: Towards A Foundation for Cognitive Science, by Zenon W. Pylyshyn, MIT Press, Cambridge, Mass., xxiii + 292 pp., £26.15 , 1985, Robotica.

[3]  John R. Anderson,et al.  Reflections of the Environment in Memory Form of the Memory Functions , 2022 .

[4]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[5]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .

[6]  Dominic W. Massaro,et al.  Understanding variability in binary and continuous choice , 1998 .

[7]  Scott Sanner,et al.  Achieving Efficient and Cognitively Plausible Learning in Backgammon , 2000, ICML.

[8]  Christian Lebiere,et al.  Simple games as dynamic, coupled systems: randomness and other emergent properties , 2001, Cognitive Systems Research.

[9]  Christian Lebiere,et al.  Sequence Learning in the ACT-R Cognitive Architecture: Empirical Analysis of a Hybrid Model , 2001, Sequence Learning.

[10]  C. L. Giles,et al.  Sequence Learning - Paradigms, Algorithms, and Applications , 2001 .

[11]  W. Tryon Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests. , 2001, Psychological methods.

[12]  L. Barker,et al.  Assessing equivalence: an alternative to the use of difference tests for measuring disparities in vaccination coverage. , 2002, American journal of epidemiology.

[13]  Cleotilde Gonzalez,et al.  Instance-based learning in dynamic decision making , 2003 .

[14]  Dario D. Salvucci,et al.  Choice and Learning under Uncertainty: A Case Study in Baseball Batting , 2003 .

[15]  I. Erev,et al.  On adaptation, maximization, and reinforcement learning among cognitive strategies. , 2005, Psychological review.

[16]  R. Sun Cognition and Multi-Agent Interactions: From Cognitive Modeling to Social Simulation , 2005 .

[17]  C. Lebiere,et al.  Stochastic Resonance in Human Cognition: ACT-R Versus Game Theory, Associative Neural Networks, Recursive Neural Networks, Q-Learning, and Humans , 2005 .

[18]  Christian Lebiere,et al.  Cognition and Multi-Agent Interaction: Cognitive Architectures, Game Playing, And Human Evolution , 2005 .

[19]  Christian Lebiere,et al.  Cognition and Multi-Agent Interaction: From Cognitive Modeling to Social Simulation , 2006 .

[20]  Terrence C. Stewart,et al.  A methodology for computational cognitive modelling , 2007 .

[21]  T. C. Stewart,et al.  Equivalence: A novel basis for model comparison , 2007 .

[22]  Michael K. Martin,et al.  Instance-Based Decision Making Model of Repeated Binary Choice , 2007 .

[23]  Alvin E. Roth,et al.  A choice prediction competition: Choices from experience and from description , 2010 .