Simple Models of Discrete Choice and Their Performance in Bandit Experiments

Recent operations management papers model customers as solving multiarmed bandit problems, positing that consumers use a particular heuristic when choosing among suppliers. These papers then analyze the resulting competition among suppliers and mathematically characterize the equilibrium actions. There remains a question, however, as to whether the original customer models on which the analyses are built are reasonable representations of actual consumer choice. In this paper, we empirically investigate how well these choice rules match actual performance as people solve two-armed Bernoulli bandit problems. We find that some of the most analytically tractable models perform best in tests of model fit. We also find that the expected number of consecutive trials of a given supplier is increasing in its expected quality level, with increasing differences, a result consistent with the models' predictions as well as with loyalty effects described in the popular management literature.

[1]  Christopher M. Anderson Behavioral models of strategies in multi-armed bandit problems , 2001 .

[2]  Joseph Hall,et al.  Customer Service Competition in Capacitated Systems , 2000, Manuf. Serv. Oper. Manag..

[3]  M. Keane,et al.  Decision-Making Under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets , 1996 .

[4]  Colin Camerer,et al.  The Effects of Financial Incentives in Experiments: A Review and Capital-Labor-Production Framework , 1999 .

[5]  P. Allison Discrete-Time Methods for the Analysis of Event Histories , 1982 .

[6]  Thomas O. Jones,et al.  Why Satisfied Customers Defect , 1996 .

[7]  David W Harless,et al.  The predictive utility of generalized expected utility theories , 1994 .

[8]  T. Lai,et al.  Optimal stopping and dynamic allocation , 1987, Advances in Applied Probability.

[9]  M. Keane,et al.  Behavior in a dynamic decision problem: An analysis of experimental evidence using a bayesian type classification algorithm , 2004 .

[10]  R. Meyer,et al.  The Fundamental Theorem of Exponential Smoothing , 1961 .

[11]  A. Tversky,et al.  On the psychology of prediction , 1973 .

[12]  R. Meyer,et al.  Sequential Choice Under Ambiguity: Intuitive Solutions to the Armed-Bandit Problem , 1995 .

[13]  G. Harrison Theory and Misbehavior of First-Price Auctions , 1989 .

[14]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[15]  Donald Robbins,et al.  Individual organism probability matching with rats in a two-choice task , 1973 .

[16]  André de Palma,et al.  Discrete Choice Theory of Product Differentiation , 1995 .

[17]  Christian M. Ernst,et al.  Multi-armed Bandit Allocation Indices , 1989 .

[18]  Richard Schmalensee,et al.  Alternative models of bandit selection , 1975 .

[19]  Itzhak Gilboa,et al.  Cumulative Discrete Choice , 2001 .

[20]  A. Roth Laboratory Experimentation in Economics: A Methodological Overview , 1988 .

[21]  J. Gittins Bandit processes and dynamic allocation indices , 1979 .

[22]  A. Roth,et al.  Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria , 1998 .

[23]  Pamela W. Henderson,et al.  Mental accounting and categorization , 1992 .

[24]  Vishal Gaur,et al.  Asymmetric Consumer Learning and Inventory Competition , 2007, Manag. Sci..

[25]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[26]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27]  Colin Camerer,et al.  Experience‐weighted Attraction Learning in Normal Form Games , 1999 .

[28]  Mark A. Olson,et al.  An experimental analysis of the bandit problem , 1997 .

[29]  Noah Gans,et al.  Customer Loyalty and Supplier Quality Competition , 2002, Manag. Sci..

[30]  H. Simon,et al.  Theories of Decision-Making in Economics and Behavioural Science , 1966 .

[31]  Daniel Friedman,et al.  Experimental Methods: A Primer for Economists , 1994 .

[32]  Robert J. Meyer,et al.  Dynamic decision making: Optimal policies and actual behavior in sequential choice problems , 1994 .

[33]  Bruce D Burns,et al.  Randomness and inductions from streaks: “Gambler’s fallacy” versus ”hot hand“ , 2004, Psychonomic bulletin & review.

[34]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[35]  Anthony Macris One‐Armed Bandit , 2002 .

[36]  Frederick Mosteller,et al.  Stochastic Models for Learning , 1956 .

[37]  John D. C. Little,et al.  A Logit Model of Brand Choice Calibrated on Scanner Data , 2011, Mark. Sci..

[38]  A. Tversky,et al.  The hot hand in basketball: On the misperception of random sequences , 1985, Cognitive Psychology.