Human Collective Intelligence under Dual Exploration-Exploitation Dilemmas

The exploration-exploitation dilemma is a recurrent adaptive problem for humans as well as non-human animals. Given a fixed time/energy budget, every individual faces a fundamental trade-off between exploring for better resources and exploiting known resources to optimize overall performance under uncertainty. Colonies of eusocial insects are known to solve this dilemma successfully via evolved coordination mechanisms that function at the collective level. For humans and other non-eusocial species, however, this dilemma operates within individuals as well as between individuals, because group members may be motivated to take excessive advantage of others' exploratory findings through social learning. Thus, even though social learning can reduce collective exploration costs, the emergence of disproportionate “information scroungers” may severely undermine its potential benefits. We investigated experimentally whether social learning opportunities might improve the performance of human participants working on a “multi-armed bandit” problem in groups, where they could learn about each other's past choice behaviors. Results showed that, even though information scroungers emerged frequently in groups, social learning opportunities reduced total group exploration time while increasing harvesting from better options, and consequentially improved collective performance. Surprisingly, enriching social information by allowing participants to observe others' evaluations of chosen options (e.g., Amazon's 5-star rating system) in addition to choice-frequency information had a detrimental impact on performance compared to the simpler situation with only the choice-frequency information. These results indicate that humans groups can handle the fundamental “dual exploration-exploitation dilemmas” successfully, and that social learning about simple choice-frequencies can help produce collective intelligence.

[1]  F. Galton Vox Populi , 1907, Nature.

[2]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[3]  S. Asch Studies of independence and conformity: I. A minority of one against a unanimous majority. , 1956 .

[4]  J. Gittins Bandit processes and dynamic allocation indices , 1979 .

[5]  P. W. Jones,et al.  Bandit Problems, Sequential Allocation of Experiments , 1987 .

[6]  Alan R. Rogers,et al.  Does Biology Constrain Culture , 1988 .

[7]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[8]  H Robbins,et al.  Sequential choice from several populations. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Tze Leung Lai,et al.  Incomplete learning from endogenous data in dynamic allocation , 1999 .

[10]  P. Todd,et al.  Simple Heuristics That Make Us Smart , 1999 .

[11]  T. Caraco,et al.  Social Foraging Theory , 2018 .

[12]  L. Giraldeau,et al.  Social influences on foraging in vertebrates: causal mechanisms and adaptive functions , 2001, Animal Behaviour.

[13]  Daisuke Nakanishi,et al.  Cost–benefit analysis of social/cultural learning in a nonstationary uncertain environment: An evolutionary simulation and an experiment with human subjects , 2002 .

[14]  R. R. Krausz Living in Groups , 2013 .

[15]  Tamar Keasar,et al.  Bees in two-armed bandit situations: foraging choices and possible decision mechanisms , 2002 .

[16]  Daisuke Nakanishi,et al.  Does social/cultural learning increase human adaptability?: Rogers's question revisited , 2003 .

[17]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[18]  P. Richerson,et al.  Not by genes alone: How culture transformed human evolution. , 2004 .

[19]  K. Laland Social learning strategies , 2004, Learning & behavior.

[20]  T. Seeley,et al.  Collective decision-making in honey bees: how colonies choose among nectar sources , 1991, Behavioral Ecology and Sociobiology.

[21]  L. Conradt,et al.  Consensus decision making in animals. , 2005, Trends in ecology & evolution.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  J. Deneubourg,et al.  Collective decision making through food recruitment , 1990, Insectes Sociaux.

[24]  Timothy M. Waring,et al.  Article in Press Evolution and Human Behavior Xxx (2005) Xxx – Xxx , 2022 .

[25]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[26]  Matthew J. Salganik,et al.  Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market , 2006, Science.

[27]  Martin H. Levinson Not by Genes Alone: How Culture Transformed Human Evolution , 2006 .

[28]  Edmund A. Mennis The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations , 2006 .

[29]  Ambuj Tewari,et al.  Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs , 2007, NIPS.

[30]  T. Kameda,et al.  “To eat or not to be eaten?” Collective risk-monitoring in groups , 2007 .

[31]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[32]  U. Fischbacher z-Tree: Zurich toolbox for ready-made economic experiments , 1999 .

[33]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[34]  Magnus Enquist,et al.  Social Learning : A Solution to Rogers ’ s Paradox of Nonadaptive Culture , 2007 .

[35]  Mark Lubell,et al.  Beyond existence and aiming outside the laboratory: estimating frequency-dependent and pay-off-biased social learning strategies , 2008, Philosophical Transactions of the Royal Society B: Biological Sciences.

[36]  J. Deneubourg,et al.  Collective Decision-Making and Foraging Patterns in Ants and Honeybees , 2008 .

[37]  T. Lillicrap,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010, Science.

[38]  H. Roche,et al.  Why Copy Others? Insights from the Social Learning Strategies Tournament , 2010 .

[39]  Ulf Toelch,et al.  Humans copy rapidly increasing choices in a multiarmed bandit problem , 2010 .

[40]  D. Sumpter Collective Animal Behavior , 2010 .

[41]  Julie Morand-Ferron,et al.  Learning in a game context: strategy choice by some keeps learning from evolving in others , 2010, Proceedings of the Royal Society B: Biological Sciences.

[42]  Stefan Krause,et al.  Swarm intelligence in animals and humans. , 2010, Trends in ecology & evolution.

[43]  Luke Rendell,et al.  ROGERS’ PARADOX RECAST AND RESOLVED: POPULATION STRUCTURE AND THE EVOLUTION OF SOCIAL LEARNING STRATEGIES , 2010, Evolution; international journal of organic evolution.

[44]  F. Ratnieks,et al.  Synergy between social and private information increases foraging efficiency in ants , 2011, Biology Letters.

[45]  A. Mesoudi Cultural Evolution , 2011, eLS.

[46]  Kevin D. Glazebrook,et al.  Multi-Armed Bandit Allocation Indices: Gittins/Multi-Armed Bandit Allocation Indices , 2011 .

[47]  Reid Hastie,et al.  Democracy Under Uncertainty: The ‘Wisdom of Crowds’ and the Free-Rider Problem in Group Decision Making , 2010, Psychological review.

[48]  D. Helbing,et al.  How social influence can undermine the wisdom of crowd effect , 2011, Proceedings of the National Academy of Sciences.

[49]  Günther Palm,et al.  Value-Difference Based Exploration: Adaptive Control between Epsilon-Greedy and Softmax , 2011, KI.

[50]  Christopher M. Anderson Ambiguity aversion in multi-armed bandit problems , 2012 .

[51]  Tatsuya Kameda,et al.  Is consensus-seeking unique to humans? A selective review of animal group decision-making and its implications for (human) social psychology , 2012 .

[52]  Takao Sasaki,et al.  Linear recruitment leads to allocation and flexibility in collective foraging by ants , 2013, Animal Behaviour.

[53]  David B. Dunson,et al.  Bayesian data analysis, third edition , 2013 .

[54]  Jens Krause,et al.  Accurate decisions in an uncertain world: collective cognition increases true positives while decreasing false positives , 2013, Proceedings of the Royal Society B: Biological Sciences.

[55]  Sean J. Taylor,et al.  Social Influence Bias: A Randomized Experiment , 2013, Science.