The wisdom of the inner crowd in three large natural experiments

The quality of decisions depends on the accuracy of estimates of relevant quantities. According to the wisdom of crowds principle, accurate estimates can be obtained by combining the judgements of different individuals1,2. This principle has been successfully applied to improve, for example, economic forecasts3–5, medical judgements6–9 and meteorological predictions10–13. Unfortunately, there are many situations in which it is infeasible to collect judgements of others. Recent research proposes that a similar principle applies to repeated judgements from the same person14. This paper tests this promising approach on a large scale in a real-world context. Using proprietary data comprising 1.2 million observations from three incentivized guessing competitions, we find that within-person aggregation indeed improves accuracy and that the method works better when there is a time delay between subsequent judgements. However, the benefit pales against that of between-person aggregation: the average of a large number of judgements from the same person is barely better than the average of two judgements from different people.The authors use large, real-world guessing competition datasets to test whether accuracy can be improved by aggregating repeated estimates by the same individual. They find that estimates do improve, but substantially less than with between-person aggregation.

[1]  Aaron S Benjamin,et al.  Knowing the crowd within: Metacognitive limits on combining multiple judgments. , 2014, Journal of memory and language.

[2]  H. J. Eysenck,et al.  The validity of judgments as a function of the number of judges. , 1939 .

[3]  A. C. Haddon,et al.  Memories of My Life , 1908, Nature.

[4]  S. Dehaene,et al.  The Number Sense: How the Mind Creates Mathematics. , 1998 .

[5]  Jan Lorenz,et al.  The wisdom of crowds in one mind: How individuals can simulate the knowledge of diverse societies to reach better decisions , 2011 .

[6]  D. Helbing,et al.  How social influence can undermine the wisdom of crowd effect , 2011, Proceedings of the National Academy of Sciences.

[7]  F. Sanders On Subjective Probability Forecasting , 1963 .

[8]  A. Timmermann Forecast Combinations , 2005 .

[9]  Stefan M. Herzog,et al.  Think twice and then: combining or choosing in dialectical bootstrapping? , 2014, Journal of experimental psychology. Learning, memory, and cognition.

[10]  Christian Genest,et al.  Combining Probability Distributions: A Critique and an Annotated Bibliography , 1986 .

[11]  Francis Tuerlinckx,et al.  Measuring the crowd within again: a pre-registered replication study , 2013, Front. Psychol..

[12]  K. Gordon,et al.  Further Observations on Group Judgments of Lifted Weights , 1935 .

[13]  Johannes Müller-Trede Repeated judgment sampling: Boundaries , 2011, Judgment and Decision Making.

[14]  V. Genrea,et al.  Combining expert forecasts : Can anything beat the simple average ? , 2012 .

[15]  S. F. Klugman,et al.  Group Judgments for Familiar and Unfamiliar Materials , 1945 .

[16]  Stefan M. Herzog,et al.  The Wisdom of Many in One Mind , 2009, Psychological science.

[17]  M. G. Preston Note on the reliability and the validity of the group judgment. , 1938 .

[18]  Stefan M. Herzog,et al.  Harnessing the wisdom of the inner crowd , 2014, Trends in Cognitive Sciences.

[19]  R. H. HOOKER Mean or Median , 1907, Nature.

[20]  Jens Krause,et al.  Detection Accuracy of Collective Intelligence Assessments for Skin Cancer Diagnosis. , 2015, JAMA dermatology.

[21]  Jonathan Baron,et al.  Combining multiple probability predictions using a simple logit model , 2014 .

[22]  C. Holstein An Experiment in Probabilistic Weather Forecasting , 1971 .

[23]  F. Galton The Ballot-Box , 1907, Nature.

[24]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[25]  R. Clemen Combining forecasts: A review and annotated bibliography , 1989 .

[26]  A. Nieder Counting on neurons: the neurobiology of numerical competence , 2005, Nature Reviews Neuroscience.

[27]  Richard P. Larrick,et al.  Intuitions About Combining Opinions: Misappreciation of the Averaging Principle , 2006, Manag. Sci..

[28]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[29]  J. Armstrong,et al.  PRINCIPLES OF FORECASTING 1 Principles of Forecasting : A Handbook for Researchers and Practitioners , 2006 .

[30]  Robin M. Hogarth,et al.  A note on aggregating opinions , 1978 .

[31]  Jeffrey A. Baars,et al.  Performance of national weather service forecasts compared to operational, consensus, and weighted model output statistics , 2005 .

[32]  Stefan M. Herzog,et al.  Boosting medical diagnostics by pooling independent judgments , 2016, Proceedings of the National Academy of Sciences.

[33]  Lisa Werner,et al.  Principles of forecasting: A handbook for researchers and practitioners , 2002 .

[34]  A. Jenness,et al.  The role of discussion in changing opinion regarding a matter of fact. , 1932 .

[35]  T. L. Kelley The applicability of the Spearman-Brown formula for the measurement of reliability. , 1925 .

[36]  H. Pashler,et al.  Measuring the Crowd Within , 2008, Psychological science.

[37]  J. Stroop Is the judgment of the group better than that of the average member of the group , 1932 .

[38]  A. Benjamin,et al.  Smaller is better (when sampling from the crowd within): Low memory-span individuals benefit more from multiple opportunities for estimation. , 2010, Journal of experimental psychology. Learning, memory, and cognition.

[39]  James Surowiecki The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies, and nations Doubleday Books. , 2004 .

[40]  Jonathan Baron,et al.  Two Reasons to Make Aggregated Probability Forecasts More Extreme , 2014, Decis. Anal..

[41]  K. Gordon,et al.  Group judgments in the field of lifted weights. , 1924 .

[42]  Julie L. Booth,et al.  Development of numerical estimation in young children. , 2004, Child development.

[43]  Julie L. Booth,et al.  Developmental and individual differences in pure numerical estimation. , 2006, Developmental psychology.

[44]  Marco Zorzi,et al.  Numerical estimation in preschoolers. , 2010, Developmental psychology.

[45]  J. Krueger,et al.  The First Cut is the Deepest: Effects of Social Projection and Dialectical Bootstrapping on Judgmental Accuracy , 2014 .

[46]  Ralf H. J. M. Kurvers,et al.  Collective Intelligence Meets Medical Decision-Making: The Collective Outperforms the Best Radiologist , 2015, PloS one.

[47]  H Gu,et al.  The effects of averaging subjective probability estimates between and within judges. , 2000, Journal of experimental psychology. Applied.

[48]  F. Galton Vox Populi , 1907, Nature.

[49]  R. Siegler,et al.  The Development of Numerical Estimation , 2003, Psychological science.

[50]  R. L. Winkler,et al.  Coherent combination of experts' opinions , 1995 .

[51]  Jack L. Treynor Market Efficiency and the Bean Jar Experiment , 1987 .

[52]  Melissa E. Libertus,et al.  Comment on "Log or Linear? Distinct Intuitions of the Number Scale in Western and Amazonian Indigene Cultures" , 2009, Science.

[53]  W. Heath The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies , 2008 .

[54]  Stefan M. Herzog,et al.  The Potential of Collective Intelligence in Emergency Medicine: Pooling Medical Students’ Independent Decisions Improves Diagnostic Performance , 2017, Medical decision making : an international journal of the Society for Medical Decision Making.

[55]  Calvin Blackwell,et al.  The wisdom of the few or the wisdom of the many? An indirect test of the marginal trader hypothesis , 2011 .

[56]  Robert L. Vislocky,et al.  Improved Model Output Statistics Forecasts through Model Consensus , 1995 .

[57]  Albert E. Mannes Are We Wise About the Wisdom of Crowds? The Use of Group Judgments in Belief Revision , 2009, Manag. Sci..