The Potential of Collective Intelligence in Emergency Medicine: Pooling Medical Students’ Independent Decisions Improves Diagnostic Performance

Background. Evidence suggests that pooling multiple independent diagnoses can improve diagnostic accuracy in well-defined tasks. We investigated whether this is also the case for diagnostics in emergency medicine, an ill-defined task environment where diagnostic errors are rife. Methods. A computer simulation study was conducted based on empirical data from 2 published experimental studies. In the computer experiments, 285 medical students independently diagnosed 6 simulated patients arriving at the emergency room with dyspnea. Participants’ diagnoses (n = 1,710), confidence ratings, and expertise levels were entered into a computer simulation. Virtual groups of different sizes were randomly created, and 3 collective intelligence rules (follow-the-plurality rule, follow-the-most-confident rule, and follow-the-most-senior rule) were applied to combine the independent decisions into a final diagnosis. For different group sizes, the performance levels (i.e., percentage of correct diagnoses) of the 3 collective intelligence rules were compared with each other and against the average individual accuracy. Results. For all collective intelligence rules, combining independent decisions substantially increased performance relative to average individual performance. For groups of 4 or fewer, the follow-the-most-confident rule outperformed the other rules; for larger groups, the follow-the-plurality rule performed best. For example, combining 5 independent decisions using the follow-the-plurality rule increased diagnostic accuracy by 22 percentage points. These results were robust across case difficulty and expertise level. Limitations of the study include the use of simulated patients diagnosed by medical students. Whether results generalize to clinical practice is currently unknown. Conclusion. Combining independent decisions may substantially improve the quality of diagnoses in emergency medicine and may thus enhance patient safety.

[1]  J. Hahn Victims Of Groupthink A Psychological Study Of Foreign Policy Decisions And Fiascoes , 2016 .

[2]  S. Broomell,et al.  Why Are Experts Correlated? Decomposing Correlations Between Judges , 2009 .

[3]  J COENEGRACHTS,et al.  [Diagnostic errors]. , 1952, Revue medicale de Liege.

[4]  R. Glaser,et al.  Expertise in a complex skill: Diagnosing x-ray pictures. , 1988 .

[5]  Scott E. Page,et al.  Making the Difference: Applying a Logic of Diversity , 2007 .

[6]  I. Janis Victims of Groupthink: A psychological study of foreign-policy decisions and fiascoes. By Irving L. Janis. (Boston: Houghton Mifflin, 1972. viii + 276 pp. Map, illustrations, chart, notes, sources, bibliography, and index. Cloth, $7.95; paper $4.50.) , 1973 .

[7]  G. Norman,et al.  Assessing diagnostic reasoning: a consensus statement summarizing theory, practice, and future needs. , 2012, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[8]  A. Elstein,et al.  Clinical reasoning in medicine. , 1995 .

[9]  Stefan M. Herzog,et al.  Boosting medical diagnostics by pooling independent judgments , 2016, Proceedings of the National Academy of Sciences.

[10]  F. Gobet,et al.  The Cambridge handbook of expertise and expert performance , 2006 .

[11]  Ilan Yaniv,et al.  The Benefit of Additional Opinions , 2004 .

[12]  J. Merriënboer,et al.  How experts deal with novel situations: A review of adaptive expertise , 2014 .

[13]  Kevin Chagin,et al.  The Wisdom of Crowds of Doctors , 2016, Medical decision making : an international journal of the Society for Medical Decision Making.

[14]  Robert L. Winkler,et al.  Evaluating and Combining Physicians' Probabilities of Survival in an Intensive Care Unit , 1993 .

[15]  Philip Yetton,et al.  Individual versus group problem solving: An empirical test of a best-member strategy , 1982 .

[16]  Jane Garbutt,et al.  Patient concerns about medical errors in emergency departments. , 2005, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[17]  Lauren J. Waterhouse,et al.  Factors Affecting Team Size and Task Performance in Pediatric Trauma Resuscitation , 2014, Pediatric emergency care.

[18]  C. Witt,et al.  Development of knowledge in basic sciences: a comparison of two medical curricula , 2012, Medical education.

[19]  Alyshah Kaba,et al.  Are we at risk of groupthink in our approach to teamwork interventions in health care? , 2016, Medical education.

[20]  Olga Kunina-Habenicht,et al.  Assessing clinical reasoning (ASCLIRE): Instrument development and validation , 2015, Advances in health sciences education : theory and practice.

[21]  Junji Otaki,et al.  A hypothesis‐driven physical examination learning and assessment procedure for medical students: initial validity evidence , 2009, Medical education.

[22]  Kent B Lewandrowski,et al.  Clinical uncertainty, diagnostic accuracy, and outcomes in emergency department patients presenting with dyspnea. , 2008, Archives of internal medicine.

[23]  T. Brennan,et al.  Missed and Delayed Diagnoses in the Ambulatory Setting: A Study of Closed Malpractice Claims , 2006, Annals of Internal Medicine.

[24]  James M. LeBreton,et al.  Answers to 20 Questions About Interrater Reliability and Interrater Agreement , 2008 .

[25]  Florenz Plassmann,et al.  Developing the aggregate empirical side of computational social choice , 2013, Annals of Mathematics and Artificial Intelligence.

[26]  F. Galton Vox Populi , 1907, Nature.

[27]  Eric F. Rietzschel,et al.  Beyond productivity loss in brainstorming groups: The evolution of a question , 2010 .

[28]  Z. Yen,et al.  Preventable deaths in patients admitted from emergency department , 2006, Emergency Medicine Journal.

[29]  R. L. Winkler,et al.  Are two (inexperienced) heads better than one (experienced) head? Averaging house officers' prognostic judgments for critically ill patients. , 1990, Archives of internal medicine.

[30]  Richard James,et al.  Swarm intelligence in humans: diversity can trump ability , 2011, Animal Behaviour.

[31]  A. Wall,et al.  Book ReviewTo Err is Human: building a safer health system Kohn L T Corrigan J M Donaldson M S Washington DC USA: Institute of Medicine/National Academy Press ISBN 0 309 06837 1 $34.95 , 2000 .

[32]  G. Norman Research in clinical reasoning: past history and current trends , 2005, Medical education.

[33]  Richard S. Bruce,et al.  Group Judgments in the Fields of Lifted Weights and Visual Discrimination , 1935 .

[34]  Asher Koriat,et al.  When two heads are better than one and when they can be worse: The amplification hypothesis. , 2015, Journal of experimental psychology. General.

[35]  D. Budescu,et al.  Aggregation of opinions based on correlated cues and advisors. , 2007 .

[36]  Cees P. M. van der Vleuten,et al.  The use of progress testing. , 2012 .

[37]  Robin M. Hogarth,et al.  A note on aggregating opinions , 1978 .

[38]  M. Ong,et al.  The Effect of Availability of Manpower on Trauma Resuscitation Times in a Tertiary Academic Hospital , 2016, PloS one.

[39]  Peter E. Latham,et al.  Does interaction matter? Testing whether a confidence heuristic can replace interaction in collective decision-making , 2014, Consciousness and Cognition.

[40]  R. Hertwig Tapping into the Wisdom of the Crowd—with Confidence , 2012, Science.

[41]  S. Downing Validity: on the meaningful interpretation of assessment data , 2003, Medical education.

[42]  Maarten J. IJzerman,et al.  Trauma team activation varies across Dutch emergency departments: a national survey , 2015, Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine.

[43]  A. Koriat,et al.  When Are Two Heads Better than One and Why? , 2012, Science.

[44]  M. Symmons,et al.  Measuring teamwork performance: Validity testing of the Team Emergency Assessment Measure (TEAM) with clinical resuscitation teams. , 2016, Resuscitation.

[45]  C. V. D. van der Vleuten,et al.  The use of progress testing , 2012, Perspectives on Medical Education.

[46]  S. Schaik,et al.  Context matters: groupthink and outcomes of health care teams , 2016, Medical education.

[47]  Reid Hastie,et al.  The robust beauty of majority rules in group decisions. , 2005, Psychological review.

[48]  Jens Krause,et al.  Detection Accuracy of Collective Intelligence Assessments for Skin Cancer Diagnosis. , 2015, JAMA dermatology.

[49]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[50]  E. Berner,et al.  Overconfidence as a cause of diagnostic error in medicine. , 2008, The American journal of medicine.

[51]  M. Makary,et al.  Medical error—the third leading cause of death in the US , 2016, British Medical Journal.

[52]  Susy Macqueen,et al.  Validity , 1973, Just Algorithms.

[53]  T. Brennan,et al.  The nature of adverse events in hospitalized patients. Results of the Harvard Medical Practice Study II. , 1991, The New England journal of medicine.

[54]  P. Pronovost,et al.  Diagnostic errors--the next frontier for patient safety. , 2009, JAMA.

[55]  Catherine Yoon,et al.  Missed and delayed diagnoses in the emergency department: a study of closed malpractice claims from 4 liability insurers. , 2007, Annals of emergency medicine.

[56]  N. Franklin,et al.  Diagnostic error in internal medicine. , 2005, Archives of internal medicine.

[57]  Andy J. King,et al.  Skin self-examinations and visual identification of atypical nevi: comparing individual and crowdsourcing approaches. , 2013, Cancer epidemiology.

[58]  G. Owen,et al.  Thirteen theorems in search of the truth , 1983 .

[59]  Robert L. Winkler,et al.  Limits for the Precision and Value of Information from Dependent Sources , 1985, Oper. Res..

[60]  P. Yetton,et al.  The relationships among group size, member ability, social decision schemes, and performance , 1983 .

[61]  Ralf H. J. M. Kurvers,et al.  Collective Intelligence Meets Medical Decision-Making: The Collective Outperforms the Best Radiologist , 2015, PloS one.

[62]  Gabor D Kelen,et al.  An epidemiologic study of closed emergency department malpractice claims in a national database of physician malpractice insurers. , 2010, Academic emergency medicine : official journal of the Society for Academic Emergency Medicine.

[63]  Wolfgang Gaissmaier,et al.  Diagnostic performance by medical students working individually or in teams. , 2015, JAMA.

[64]  J. Davitz,et al.  A survey of studies contrasting the quality of group performance and individual performance, 1920-1957. , 1958, Psychological bulletin.

[65]  Julius Caesar,et al.  So right it ’ s wrong : Groupthink and the ubiquitous nature of polarized group decision-making , 2007 .

[66]  K. Gordon,et al.  Group judgments in the field of lifted weights. , 1924 .

[67]  Richard P. Larrick,et al.  The wisdom of select crowds. , 2014, Journal of personality and social psychology.