How generalizable is good judgment? A multi-task, multi-benchmark study

Good judgment is often gauged against two gold standards – coherence and correspondence. Judgments are coherent if they demonstrate consistency with the axioms of probability theory or propositional logic. Judgments are correspondent if they agree with ground truth. When gold standards are unavailable, silver standards such as consistency and discrimination can be used to evaluate judgment quality. Individuals are consistent if they assign similar judgments to comparable stimuli, and they discriminate if they assign different judgments to dissimilar stimuli. We ask whether “superforecasters†, individuals with noteworthy correspondence skills (see Mellers et al., 2014) show superior performance on laboratory tasks assessing other standards of good judgment. Results showed that superforecasters either tied or out-performed less correspondent forecasters and undergraduates with no forecasting experience on tests of consistency, discrimination, and coherence. While multifaceted, good judgment may be a more unified than concept than previously thought.

[1]  T. R. Stewart,et al.  Dimensions of Judgment: Factor Analysis of Individual Differences , 2012 .

[2]  Jay J.J. Christensen-Szalanski,et al.  Physicians' use of probabilistic information in a real clinical setting. , 1981 .

[3]  Peter Ayton,et al.  Task influences on judgemental forecasting , 1987 .

[4]  Jennifer Tsai,et al.  Coherence and Correspondence Competence: Implications for Elicitation and Aggregation of Probabilistic Forecasts of World Events , 2012 .

[5]  M. Dhami Journal of Behavioral Decision Making J. Behav. Dec. Making, 14: 141±168 �2001) DOI: 10.1002/bdm.371 Bailing and Jailing the Fast and Frugal Way , 2022 .

[6]  Thomas S. Wallsten,et al.  Base rate effects on the interpretations of probability and frequency expressions , 1986 .

[7]  Adam J. L. Harris,et al.  Communicating environmental risks: Clarifying the severity effect in interpretations of verbal probability expressions. , 2011, Journal of experimental psychology. Learning, memory, and cognition.

[8]  J. Shanteue COMPETENCE IN EXPERTS: THE ROLE OF TASK CHARACTERISTICS , 1992 .

[9]  A. Piquero,et al.  USING THE CORRECT STATISTICAL TEST FOR THE EQUALITY OF REGRESSION COEFFICIENTS , 1998 .

[10]  David R. Mandel,et al.  Instruction in information structuring improves Bayesian judgment in intelligence analysts , 2015, Front. Psychol..

[11]  David R. Mandel,et al.  Probabilistic Coherence Weighting for Optimizing Expert Forecasts , 2013, Decis. Anal..

[12]  Ebbe B. Ebbesen,et al.  Decision Making and Information Integration in the Courts: The Setting of Bail , 1975 .

[13]  J. Payne,et al.  Crime seriousness, recidivism risk, and causal attributions in judgments of prison term by students and experts. , 1977 .

[14]  Valerie F Reyna,et al.  Developmental Reversals in Risky Decision Making , 2014, Psychological science.

[15]  Paul E. Meehl,et al.  Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence , 1996 .

[16]  Peter Ayton,et al.  The psychology of forecasting , 1986 .

[17]  Philip Tetlock,et al.  The psychology of intelligence analysis: drivers of prediction accuracy in world politics. , 2015, Journal of experimental psychology. Applied.

[18]  Robert A. Olsen Desirability bias among professional investment managers: some evidence from experts , 1997 .

[19]  J. Scott Armstrong,et al.  Structured Analogies for Forecasting , 2007 .

[20]  K. Stanovich What Intelligence Tests Miss , 2017 .

[21]  H. Vincent Poor,et al.  Aggregating Large Sets of Probabilistic Forecasts by Weighted Coherent Adjustment , 2011, Decis. Anal..

[22]  George Wright,et al.  Coherence, Calibration, and Expertise in Judgmental Probability Forecasting , 1994 .

[23]  Amnon Rapoport,et al.  Measuring the Vague Meanings of Probability Terms , 1986 .

[24]  J. Baron,et al.  Heuristics and biases in diagnostic reasoning: II. Congruence, information, and certainty☆ , 1988 .

[25]  R. Zeckhauser,et al.  The Value of Precision in Probability Assessment: Evidence from a Large-Scale Geopolitical Forecasting Tournament , 2018 .

[26]  A. Tversky,et al.  Evidential impact of base rates , 1981 .

[27]  D. Mandel Accuracy of Intelligence Forecasts From the Intelligence Consumer’s Perspective , 2015 .

[28]  Kenneth R. Hammond,et al.  Beyond Rationality: The Search for Wisdom in a Troubled Time , 2007 .

[29]  A. H. Murphy,et al.  Probability Forecasting in Meteorology , 1984 .

[30]  H. J. Einhorn Expert judgment: Some necessary conditions and an example. , 1974 .

[31]  Ray W. Cooksey,et al.  Judgment analysis : theory, methods, and applications , 1996 .

[32]  Brad M. Barber,et al.  Boys Will Be Boys: Gender, Overconfidence, and Common Stock Investment , 1998 .

[33]  Tobias F. Rötheli Superforecasting: the art and science of prediction , 2017 .

[34]  H. Arkes,et al.  Failure to Adopt Beneficial Therapies Caused by Bias in Medical Evidence Evaluation , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[35]  Geoff Norman,et al.  Overconfidence in clinical decision making. , 2008, The American journal of medicine.

[36]  Alan Barnes,et al.  Accuracy of forecasts in strategic intelligence , 2014, Proceedings of the National Academy of Sciences.

[37]  Valerie F. Reyna,et al.  Coherence and Correspondence Criteria for Rationality: Experts' Estimation of Risks of Sexually Transmitted Infections. , 2005 .

[38]  Gerd Gigerenzer,et al.  How to Improve Bayesian Reasoning Without Instruction: Frequency Formats , 1995 .

[39]  Robert P. Mahan,et al.  The Use of Base Rate Information as a Function of Experienced Consistency , 2005 .

[40]  Philip Tetlock,et al.  Identifying and Cultivating Superforecasters as a Method of Improving Probabilistic Predictions , 2015, Perspectives on psychological science : a journal of the Association for Psychological Science.

[41]  Sydney E. Scott,et al.  Psychological Strategies for Winning a Geopolitical Forecasting Tournament , 2014, Psychological science.

[42]  David J. Weiss,et al.  Empirical Assessment of Expertise , 2003, Hum. Factors.

[43]  R. Dawes,et al.  Heuristics and Biases: Clinical versus Actuarial Judgment , 2002 .

[44]  Gerd Gigerenzer,et al.  How bad is incoherence , 2016 .

[45]  J. Bring,et al.  How do GPs use clinical information in their judgements of heart failure? A clinical judgement analysis study. , 1998, Scandinavian journal of primary health care.

[46]  Kenneth R. Hammond,et al.  Coherence and correspondence theories in judgment and decision making. , 2000 .

[47]  Philip T. Dunwoody Theories of truth as assessment criteria in judgment and decision making , 2009, Judgment and Decision Making.

[48]  Gideon Keren,et al.  Facing uncertainty in the game of bridge: A calibration study , 1987 .

[49]  G. Gaus,et al.  Expert Political Judgment: How Good Is It? How Can We Know? , 2007, Perspectives on Politics.

[50]  Keith E. Stanovich,et al.  The Rationality Quotient: Toward a Test of Rational Thinking , 2016 .

[51]  Rick P. Thomas,et al.  Criteria for performance evaluation , 2009, Judgment and Decision Making.

[52]  N V Dawson,et al.  Systematic errors in medical decision making: judgment limitations. , 1987, Journal of general internal medicine.

[53]  Elke U. Weber,et al.  Contextual Effects in the Interpretations of Probability Words: Perceived Base Rate and Severity of Events , 1990 .

[54]  H. Vincent Poor,et al.  Aggregating Probabilistic Forecasts from Incoherent and Abstaining Experts , 2008, Decis. Anal..

[55]  Chris Guthrie,et al.  Blinking on the Bench: How Judges Decide Cases , 2007 .

[56]  D. Kahneman,et al.  Conditions for intuitive expertise: a failure to disagree. , 2009, The American psychologist.