Probabilistic Coherence Weighting for Optimizing Expert Forecasts

Methods for eliciting and aggregating expert judgment are necessary when decision-relevant data are scarce. Such methods have been used for aggregating the judgments of a large, heterogeneous group of forecasters, as well as the multiple judgments produced from an individual forecaster. This paper addresses how multiple related individual forecasts can be used to improve aggregation of probabilities for a binary event across a set of forecasters. We extend previous efforts that use probabilistic incoherence of an individual forecaster's subjective probability judgments to weight and aggregate the judgments of multiple forecasters for the goal of increasing the accuracy of forecasts. With data from two studies, we describe an approach for eliciting extra probability judgments to (i) adjust the judgments of each individual forecaster, and (ii) assign weights to the judgments to aggregate over the entire set of forecasters. We show improvement of up to 30% over the established benchmark of a simple equal-weighted averaging of forecasts. We also describe how this method can be used to remedy the “fifty--fifty blip” that occurs when forecasters use the probability value of 0.5 to represent epistemic uncertainty.

[1]  Mark A. Burgman,et al.  Expert Status and Performance , 2011, PloS one.

[2]  Richard P. Larrick,et al.  Intuitions About Combining Opinions: Misappreciation of the Averaging Principle , 2006, Manag. Sci..

[3]  D. Krantz,et al.  A Note on Superadditive Probability Judgment , 1999 .

[4]  David R. Mandel,et al.  Do Evaluation Frames Improve the Quality of Conditional Probability Judgment , 2007 .

[5]  Jennifer Tsai,et al.  Coherence and Correspondence Competence: Implications for Elicitation and Aggregation of Probabilistic Forecasts of World Events , 2012 .

[6]  Christian Genest,et al.  Allocating the weights in the linear opinion pool , 1990 .

[7]  A. Tversky,et al.  On the Reconciliation of Probability Assessments , 1979 .

[8]  H. Pashler,et al.  Increasing Retention Without Increasing Study Time , 2007 .

[9]  C. K. Mertz,et al.  Less Is More in Presenting Quality Information to Consumers , 2007, Medical care research and review : MCRR.

[10]  H. Pashler,et al.  Measuring the Crowd Within , 2008, Psychological science.

[11]  Steven D. Penrod,et al.  Performance feedback improves the resolution of confidence judgments , 1988 .

[12]  Jason R. W. Merrick Getting the Right Mix of Experts , 2008, Decis. Anal..

[13]  Baruch Fischhoff,et al.  What Number is “Fifty‐Fifty”?: Redistributing Excessive 50% Responses in Elicited Probabilities , 2002, Risk analysis : an official publication of the Society for Risk Analysis.

[14]  T. Gneiting Making and Evaluating Point Forecasts , 2009, 0912.0902.

[15]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[16]  Rick P. Thomas,et al.  Implications of Cognitive Load for Hypothesis Generation and Probability Judgment , 2011, Front. Psychology.

[17]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[18]  Robert T. Clemen,et al.  Comment on Cooke's classical method , 2008, Reliab. Eng. Syst. Saf..

[19]  Roger M. Cooke,et al.  TU Delft expert judgment data base , 2008, Reliab. Eng. Syst. Saf..

[20]  H. Vincent Poor,et al.  Aggregating Large Sets of Probabilistic Forecasts by Weighted Coherent Adjustment , 2011, Decis. Anal..

[21]  George Wright,et al.  Coherence, Calibration, and Expertise in Judgmental Probability Forecasting , 1994 .

[22]  M. Elisabeth Paté-Cornell,et al.  Uncertainties in risk analysis: Six levels of treatment , 1996 .

[23]  H. Vincent Poor,et al.  Aggregating Probabilistic Forecasts from Incoherent and Abstaining Experts , 2008, Decis. Anal..

[24]  Keith D. Markman,et al.  Multiple explanation: A consider-an-alternative strategy for debiasing judgments. , 1995 .

[25]  Donald T. Gantz,et al.  Structuring and analyzing competing hypotheses with Bayesian networks for intelligence analysis , 2013 .

[26]  Robert L. Winkler,et al.  Multiple Experts vs. Multiple Methods: Combining Correlation Assessments , 2004, Decis. Anal..

[27]  P. Tetlock Expert Political Judgment: How Good Is It? How Can We Know? , 2005 .

[28]  Daniel M. Oppenheimer,et al.  Instructional Manipulation Checks: Detecting Satisficing to Increase Statistical Power , 2009 .

[29]  Ellen Peters,et al.  Development and Testing of an Abbreviated Numeracy Scale: A Rasch Analysis Approach , 2012, Journal of behavioral decision making.

[30]  Robert L. Winkler,et al.  Combining Probability Distributions From Experts in Risk Analysis , 1999 .

[31]  David R Mandel,et al.  Are risk assessments of a terrorist attack coherent? , 2005, Journal of experimental psychology. Applied.

[32]  D. Mandel Violations of coherence in subjective probability: A representational and assessment processes account , 2008, Cognition.

[33]  Daniel N. Osherson,et al.  Aggregating disparate estimates of chance , 2006, Games Econ. Behav..

[34]  P. Carayon,et al.  Human Factors and Ergonomics , 2013 .

[35]  A. Tversky,et al.  Support theory: A nonextensional representation of subjective probability. , 1994 .

[36]  Ilan Yaniv,et al.  Measures of Discrimination Skill in Probabilistic Judgment , 1991 .

[37]  Stefan M. Herzog,et al.  The Wisdom of Many in One Mind , 2009, Psychological science.

[38]  Winston R. Sieck,et al.  Option fixation: A cognitive contributor to overconfidence , 2007 .

[39]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[40]  T. Modis,et al.  Experts in uncertainty , 1993 .

[41]  Christopher K. Hsee,et al.  The Evaluability Hypothesis: An Explanation for Preference Reversals between Joint and Separate Evaluations of Alternatives , 1996 .

[42]  M. Lepper,et al.  Considering the opposite: a corrective strategy for social judgment. , 1984, Journal of personality and social psychology.