Evaluation of Probabilities: A Level Playing Field?

Scoring rules provide overall measures of the “goodness” of probabilities, and other measures (often from decompositions of scoring rules) relate to specific attributes such as calibration and sharpness. This paper briefly reviews scoring rules and probability evaluation and attempts to extend past work in two directions. The first extension generalizes the development of scoring rules that better reflect the difficulty of forecasting situations and the skill associated with probability forecasts, with the goal of obtaining measures that yield comparable scores. The second extension involves the use of scoring rules to evaluate probabilities without perfect knowledge of the actual outcomes of the events or variables to which the probabilities refer. This is in the spirit of attempting to create a level playing field for probability evaluation by making evaluation measures comparable and expanding the set of probabilities that can be evaluated.

[1]  J. Frank Yates,et al.  Analyzing the accuracy of probability judgments for multiple events: An extension of the covariance decomposition , 1988 .

[2]  M. Schervish A General Method for Comparing Probability Assessors , 1989 .

[3]  A. H. Murphy,et al.  A General Framework for Forecast Verification , 1987 .

[4]  J McCarthy,et al.  MEASURES OF THE VALUE OF INFORMATION. , 1956, Proceedings of the National Academy of Sciences of the United States of America.

[5]  R. Cooke Experts in Uncertainty: Opinion and Subjective Probability in Science , 1991 .

[6]  Ronald A. Howard,et al.  Readings on the Principles and Applications of Decision Analysis , 1989 .

[7]  A. H. Murphy,et al.  A Sample Skill Score for Probability Forecasts , 1974 .

[8]  W. Edwards,et al.  Decision Analysis and Behavioral Research , 1986 .

[9]  G. Brier,et al.  External correspondence: Decompositions of the mean probability score , 1982 .

[10]  F. Sanders On Subjective Probability Forecasting , 1963 .

[11]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[12]  M. West,et al.  Bayesian forecasting and dynamic models , 1989 .

[13]  B. D. Finetti,et al.  Foresight: Its Logical Laws, Its Subjective Sources , 1992 .

[14]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[15]  A. H. Murphy,et al.  Hailfinder: A Bayesian system for forecasting severe weather , 1996 .

[16]  A. Dawid The Well-Calibrated Bayesian , 1982 .

[17]  B. deFinetti,et al.  METHODS FOR DISCRIMINATING LEVELS OF PARTIAL KNOWLEDGE CONCERNING A TEST ITEM. , 1965, The British journal of mathematical and statistical psychology.

[18]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[19]  G. Blattenberger,et al.  Separating the Brier Score into Calibration and Refinement Components: A Graphical Exposition , 1985 .

[20]  R. L. Keeney,et al.  Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[21]  A. H. Murphy,et al.  Diagnostic verification of probability forecasts , 1992 .

[22]  A. H. Murphy,et al.  Scoring rules and the evaluation of probabilities , 1996 .

[23]  Robert T. Clemen,et al.  The use of probability elicitation in the high-level nuclear waste regulation program , 1995 .

[24]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[25]  A. H. Murphy,et al.  Scalar and Vector Partitions of the Probability Score: Part I. Two-State Situation , 1972 .

[26]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[27]  Thornton Page,et al.  The Scientist Speculates: An Anthology of Partly-baked Ideas , 1964 .

[28]  Michael A. West,et al.  Bayesian Forecasting and Dynamic Models (2nd edn) , 1997, J. Oper. Res. Soc..

[29]  R. L. Winkler Evaluating probabilities: asymmetric scoring rules , 1994 .

[30]  B. D. Finetti La prévision : ses lois logiques, ses sources subjectives , 1937 .

[31]  Stephen E. Fienberg,et al.  The Comparison and Evaluation of Forecasters. , 1983 .

[32]  Max Henrion,et al.  Uncertainty: A Guide to Dealing with Uncertainty in Quantitative Risk and Policy Analysis , 1990 .

[33]  A. H. Murphy,et al.  Probability Forecasting in Meteorology , 1984 .

[34]  A. H. Murphy,et al.  Hedging and Skill Scores for Probability Forecasts , 1973 .

[35]  Shawn P. Curley,et al.  Conditional distribution analyses of probabilistic forecasts , 1985 .

[36]  M. Degroot,et al.  Assessing Probability Assessors: Calibration and Refinement. , 1981 .

[37]  A. H. Murphy,et al.  General Decompositions of MSE-Based Skill Scores: Measures of Some Basic Aspects of Forecast Quality , 1996 .

[38]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[39]  E. H. Shuford,et al.  Admissible probability measurement procedures , 1966, Psychometrika.

[40]  R. L. Winkler The Quantification of Judgment: Some Methodological Suggestions , 1967 .

[41]  M. Degroot,et al.  Bayesian Statistics 2. , 1987 .