On Testing the Validity of Sequential Probability Forecasts

Events are observed sequentially, and at each stage a probability for the next event is assessed. We consider the relationship between the sequence of probability forecasts and the sequence of outcomes. We argue that the forecasts may be considered “empirically valid” when both these sequences are consistent with a common joint distribution for the events. To aid in assessing validity, we introduce various test statistics that measure, in a natural way, the empirical performance of the probability forecasts in the light of the outcomes obtained. It is shown that under the null hypothesis of forecast validity, such statistics will, after suitable normalization, have a Standard normal (or chi-squared) distribution, virtually irrespective of the common joint distribution supposed to underlie both sequences.