Some problems of statistical prediction.

SUMMARY A general framework is introduced for the study of inference and decision predictions about the outcome of a future experiment from the data of an independent informative experiment. This allows a simple classification of prediction problems, and shows the place of standard inference predictions within the framework. A Bayesian approach to decision prediction is then presented and techniques appropriate to a variety of realistic utility functions are developed. Finally, some prediction problems associated with classes of experiments are considered. Statistical prediction is the use of the data from an informative experiment E to make some statement about the outcome of a future experiment F. The prediction statements commonly treated in the literature are of inference type, in which the purpose is to give some indication of the likely outcome of F, or to suggest some subset of possible outcomes in which the actual outcome of F is likely to fall. There are also, however, prediction problems of a decision type, for which the decision space consists of subsets of the outcome space of F, and where the prediction is related in a much more precise way to some specific purpose. Our own interest in the subject has arisen from decision problems in the supply of hospital engineering services (e.g. oxygen, gas, conditioned air, suction, etc.). In a simple version the supply system may be supposed to function at a series of independent operations, at each of which a constant quantity r (e.g. number of outlets) of the commodity is available for supply. At each operation of the system some variable quantity y is demanded; this may be below or above r. If y > r the system has failed fully to meet demand and if y < r the system has oversupplied. The extent to which fixing the supply at r is satisfactory depends on the relative demerits of failing to meet demand and of oversupplying, and on the variation in y. Here we can suppose F to be the observation of a free demand, unrestricted by the limited supply. The informative experiment E may consist of demands xl, ..., Xn on an existing similar system which has been overdesigned, so that E consists of n replicates of F. If the existing system is not overdesigned but supplies r,, say, at each operation then E may be regarded as n replicates of F, with observations truncated or censored at rL; this case can be treated only by asymptotic methods and we shall not consider it here. Our purpose in this paper is first to suggest a clear and flexible framework within which such inference and decision prediction can be discussed, to indicate briefly how existing inference procedures fall within this framework, and then to develop the model towards specific decision prediction procedures. We do this for the case in which E and F are independent experiments; thus we do not consider the situation where the outcome of E is a part realization of a stochastic process and F is the continuation of the process.