Assessment of Numeric Predictions

Most people divide prediction into two families: classification and numeric prediction. In classification, the goal is to assign an unknown case into one of several competing categories (benign versus malignant, tank versus truck versus rock, and so forth). In numeric prediction, the goal is to assign a specific numeric value to a case (expected profit of a trade, expected yield in a chemical batch, and so forth). Actually, such a clear distinction between classification and numeric prediction can be misleading because they blend into each other in many ways. We may use a numeric prediction to perform classification based on a threshold. For example, we may decree that a tissue sample should be called malignant if and only if a numeric prediction model’s output exceeds, say, 0.76. Conversely, we may sometimes use a classification model to predict values of an ordinal variable, although this can be dangerous if not done carefully. Ultimately, the choice of a numeric or classification model depends on the nature of the data and on a sometimes arbitrary decision by the experimenter. This chapter discusses methods for assessing models that make numeric predictions.