A likelihood ratio approach to classification problems using discrete data

Abstract One form of inference found in a variety of applied areas involves classifying individuals into one of several possible populations or categories on the basis of inconclusive data. Discussed in this paper is a strategy for classification using naturally discrete data classes or data classes which are formed by discrete partitions of continuous data classes. The strategy makes use of the concept of likelihood ratio and is scale-free in the sense that, whatever their scale properties, the original measures identifying data class levels or states are not utilized. The approach is algebraically simple; no calculations of covariance or correlation are required. In addition, no distributional assumptions of any kind are required. A convergence process, applicable even when the number of data classes is small, allows ready calculation, using the normal distribution, of the probability of misclassification errors. The approach is particularly useful when data classes are conditionally independent but is also useful when this assumption is not valid, provided that there are sufficient data. Also discussed are measures of the classification or inferential value of individual data classes and of various subsets of data classes.