Maximal predictive classification

Consider n point populations characterised by v dichotomous variables, constant within populations. For any classification into k classes, a list of valtues (class predictors) can be constructed for the variables of each class, and used to predict the properties of any individual belonging to that class. The maximal predictive criterion selects that partition of the n populations into k classes which maximises the inuimber, Wk, of correct predictions. The average number, Bk, of properties correctly predicted for members of each class using the class predictors of the other k 1 classes, measures the separation between classes. The best choice of k is related to maximising Wk Bk . A general method is given for defining optimal hierarchical classification using any optimal k-class criterion, not necessarily depending on taxonomic distance. Maximal predictive classes have an optimal identification property and other properties, useful for constructing search algorithms, are given. An example illustrates the results. Multilevel qualitative variables and differing probabilities of occurrence for each population are acceptable, but random variation within populations needs further consideration.

[1]  J. Gilmour A Taxonomic Problem , 1937, Nature.

[2]  J. Gilmour,et al.  The Development of Taxonomic Theory Since 1851 , 1951, Nature.

[3]  R. Sokal,et al.  Principles of numerical taxonomy , 1965 .

[4]  R. Colwell,et al.  QUANTITATIVE APPROACH TO THE STUDY OF BACTERIAL SPECIES , 1963, Journal of bacteriology.

[5]  A W EDWARDS,et al.  A METHOD FOR CLUSTER ANALYSIS. , 1965, Biometrics.

[6]  J. Hartigan REPRESENTATION OF SIMILARITY MATRICES BY TREES , 1967 .

[7]  J. Gower A comparison of some methods of cluster analysis. , 1967, Biometrics.

[8]  G. N. Lance,et al.  Note on a New Information-Statistic Classificatory Program , 1968, Comput. J..

[9]  Robin Sibson,et al.  The Construction of Hierarchic and Non-Hierarchic Classifications , 1968, Comput. J..

[10]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[11]  Robert E. Jensen,et al.  A Dynamic Programming Algorithm for Cluster Analysis , 1969, Oper. Res..

[12]  John C. Gower A note on Burnaby’s character-weighted similarity coefficient , 1970 .

[13]  R. J. Pankhurst,et al.  A Computer Program for Generating Diagnostic Keys , 1970, Comput. J..

[14]  Chris S. Wallace,et al.  A Program for Numerical Classification , 1970, Comput. J..

[15]  R. M. Cormack,et al.  A Review of Classification , 1971 .

[16]  A. Scott,et al.  297. Note: On the Edwards and Cavalli-Sforza Method of Cluster Analysis , 1971 .

[17]  I. C. Lerman,et al.  Les bases de la classification automatique , 1971 .

[18]  J. C. GOWER,et al.  Selecting Tests in Diagnostic Keys with Unknown Responses , 1971, Nature.

[19]  John C. Gower,et al.  Statistical methods of comparing different multivariate analyses of the same data , 1971 .

[20]  J. Gower,et al.  A maximal predictive classification of Klebsielleae and of the yeasts. , 1975, Journal of general microbiology.