Combining One-Class Classifiers to Classify Missing Data

In the paper a new method for handling with missing features values in classification is presented. The presented idea is to form an ensemble of one-class classifiers trained on each feature, preselected group of features or to compute from features a dissimilarity representation. Thus when any feature values are missing for a data point to be labeled, the ensemble can still make a reasonable decision based on the remaining classifiers. With the comparison to standard algorithms that handle with the missing features problem it is possible to build an ensemble that can classify test objects with all possible occurrence of missing features without retrain a classifier for each combination of missing features. Additionally, to train such an ensemble a training set does not need to be uncorrupted. The performance of the proposed ensemble is compared with standard methods use with missing features values problem on several UCI datasets.

[1]  Tin Kam Ho Data Complexity Analysis for Classifier Combination , 2001, Multiple Classifier Systems.

[2]  Robert P. W. Duin,et al.  On the Choice of Smoothing Parameters for Parzen Estimators of Probability Density Functions , 1976, IEEE Transactions on Computers.

[3]  Roderick J. A. Little,et al.  Consistent Regression Methods for Discriminant Analysis with Incomplete Data , 1978 .

[4]  A Reappraisal of Distance-Weighted K-Nearest Neighbor Classification for Pattern Recognition with Missing Data , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  Robert P. W. Duin,et al.  Dissimilarity representations allow for building good classifiers , 2002, Pattern Recognit. Lett..

[7]  A. Meyer-Bäse Feature Selection and Extraction , 2004 .

[8]  Sarunas Raudys Multiple Classification Systems in the Context of Feature Extraction and Selection , 2002, Multiple Classifier Systems.

[9]  David M. J. Tax,et al.  One-class classification , 2001 .

[10]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[11]  John K. Dixon,et al.  Pattern Recognition with Partly Missing Data , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[12]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[13]  Olive Jean Dunn,et al.  Alternative Approaches to Missing Values in Discriminant Analysis , 1976 .

[14]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[15]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[16]  Robert P. W. Duin,et al.  One-Class LP Classifiers for Dissimilarity Representations , 2002, NIPS.

[17]  S. Hanson,et al.  Some Solutions to the Missing Feature Problem in Vision , 1993 .

[18]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[20]  David G. Stork,et al.  Pattern Classification , 1973 .

[21]  Robert P. W. Duin,et al.  Combining One-Class Classifiers , 2001, Multiple Classifier Systems.

[22]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[23]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..