Optimum feature selection for decision functions

Feature selection is one of the most important processes in the design of pattern classifiers. This paper presents an optimum feature selection method which is applicable to arbitrary (nonlinear) decision functions. It is assumed that a finite number of training samples (training set) is given for each pattern class, and the decision function is designed based on the training sets. The training sets are edited by removing the samples which are classified incorrectly by the decision function. Then the feature selection problem is transformed to a modified zero-one integer program. In this method, under a chosen permissible error, a minimum feature subset can be found which is combinationally optimum. Numerical examples of feature selection for a linear and a quadratic decision function are presented.

[1]  Manabu Ichino,et al.  Suboptimum Linear Feature Selection in Multiclass Problem , 1974, IEEE Trans. Syst. Man Cybern..

[2]  Laveen N. Kanal,et al.  Patterns in pattern recognition: 1968-1974 , 1974, IEEE Trans. Inf. Theory.

[3]  King-Sun Fu,et al.  Recent Developments in Pattern Recognition , 1980, IEEE Trans. Computers.

[4]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[5]  Jack Sklansky,et al.  Feature Selection for Automatic Classification of Non-Gaussian Data , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[7]  MANABU ICHINO,et al.  Optimum feature selection by zero-one integer programming , 1984, IEEE Transactions on Systems, Man, and Cybernetics.