Mathematics Methods of Feature Selection in Pattern Recognition

In the 15 years of its existence pattern recognition has made considerable progress on both the theoretical and practical fronts. Starting from the original application of pattern recognition techniques to the problem of character recognition at the time when pattern recognition was conceived these techniques have now penetrated such diverse areas of science as medical diagnosis, remote sensing, finger prints and speech recognition, image classification, etc.* This wide applicability derives from the inherent generality of pattern recognition, which is a direct consequence of the adopted threestage concept of pattern recognition process. According to this concept the process of pattern recognition is viewed as a sequence of three independent functions--representation, feature selection and classification (Fig. 1). Among these functions only the representation stage, which transforms the input patterns into a form suitable for computer processing, is problemdependent. Both the feature selector, the function of which is to reduce the dimensionality of the representation vector, and the classifier, which carries out the actual decision process, work with a vector of measurements which can be considered as an abstract pattern. As a result, the feature selection and classification stages can be implemented using mathematical methods irrespective of the original application. Naturally, this has had a beneficial effect on the progress in the theory of pattern recognition. Although all three stages of the pattern recognition system play an essential role in the process of classifying patterns by machine, the quality of the system's performance depends chiefly on the feature selector. The reasons

[1]  Anthony N. Mucciardi,et al.  A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties , 1971, IEEE Transactions on Computers.

[2]  Thomas M. Cover,et al.  The Best Two Independent Measurements Are Not the Two Best , 1974, IEEE Trans. Syst. Man Cybern..

[3]  Toomas R. Vilmansen,et al.  Feature Evalution with Measures of Probabilistic Dependence , 1973, IEEE Transactions on Computers.

[4]  King-Sun Fu,et al.  On the generalized Karhunen-Loeve expansion (Corresp.) , 1967, IEEE Trans. Inf. Theory.

[5]  Martin D. Levine,et al.  Feature extraction: A survey , 1969 .

[6]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[7]  King-Sun Fu,et al.  Feature Selection in Pattern Recognition , 1970, IEEE Trans. Syst. Sci. Cybern..

[8]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[9]  Thomas W. Calvert,et al.  Nonorthogonal Projections for Feature Extraction in Pattern Recognition , 1969, IEEE Transactions on Computers.

[10]  Philip M. Lewis,et al.  The characteristic selection problem in recognition systems , 1962, IRE Trans. Inf. Theory.

[11]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[12]  Edward A. Patrick,et al.  Nonparametric feature selection , 1969, IEEE Trans. Inf. Theory.

[13]  Josef Kittler,et al.  A new approach to feature selection based on the Karhunen-Loeve expansion , 1973, Pattern Recognit..

[14]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[15]  Keinosuke Fukunaga,et al.  Application of the Karhunen-Loève Expansion to Feature Selection and Ordering , 1970, IEEE Trans. Computers.