Feature dimensionality reduction for the verification of handwritten numerals

A novel method based on multi-modal discriminant analysis is proposed to reduce feature dimensionality. First, each class is divided into several clusters by the k-means algorithm. The optimal discriminant analysis is implemented by multi-modal mapping. Our method utilizes only those training samples on and near the effective decision boundary to generate a between-class scatter matrix, which requires less CPU time than other nonparametric discriminant analysis (NDA) approaches [Fukunaga and Mantock in IEEE Trans PAMI 5(6):671–677, 1983; Bressan and Vitria in Pattern Recognit Lett 24(5):2473–2749, 2003]. In addition, no prior assumptions about class and cluster densities are needed. In order to achieve a high verification performance of confusing handwritten numeral pairs, a hybrid feature extraction scheme is developed, which consists of a set of gradient-based wavelet features and a set of geometric features. Our proposed dimensionality reduction algorithm is used to congregate features, and it outperforms the principal component analysis (PCA) and other NDA approaches. Experiments proved that our proposed method could achieve a high feature compression performance without sacrificing its discriminant ability for classification. As a result, this new method can reduce artificial neural network (ANN) training complexity and make the ANN classifier more reliable.

[1]  Trevor Hastie,et al.  Flexible discriminant and mixture models , 2000 .

[2]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Ching Y. Suen,et al.  Analysis of Class Separation and Combination of Class-Dependent Features for Handwriting Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[5]  Kari Torkkola,et al.  Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..

[6]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[7]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[8]  G. Kim,et al.  FEATURE SELECTION USING GENETIC ALGORITHMS FOR HANDWRITTEN CHARACTER RECOGNITION , 2004 .

[9]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[10]  David A. Landgrebe,et al.  Feature Extraction Based on Decision Boundaries , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  M. Bressan,et al.  Nonparametric discriminant analysis and nearest neighbor classification , 2003, Pattern Recognit. Lett..

[12]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[13]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[14]  Luiz Eduardo Soares de Oliveira,et al.  Impacts of verification on a numeral string recognition system , 2003, Pattern Recognit. Lett..

[15]  Luiz Eduardo Soares de Oliveira,et al.  A Methodology for Feature Selection Using Multiobjective Genetic Algorithms for Handwritten Digit String Recognition , 2003, Int. J. Pattern Recognit. Artif. Intell..

[16]  W. C. Guenther,et al.  Analysis of variance , 1968, The Mathematical Gazette.

[17]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[18]  Ching Y. Suen,et al.  Computer recognition of unconstrained handwritten numerals , 1992, Proc. IEEE.

[19]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[20]  Hiroshi Sako,et al.  Handwritten digit recognition: investigation of normalization and feature extraction techniques , 2004, Pattern Recognit..

[21]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[22]  K. Fukunaga,et al.  Nonparametric Discriminant Analysis , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Claus Bahlmann,et al.  Online handwriting recognition with support vector machines - a kernel approach , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[24]  Guangyi Chen,et al.  Invariant Fourier-wavelet descriptor for pattern recognition , 1999, Pattern Recognit..

[25]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[26]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[29]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..