Evaluation of weighted Fisher criteria for large category dimensionality reduction in application to Chinese handwriting recognition

To improve the class separability of Fisher linear discriminant analysis (FDA) for large category problems, we investigate the weighted Fisher criterion (WFC) by integrating weighting functions for dimensionality reduction. The objective of WFC is to maximize the sum of weighted distances of all class pairs. By setting larger weights for the most confusable classes, WFC can improve the class separation while the solution remains an eigen-decomposition problem. We evaluate five weighting functions in three different weighting spaces in a typical large category problem of handwritten Chinese character recognition. The weighting functions include four based on existing methods, namely, FDA, approximate pairwise accuracy criterion (aPAC), power function (POW), confused distance maximization (CDM), and a new one based on K-nearest neighbors (KNN). All the weighting functions can be calculated in the original feature space, low-dimensional space, or fractional space. Our experiments on a 3,755-class Chinese handwriting database demonstrate that WFC can improve the classification accuracy significantly compared to FDA. Among the weighting functions, the KNN method in the original space is the most competitive model which achieves significantly higher classification accuracy and has a low computational complexity. To further improve the performance, we propose a nonparametric extension of the KNN method from the class level to the sample level. The sample level KNN (SKNN) method is shown to outperform significantly other methods in Chinese handwriting recognition such as the locally linear discriminant analysis (LLDA), neighbor class linear discriminant analysis (NCLDA), and heteroscedastic linear discriminant analysis (HLDA).

[1]  Cheng-Lin Liu,et al.  High accuracy handwritten Chinese character recognition using LDA-based compound distances , 2008, Pattern Recognit..

[2]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[3]  Yunxue Shao,et al.  Fast self-generation voting for handwritten Chinese character recognition , 2012, International Journal on Document Analysis and Recognition (IJDAR).

[4]  Fei Yin,et al.  Online and offline handwritten Chinese character recognition: Benchmarking on new databases , 2013, Pattern Recognit..

[5]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[6]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Dit-Yan Yeung,et al.  Worst-Case Linear Discriminant Analysis , 2010, NIPS.

[8]  Cheng-Lin Liu,et al.  Writer Adaptation with Style Transfer Mapping , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  K. Fukunaga,et al.  Nonparametric Discriminant Analysis , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Hakan Cevikalp,et al.  Discriminative common vectors for face recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Sanjoy Dasgupta,et al.  Experiments with Random Projection , 2000, UAI.

[12]  Robert P. W. Duin,et al.  Multiclass Linear Dimension Reduction by Weighted Pairwise Fisher Criteria , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Hiroshi Sako,et al.  Discriminative learning quadratic discriminant function for handwriting recognition , 2004, IEEE Transactions on Neural Networks.

[14]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[15]  Tetsushi Wakabayashi,et al.  On feature extraction for limited class problem , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[16]  André Stuhlsatz,et al.  Feature Extraction With Deep Neural Networks by a Generalized Discriminant Analysis , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Yaoliang Yu,et al.  Distance metric learning by minimal distance maximization , 2011, Pattern Recognit..

[18]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[19]  Dacheng Tao,et al.  Harmonic mean for subspace selection , 2008, 2008 19th International Conference on Pattern Recognition.

[20]  Frank P. Ferrie,et al.  Pareto discriminant analysis , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Cheng-Lin Liu,et al.  Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[23]  Wai Keung Wong,et al.  Deep Learning Regularized Fisher Mappings , 2011, IEEE Transactions on Neural Networks.

[24]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[25]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[28]  Bor-Chen Kuo,et al.  Nonparametric weighted feature extraction for classification , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[29]  Fei Yin,et al.  CASIA Online and Offline Chinese Handwriting Databases , 2011, 2011 International Conference on Document Analysis and Recognition.

[30]  Ravi Kothari,et al.  Fractional-Step Dimensionality Reduction , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[32]  Fei Yin,et al.  ICDAR 2011 Chinese Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[33]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[34]  T. Kobayashi,et al.  Improve Handwritten Character Recognition Performance by Heteroscedastic Linear Discriminant Analysis , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[35]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[36]  Robert P. W. Duin,et al.  Linear dimensionality reduction via a heteroscedastic extension of LDA: the Chernoff criterion , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  E. Oja,et al.  Independent Component Analysis , 2013 .

[38]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[39]  David A. Landgrebe,et al.  Feature Extraction Based on Decision Boundaries , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Hua Yu,et al.  A direct LDA algorithm for high-dimensional data - with application to face recognition , 2001, Pattern Recognit..

[41]  Aleix M. Martínez,et al.  Subclass discriminant analysis , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Dacheng Tao,et al.  Max-Min Distance Analysis by Using Sequential SDP Relaxation for Dimension Reduction , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[44]  Cheng-Lin Liu,et al.  Confused Distance Maximization for Large Category Dimensionality Reduction , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[45]  Xue Gao,et al.  Dimensionality Reduction by Locally Linear Discriminant Analysis for Handwritten Chinese Character Recognition , 2012, IEICE Trans. Inf. Syst..

[46]  Bo Xu,et al.  Dimensionality Reduction by Minimal Distance Maximization , 2010, 2010 20th International Conference on Pattern Recognition.

[47]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[48]  Jian Yang,et al.  KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[50]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[51]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.