Handwritten digit classification

Pattern recognition is one of the major challenges in statistics framework. Its goal is the feature extraction to classify the patterns into categories. A well-known example in this field is the handwritten digit recognition where digits have to be assigned into one of the 10 classes using some classification method. Our purpose is to present alternative classification methods based on statistical techniques. We show a comparison between a multivariate and a probabilistic approach, concluding that both methods provide similar results in terms of test-error rate. Experiments are performed on the known MNIST and USPS databases in binary-level image. Then, as an additional contribution we introduce a novel method to binarize images, based on statistical concepts associated to the written trace of the digit

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Thomas M. Cover,et al.  Estimation by the nearest neighbor rule , 1968, IEEE Trans. Inf. Theory.

[3]  Ching Y. Suen,et al.  A trainable feature extractor for handwritten digit recognition , 2007, Pattern Recognit..

[4]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[5]  Junwei Ji,et al.  The Euler Number Study of Image and Its Application , 2007, 2007 2nd IEEE Conference on Industrial Electronics and Applications.

[6]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[7]  Rama Chellappa,et al.  Discriminant analysis of principal components for face recognition , 1998 .

[8]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[9]  Harris Drucker,et al.  Comparison of learning algorithms for handwritten digit recognition , 1995 .

[10]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[11]  Larry D. Hostetler,et al.  k-nearest-neighbor Bayes-risk estimation , 1975, IEEE Trans. Inf. Theory.

[12]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[13]  Vipin Kumar,et al.  Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification , 2001, PAKDD.

[14]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[16]  Hermann Ney,et al.  Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[18]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Asha Gowda Karegowda,et al.  Feature Subset Selection Problem using Wrapper Approach in Supervised Learning , 2010 .

[20]  Stan Sclaroff,et al.  Efficient nearest neighbor classification using a cascade of approximate similarity measures , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  B. Park,et al.  Choice of neighbor order in nearest-neighbor classification , 2008, 0810.5276.

[24]  Dimitrios Gunopulos,et al.  Large margin nearest neighbor classifiers , 2005, IEEE Transactions on Neural Networks.

[25]  Karl Sims,et al.  Handwritten Character Classification Using Nearest Neighbor in Large Databases , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Junwei Ji,et al.  A proof of image Euler Number formula , 2006, Science in China Series F.

[27]  Yann LeCun,et al.  Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.

[28]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[29]  Rafael C. González,et al.  Digital image processing using MATLAB , 2006 .

[30]  Amir F. Atiya,et al.  Estimating the Posterior Probabilities Using the K-Nearest Neighbor Rule , 2005, Neural Computation.