Deep Self-Taught Learning for Handwritten Character Recognition

Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by out-of-distribution examples. For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set. We show that deep learners benefit more from out-of-distribution examples than a corresponding shallow learner, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition.

[1]  Pascal Vincent,et al.  A Connection Between Score Matching and Denoising Autoencoders , 2011, Neural Computation.

[2]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[3]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[4]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[5]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[7]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[8]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[9]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[10]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[11]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[12]  Quoc V. Le,et al.  Measuring Invariances in Deep Networks , 2009, NIPS.

[13]  Xinhua Zhuang,et al.  Image Analysis Using Mathematical Morphology , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[15]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[16]  Luiz S. Oliveira,et al.  Supervised learning of fuzzy ARTMAP neural networks through particle swarm optimization , 2007 .

[17]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[18]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[19]  Mohamed Cheriet,et al.  Estimating accurate multi-class probabilities with support vector machines , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[20]  Rafael Llobet,et al.  Fast and Accurate Handwritten Character Recognition Using Approximate Nearest Neighbours Search on Large Databases , 2000, SSPR/SPR.

[21]  Luiz Eduardo Soares de Oliveira,et al.  Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[23]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[24]  Jean Serra,et al.  Image Analysis and Mathematical Morphology , 1983 .