Convolutional Neural Network Committees for Handwritten Character Classification

In 2010, after many years of stagnation, the MNIST handwriting recognition benchmark record dropped from 0.40% error rate to 0.35%. Here we report 0.27% for a committee of seven deep CNNs trained on graphics cards, narrowing the gap to human performance. We also apply the same architecture to NIST SD 19, a more challenging dataset including lower and upper case letters. A committee of seven CNNs obtains the best results published so far for both NIST digits and NIST letters. The robustness of our method is verified by analyzing 78125 different 7-net committees.

[1]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[2]  Harris Drucker,et al.  Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .

[3]  Y. Miyake,et al.  Machine and human recognition of segmented characters from handwritten words , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[4]  Sherif Hashem,et al.  Optimal Linear Combinations of Neural Networks , 1997, Neural Networks.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Naonori Ueda,et al.  Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[8]  Luiz Eduardo Soares de Oliveira,et al.  Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[10]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  Mohamed Cheriet,et al.  Estimating accurate multi-class probabilities with support vector machines , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[13]  Alessandro Lameiras Koerich,et al.  Unconstrained handwritten character recognition using metaclasses of characters , 2005, IEEE International Conference on Image Processing 2005.

[14]  Yann LeCun,et al.  Large-scale Learning with SVM and Convolutional for Generic Object Categorization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Luiz Eduardo Soares de Oliveira,et al.  An implicit segmentation-based method for recognition of handwritten strings of characters , 2006, SAC.

[16]  Luiz S. Oliveira,et al.  Supervised learning of fuzzy ARTMAP neural networks through particle swarm optimization , 2007 .

[17]  Thomas Hofmann,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2007 .

[18]  H. Sebastian Seung,et al.  Natural Image Denoising with Convolutional Networks , 2008, NIPS.

[19]  Luiz Eduardo Soares de Oliveira,et al.  Overfitting in the selection of classifier ensembles: a comparative study between PSO and GA , 2008, GECCO '08.

[20]  Robert Sabourin,et al.  Using the RRT algorithm to optimize classification systems for handwritten digits and letters , 2008, SAC '08.

[21]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[22]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Jia-Yu Yang,et al.  Optimal linear combination of neural networks to model thermally induced error of machine tools , 2009, Int. J. Model. Identif. Control..

[24]  Sven Behnke,et al.  Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.

[25]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[27]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[28]  Joseph F. Murray,et al.  Convolutional Networks Can Learn to Generate Affinity Graphs for Image Segmentation , 2010, Neural Computation.

[29]  Fei Yin,et al.  CASIA Online and Offline Chinese Handwriting Databases , 2011, 2011 International Conference on Document Analysis and Recognition.

[30]  Luca Maria Gambardella,et al.  Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.

[31]  Fei Yin,et al.  ICDAR 2011 Chinese Handwriting Recognition Competition , 2011, 2011 International Conference on Document Analysis and Recognition.

[32]  Luca Maria Gambardella,et al.  High-Performance Neural Networks for Visual Object Classification , 2011, ArXiv.

[33]  Luca Maria Gambardella,et al.  Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs , 2011, ArXiv.

[34]  Jürgen Schmidhuber,et al.  A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.