TextCaps: Handwritten Character Recognition With Very Small Datasets

Many localized languages struggle to reap the benefits of recent advancements in character recognition systems due to the lack of substantial amount of labeled training data. This is due to the difficulty in generating large amounts of labeled data for such languages and inability of deep learning techniques to properly learn from small number of training samples. We solve this problem by introducing a technique of generating new training samples from the existing samples, with realistic augmentations which reflect actual variations that are present in human hand writing, by adding random controlled noise to their corresponding instantiation parameters. Our results with a mere 200 training samples per class surpass existing character recognition results in the EMNIST-letter dataset while achieving the existing results in the three datasets: EMNIST-balanced, EMNIST-digits, and MNIST. We also develop a strategy to effectively use a combination of loss functions to improve reconstructions. Our system is useful in character recognition for localized languages that lack much labeled training data and even in other related more general contexts such as object recognition.

[1]  Giovanni Ramponi,et al.  Image enhancement via adaptive unsharp masking , 2000, IEEE Trans. Image Process..

[2]  Yuchun Lee,et al.  Handwritten Digit Recognition Using K Nearest-Neighbor, Radial-Basis Function, and Backpropagation Neural Networks , 1991, Neural Computation.

[3]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[4]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[6]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[7]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[8]  Carl Doersch,et al.  Tutorial on Variational Autoencoders , 2016, ArXiv.

[9]  Thomas Martinetz,et al.  Simple Method for High-Performance Digit Recognition Based on Sparse Coding , 2008, IEEE Transactions on Neural Networks.

[10]  Maheshkumar H. Kolekar,et al.  Classification of fashion article images using convolutional neural networks , 2017, 2017 Fourth International Conference on Image Information Processing (ICIIP).

[11]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[12]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[13]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[14]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.

[20]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[21]  Leslie N. Smith,et al.  Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[22]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Jeff Orchard,et al.  Style Memory: Making a Classifier Network Generative , 2018, 2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[24]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[25]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[26]  Emmanuel Dufourq,et al.  EDEN: Evolutionary deep networks for efficient machine learning , 2017, 2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech).