Robust Handwriting Recognition with Limited and Noisy Data

Despite the advent of deep learning in computer vision, the general handwriting recognition problem is far from solved. Most existing approaches focus on handwriting datasets that have clearly written text and carefully segmented labels. In this paper, we instead focus on learning handwritten characters from maintenance logs, a constrained setting where data is very limited and noisy. We break the problem into two consecutive stages of word segmentation and word recognition respectively, and utilize data augmentation techniques to train both stages. Extensive comparisons with popular baselines for scene-text detection and word recognition show that our system achieves a lower error rate and is more suited to handle noisy and difficult documents.

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[3]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4]  Ankush Gupta,et al.  Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6]  Shuchang Zhou,et al.  EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  C. V. Jawahar,et al.  Deep Feature Embedding for Accurate Recognition and Retrieval of Handwritten Text , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8]  Andreas Keller,et al.  Lexicon-free handwritten word spotting using character HMMs , 2012, Pattern Recognit. Lett..

[9]  Andrew Zisserman,et al.  Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[10]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Dongyoon Han,et al.  Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jianying Hu,et al.  Writer independent on-line handwriting recognition using an HMM approach , 2000, Pattern Recognit..

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[15]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[16]  Lior Wolf,et al.  CNN-N-Gram for HandwritingWord Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Joan Puigcerver,et al.  Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition? , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[18]  Hermann Ney,et al.  Sequence-discriminative training of recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[20]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Hermann Ney,et al.  Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[22]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[23]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[24]  C. V. Jawahar,et al.  Improving CNN-RNN Hybrid Networks for Handwriting Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[25]  Albert Gordo,et al.  Rosetta: Large Scale System for Text Detection and Recognition in Images , 2018, KDD.

[26]  Xuelong Li,et al.  PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.

[27]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[32]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[33]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[35]  Claus Bahlmann,et al.  The writer independent online handwriting recognition system frog on hand and cluster generative statistical dynamic time warping , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Alicia Fornés,et al.  Convolve, Attend and Spell: An Attention-based Sequence-to-Sequence Model for Handwritten Word Recognition , 2018, GCPR.

[37]  Marcus Liwicki,et al.  A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks , 2007 .

[38]  Marcelo Worsley,et al.  Multimodal Learning Analytics and Education Data Mining: using computational technologies to measure complex learning tasks , 2016, J. Learn. Anal..

[39]  Patrick J. Grother,et al.  NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .