论文信息 - Field Typing for Improved Recognition on Heterogeneous Handwritten Forms

Field Typing for Improved Recognition on Heterogeneous Handwritten Forms

Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements.

[1] Paul J. Besl,et al. A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[2] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] C. V. Jawahar,et al. Generating Synthetic Data for Text Recognition , 2016, ArXiv.

[4] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[5] Andrew Zisserman,et al. Deep Structured Output Learning for Unconstrained Text Recognition , 2014, ICLR.

[6] Andreas Dengel,et al. A Tesseract-based OCR framework for historical documents lacking ground-truth text , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[7] Lei Sun,et al. A CNN-Based Approach to Detecting Text from Images of Whiteboards and Handwritten Notes , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8] Gernot A. Fink,et al. Markov models for offline handwriting recognition: a survey , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[9] Théodore Bluche,et al. Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[10] Marçal Rusiñol,et al. Automatic Static/Variable Content Separation in Administrative Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Lei Sun,et al. A Robust Approach to Detecting Text from Images of Whiteboards and Handwritten Notes , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[13] Sébastien Eskenazi,et al. A comprehensive survey of mostly textual document segmentation algorithms since 2008 , 2017, Pattern Recognit..

[14] C. V. Jawahar,et al. Improving CNN-RNN Hybrid Networks for Handwriting Recognition , 2018, 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[15] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[16] Christian Wolf,et al. Recognition : Learning Where to Start and When to Stop , 2017 .

[17] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[18] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[19] Premkumar Natarajan,et al. Combining Convolutional Neural Networks and LSTMs for Segmentation-Free OCR , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[20] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[21] Frédéric Kaplan,et al. Comparing human and machine performances in transcribing 18th century handwritten Venetian script , 2018, DH.

[22] Horst Bunke,et al. The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[23] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24] Andreas Dengel,et al. Classification and Information Extraction for Complex and Nested Tabular Structures in Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[25] Joan Puigcerver,et al. Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition? , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[26] Emmanuel Augustin,et al. RIMES evaluation campaign for handwritten mail processing , 2006 .

[27] Premkumar Natarajan,et al. Combining deep learning and language modeling for segmentation-free OCR from raw pixels , 2017, 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR).

[28] Qiang Huo,et al. A study on effects of implicit and explicit language model information for DBLSTM-CTC based handwriting recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[29] Christopher Kermorvant,et al. Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[30] Premkumar Natarajan,et al. Implicit Language Model in LSTM for OCR , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[31] Xiang Bai,et al. Detecting Oriented Text in Natural Images by Linking Segments , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).