Recognition : Learning Where to Start and When to Stop

Text line detection and localization is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a new approach for full page text recognition. Localization of the text lines is based on regressions with Fully Convolutional Neural Networks and Multidimensional Long Short-Term Memory as contextual layers. In order to increase the efficiency of this localization method, only the position of the left side of the text lines are predicted. The text recognizer is then in charge of predicting the end of the text to recognize. This method has shown good results for full page text recognition on the highly heterogeneous Maurdor dataset.

[1]  Apostolos Antonacopoulos,et al.  Europeana Newspapers OCR Workflow Evaluation , 2015, HIP@ICDAR.

[2]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[3]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Christopher Kermorvant,et al.  Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[5]  Basilios Gatos,et al.  Handwritten Text Line Segmentation by Shredding Text into its Lines , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[6]  Hsi-Jian Lee,et al.  Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm , 1999, Pattern Recognit. Lett..

[7]  Nikita Vijay Borse,et al.  Language Independent Text-Line Extraction Algorithm for Handwritten Documents , 2014 .

[8]  Christopher Kermorvant,et al.  Automatic Line Segmentation and Ground-Truth Alignment of Handwritten Documents , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[9]  Ioannis Pratikakis,et al.  Text line and word segmentation of handwritten documents , 2009, Pattern Recognit..

[10]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[11]  Christian Wolf,et al.  Learning to detect and localize many objects from few examples , 2016, ArXiv.

[12]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[13]  Horst Bunke,et al.  Hidden Markov model-based ensemble methods for offline handwritten text line recognition , 2008, Pattern Recognit..

[14]  George Nagy,et al.  HIERARCHICAL REPRESENTATION OF OPTICALLY SCANNED DOCUMENTS , 1984 .

[15]  Rémy Mullot,et al.  A Pixel Labeling Approach for Historical Digitized Books , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[16]  Ankush Gupta,et al.  Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Marcus Liwicki,et al.  Page segmentation of historical document images with convolutional autoencoders , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[18]  Christophe Garcia,et al.  text Detection with Convolutional Neural Networks , 2008, VISAPP.

[19]  Abdel Belaïd,et al.  Noname manuscript No. (will be inserted by the editor) A General Approach for Multi-oriented Text Line Extraction of Handwritten Documents , 2011 .

[20]  Venu Govindaraju,et al.  2009 10th International Conference on Document Analysis and Recognition A Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic Text Lines , 2022 .

[21]  Théodore Bluche,et al.  Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition , 2016, NIPS.

[22]  Wenyu Liu,et al.  TextBoxes: A Fast Text Detector with a Single Deep Neural Network , 2016, AAAI.

[23]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[24]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[25]  Bruno Grilhères,et al.  The Maurdor Project: Improving Automatic Processing of Digital Documents , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[26]  Christian Wolf,et al.  Paragraph text segmentation into lines with Recurrent Neural Networks , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[27]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[29]  Jean-Michel Jolion,et al.  Text localization, enhancement and binarization in multimedia documents , 2002, Object recognition supported by user interaction for service robots.