Space Displacement Localization Neural Networks to locate origin points of handwritten text lines in historical documents

We describe a new method for detecting and localizing multiple objects in an image using context aware deep neural networks. Common architectures either proceed locally per pixel-wise sliding-windows, or globally by predicting object localizations for a full image. We improve on this by training a semi-local model to detect and localize objects inside a large image region, which covers an object or a part of it. Context knowledge is integrated, combining multiple predictions for different regions through a spatial context layer modeled as an LSTM network. The proposed method is applied to a complex problem in historical document image analysis, where we show that is capable of robustly detecting text lines in the images from the ANDAR-TL competition. Experiments indicate that the model can cope with difficult situations and reach the state of the art in Vision such as other deep models.

[1]  Jihad El-Sana,et al.  Text Line Detection in Corrupted and Damaged Historical Manuscripts , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[2]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jürgen Schmidhuber,et al.  Multi-dimensional Recurrent Neural Networks , 2007, ICANN.

[4]  Laurence Likforman-Sulem,et al.  Text Line Segmentation of Historical Arabic Documents , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Basilios Gatos,et al.  Handwritten Text Line Segmentation by Shredding Text into its Lines , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[7]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[8]  Vassilis Katsouros,et al.  A Morphological Approach for Text-Line Segmentation in Handwritten Documents , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[11]  Dan S. Bloomberg,et al.  Measuring document image skew and orientation , 1995, Electronic Imaging.

[12]  Venu Govindaraju,et al.  2009 10th International Conference on Document Analysis and Recognition A Steerable Directional Local Profile Technique for Extraction of Handwritten Arabic Text Lines , 2022 .

[13]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[14]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[16]  Paul S. Heckbert,et al.  Graphics gems IV , 1994 .

[17]  Juergen Luettin,et al.  A new normalization technique for cursive handwritten words , 2001, Pattern Recognit. Lett..

[18]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[19]  Christopher Kermorvant,et al.  The A2iA Multi-lingual Text Recognition System at the Second Maurdor Evaluation , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[20]  Ioannis Pratikakis,et al.  Text line and word segmentation of handwritten documents , 2009, Pattern Recognit..

[21]  Karel J. Zuiderveld,et al.  Contrast Limited Adaptive Histogram Equalization , 1994, Graphics Gems.

[22]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Christophe Garcia,et al.  text Detection with Convolutional Neural Networks , 2008, VISAPP.

[24]  Tien D. Bui,et al.  Text line segmentation in handwritten documents using Mumford-Shah model , 2009, Pattern Recognit..