Morphology-based hierarchical representation with application to text segmentation in natural images

Many text segmentation methods are elaborate and thus are not suitable to real-time implementation on mobile devices. Having an efficient and effective method, robust to noise, blur, or uneven illumination, is interesting due to the increasing number of mobile applications needing text extraction. We propose a hierarchical image representation, based on the morphological Laplace operator, which is used to give a robust text segmentation. This representation relies on several very sound theoretical tools; its computation eventually translates to a simple labeling algorithm, and for text segmentation and grouping, to an easy tree-based processing. We also show that this method can also be applied to document binarization, with the interesting feature of getting also reverse-video text.

[1]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[2]  Yongchao Xu,et al.  Connected Filtering on Tree-Based Shape-Spaces , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Longin Jan Latecki 3D Well-Composed Pictures , 1997, CVGIP Graph. Model. Image Process..

[4]  Séverine Dubuisson,et al.  From Text Detection to Text Segmentation: A Unified Evaluation Scheme , 2016, ECCV Workshops.

[5]  Séverine Dubuisson,et al.  What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions , 2016, Image Vis. Comput..

[6]  Azriel Rosenfeld,et al.  Well-Composed Sets , 1995, Comput. Vis. Image Underst..

[7]  Laurent Najman,et al.  How to Make nD Functions Digitally Well-Composed in a Self-dual Way , 2015, ISMM.

[8]  Jiri Matas,et al.  Real-Time Lexicon-Free Scene Text Localization and Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[10]  David S. Doermann,et al.  Text Detection and Recognition in Imagery: A Survey , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[12]  Christof Koch,et al.  AdaBoost for Text Detection in Natural Scene , 2011, 2011 International Conference on Document Analysis and Recognition.

[13]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[14]  Ian T. Young,et al.  An edge detection model based on non-linear Laplace filtering , 1988 .

[15]  Thierry Géraud,et al.  Efficient multiscale Sauvola’s binarization , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[16]  Séverine Dubuisson,et al.  Using histogram representation and Earth Mover's Distance as an evaluation tool for text detection , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[17]  Jiri Matas,et al.  FASText: Efficient Unconstrained Scene Text Detector , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Séverine Dubuisson,et al.  TextCatcher: a method to detect curved and challenging text in natural scenes , 2016, International Journal on Document Analysis and Recognition (IJDAR).

[19]  Xiang Bai,et al.  Symmetry-based text line detection in natural scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Wenyu Liu,et al.  Multi-oriented Text Detection with Fully Convolutional Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Xiang Bai,et al.  Scene text detection and recognition: recent advances and future trends , 2015, Frontiers of Computer Science.

[22]  Mickaël Coustaty,et al.  ICDAR2015 competition on smartphone document capture and OCR (SmartDoc) , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[23]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[24]  Laurent Najman,et al.  A Quasi-linear Algorithm to Compute the Tree of Shapes of nD Images , 2013, ISMM.

[25]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.