Transform invariant text extraction

Automatically extracting texts from natural images is very useful for many applications such as augmented reality. Most of the existing text detection systems require that the texts to be detected (and recognized) in an image are taken from a nearly frontal viewpoint. However, texts in most images taken naturally by a camera or a mobile phone can have a significant affine or perspective deformation, making the existing text detection and the subsequent OCR engines prone to failures. In this paper, based on stroke width transform and texture invariant low-rank transform, we propose a framework that can detect and rectify texts in arbitrary orientations in the image against complex backgrounds, so that the texts can be correctly recognized by common OCR engines. Extensive experiments show the advantage of our method when compared to the state of art text detection systems.

[1]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[2]  A. Pais,et al.  O(4) Treatment of the Electromagnetic-Weak Synthesis. , 1972 .

[3]  Jin Hyung Kim,et al.  Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Amandeep Kaur,et al.  Hough transform based fast skew detection and accurate skew correction methods , 2008, Pattern Recognit..

[5]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[6]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[7]  Palaiahnakote Shivakumara,et al.  New Wavelet and Color Features for Text Detection in Video , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Premkumar Natarajan,et al.  Character-Stroke Detection for Text-Localization and Extraction , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[10]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[11]  Hong Yan,et al.  Skew Correction of Document Images Using Interline Cross-Correlation , 1993, CVGIP Graph. Model. Image Process..

[12]  Bingsheng He,et al.  Some convergence properties of a method of multipliers for linearly constrained monotone variational inequalities , 1998, Oper. Res. Lett..

[13]  David S. Doermann,et al.  Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yin Zhang,et al.  Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[15]  Jing Zhang,et al.  Text Detection Using Edge Gradient and Graph Spectrum , 2010, 2010 20th International Conference on Pattern Recognition.

[16]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[17]  John Wright,et al.  RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[19]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[20]  Chew Lim Tan,et al.  Convex hull based skew estimation , 2007, Pattern Recognit..

[21]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[22]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[23]  Alejandro Héctor Toselli,et al.  Projection Profile Based Algorithm for Slant Removal , 2004, ICIAR.

[24]  Palaiahnakote Shivakumara,et al.  Efficient video text detection using edge features , 2008, 2008 19th International Conference on Pattern Recognition.

[25]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[26]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[27]  Yi Ma,et al.  TILT: Transform Invariant Low-Rank Textures , 2010, ACCV 2010.

[28]  Yuxin Peng,et al.  Color-based clustering for text detection and extraction in image , 2007, ACM Multimedia.

[29]  Qifeng Liu,et al.  A stroke filter and its application to text localization , 2009, Pattern Recognit. Lett..

[30]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[31]  Sanghoon Sull,et al.  An Efficient Method for Text Detection in Video Based on Stroke Width Similarity , 2007, ACCV.

[32]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[33]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[34]  Edward M. Riseman,et al.  TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Jiri Matas,et al.  A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[36]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[37]  Palaiahnakote Shivakumara,et al.  Video text detection based on filters and edge features , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[38]  Chucai Yi,et al.  Text String Detection From Natural Scenes by Structure-Based Partition and Grouping , 2011, IEEE Transactions on Image Processing.

[39]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.