Recognising text in real scenes

Abstract. We present two different approaches to the location and recovery of text in images of real scenes. The techniques we describe are invariant to the scale and 3D orientation of the text, and allow recovery of text in cluttered scenes. The first approach uses page edges and other rectangular boundaries around text to locate a surface containing text, and to recover a fronto-parallel view. This is performed using line detection, perceptual grouping, and comparison of potential text regions using a confidence measure. The second approach uses low-level texture measures with a neural network classifier to locate regions of text in an image. Then we recover a fronto-parallel view of each located paragraph of text by separating the individual lines of text and determining the vanishing points of the text plane. We illustrate our results using a number of images.

[1]  Edwin R. Hancock,et al.  Detecting multiple texture planes using local spectral distortion , 2002, Image Vis. Comput..

[2]  Josef Kittler,et al.  An Optimizing Line Finder Using a Hough Transform Algorithm , 1997, Comput. Vis. Image Underst..

[3]  Robert Wilensky,et al.  Multivalent Documents: A New Model for Digital Documents , 1998 .

[4]  Chew Lim Tan,et al.  Text extraction using pyramid , 1998, Pattern Recognit..

[5]  Gian Luca Foresti,et al.  2D into 3D Hough-space mapping for planar object pose estimation , 1997, Image Vis. Comput..

[6]  Shu-Yuan Chen,et al.  Adaptive page segmentation for color technical journals' cover images , 1998, Image Vis. Comput..

[7]  Raymond K. K. Yip A Hough transform technique for the detection of reflectional symmetry and skew-symmetry , 2000, Pattern Recognit. Lett..

[8]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[9]  Camillo J. Taylor,et al.  Reconstruction of Linearly Parameterized Models from Single Images with a Camera of Unknown Focal Length , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  David S. Doermann,et al.  Automatic identification of text in digital video key frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[11]  Luc Van Gool,et al.  The Characterization and Detection of Skewed Symmetry , 1995, Comput. Vis. Image Underst..

[12]  Josef Kittler,et al.  Optimal Edge Detectors for Ramp Edges , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Stefano Messelodi,et al.  Automatic identification and skew estimation of text lines in real scene images , 1999, Pattern Recognition.

[14]  Majid Mirmehdi,et al.  Extracting Low Resolution Text with an Active Camera for OCR , 2001 .

[15]  Andrew Zisserman,et al.  Geometric Grouping of Repeated Elements within Images , 1998, BMVC.

[16]  Antonio Criminisi,et al.  Shape from Texture: Homogeneity Revisited , 2000, BMVC.

[17]  Bob Richards Faster Spatial Image Processing Using Partial Summation Faster Spatial Image Processing Using Partial Summation , 2003 .

[18]  David G. Lowe,et al.  Perceptual Organization and Visual Recognition , 2012 .

[19]  Robert C. Bolles,et al.  A RANSAC-Based Approach to Model Fitting and Its Application to Finding Cylinders in Range Data , 1981, IJCAI.