Robust skew estimation using straight lines in document images

Abstract. A skew-estimation method using straight lines in document images is presented. Unlike conventional approaches exploiting the properties of text, we formulate the skew-estimation problem as an estimation task using straight lines in images and focus on robust and accurate line detection. To be precise, we adopt a block-based edge detector followed by a progressive line detector to take clues from a variety of sources such as text lines, boundaries of figures/tables, vertical/horizontal separators, and boundaries of textblocks. Extensive experiments on the datasets of skewed images and competition results reveal that the proposed method works robustly and yields accurate skew-estimation results.

[1]  Yang Cao,et al.  Skew detection and correction in document images bsed on straight-line fitting , 2003, Pattern Recognit. Lett..

[2]  Nanning Zheng,et al.  Skew Estimation of Document Images Using Bagging , 2010, IEEE Transactions on Image Processing.

[3]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[4]  Hong Yan,et al.  Skew Correction of Document Images Using Interline Cross-Correlation , 1993, CVGIP Graph. Model. Image Process..

[5]  Jiri Matas,et al.  Robust Detection of Lines Using the Progressive Probabilistic Hough Transform , 2000, Comput. Vis. Image Underst..

[6]  Hartmut Neven,et al.  PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  A. Papandreou,et al.  Efficient skew detection of printed document images based on novel combination of enhanced profiles , 2014, International Journal on Document Analysis and Recognition (IJDAR).

[9]  Jun Sun,et al.  Skew detection using wavelet decomposition and projection profile analysis , 2007, Pattern Recognit. Lett..

[10]  Prasenjit Dey,et al.  e-PCP: A robust skew detection method for scanned document images , 2010, Pattern Recognit..

[11]  Amandeep Kaur,et al.  Hough transform based fast skew detection and accurate skew correction methods , 2008, Pattern Recognit..

[12]  Venu Govindaraju,et al.  Analysis of textual images using the Hough transform , 1989, Machine Vision and Applications.

[13]  Hyung Il Koo Segmentation and Rectification of Pictures in the Camera-Captured Images of Printed Documents , 2013, IEEE Transactions on Multimedia.

[14]  S. Chaudhuri,et al.  Robust detection of skew in document images , 1997, IEEE Trans. Image Process..

[15]  Anil K. Jain,et al.  A robust and fast skew detection algorithm for generic documents , 1996, Pattern Recognit..

[16]  Jhing-Fa Wang,et al.  Skew detection and reconstruction based on maximization of variance of transition-counts , 2000, Pattern Recognit..

[17]  Nicholas R. Howe,et al.  Document binarization with automatic parameter tuning , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[18]  Chien-Hsing Chou,et al.  Estimation of skew angles for scanned documents based on piecewise covering by parallelograms , 2007, Pattern Recognit..

[19]  Nam Ik Cho,et al.  Skew estimation of natural images based on a salient line detector , 2013, J. Electronic Imaging.