Document skew detection using minimum-area bounding rectangle

Detection of document skew is an important step in document image analysis. The paper presents a new method for calculation of document skew. The method forms large connected components by a smoothing algorithm and calculates the document skew by finding the orientation of the minimum-area bounding rectangle of one of several connected components. Connection of text to non-text in the smoothing step does not degrade the performance of the method. The smoothing parameters are determined automatically and no manual adjustment is necessary. The method is not limited in the range of detectable skew angles and the achievable accuracy. Experimental results show the high performance of the algorithm in detecting document skew for a variety of documents with different levels of complexity.

[1]  Frank Y. Shih,et al.  Adaptive document block segmentation and classification , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[2]  Rangachar Kasturi,et al.  Machine vision , 1995 .

[3]  Azriel Rosenfeld,et al.  A method of detecting the orientation of aligned components , 1986, Pattern Recognit. Lett..

[4]  B. GATOS,et al.  Skew detection and text line position determination in digitized documents , 1997, Pattern Recognit..

[5]  Yasuto Ishitani,et al.  Document skew detection based on local region complexity , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[6]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hong Yan,et al.  Skew Correction of Document Images Using Interline Cross-Correlation , 1993, CVGIP Graph. Model. Image Process..

[8]  Harry Wechsler,et al.  Automated page orientation and skew angle detection for binary document images , 1994, Pattern Recognit..

[9]  Robert M. Haralick,et al.  An automatic algorithm for text skew estimation in document images using recursive morphological transforms , 1994, Proceedings of 1st International Conference on Image Processing.

[10]  Norihiro Hagita,et al.  Automated entry system for printed documents , 1990, Pattern Recognit..

[11]  Anil K. Jain,et al.  A robust and fast skew detection algorithm for generic documents , 1996, Pattern Recognit..

[12]  Henry S. Baird,et al.  The skew angle of printed documents , 1995 .

[13]  S. Chaudhuri,et al.  Robust detection of skew in document images , 1997, IEEE Trans. Image Process..

[14]  Jim R. Parker,et al.  Algorithms for image processing and computer vision , 1996 .

[15]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[16]  S.C. Hinds,et al.  A document skew detection method using run-length encoding and the Hough transform , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.