Skew detection and correction in document images bsed on straight-line fitting

During document scanning, skew is inevitably introduced into the incoming document image. Since the algorithms for layout analysis and character recognition are generally very sensitive to the page skew, skew detection and correction in document images are the critical steps before layout analysis. In this paper, a novel skew detection method based on straight-line fitting is proposed. And a concept of Eigen-point is introduced. After the relations between the successive Eigen-points in every text line within a suitable sub-region were analyzed, the Eigen-points most possibly laid on the baselines are selected as samples for the straight-line fitting. The average of these baseline directions is computed, which corresponds to the degree of skew of the whole document image. Then a fast skew correction method based on the scanning line model is also presented. Experiments prove that the proposed approaches are fast and accurate.

[1]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ming Chen,et al.  A robust skew detection algorithm for grayscale document image , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[3]  G. Ciardiello,et al.  An experimental system for office document handling and text recognition , 1988 .

[4]  Anil K. Jain,et al.  A robust and fast skew detection algorithm for generic documents , 1996, Pattern Recognit..

[5]  S.C. Hinds,et al.  A document skew detection method using run-length encoding and the Hough transform , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[6]  B. GATOS,et al.  Skew detection and text line position determination in digitized documents , 1997, Pattern Recognit..

[7]  Hong Yan,et al.  Skew Correction of Document Images Using Interline Cross-Correlation , 1993, CVGIP Graph. Model. Image Process..

[8]  Matti Pietikäinen,et al.  Robust skew estimation on low-resolution document images , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[9]  Ehud Rivlin,et al.  Skew detection via principal components analysis , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[10]  Harry Wechsler,et al.  Automated page orientation and skew angle detection for binary document images , 1994, Pattern Recognit..

[11]  Changming Sun,et al.  Skew and slant correction for document images using gradient direction , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.