Skew Angle Detection of Digitized Indian Script Documents

Skew angle detection of scanned documents containing most popular Indian scripts (Devnagari and Bangla) is considered. Most characters in these scripts have horizontal lines at the top, called head lines. The character head lines mostly join one another in a word and the word appears as a single component. In the proposed method the components are at first labeled. The upper envelope of a component is found by columnwise scanning from an imaginary line above the component. Portions of upper envelope satisfying the properties of digital straight line are detected. They are clustered as belonging to single text lines. Estimates from individual clusters are combined to get the skew angle. Apart from accuracy and efficiency, an advantage of the method is that character segmentation and zone detection can be readily done from head line information, which is useful in optical character recognition approaches of these scripts.

[1]  Bidyut Baran Chaudhuri,et al.  An improved document skew angle estimation technique , 1996, Pattern Recognit. Lett..

[2]  S.C. Hinds,et al.  A document skew detection method using run-length encoding and the Hough transform , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[3]  Rangachar Kasturi,et al.  A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  AZRIEL ROSENFELD,et al.  Digital Straight Line Segments , 1974, IEEE Transactions on Computers.

[5]  Hsieh S. Hou,et al.  Digital document processing , 1983 .

[6]  Jiangying Zhou,et al.  Page segmentation and classification , 1992, CVGIP Graph. Model. Image Process..

[7]  Harry Wechsler,et al.  Automated page orientation and skew angle detection for binary document images , 1994, Pattern Recognit..

[8]  Bidyut B. Chaudhuri,et al.  Computer recognition of printed Bangla script , 1995 .

[9]  Norihiro Hagita,et al.  Automated entry system for printed documents , 1990, Pattern Recognit..

[10]  Henry S. Baird,et al.  The skew angle of printed documents , 1995 .

[11]  Azriel Rosenfeld,et al.  A method of detecting the orientation of aligned components , 1986, Pattern Recognit. Lett..

[12]  Hong Yan,et al.  Skew Correction of Document Images Using Interline Cross-Correlation , 1993, CVGIP Graph. Model. Image Process..

[13]  Lawrence O'Gorman,et al.  The Document Spectrum for Page Layout Analysis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..