An Improved Method for Text Segmentation and Skew Normalization of Handwriting Image

This paper proposed an off-line cursive handwriting segmentation method and an efficient skew normalization process of the handwritten document. The proposed segmentation method based on horizontal and vertical projection, which has been already used for different purposes in handwriting analysis. But to tolerate the text lines overlapping and multi-skewed text lines, present work implements modified version of horizontal and vertical projection, which can segment the text lines and words even if text lines are overlapped. Present work also proposed a skew normalization method which is based on orthogonal projection toward the x-axis. The proposed method was tested on more than 550 text images of IAM database and sample handwriting image which are written by the different writer on the different background. The experimental result shows that proposed algorithm achieves more than 96% accuracy.

[1]  Thierry Paquet,et al.  Text line segmentation in handwritten document using a production system , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  Utpal Roy,et al.  A Novel Approach to Skew Detection and Character Segmentation for Handwritten Bangla Words , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[4]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  Bhabatosh Chanda,et al.  A Model Based Text Line Segmentation Method for Off-line Handwritten Documents , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[6]  Prasenjit Dey,et al.  A Novel Approach of Bangla Handwritten Text Recognition Using HMM , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[7]  Neeta Nain,et al.  A Novel Approach of Skew Normalization for Handwritten Text Lines and Words , 2012, 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems.

[8]  Xi Zhang,et al.  Text Line Segmentation for Handwritten Documents Using Constrained Seam Carving , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[9]  Yan Solihin,et al.  Noise and background removal from handwriting images , 1997, Proceedings Intelligent Information Systems. IIS'97.

[10]  Fotini Simistira,et al.  Enhancing Handwritten Word Segmentation by Employing Local Spatial Features , 2011, 2011 International Conference on Document Analysis and Recognition.

[11]  Bidyut Baran Chaudhuri,et al.  Extraction of line-word-character segments directly from run-length compressed printed text-documents , 2013, 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG).

[12]  Jija Das Gupta,et al.  Novel methods for slope and slant correction of off-line handwritten text word , 2012, 2012 Third International Conference on Emerging Applications of Information Technology.

[13]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  A. Harvey,et al.  Skew detection in handwritten scripts , 1997, TENCON '97 Brisbane - Australia. Proceedings of IEEE TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications (Cat. No.97CH36162).

[15]  M. Sarfraz,et al.  Skew Estimation and Correction of Text Using Bounding Box , 2008, 2008 Fifth International Conference on Computer Graphics, Imaging and Visualisation.

[16]  Horst Bunke,et al.  Using Hidden Markov Models as a Tool for Handwritten Text Line Segmentation , 2007 .

[17]  Abhishek Bal,et al.  An Improved Method for Handwritten Document Analysis Using Segmentation, Baseline Recognition and Writing Pressure Detection , 2016 .

[18]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[19]  K. M. bin Abdl,et al.  Handwriting identification: a direction review , 2009, 2009 IEEE International Conference on Signal and Image Processing Applications.

[20]  C.A.B. Mello,et al.  Text Line Segmentation in Images of Handwritten Historical Documents , 2008, 2008 First Workshops on Image Processing Theory, Tools and Applications.

[21]  Horst Bunke,et al.  Automatic segmentation of the IAM off-line database for handwritten English text , 2002, Object recognition supported by user interaction for service robots.

[22]  Neeta Nain,et al.  Handwritten text documents binarization and skew normalization approaches , 2012, 2012 4th International Conference on Intelligent Human Computer Interaction (IHCI).

[23]  Lawrence O'Gorman,et al.  The RightPages image-based electronic library for alerting and browsing , 1992, Computer.

[24]  Vassilis Katsouros,et al.  Handwritten document image segmentation into text lines and words , 2010, Pattern Recognit..

[25]  Venu Govindaraju,et al.  Analysis of textual images using the Hough transform , 1989, Machine Vision and Applications.

[26]  Horst Bunke,et al.  Text line segmentation and word recognition in a system for general writer independent handwriting recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[27]  Wichian Premchaiswadi,et al.  A scheme for salt and pepper noise reduction and its application for OCR systems , 2010 .

[28]  Salvador España Boquera,et al.  Handwriting Normalization by Zone Estimation Using HMM/ANNs , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[29]  K. Anandakumar,et al.  Automated Human Behavior Prediction through Handwriting Analysis , 2010, 2010 First International Conference on Integrated Intelligent Computing.

[30]  Vassilis Katsouros,et al.  Robust text-line and word segmentation for handwritten documents images , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.