SOUTH INDIAN TAMIL LANGUAGE HANDWRITTEN DOCUMENT TEXT LINE SEGMENTATION TECHNIQUE WITH AID OF SLIDING WINDOW AND SKEWING OPERATIONS

In document image analysis, Text line segmentation is one of the key components. The segmentation logi c presents essential information about skew correctio n, zone segmentation, and character recognition. Th e method of document image segmentation into text lines for printed text has seen numerous contributions from fellow research scholars, yet there is scope f or tremendous improvement. The key challenges for handwritten document are due to writer movement, the inter-line distance changeability and incoherent distance between the components that may differ. These may be directly by segments, or curved. The are a of handwritten segmentation has seen few models; very few of the research paper are proposed for Text line skew segmentation model and hence the stimulus of handwritten south Indian languages. Consequently, a better text line segmentation technique for south I ndian Tamil language is proposed in this paper. The processing of Tamil language is very crucial factor because the Tamil letters are in crucial shapes an d it is harder to segment the touching lines and letters fr om the Tamil image documents. The challenges present in Tamil language process and the existing text line s egmentation methods has been improved by our proposed method, which utilizing two major techniques namely, sliding window and adaptive histogram equalization. Our proposed text line segmentation t echnique initially performs the preprocessing proce ss and these preprocessed document images are given to the adaptive histogram equalization. During the histogram equalization process, the document images text characters are enhanced to view the character s more accurately. The enhanced image text lines are segmented by utilizing the sliding window operation . For accurate line segmentation, the skewing operati on is performed on the line segmented result images . The implementation result shows the effectiveness o f proposed technique, in segmenting the handwritten text lines from the input document. The performance of the proposed technique is evaluated by comparin g the result of proposed technique with the conventio nal text line segmentation technique. The result sh ows that our proposed technique acquires high-quality t ext line segmentation DR, RA and F-Measure values for the number of testing documents in comparison with the conventional technique.

[1]  Ioannis Pratikakis,et al.  Text line and word segmentation of handwritten documents , 2009, Pattern Recognit..

[2]  Srikantamurthy Krishnamurthy,et al.  Skew Detection, Correction and Segmentation of Handwritten Kannada Document , 2012 .

[3]  Basilios Gatos,et al.  Handwritten Text Line Segmentation by Shredding Text into its Lines , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  G. Hemantha Kumar,et al.  Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis , 2008, Eng. Appl. Artif. Intell..

[5]  C. Sureshkumar Tamil Handwritten Character Recognition Using Kohonon's Self Organizing Map , 2009 .

[6]  Sargur N. Srihari,et al.  Word segmentation of off-line handwritten documents , 2008, Electronic Imaging.

[7]  Palaiahnakote Shivakumara,et al.  A novel technique for estimation of skew in binary text document images based on linear regression analysis , 2005 .

[8]  David Doermann,et al.  A New Algorithm for Detecting Text Line in Handwritten Documents , 2006 .

[9]  Vassilis Katsouros,et al.  Handwritten document image segmentation into text lines and words , 2010, Pattern Recognit..

[10]  Basappa B. Kodada,et al.  Unconstrained Handwritten Kannada Numeral Recognition , 2013 .

[11]  Javad Mohammadi,et al.  Novel Approach for Baseline Detection and Text Line Segmentation , 2012 .

[12]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  P. Vanaja Ranjan,et al.  EFFICIENT ZONE BASED FEATURE EXTRATION ALGORITHM FOR HANDWRITTEN NUMERAL RECOGNITION OF FOUR POPULAR SOUTH INDIAN SCRIPTS , 2008 .

[14]  A. Diop Journal of Theoretical and Applied Information Technology , 2012 .

[15]  Fei Yin,et al.  Handwritten Chinese text line segmentation by clustering with distance metric learning , 2009, Pattern Recognit..

[16]  R. Manmatha,et al.  Scale Space Technique for Word Segmentation in Handwritten Documents , 1999, Scale-Space.

[17]  C. Sureshkumar Handwritten Tamil Character Recognition and Conversion using Neural Network , 2010 .

[18]  Apurva A. Desai,et al.  Gujarati handwritten numeral optical character reorganization through neural network , 2010, Pattern Recognit..

[19]  Noorzaily Mohamed Noor Off-line Handwriting Text Line Segmentation : A Review , 2008 .

[20]  R MamathaH,et al.  Morphological Operations and Projection Profiles based Segmentation of Handwritten Kannada Document , 2012 .

[21]  Sargur N. Srihari,et al.  On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Abdelmajid Ben Hamadou,et al.  Off-line handwritten word recognition using multi-stream hidden Markov models , 2010, Pattern Recognit. Lett..

[23]  Robert Sabourin,et al.  Off-Line Handwritten Word Recognition Using Hidden Markov Models , 1999, KNOWLEDGE-BASED INTELLIGENT TECHNIQUES in CHARACTER RECOGNITION.

[24]  K SrikantaMurthy,et al.  Fan Beam Projection Based Features to Recognize Handwritten Kannada Numerals , 2011 .