Simple and effective techniques for core-region detection and slant correction in offline script recognition

This paper presents two new preprocessing techniques for cursive script recognition. Enhanced algorithms for core-region detection and effective uniform slant angle estimation are proposed. Reference lines composed of core-region are usually obtained as the ones surrounding highest density peaks, but are strongly affected by the presence of long horizontal strokes and erratic characters in the word. Therefore, it caused confusion with the actual core-region and leads to decisive errors in normalizing the word. To overcome this problem in core-region detection quantile is introduced to make resulting process robust. On the other hand, research community has introduced computationally heavy approaches to remove slant in cursive script. Therefore, a simple formalized and effective method is presented for the detection and removal of slant angle for offline cursive handwritten words to avoid heavy experimental efforts. Additionally, already not-slanted words are not affected negatively by applying this algorithm. The core-region detection is based on statistical features, while slant angle estimation is based on structure features of the word image. The algorithms are tested on IAM benchmark database of cursive handwritten words. Promising results for core-region detection, slant angle estimation/removal are reported and compared with widely applied Bozinovic and Srihari method (BSM).

[1]  Michael Blumenstein,et al.  The neural-based segmentation of cursive words using enhanced heuristics , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[2]  Ching Y. Suen,et al.  Automatic reading of cursive scripts using a reading model and perceptual concepts , 1998, International Journal on Document Analysis and Recognition.

[3]  Malayappan Shridhar,et al.  Handwritten address interpretation using word recognition with and without lexicon , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[4]  S. Ganapathy,et al.  Preprocessing techniques for cursive script word recognition , 1983, Pattern Recognit..

[5]  Flávio Bortolozzi,et al.  Mathematical morphology and weighted least squares to correct handwriting baseline skew , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[6]  Seiichi Uchida,et al.  Nonuniform slant correction using dynamic programming , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Nikos Fakotakis,et al.  A slant removal algorithm , 2000, Pattern Recognit..

[8]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[9]  Nikos Fakotakis,et al.  Slant estimation algorithm for OCR systems , 2001, Pattern Recognit..

[10]  Nikos Fakotakis,et al.  An Integrated System for Handwritten Document Image Processing , 2003, Int. J. Pattern Recognit. Artif. Intell..

[11]  Alejandro Héctor Toselli,et al.  Projection Profile Based Algorithm for Slant Removal , 2004, ICIAR.

[12]  Juergen Luettin,et al.  A new normalization technique for cursive handwritten words , 2001, Pattern Recognit. Lett..

[13]  Venu Govindaraju,et al.  Efficient chain-code-based image manipulation for handwritten word recognition , 1996, Electronic Imaging.

[14]  Emmanuel Augustin,et al.  Hidden Markov Model Based Word Recognition and Its Application to Legal Amount Reading on French Checks , 1998, Comput. Vis. Image Underst..

[15]  Brijesh Verma A contour character extraction approach in conjunction with a neural confidence fusion technique for the segmentation of handwriting recognition , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[16]  Sargur N. Srihari,et al.  Off-Line Cursive Script Word Recognition , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jinhai Cai,et al.  Off-Line Unconstrained Handwritten Word Recognition , 2000, Int. J. Pattern Recognit. Artif. Intell..

[18]  Nikos Fakotakis,et al.  New algorithms for skewing correction and slant removal on word-level [OCR] , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[19]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[20]  Anthony J. Robinson,et al.  An Off-Line Cursive Handwriting Recognition System , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Fajri Kurniawan,et al.  A NEW APPROACH FOR SEGMENTING DIFFICULT CURSIVE HANDWRITTEN WORDS FROM BENCHMARK DATABASE , 2008 .

[22]  Michael Blumenstein,et al.  New Preprocessing Techniques for Handwritten Word Recognition , 2002 .

[23]  Robert Sabourin,et al.  An HMM-Based Approach for Off-Line Unconstrained Handwritten Word Modeling and Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Seiichi Uchida,et al.  Nonuniform Slant Correction for Handwritten Word Recognition , 2004, IEICE Trans. Inf. Syst..

[25]  C. Scagliola,et al.  Generalised projections: a tool for cursive handwriting normalisation , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[26]  Rubiyah Yusof,et al.  Offline cursive handwriting recognition system based on hybrid Markov model and neural networks , 2003, Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No.03EX694).

[27]  Torsten Caesar,et al.  Preprocessing and feature extraction for a handwriting recognition system , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[28]  Frank de Zeeuw,et al.  Slant Correction using Histograms , 2006 .