Word segmentation in handwritten Chinese text image based on component clustering techniques

Segmentation of handwritten Chinese input into individual character is a crucial step in many connected handwriting recognition systems. In this paper, a new method is addressed to segment off-line handwritten Chinese text images. We first adopt the HMM method to produce the segmentation paths and apply two rules to reduce the redundant paths, then the left candidate paths dissect the text line into radicals or pseudo-radicals-components. In the second stage, we propose three new criteria -aspect ratio, gap ratio, longer edge criteria - to calculate the clustering cost matrix and use a dynamic programming technique to produce the optimal clustering scheme. A series of experiments show that our method is very effective for the word segmentation of the offline handwritten Chinese text image.

[1]  Rung Ching Chen,et al.  Segmenting handwritten Chinese characters based on heuristic merging of stroke bounding boxes and dynamic programming , 1998, Pattern Recognit. Lett..

[2]  Yi Lu,et al.  Character segmentation in handwritten words - An overview , 1996, Pattern Recognit..

[3]  Bing Feng,et al.  Off-line handwritten Chinese character recognition with hidden Markov models , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[4]  Sargur N. Srihari,et al.  Control Structure for Interpreting Handwritten Addresses , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Pengfei Shi,et al.  Handwritten Chinese character segmentation using a two-stage approach , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[6]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hsi-Jian Lee,et al.  Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm , 1999, Pattern Recognit. Lett..

[8]  Yi Lu,et al.  Machine printed character segmentation --; An overview , 1995, Pattern Recognit..