Offline handwritten Chinese character recognition by radical decomposition

Offline handwritten Chinese character recognition is a very hard pattern-recognition problem of considerable practical importance. Two popular approaches are to extract features holistically from the character image or to decompose characters structurally into component parts---usually strokes. Here we take a novel approach, that of decomposing into radicals on the basis of image information (i.e., without first decomposing into strokes). During training, 60 examples of each radical were represented by "landmark" points, labeled semiautomatically, with radicals in different characteristic positions treated as distinctly different radicals. Kernel principal-component analysis then captured the main (nonlinear) variations around the mean radical. During the recognition, the dynamic tunneling algorithm was used to search for optimal shape parameters in terms of chamfer distance minimization. Considering character composition as a Markov process in which up to four radicals are combined in some assumed sequential order, we can recognize complete, hierarchically-composed characters by using the Viterbi algorithm. This gave a character recognition rate of 93.5% characters correct (writer-independent) on a test set of 430,800 characters from 2,154 character classes composed of 200 radical categories, which is comparable to the best reported results in the literature. Although the initial semiautomatic landmark labeling is time consuming, the decomposition approach is theoretically well-motivated and allows the different sources of variability in Chinese handwriting to be handled separately and by the most appropriate means--either learned from example data or incorporated as prior knowledge. Hence, high generalizability is obtained from small amounts of training data, and only simple prior knowledge needs to be incorporated, thus promising robust recognition performance. As such, there is very considerable potential for further development and improvement in the direction of larger character sets and less constrained writing conditions.

[1]  Qiang Huo,et al.  Offline recognition of handwritten Chinese characters using Gabor features, CDHMM modeling and MCE training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, International Conference on Artificial Neural Networks.

[3]  Daming Shi,et al.  Recognition rule acquisition by an advanced extension matrix algorithm , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[4]  Suh-Yin Lee,et al.  On-Line Chinese Character Recognition via A Representation of Spatial Relationships between Strokes , 1997, Int. J. Pattern Recognit. Artif. Intell..

[5]  Yashwant Prasad Singh,et al.  Hybridization of gradient descent algorithms with dynamic tunneling methods for global optimization , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[6]  Jun S. Huang,et al.  A transformation invariant matching algorithm for handwritten chinese character recognition , 1990, Pattern Recognit..

[7]  Fu-Lai Chung,et al.  Complex character decomposition using deformable model , 2001 .

[8]  Korris Fu-Lai Chung,et al.  Offline handwritten Chinese character recognition via radical extraction and recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[9]  I. Jolliffe Principal Component Analysis , 2002 .

[10]  Vladimir Cherkassky,et al.  Learning from data , 1998 .

[11]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Qiang Huo,et al.  A Discrete Contextual Stochastic Model for the Offline Recognition of Handwritten Chinese Characters , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Hang Joon Kim,et al.  On-line Chinese character recognition using ART-based stroke classification , 1996, Pattern Recognit. Lett..

[16]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[17]  Daming Shi,et al.  Handwritten Chinese Radical Recognition Using Nonlinear Active Shape Models , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[19]  Roland T. Chin,et al.  One-Pass Parallel Thinning: Analysis, Properties, and Quantitative Evaluation , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Yuan Yan Tang,et al.  Offline Recognition of Chinese Handwriting by Multifeature and Multilevel Classification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[22]  Daming Shi,et al.  Active radical modeling for handwritten Chinese characters , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[23]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[24]  Hang Joon Kim,et al.  On-line recognition of cursive Korean characters using graph representation , 2000, Pattern Recognit..

[25]  Hang Joon Kim,et al.  On-line recognition of handwritten chinese characters based on hidden markov models , 1997, Pattern Recognit..

[26]  Gunilla Borgefors,et al.  Hierarchical Chamfer Matching: A Parametric Edge Matching Algorithm , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Daming Shi,et al.  Handwritten Chinese character recognition using nonlinear active shape models and the Viterbi algorithm , 2002, Pattern Recognit. Lett..

[28]  Wentai Liu,et al.  Optical recognition of handwritten Chinese characters: Advances since 1980 , 1993, Pattern Recognit..

[29]  David L. Neuhoff,et al.  The Viterbi algorithm as an aid in text recognition (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[30]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[31]  Daming Shi,et al.  A radical approach to handwritten Chinese character recognition using active handwriting models , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[32]  Yong Yao,et al.  Dynamic tunneling algorithm for global optimization , 1989, IEEE Trans. Syst. Man Cybern..

[33]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[34]  Shi-Kuo Chang,et al.  An Interactive System for Chinese Character Generation and Retrieval , 1973, IEEE Trans. Syst. Man Cybern..