Statistical Character Structure Modeling and Its Application to Handwritten Chinese Character Recognition

This paper proposes a statistical character structure modeling method. It represents each stroke by the distribution of the feature points. The character structure is represented by the joint distribution of the component strokes. In the proposed model, the stroke relationship is effectively reflected by the statistical dependency. It can represent all kinds of stroke relationship effectively in a systematic way. Based on the character representation, a stroke neighbor selection method is also proposed. It measures the importance of a stroke relationship by the mutual information among the strokes. With such a measure, the important neighbor relationships are selected by the nth order probability approximation method. The neighbor selection algorithm reduces the complexity significantly because we can reflect only some important relationships instead of all existing relationships. The proposed character modeling method was applied to a handwritten Chinese character recognition system. Applying a model-driven stroke extraction algorithm that cooperates with a selective matching algorithm, the proposed system is better than conventional structural recognition systems in analyzing degraded images. The effectiveness of the proposed methods was visualized by the experiments. The proposed method successfully detected and reflected the stroke relationships that seemed intuitively important. The overall recognition rate was 98.45 percent, which confirms the effectiveness of the proposed methods.

[1]  Philip M. Lewis,et al.  Approximating Probability Distributions to Reduce Storage Requirements , 1959, Information and Control.

[2]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[3]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[4]  Ying Xia,et al.  The automatic recognition of handprinted Chinese characters - A method of extracting an ordered sequence of strokes , 1983, Pattern Recognit. Lett..

[5]  Xia Ying,et al.  The automatic recognition of handprinted Chinese characters - A method of extracting an ordered sequence of strokes , 1983 .

[6]  Ling-Hwei Chen,et al.  Handwritten character recognition using a 2-layer random graph model by relaxation matching , 1990, Pattern Recognit..

[7]  Hiromitsu Yamada,et al.  A nonlinear normalization method for handprinted kanji character recognition - line density equalization , 1990, Pattern Recognit..

[8]  Bin Chen,et al.  Recognition of handwritten Chinese characters via short line segments , 1992, Pattern Recognit..

[9]  Hee-Joong Kang,et al.  A framework for probabilistic combination of multiple classifiers at an abstract level , 1997 .

[10]  Korris Fu-Lai Chung,et al.  Development of a structural deformable model for handwriting recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[11]  Cheng-Lin Liu,et al.  Model-Based Stroke Extraction and Matching by Heuristic Search for Handwritten Chinese Character Recognition , 1998 .

[12]  Fang-Hsuan Cheng Multi-stroke relaxation matching method for handwritten chinese character recognition , 1998, Pattern Recognit..

[13]  Masaki Nakagawa,et al.  Relaxation-based pattern matching using automatic differentiation for off-line character recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[14]  Jin H. Kim,et al.  Hierarchical random graph representation of handwritten characters and its application to Hangul recognition , 2001, Pattern Recognit..

[15]  Jin Hyung Kim,et al.  Model-based stroke extraction and matching for handwritten Chinese character recognition , 2001, Pattern Recognit..

[16]  C. E. Budde FIG , 2022, ACM SIGMETRICS Performance Evaluation Review.