Post processing for offline Chinese handwritten character string recognition

Offline Chinese handwritten character string recognition is one of the most important research fields in pattern recognition. Due to the free writing style, large variability in character shapes and different geometric characteristics, Chinese handwritten character string recognition is a challenging problem to deal with. However, among the current methods over-segmentation and merging method which integrates geometric information, character recognition information and contextual information, shows a promising result. It is found experimentally that a large part of errors are segmentation error and mainly occur around non-Chinese characters. In a Chinese character string, there are not only wide characters namely Chinese characters, but also narrow characters like digits and letters of the alphabet. The segmentation error is mainly caused by uniform geometric model imposed on all segmented candidate characters. To solve this problem, post processing is employed to improve recognition accuracy of narrow characters. On one hand, multi-geometric models are established for wide characters and narrow characters respectively. Under multi-geometric models narrow characters are not prone to be merged. On the other hand, top rank recognition results of candidate paths are integrated to boost final recognition of narrow characters. The post processing method is investigated on two datasets, in total 1405 handwritten address strings. The wide character recognition accuracy has been improved lightly and narrow character recognition accuracy has been increased up by 10.41% and 10.03% respectively. It indicates that the post processing method is effective to improve recognition accuracy of narrow characters.

[1]  Cheng-Lin Liu,et al.  Improving HMM-Based Chinese Handwriting Recognition Using Delta Features and Synthesized String Samples , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[2]  Hiromichi Fujisawa A View on the Past and Future of Character and Document Recognition , 2007 .

[3]  Pengfei Shi,et al.  Handwritten Chinese character segmentation using a two-stage approach , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[4]  Tong Liu,et al.  A Novel Segmentation and Recognition Algorithm for Chinese Handwritten Address Character Strings , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Eric Lecolinet,et al.  A Survey of Methods and Strategies in Character Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ching Y. Suen,et al.  Spiral recognition methodology and its application for recognition of Chinese bank checks , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[7]  Kenneth M. Sayre,et al.  Machine recognition of handwritten words: A project report , 1973, Pattern Recognit..

[8]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yanming Zou,et al.  Continuous Chinese Handwriting Recognition with Language Model , 2008 .

[10]  Xiang-Dong Zhou,et al.  Online Handwritten Japanese Character String Recognition Incorporating Geometric Context , 2007 .

[11]  Fei Yin,et al.  Integrating Language Model in Handwritten Chinese Text Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[12]  Masaki Nakagawa,et al.  Precise Candidate Selection for Large Character Set Recognition by Confidence Evaluation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Tianwen Zhang,et al.  Off-line recognition of realistic Chinese handwriting using segmentation-free strategy , 2009, Pattern Recognit..