A Hybrid Handwritten Chinese Address Recognition Approach

Handwritten Chinese Address Recognition describes a difficult yet important pattern recognition task. There are three difficulties in this problem: (1) Handwritten address is often of free styles and of high variations, resulting in inevitable segmentation errors. (2) The number of Chinese characters is large, leading low recognition rate for single Chinese characters. (3) Chinese address is usually irregular, i.e., different persons may write the same address in different formats. In this paper, we propose a comprehensive and hybrid approach for solving all these three difficulties. Aiming to solve (1) and (2), we adopt an enhanced holistic scheme to recognize the whole image of words (defined as a place name) instead of that of single characters. This facilitates the usage of address knowledge and avoids the difficult single character segmentation problem as well. In order to attack (3), we propose a hybrid approach that combines the word-based language model and the holistic word matching scheme. Therefore, it can deal with various irregular address. We provide theoretical justifications, outline the detailed steps, and perform a series of experiments. The experimental results on various real address demonstrate the advantages of our novel approach.

[1]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[2]  Chang-Ping Liu,et al.  A two-stage handwritten character segmentation approach in mail address recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[3]  Fabio Somenzi,et al.  An Algorithm for Strongly Connected Component Analysis in n log n Symbolic Steps , 2000, Formal Methods Syst. Des..

[4]  Chunheng Wang,et al.  Handwritten Chinese address recognition , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[5]  Qiang Fu,et al.  A hidden Markov model based segmentation and recognition algorithm for Chinese handwritten address character strings , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).