Virtual Example Synthesis Based on PCA for Off-Line Handwritten Character Recognition

This paper proposes a method to improve off-line character classifiers learned from examples using virtual examples synthesized from an on-line character database. To obtain good classifiers, a large database which contains a large enough number of variations of handwritten characters is usually required. However, in practice, collecting enough data is time-consuming and costly. In this paper, we propose a method to train SVM for off-line character recognition based on artificially augmented examples using on-line characters. In our method, virtual examples are synthesized from on-line characters by the following two steps: (1) applying affine transformation to each stroke of “real” characters, and (2) applying affine transformation to each stroke of artificial characters, which are synthesized on the basis of PCA. SVM classifiers are trained by using the training samples containing artificially generated patterns and real characters. We examine the effectiveness of the proposed method with respect to the recognition rates and number of support vectors of SVM through experiments involving the handwritten Japanese Hiragana character classification.

[1]  Masaki Nakagawa,et al.  An improved approach to generating realistic Kanji character images from on-line characters and its benefit to off-line recognition performance , 2002, Object recognition supported by user interaction for service robots.

[2]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[3]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[4]  Minoru Maruyama,et al.  A method to make multiple hypotheses with high cumulative recognition rate using SVMs , 2004, Pattern Recognit..

[5]  Horst Bunke,et al.  Off-Line, Handwritten Numeral Recognition by Perturbation Method , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Tomaso Poggio,et al.  Incorporating prior information in machine learning by creating virtual examples , 1998, Proc. IEEE.

[7]  Masaki Nakagawa,et al.  A new warping technique for normalizing likelihood of multiple classifiers and its effectiveness in combined on-line/off-line japanese character recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[8]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[9]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[10]  Debashis Ghosh,et al.  An analytic approach for generation of artificial hand-printed character database from given generative models , 1999, Pattern Recognit..

[11]  Minoru Maruyama,et al.  Off-line handwritten character recognition by SVM based on the virtual examples synthesized from on-line characters , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[12]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.