A Chinese minority script recognition method based on wavelet feature and multinomial naive Bayes

The existing Chinese Minorities OCR system is mainly oriented in the "literacy" level, the script recognition has not attracted the attention it deserves, and the area of recognizing the kinds of Chinese minority scripts is still in a blank. This paper presents a method of recognizing the kinds of Chinese minority scripts based on wavelet analysis and Multinomial Naive Bayes. The method of recognizing the kinds of Chinese minority scripts based on wavelet analysis and Multinomial Naive Bayes is presented which adopts wavelet decomposition that obtains feature descriptor of wavelet energy and wavelet energy distribution proportion. Combined with the texture feature of Chinese minority scripts, radially classification in Multinomial Naive Bayes. Among Chinese, English and Chinese minority scripts such as Tibetan, Tai Lue, Naxi Pictographs, Uighur, Tai Le, Yi, the experimental results show the recognition rate is up to 90%.

[1]  M. M. Leung,et al.  Scale and rotation invariant texture classification , 1992, [1992] Conference Record of the Twenty-Sixth Asilomar Conference on Signals, Systems & Computers.

[2]  Tommy W. S. Chow,et al.  Induction machine fault diagnostic analysis with wavelet technique , 2004, IEEE Transactions on Industrial Electronics.

[3]  Jian Fan,et al.  Texture Classification by Wavelet Packet Signatures , 1993, MVA.

[4]  Li Zhen The Research of Printed Mongolian Character Recognition , 2003 .

[5]  Wang Hua Multi-font multi-size printed Uyghur character recognition , 2004 .

[6]  Zhoujun Li,et al.  Semantic Smoothing the Multinomial Naive Bayes for Biomedical Literature Classification , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[7]  Xu Jianfeng Web Embedding Fonts Technology of Naxi Pictographs , 2005 .

[8]  Rangasami L. Kashyap,et al.  A Model-Based Method for Rotation Invariant Texture Classification , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Michael Unser,et al.  Texture classification and segmentation using wavelet frames , 1995, IEEE Trans. Image Process..

[10]  Hae-Chang Rim,et al.  Topic Document Model Approach for Naive Bayes Text Classification , 2005, IEICE Trans. Inf. Syst..

[11]  Wang Weilan Study on Printed Tibetn Character Recognition , 2003 .

[12]  Richard Kronland-Martinet,et al.  Asymptotic wavelet and Gabor analysis: Extraction of instantaneous frequencies , 1992, IEEE Trans. Inf. Theory.

[13]  Dennis Gabor,et al.  Theory of communication , 1946 .

[14]  Geoff Holmes,et al.  Multinomial Naive Bayes for Text Categorization Revisited , 2004, Australian Conference on Artificial Intelligence.

[15]  C.-C. Jay Kuo,et al.  Texture analysis and classification with tree-structured wavelet transform , 1993, IEEE Trans. Image Process..

[16]  Bedrich J. Hosticka,et al.  Unsupervised texture segmentation of images using tuned matched Gabor filters , 1995, IEEE Trans. Image Process..

[17]  Hae-Chang Rim,et al.  A new method of parameter estimation for multinomial naive bayes text classifiers , 2002, SIGIR '02.

[18]  F. Harris On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[19]  Li Jian-ping,et al.  Uniform analytic construction of wavelet analysis filters based on sine and cosine trigonometric functions , 2001 .