A Recognition Method of the Similarity Character for Uchen Script Tibetan Historical Document Based on DNN

In order to improve the similarity character recognition of Tibetan historical document, this paper applied the Depth Neural Network (DNN) to similar characters recognition of Tibetan historical document, and proposed a recognition method of the similarity character for Uchen Script Tibetan based on deep learning. The effective feature learning and recognition are automatically carried out by DNN. We also introduced a sample labeling method of Tibetan historical document of Uchen Script using unsupervised clustering and constructing sample sets of the similar characters. Compared with the traditional methods such as Support Vector Machine (SVM) and Naive Bayes Classifier (NBC) based on gradient features through simulation experiment, our method can achieve better performance. The proposed method can learn feature effectively and avoid the disadvantages of manual feature selection and extraction, and it can improve recognition rate greatly. With the increasing of training samples, the recognition rate was improved more significantly. The experimental results show that the proposed method used for similar characters of Tibetan historical document Uchen Script recognition, higher recognition rate can be obtained.

[1]  C. V. Jawahar,et al.  Deep Feature Embedding for Accurate Recognition and Retrieval of Handwritten Text , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[2]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[3]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[4]  Marcus Liwicki,et al.  Ground truth model, tool, and dataset for layout analysis of historical documents , 2015, Electronic Imaging.

[5]  Yi-Chao Wu,et al.  Handwritten Chinese Text Recognition Using Separable Multi-Dimensional Recurrent Neural Network , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[6]  Rajendra Kumar Sharma,et al.  HMM-based online handwritten gurmukhi character recognition , 2010 .

[7]  Lianwen Jin,et al.  Convolutional Multi-directional Recurrent Network for Offline Handwritten Text Recognition , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[8]  Made Windu Antara Kesiman,et al.  Southeast Asian palm leaf manuscript images: a review of handwritten text line segmentation methods and new challenges , 2017, J. Electronic Imaging.

[9]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[10]  Christian Wolf,et al.  Learning Text-Line Localization with Shared and Local Regression Neural Networks , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[11]  Jin Lianwen,et al.  Recognition of Chinese characters based on multi-scale gradient and deep neural network , 2015 .

[12]  Lianwen Jin,et al.  Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[13]  Marcus Liwicki,et al.  Hybrid Feature Selection for Historical Document Layout Analysis , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[14]  Fraser Sturt,et al.  Underwater reflectance transformation imaging: a technology for in situ underwater cultural heritage object-level recording , 2017, J. Electronic Imaging.

[15]  Gernot A. Fink,et al.  PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[16]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[17]  Cheng-Lin Liu,et al.  Normalization-Cooperated Gradient Feature Extraction for Handwritten Character Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Akanksha Gaur,et al.  Handwritten Hindi character recognition using k-means clustering and SVM , 2015, 2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services.

[19]  Masaki Nakagawa,et al.  Training an End-to-End System for Handwritten Mathematical Expression Recognition by Generated Patterns , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[20]  Zhen-Long Bai,et al.  A study on the use of 8-directional features for online handwritten Chinese character recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[21]  José Luis Lerma,et al.  Heuristic method based on voting for extrinsic orientation through image epipolarization , 2017, J. Electronic Imaging.

[22]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[23]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.