Fusion of Spatio-temporal Information for Indic Word Recognition Combining Online and Offline Text Data

We present a novel Indic handwritten word recognition scheme by fusion of spatio-temporal information extracted from handwritten images. The main challenge in Indic word recognition lies in its complexity because of modifiers, touching characters, and compound characters. Hidden Markov Models (HMMs) are being used to model such data due to their ability to learn sequential data, however, the recognition performance is not satisfactory. We propose here a Long Short-Term Memory (LSTM)-based architecture for offline Indic word recognition. Offline recognition methods usually involve spatial data, whereas it has been observed that online recognition schemes show better performance than the offline methodologies. Online information usually refers to the temporal information obtained from the strokes of the pen tip while writing, which is missing in offline word images. In this article, an effort has been made to extract the online temporal information from offline images using stroke recovery and later it is combined with spatial information in LSTM architecture. During recognition, the character models are trained using both offline and extracted pseudo-online handwritten data separately. Finally, a novel fusion scheme has been used to combine them together. From the experiment, it is noted that recognition performance of handwritten Indic words improves considerably due to the fusion scheme of spatial and temporal data.

[1]  Wenyu Liu,et al.  Skeletonization using SSM of the Distance Transform , 2007, 2007 IEEE International Conference on Image Processing.

[2]  Bidyut Baran Chaudhuri,et al.  Offline Cursive Bengali Word Recognition Using CNNs with a Recurrent Model , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[3]  Hermann Ney,et al.  Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[4]  C. V. Jawahar,et al.  Towards Accurate Handwritten Word Recognition for Hindi and Bangla , 2017, NCVPRIPG.

[5]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[6]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[7]  Jin Hyung Kim,et al.  Online Handwriting Recognition , 2014, Handbook of Document Image Processing and Recognition.

[8]  Debi Prosad Dogra,et al.  A multimodal framework for sensor based sign language recognition , 2017, Neurocomputing.

[9]  Debi Prosad Dogra,et al.  A position and rotation invariant framework for sign language recognition (SLR) using Kinect , 2018, Multimedia Tools and Applications.

[10]  Sabri A. Mahmoud,et al.  Recognition : A Survey , 2013 .

[11]  Sabri A. Mahmoud,et al.  Arabic handwriting recognition using structural and syntactic pattern attributes , 2013, Pattern Recognit..

[12]  Debi Prosad Dogra,et al.  Study of Text Segmentation and Recognition Using Leap Motion Sensor , 2017, IEEE Sensors Journal.

[13]  Makoto Yasuhara,et al.  Recovery of Drawing Order from Single-Stroke Handwriting Images , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Claudio M. Privitera,et al.  The segmentation of cursive handwriting: an approach based on off-line recovery of the motor-temporal information , 1999, IEEE Trans. Image Process..

[15]  Umapada Pal,et al.  Stroke Segmentation and Recognition from Bangla Online Handwritten Text , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[16]  Ehsanollah Kabir,et al.  Decision fusion of horizontal and vertical trajectories for recognition of online Farsi subwords , 2013, Eng. Appl. Artif. Intell..

[17]  Umapada Pal,et al.  Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques , 2012, TALIP.

[18]  Umapada Pal,et al.  HMM-based writer identification in music score documents without staff-line removal , 2017, Expert Syst. Appl..

[19]  Umapada Pal,et al.  Document seal detection using GHT and character proximity graphs , 2011, Pattern Recognit..

[20]  Jianmin Jiang,et al.  Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking , 2011, Pattern Recognit. Lett..

[21]  Prasenjit Dey,et al.  HMM-based Indic handwritten word recognition using zone segmentation , 2016, Pattern Recognit..

[22]  Irfan Ahmad,et al.  Arabic Bank Check Analysis and Zone Extraction , 2012, ICIAR.

[23]  Harold Mouchère,et al.  A global learning approach for an online handwritten mathematical expression recognition system , 2014, Pattern Recognit. Lett..

[24]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[25]  Mohamed Cheriet,et al.  Combination of context-dependent bidirectional long short-term memory classifiers for robust offline handwriting recognition , 2017, Pattern Recognit. Lett..

[26]  John Illingworth,et al.  Combining HMM classifiers in a handwritten text recognition system , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[27]  Bidyut Baran Chaudhuri,et al.  A system for Indian postal automation , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[28]  Salvador España Boquera,et al.  Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Mohamed Cheriet,et al.  Feature Set Evaluation for Offline Handwriting Recognition Systems: Application to the Recurrent Neural Network Model , 2016, IEEE Transactions on Cybernetics.

[30]  Umapada Pal,et al.  Keyword spotting in doctor's handwriting on medical prescriptions , 2017, Expert Syst. Appl..

[31]  Mahantapas Kundu,et al.  Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition , 2010, ArXiv.

[32]  Ching Y. Suen,et al.  Multiple Classifier Combination Methodologies for Different Output Levels , 2000, Multiple Classifier Systems.

[33]  Terrence J. Sejnowski,et al.  Unsupervised Learning , 2018, Encyclopedia of GIS.

[34]  Yoshua Bengio,et al.  Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark , 2016, Pattern Recognit..

[35]  Maher Khemakhem,et al.  A model-based approach to offline text-independent Arabic writer identification and verification , 2015, Pattern Recognit..

[36]  Debi Prosad Dogra,et al.  3D text segmentation and recognition using leap motion , 2017, Multimedia Tools and Applications.

[37]  Yu Qiao,et al.  Offline Signature Verification Using Online Handwriting Registration , 2007 .

[38]  Sriganesh Madhvanath,et al.  HMM-Based Lexicon-Driven and Lexicon-Free Word Recognition for Online Handwritten Indic Scripts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Mohamed Cheriet,et al.  Tandem hidden Markov models using deep belief networks for offline handwriting recognition , 2017, Frontiers of Information Technology & Electronic Engineering.

[40]  LiuCheng-Lin,et al.  Online and offline handwritten Chinese character recognition , 2013 .

[41]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[42]  Chafic Mokbel,et al.  Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Debi Prosad Dogra,et al.  Coupled HMM-based multi-sensor data fusion for sign language recognition , 2017, Pattern Recognit. Lett..

[44]  Nizar Habash,et al.  Online Arabic Handwriting Recognition Using Hidden Markov Models , 2006 .

[45]  Partha Pratim Roy,et al.  A Dempster–Shafer theory based classifier combination for online Signature recognition and verification systems , 2018, International Journal of Machine Learning and Cybernetics.