In-air handwritten English word recognition using attention recurrent translator

As a new human–computer interaction way, in-air handwriting allows users to write in the air in a natural, unconstrained way. Compared with conventional online handwriting based on touch devices, in-air handwriting is much more challenging due to its unique characteristics. The in-air handwriting is always finished in a single stroke and thus lacks pen-down and pen-up information. Moreover, the in-air handwriting suffers less friction and space restriction so that the users write more casually. In this paper, we present an in-air handwriting system for effectively recognizing handwritten English words. An attention-based model, called attention recurrent translator, is proposed for the in-air handwritten English word recognition, which is considerably different from connectionist temporal classification (CTC). We evaluate the proposed approach on a newly collected dataset containing a total of 150,480 recordings that cover 2280 English words. The proposed approach achieves a word recognition accuracy of 97.74%. The experimental results show that the proposed recognizer is comparable with CTC and is extremely effective for in-air handwritten English word recognition.

[1]  Debi Prosad Dogra,et al.  Study of Text Segmentation and Recognition Using Leap Motion Sensor , 2017, IEEE Sensors Journal.

[2]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[3]  Volkmar Frinken,et al.  Deep BLSTM neural networks for unconstrained continuous handwritten text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[4]  Xue Gao,et al.  A New Method for Rotation Free Method for Online Unconstrained Handwritten Chinese Word Recognition: A Holistic Approach , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[5]  Debi Prosad Dogra,et al.  3D text segmentation and recognition using leap motion , 2017, Multimedia Tools and Applications.

[6]  Weiqiang Wang,et al.  An end-to-end recognizer for in-air handwritten Chinese characters based on a new recurrent neural networks , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[7]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[8]  Urbashi Mitra,et al.  2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , 2008 .

[9]  Marcus Liwicki,et al.  IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[10]  Tanja Schultz,et al.  Airwriting recognition using wearable motion sensors , 2010, AH.

[11]  Xin Zhang,et al.  A New Writing Experience: Finger Writing in the Air Using a Kinect Sensor , 2013, IEEE MultiMedia.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[15]  Yoshua Bengio,et al.  Drawing and Recognizing Chinese Characters with Recurrent Neural Network , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Lei Li,et al.  Handwriting and Gestures in the Air, Recognizing on the Fly , 2013 .

[19]  Haijun Zhang,et al.  Understanding Subtitles by Character-Level Sequence-to-Sequence Learning , 2017, IEEE Transactions on Industrial Informatics.

[20]  Quoc V. Le,et al.  Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  Biing-Hwang Juang,et al.  Air-Writing Recognition—Part I: Modeling and Recognition of Characters, Words, and Connecting Motions , 2016, IEEE Transactions on Human-Machine Systems.

[23]  Karim Faez,et al.  Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM , 2001, Pattern Recognit..

[24]  Rene De La Briandais File searching using variable length keys , 1959, IRE-AIEE-ACM Computer Conference.

[25]  Gang Liu,et al.  SCUT-COUCH2009—a comprehensive online unconstrained Chinese handwriting database and benchmark evaluation , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[26]  Muhammad Imran Razzak,et al.  Evaluation of cursive and non-cursive scripts using recurrent neural networks , 2015, Neural Computing and Applications.

[27]  Ke Wang,et al.  Learning to link human objects in videos and advertisements with clothes retrieval , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[28]  Tanja Schultz,et al.  Airwriting: Hands-Free Mobile Text Input by Spotting and Continuous Recognition of 3d-Space Handwriting with Inertial Sensors , 2012, 2012 16th International Symposium on Wearable Computers.

[29]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[30]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[31]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[32]  Muhammad Imran Razzak,et al.  Urdu Nasta’liq text recognition system based on multi-dimensional recurrent neural network and statistical features , 2017, Neural Computing and Applications.

[33]  Adnan Khashman,et al.  Deep learning in vision-based static hand gesture recognition , 2017, Neural Computing and Applications.

[34]  Weiqiang Wang,et al.  Recognition of In-air Handwritten Chinese Character Based on Leap Motion Controller , 2015, ICIG.

[35]  Fumitaka Kimura,et al.  Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Tommy W. S. Chow,et al.  Object-Level Video Advertising: An Optimization Framework , 2017, IEEE Transactions on Industrial Informatics.

[37]  Santanu Chaudhury,et al.  Text recognition using deep BLSTM networks , 2015, 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR).

[38]  Weiqiang Wang,et al.  High-order directional features and sparse representation based classification for in-air handwritten Chinese character recognition , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[39]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.