Real-time air-writing recognition in motion stream

The main contribution of this work is developing an end-to-end air-writing recognition technique for a real-time application. We assume the user performs the air-writing naturally and intuitively without doing any explicit signal. For avoiding the spotting process, this work considers the segmentation free technique using the LSTM network with CTC loss. The fusion scheme models the writing trajectory with the spatial and temporal features. To extract the writing information from the finger motion, we utilize a window-based technique for segmenting stream data for generating the training features. We deploy two features: the hand position and the path signature, to train the proposed network. For evaluating the performance of the proposed technique, we conduct the experiments the public dataset namely the finger writing. From the result, it confirms the fusion scheme can improve the recognition accuracy. The appropriate size of the sliding window for the proposed structure is 0.25 second while the skip size equals 83 milliseconds. The proposed network can recognize the air-writing word 75.81% without the language model. When considering the processing time of the recognition technique, the air-writing could predict the written word within 6.37 milliseconds. It confirms the proposed algorithm can deploy for a real-time application.

[1]  Volkmar Frinken,et al.  Deep BLSTM neural networks for unconstrained continuous handwritten text recognition , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[2]  Tanja Schultz,et al.  Airwriting: Hands-Free Mobile Text Input by Spotting and Continuous Recognition of 3d-Space Handwriting with Inertial Sensors , 2012, 2012 16th International Symposium on Wearable Computers.

[3]  Debi Prosad Dogra,et al.  Segmentation and recognition of text written in 3D using Leap motion interface , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[4]  Takao Onoye,et al.  Air-Writing Recognition Based on Fusion Network for Learning Spatial and Temporal Features , 2018, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Cordelia Schmid,et al.  Leveraging the Path Signature for Skeleton-based Human Action Recognition , 2017, ArXiv.

[7]  Tae-Seong Kim,et al.  3-D hand motion tracking and gesture recognition using a data glove , 2009, 2009 IEEE International Symposium on Industrial Electronics.

[8]  Jérôme Louradour,et al.  Segmentation-free handwritten Chinese text recognition with LSTM-RNN , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[9]  David J. Kriegman,et al.  A Real-Time Approach to the Spotting, Representation, and Recognition of Hand Gestures for Human-Computer Interaction , 2002, Comput. Vis. Image Underst..

[10]  Ugur Güdükbay,et al.  A hand gesture recognition technique for human-computer interaction , 2015, J. Vis. Commun. Image Represent..

[11]  Kuo-Tsai Chen INTEGRATION OF PATHS—A FAITHFUL REPRE- SENTATION OF PATHS BY NONCOMMUTATIVE FORMAL POWER SERIES , 1958 .

[12]  Jun-Wei Hsieh,et al.  Reverse time ordered stroke context for air-writing recognition , 2017, 2017 10th International Conference on Ubi-media Computing and Workshops (Ubi-Media).

[13]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[14]  Biing-Hwang Juang,et al.  Air-Writing Recognition—Part I: Modeling and Recognition of Characters, Words, and Connecting Motions , 2016, IEEE Transactions on Human-Machine Systems.

[15]  Biing-Hwang Juang,et al.  6DMG: a new 6D motion gesture database , 2012, MMSys '12.

[16]  Terry Lyons Rough paths, Signatures and the modelling of functions on streams , 2014, 1405.4537.

[17]  Debi Prosad Dogra,et al.  Study of Text Segmentation and Recognition Using Leap Motion Sensor , 2017, IEEE Sensors Journal.

[18]  A. Graves,et al.  Unconstrained Online Handwriting Recognition with Recurrent Neural Networks , 2007 .

[19]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[20]  Pritee Khanna,et al.  Vision-Based Mid-Air Unistroke Character Input Using Polar Signatures , 2017, IEEE Transactions on Human-Machine Systems.

[21]  Biing-Hwang Juang,et al.  Air-Writing Recognition—Part II: Detection and Recognition of Writing Activity in Continuous Stream of Motion Data , 2016, IEEE Transactions on Human-Machine Systems.

[22]  Bonhwa Ku,et al.  Alpha-numeric hand gesture recognition based on fusion of spatial feature modelling and temporal feature modelling , 2016 .

[23]  Y. LeCun Neural Networks and Gradient-Based Learning in OCR , 1997, Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop.

[24]  Jun Du,et al.  Deep neural network based hidden Markov model for offline handwritten Chinese text recognition , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).