Multi-view Discriminant Analysis for Dynamic Hand Gesture Recognition

Although there have been attempts to tackle the problem of hand gesture recognition “in-the-wild”, deployment of such methods in practical applications still face major issues such as view point change, clustered background and low resolution of hand regions. In this paper, we investigate these issues based on a frame-work that is intensively designed in terms of both varying features and multi-view analysis. In the framework, we embed both hand-crafted features and learnt features using Convolutional Neural Network (CNN) for gesture representation at single view. We then employ multi-view discriminant analysis (MvDA) based techniques to build a discriminant common space by jointly learning multiple view-specific linear transforms from multiple views. To evaluate the effectiveness of the proposed frame-work, we construct a new multi-view dataset of twelve gestures. These gestures are captured by five cameras uniformly spaced on the half of a circle frontally surrounding the user in the context of human machine interaction. The performance of each designed scheme in the proposed framework is then evaluated. We report accuracy and discuss the results in view of developing practical applications. Experimental results show promising performance for developing a natural and friendly hand-gesture based applications.

[1]  Ruonan Li,et al.  Discriminative virtual views for cross-view action recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Iqbal Gondal,et al.  On dynamic scene geometry for view-invariant action matching , 2011, CVPR 2011.

[3]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Thanh-Hai Tran,et al.  Recognition of hand gestures from cyclic hand movements using spatial-temporal features , 2015, SoICT.

[5]  Justus H. Piater,et al.  A multi-view hand gesture RGB-D dataset for human-robot interaction scenarios , 2016, 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).

[6]  Vassilis Athitsos,et al.  Sign language recognition using dynamic time warping and hand shape distance based on histogram of oriented gradient features , 2014, PETRA.

[7]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Methods, Data, and Challenges , 2015, International Journal of Computer Vision.

[8]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[9]  Ning Chen,et al.  Predictive Subspace Learning for Multi-view Data: a Large Margin Approach , 2010, NIPS.

[10]  Lorenzo Torresani,et al.  Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Janusz Konrad,et al.  The Value of Multiple Viewpoints in Gesture-Based User Authentication , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[12]  Yasushi Makihara,et al.  Multi-view discriminant analysis with tensor representation and its application to cross-view gait recognition , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[13]  Mubarak Shah,et al.  Learning 4D action feature models for arbitrary view action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Trung-Hieu Le,et al.  Hand segmentation under different viewpoints by combination of Mask R-CNN with tracking , 2018, 2018 5th Asian Conference on Defense Technology (ACDT).

[15]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Janusz Konrad,et al.  The value of posture, build and dynamics in gesture-based user authentication , 2014, IEEE International Joint Conference on Biometrics.

[17]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[18]  Malin Premaratne,et al.  Dynamic hand gesture recognition system using moment invariants , 2010, 2010 Fifth International Conference on Information and Automation for Sustainability.

[19]  Antonis A. Argyros,et al.  Vision-based Hand Gesture Recognition for Human-Computer Interaction , 2008 .

[20]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jun-Hyeong Do,et al.  Advanced Soft Remote Control System in Human-friendliness , 2006 .

[22]  Yi Yang,et al.  Depth-Based Hand Pose Estimation: Data, Methods, and Challenges , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Thanh-Hai Tran,et al.  Phase synchronization in a manifold space for recognizing dynamic hand gestures from periodic image sequence , 2016, 2016 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF).

[24]  Wei Hu,et al.  Automatic user state recognition for hand gesture based low-cost television control system , 2014, IEEE Transactions on Consumer Electronics.