Recognizing User-Defined Subsequences in Human Motion Data

Motion capture technologies digitize human movements by tracking 3D positions of specific skeleton joints in time. Such spatio-temporal multimedia data have an enormous application potential in many fields, ranging from computer animation, through security and sports to medicine, but their computerized processing is a difficult problem. In this paper, we focus on an important task of recognition of a user-defined motion, based on a collection of labelled actions known in advance. We utilize current advances in deep feature learning and scalable similarity retrieval to build an effective and efficient k-nearest-neighbor recognition technique for 3D human motion data. The properties of the technique are demonstrated by a web application which allows a user to browse long motion sequences and specify any subsequence as the input for probabilistic recognition based on 130 predefined classes.

[1]  Wenjun Zeng,et al.  An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data , 2016, AAAI.

[2]  Jian Yang,et al.  Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition , 2018, AAAI.

[3]  Pavel Zezula,et al.  Probabilistic Classification of Skeleton Sequences , 2018, DEXA.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Juan José Pantrigo,et al.  Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition , 2018, Pattern Recognit..

[6]  Pavel Zezula,et al.  Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.

[7]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[8]  Christian Wolf,et al.  Human Action Recognition: Pose-Based Attention Draws Focus to Hands , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Thierry Dutoit,et al.  3D skeleton‐based action recognition by representing motion capture sequences as 2D‐RGB images , 2017, Comput. Animat. Virtual Worlds.

[10]  Gang Wang,et al.  Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks , 2017, IEEE Transactions on Image Processing.

[11]  Tido Röder,et al.  Documentation Mocap Database HDM05 , 2007 .

[12]  Nassir Navab,et al.  Human Motion Analysis with Deep Metric Learning , 2018, ECCV.

[13]  Rajeev Srivastava,et al.  Depth based enlarged temporal dimension of 3D deep convolutional network for activity recognition , 2018, Multimedia Tools and Applications.

[14]  Pavel Zezula,et al.  Effective and efficient similarity searching in motion capture data , 2017, Multimedia Tools and Applications.