Joint-Triplet Motion Image and Local Binary Pattern for 3D Action Recognition Using Kinect

This paper presents a new action recognition method that utilizes the 3D skeletal motion data captured using the Kinect depth sensor. We propose a robust view-invariant joint motion representation based on the spatio-temporal changes in relative angles among the different skeletal joint-triplets, namely the joint relative angle (JRA). A sequence of JRAs obtained for a particular joint-triplet intuitively represents the level of involvement of those joints in performing a specific action. Collection of all joint-triplet JRA sequences is then utilized to construct a spatial holistic description of action-specific motion patterns, namely the 2D joint-triplet motion image. The proposed method exploits a local texture analysis method, the local binary pattern (LBP), to highlight micro-level texture details in the motion images. This process isolates prototypical features for different actions. LBP histogram features are then projected into a discriminant Fisher-space, resulting in more compact and disjoint feature clusters representing individual actions. The performance of the proposed method is evaluated using two publicly available Kinect action databases. Extensive experiments show advantage of the proposed joint-triplet motion image and LBP-based action recognition approach over existing methods.

[1]  Marina L. Gavrilova,et al.  Weighted Fusion of Bit Plane-Specific Local Image Descriptors for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[2]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[3]  Ramakant Nevatia,et al.  View and scale invariant action recognition using multiview shape-flow models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[5]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Ran Gilad-Bachrach,et al.  Full body gait analysis with Kinect , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[7]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[8]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[9]  Marjorie Skubic,et al.  Evaluation of an inexpensive depth camera for passive in-home fall risk assessment , 2011, 2011 5th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth) and Workshops.

[10]  Takeo Kanade,et al.  Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[11]  Marina L. Gavrilova,et al.  DTW-based kernel and rank-level fusion for 3D gait recognition using Kinect , 2015, The Visual Computer.

[12]  Juan A. Botía Blaya,et al.  Ambient Assisted Living system for in-home monitoring of healthy independent elders , 2012, Expert Syst. Appl..

[13]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[14]  Fabio Tozeto Ramos,et al.  Unsupervised clustering of people from ‘skeleton’ data , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[15]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Junsong Yuan,et al.  Learning Actionlet Ensemble for 3D Human Action Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Marco Morana,et al.  Human Activity Recognition Process Using 3-D Posture Data , 2015, IEEE Transactions on Human-Machine Systems.

[18]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Nadia Magnenat-Thalmann,et al.  Example-guided anthropometric human body modeling , 2014, The Visual Computer.

[20]  Patrick Pérez,et al.  View-Independent Action Recognition from Temporal Self-Similarities , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Zhiwu Lu,et al.  Spatial temporal pyramid matching using temporal sparse representation for human motion retrieval , 2014, The Visual Computer.

[22]  Yao-Jen Chang,et al.  A Kinect-based system for physical rehabilitation: a pilot study for young adults with motor disabilities. , 2011, Research in developmental disabilities.

[23]  In Kyu Park,et al.  Content-based 3D model retrieval using a single depth image from a low-cost 3D camera , 2013, The Visual Computer.

[24]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Faisal Ahmed,et al.  Gradient directional pattern: A robust feature descriptor for facial expression recognition , 2012 .

[27]  Léon J. M. Rothkrantz,et al.  Kinect Sensing of Shopping Related Actions , 2011, AmI Workshops.

[28]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Mathieu Barnachon,et al.  Ongoing human action recognition with motion capture , 2014, Pattern Recognit..

[30]  Xiaodong Yang,et al.  Effective 3D action recognition using EigenJoints , 2014, J. Vis. Commun. Image Represent..

[31]  Marina L. Gavrilova,et al.  Evolutionary fusion of local texture patterns for facial expression recognition , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[32]  HiltonAdrian,et al.  A survey of advances in vision-based human motion capture and analysis , 2006 .

[33]  Claudia Linnhoff-Popien,et al.  Gait Recognition with Kinect , 2012 .