A Human Activity Recognition System Based on Dynamic Clustering of Skeleton Data

Human activity recognition is an important area in computer vision, with its wide range of applications including ambient assisted living. In this paper, an activity recognition system based on skeleton data extracted from a depth camera is presented. The system makes use of machine learning techniques to classify the actions that are described with a set of a few basic postures. The training phase creates several models related to the number of clustered postures by means of a multiclass Support Vector Machine (SVM), trained with Sequential Minimal Optimization (SMO). The classification phase adopts the X-means algorithm to find the optimal number of clusters dynamically. The contribution of the paper is twofold. The first aim is to perform activity recognition employing features based on a small number of informative postures, extracted independently from each activity instance; secondly, it aims to assess the minimum number of frames needed for an adequate classification. The system is evaluated on two publicly available datasets, the Cornell Activity Dataset (CAD-60) and the Telecommunication Systems Team (TST) Fall detection dataset. The number of clusters needed to model each instance ranges from two to four elements. The proposed approach reaches excellent performances using only about 4 s of input data (~100 frames) and outperforms the state of the art when it uses approximately 500 frames on the CAD-60 dataset. The results are promising for the test in real context.

[1]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[2]  Jin Zhang,et al.  STFC: Spatio-temporal feature chain for skeleton-based human action recognition , 2015, J. Vis. Commun. Image Represent..

[3]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[4]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[5]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  S Micera,et al.  Technology and Innovative Services , 2011, IEEE Pulse.

[7]  Jun Kong,et al.  Informative joints based human action recognition using skeleton contexts , 2015, Signal Process. Image Commun..

[8]  Jake K. Aggarwal,et al.  Human activity recognition from 3D data: A review , 2014, Pattern Recognit. Lett..

[9]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[10]  Paolo Dario,et al.  A 3D Human Posture Approach for Activity Recognition Based on Depth Camera , 2016, ECCV Workshops.

[11]  Ennio Gambi,et al.  A Human Activity Recognition System Using Skeleton Data from RGBD Sensors , 2016, Comput. Intell. Neurosci..

[12]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[13]  Juan Song,et al.  An Online Continuous Human Action Recognition Algorithm Based on the Kinect Sensor , 2016, Sensors.

[14]  Marco Morana,et al.  Human Activity Recognition Process Using 3-D Posture Data , 2015, IEEE Transactions on Human-Machine Systems.

[15]  Alexandros André Chaaraoui,et al.  Evolutionary joint selection to improve human action recognition with RGB-D devices , 2014, Expert Syst. Appl..

[16]  Juan Song,et al.  Human action recognition using key poses and atomic motions , 2015, 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[17]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[18]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[19]  Leonid Sigal,et al.  Poselet Key-Framing: A Model for Human Activity Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[22]  Ennio Gambi,et al.  Proposal and Experimental Evaluation of Fall Detection Solution Based on Wearable and Depth Data Fusion , 2015, ICT Innovations.

[23]  Stefan Wermter,et al.  Self-organizing neural integration of pose-motion features for human action recognition , 2015, Front. Neurorobot..

[24]  Pinar Duygulu Sahin,et al.  Recognizing Human Actions Using Key Poses , 2010, 2010 20th International Conference on Pattern Recognition.

[25]  Srinivas Akella,et al.  3D human action segmentation and recognition using pose kinetic energy , 2014, 2014 IEEE International Workshop on Advanced Robotics and its Social Impacts.

[26]  Bingbing Ni,et al.  RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[27]  Feng Gu,et al.  Visual Privacy by Context: Proposal and Evaluation of a Level-Based Visualisation Scheme , 2015, Sensors.

[28]  Paolo Dario,et al.  Multidisciplinary approach for developing a new robotic system for domiciliary assistance to elderly people , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[29]  Matti Pietikäinen,et al.  Human Activity Recognition Using Sequences of Postures , 2005, MVA.

[30]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[31]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[32]  Alberto Del Bimbo,et al.  Effective Codebooks for human action categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[33]  Junsong Yuan,et al.  Learning Actionlet Ensemble for 3D Human Action Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[35]  Yong Pei,et al.  Multilevel Depth and Image Fusion for Human Activity Detection , 2013, IEEE Transactions on Cybernetics.

[36]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[37]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[38]  Cristiano Premebida,et al.  A probabilistic approach for human everyday activities recognition using body motion from RGB-D images , 2014, The 23rd IEEE International Symposium on Robot and Human Interactive Communication.

[39]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[40]  Guodong Guo,et al.  Evaluating spatiotemporal interest point features for depth-based action recognition , 2014, Image Vis. Comput..

[41]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[42]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[43]  Maria Petrou,et al.  Photometric stereo with an arbitrary number of illuminants , 2010, Comput. Vis. Image Underst..