Skeleton Point Trajectories for Human Daily Activity Recognition

Automatic human action annotation is a challenging problem, which overlaps with many computer vision fields such as video-surveillance, human-computer interaction or video mining. In this work, we offer a skeleton based algorithm to classify segmented human-action sequences. Our contribution is twofold. First, we offer and evaluate different trajectory descriptors on skeleton datasets. Six short term trajectory features based on position, speed or acceleration are first introduced. The last descriptor is the most original since it extends the well-known bag-of-words approach to the bag-of-gestures ones for 3D position of articulations. All these descriptors are evaluated on two public databases with state-of-the art machine learning algorithms. The second contribution is to measure the influence of missing data on algorithms based on skeleton. Indeed skeleton extraction algorithms commonly fail on real sequences, with side or back views and very complex postures. Thus on these real data, we offer to compare recognition methods based on image and those based on skeleton with many missing data.

[1]  Jorge Angeles,et al.  Rigid-body pose and twist estimation using an accelerometer array , 2004 .

[2]  Mathieu Barnachon,et al.  Interprétation de Mouvements Temps Réel , 2012 .

[3]  Matemática,et al.  Society for Industrial and Applied Mathematics , 2010 .

[4]  J. Angeles,et al.  Rigid-body pose and twist estimation using an accelerometer array , 2004 .

[5]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[6]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[7]  Agnès Just,et al.  HMM and IOHMM for the Recognition of Mono- and Bi-Manual 3D Hand Gestures , 2004, BMVC.

[8]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[9]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[10]  Bart Selman,et al.  Human Activity Detection from RGBD Images , 2011, Plan, Activity, and Intent Recognition.

[11]  Stefano Soatto,et al.  Tracklet Descriptors for Action Modeling and Video Analysis , 2010, ECCV.

[12]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Bingbing Ni,et al.  RGBD-HuDaAct: A color-depth video database for human daily activity recognition , 2011, ICCV Workshops.

[14]  Dan Schonfeld,et al.  Object Trajectory-Based Activity Classification and Recognition Using Hidden Markov Models , 2007, IEEE Transactions on Image Processing.

[15]  Aaron F. Bobick,et al.  Recognition of human body motion using phase space constraints , 1995, Proceedings of IEEE International Conference on Computer Vision.

[16]  Luc Van Gool,et al.  Does Human Action Recognition Benefit from Pose Estimation? , 2011, BMVC.

[17]  Nicolas Ballas,et al.  Trajectories based descriptor for dynamic events annotation , 2011, J-MRE '11.

[18]  Ramakant Nevatia,et al.  3D Human Action Recognition Using Spatio-temporal Motion Templates , 2005, ICCV-HCI.

[19]  Gunnar Rätsch,et al.  The SHOGUN Machine Learning Toolbox , 2010, J. Mach. Learn. Res..

[20]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[21]  Bingbing Ni,et al.  Contextualizing histogram , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Christopher Joseph Pal,et al.  Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[24]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Marko Heikkilä,et al.  Description of interest regions with local binary patterns , 2009, Pattern Recognit..

[26]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[27]  Meinard Müller,et al.  Motion templates for automatic classification and retrieval of motion capture data , 2006, SCA '06.

[28]  Martial Hebert,et al.  Trajectons: Action recognition through the motion analysis of tracked features , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[29]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Yiannis Kompatsiaris,et al.  Local Invariant Feature Tracks for high-level video feature extraction , 2010, 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10.

[31]  Hans-Peter Seidel,et al.  A data-driven approach for real-time full body pose reconstruction from a depth camera , 2011, 2011 International Conference on Computer Vision.

[32]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Luc Van Gool,et al.  A Hough transform-based voting framework for action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Darko Kirovski,et al.  Real-time classification of dance gestures from skeleton animation , 2011, SCA '11.

[37]  G. Johansson Visual perception of biological motion and a model for its analysis , 1973 .

[38]  Moritz Tenorth,et al.  The TUM Kitchen Data Set of everyday manipulation activities for motion tracking and action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[39]  Ali Ghodsi,et al.  Rare Class Classification by Support Vector Machine , 2010, 2010 20th International Conference on Pattern Recognition.

[40]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.