Online quality assessment of human motion from skeleton data

This work addresses the challenge of analysing the quality of human movements from visual information which has use in a broad range of applications, from diagnosis and rehabilitation to movement optimisation in sports science. Traditionally, such assessment is performed as a binary classification between normal and abnormal by comparison against normal and abnormal movement models, e.g. [5]. Since a single model of abnormal movement cannot encompass the variety of abnormalities, another class of methods only compares against one model of normal movement, e.g. [4]. We adopt this latter strategy and propose a continuous assessment of movement quality, rather than a binary classification, by quantifying the deviation from a normal model. In addition, while most methods can only analyse a movement after its completion e.g. [6], this assessment is performed on a frame-by-frame basis in order to allow fast system response in case of an emergency, such as a fall. Methods such as [4, 6] are specific to one type of movement, mostly due to the features used. In this work, we aim to represent a large variety of movements by exploiting full body information. We use a depth camera and a skeleton tracker [3] to obtain the position of the main joints of the body, as seen in Fig. 1. We normalise this skeleton for global position and orientation of the camera, and for the varying height of the subjects, e.g. using Procrustes analysis. The normalised skeletons have high dimensionality and tend to contain outliers. Thus, the dimensionality is reduced using Diffusion Maps [1] which is modified by including the extension that Gerber et al. [2] presented to deal with outliers in Laplacian Eigenmaps. The resulting high level feature vector Y, obtained from the normalised skeleton at one frame, represents an individual pose and is used to build a statistical model of normal movement. Our statistical model is made up of two components that describe the normal poses and the normal dynamics of the movement. The pose model is in the form of the probability density function (pdf) fY (y) of a random variable Y that takes as value y = Y our pose feature vector Y. The pdf is learnt from all the frames of training sequences that contain normal instances of the movement, using a Parzen window estimator. The quality of a new pose yt at frame t is then assessed as the log-likelihood of being described by the pose model, i.e.

[1]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[2]  Ross T. Whitaker,et al.  Robust non-linear dimensionality reduction using successive 1-dimensional Laplacian Eigenmaps , 2007, ICML '07.

[3]  Luc Van Gool,et al.  Exploiting simple hierarchies for unsupervised human behavior analysis , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Alex Mihailidis,et al.  3D Human Motion Analysis to Detect Abnormal Events on Stairs , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[5]  Alexandros André Chaaraoui,et al.  A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living , 2012, Expert Syst. Appl..

[6]  John A. Templer,et al.  The Staircase: History and Theories@@@The Staircase: Studies of Hazards, Falls, and Safer Design , 1994 .

[7]  B. Nadler,et al.  Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[8]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[10]  Guillermo Sapiro,et al.  Connecting the Out-of-Sample and Pre-Image Problems in Kernel Methods , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Jesse Hoey,et al.  Automated Detection of Unusual Events on Stairs , 2006, The 3rd Canadian Conference on Computer and Robot Vision (CRV'06).

[12]  Tae-Seong Kim,et al.  Depth video-based gait recognition for smart home using local directional pattern features and hidden Markov model , 2014 .

[13]  Gérard G. Medioni,et al.  Dynamic Manifold Warping for view invariant action recognition , 2011, 2011 International Conference on Computer Vision.

[14]  Alfred O. Hero,et al.  Robust object pose estimation via statistical manifold modeling , 2011, 2011 International Conference on Computer Vision.

[15]  Jessica K. Hodgins,et al.  Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Kejun Wang,et al.  Video-Based Abnormal Human Behavior Recognition—A Review , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Gérard G. Medioni,et al.  Home Monitoring Musculo-skeletal Disorders with a Single 3D Sensor , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.