Action classification using a discriminative multilevel HDP-HMM

We classify human actions occurring in depth image sequences using features based on skeletal joint positions. The action classes are represented by a multi-level Hierarchical Dirichlet Process-Hidden Markov Model (HDP-HMM). The non-parametric HDP-HMM allows the inference of hidden states automatically from training data. The model parameters of each class are formulated as transformations from a shared base distribution, thus promoting the use of unlabelled examples during training and borrowing information across action classes. Further, the parameters are learnt in a discriminative way. We use a normalized gamma process representation of HDP and margin based likelihood functions for this purpose. We sample parameters from the complex posterior distribution induced by our discriminative likelihood function using elliptical slice sampling. Experiments with two different datasets show that action class models learnt using our technique produce good classification results.

[1]  Erik B. Sudderth,et al.  Nonparametric discovery of activity patterns from video collections , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[2]  Ning Chen,et al.  Gibbs Max-Margin Topic Models with Fast Sampling Algorithms , 2013, ICML.

[3]  Hazem Wannous,et al.  Grassmannian Representation of Motion Depth for 3D Human Gesture and Action Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[4]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[5]  Yun Fu,et al.  A Discriminative Model with Multiple Temporal Scales for Action Prediction , 2014, ECCV.

[6]  Hui Jiang,et al.  Discriminative training of HMMs for automatic speech recognition: A survey , 2010, Comput. Speech Lang..

[7]  Herman Bruyninckx,et al.  Hierarchical Dirichlet Process Hidden Markov Models for abnormality detection in robotic assembly , 2012 .

[8]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[9]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[10]  Ziv Bar-Joseph,et al.  Alignment and classification of time series gene expression in clinical studies , 2008, ISMB.

[11]  Michael I. Jordan,et al.  An HDP-HMM for systems with state persistence , 2008, ICML '08.

[12]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[13]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[14]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[15]  Marwan Torki,et al.  Histogram of Oriented Displacements (HOD): Describing Trajectories of Human Joints for Action Recognition , 2013, IJCAI.

[16]  Yee Whye Teh,et al.  Beam sampling for the infinite hidden Markov model , 2008, ICML '08.

[17]  Jake K. Aggarwal,et al.  View invariant human action recognition using histograms of 3D joints , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[18]  Massimo Piccardi,et al.  An online HDP-HMM for joint action segmentation and classification in motion capture data , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[19]  Gwenn Englebienne,et al.  A Non-parametric Hierarchical Model to Discover Behavior Dynamics from Tracks , 2012, ECCV.

[20]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Wanqing Li,et al.  Action recognition based on a bag of 3D points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[22]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[23]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[24]  Wu Chou,et al.  Discriminative learning in sequential pattern recognition , 2008, IEEE Signal Processing Magazine.

[25]  Alberto Del Bimbo,et al.  Space-Time Pose Representation for 3D Human Action Recognition , 2013, ICIAP Workshops.

[26]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .

[27]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Dong Yu,et al.  Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition , 2007, International Conference on Semantic Computing (ICSC 2007).

[30]  Alex Bateman,et al.  An introduction to hidden Markov models. , 2007, Current protocols in bioinformatics.

[31]  D. Blei,et al.  The Discrete Innite Logistic Normal Distribution , 2011, 1103.4789.

[32]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Antonio Torralba,et al.  Describing Visual Scenes Using Transformed Objects and Parts , 2008, International Journal of Computer Vision.