Human Action Recognition Using Supervised pLSA

Probabilistic latent semantic analysis (pLSA) has been widely used by researchers for human action recognition from video sequences. However, one of the major disadvantages of pLSA and its other extensions is that category labels of training samples are not fully used in model learning procedure for classification task. In this paper, a supervised pLSA (spLSA) model is proposed for overcoming this drawback. By adding an observable category variable to generative process of classic pLSA, spLSA is endowed with more discriminative power. Thus, this model provides a unified framework for semantic analysis and object classification, where the topics formulation is guided by spLSA towards more discriminative and the mapping between the topics and the action categories are described in a fully probabilistic manner. Experimental results show that spLSA substantially outperforms pLSA and achieves comparable or better performances than latent dirichlet allocation based supervised models and other state-of-the-art methods.

[1]  Jean-Michel Renders,et al.  Learning aspect models with partially labeled data , 2011, Pattern Recognit. Lett..

[2]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Greg Mori,et al.  IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL., NO. 1 Human Action Recognition by Semi-Latent Topic Models , 2022 .

[4]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[5]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[7]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Saeid Nahavandi,et al.  Supervised learning probabilistic Latent Semantic Analysis for human motion analysis , 2013, Neurocomputing.

[9]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[10]  Joonki Paik,et al.  Hierarchical pose classification based on human physiology for behaviour analysis , 2010 .

[11]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Jenq-Neng Hwang,et al.  Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks , 2003, Comput. Vis. Image Underst..

[13]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[14]  I. Patras,et al.  Spatiotemporal salient points for visual recognition of human actions , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[16]  Tae-Kyun Kim,et al.  Learning Motion Categories using both Semantic and Structural Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Tae-Kyun Kim,et al.  Canonical Correlation Analysis of Video Volume Tensors for Action Categorization and Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Shaogang Gong,et al.  Action categorization by structural probabilistic latent semantic analysis , 2010, Comput. Vis. Image Underst..

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[23]  B. Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[24]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[25]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.