RBM-based Silhouette Encoding for Human Action Modelling

In this paper we evaluate the use of Restricted Bolzmann Machines (RBM) in the context of learning and recognizing human actions. The features used as basis are binary silhouettes of persons. We test the proposed approach on two datasets of human actions where binary silhouettes are available: ViHASi (synthetic data) and Weizmann (real data). In addition, on Weizmann dataset, we combine features based on optical flow with the associated binary silhouettes. The results show that thanks to the use of RBM-based models, very informative and shorter feature vectors can be obtained for the classification tasks, improving the classification performance.

[1]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Eli Shechtman,et al.  Space-Time Behavior-Based Correlation-OR-How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them? , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[4]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[5]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[6]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[7]  Greg Mori,et al.  Action recognition by learning mid-level motion features , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Rong Yan,et al.  Harmonium Models for Video Classification , 2008, Stat. Anal. Data Min..

[9]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[11]  Tim J. Ellis,et al.  ViHASi: Virtual human action silhouette data for the performance evaluation of silhouette-based action recognition methods , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[12]  Manuel J. Marín-Jiménez,et al.  Learning action descriptors for recognition , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[13]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.