A Boosted Co-Training Algorithm for Human Action Recognition

This paper proposes a boosted co-training algorithm for human action recognition. To address the view-sufficiency and view-dependency issues in co-training, two new confidence measures, namely, inter-view confidence and intra-view confidence, are proposed. They are dynamically fused into a semi-supervised learning process. Mutual information is employed to quantify the inter-view uncertainty and measure the independence among respective views. Intra-view confidence is estimated from boosted hypotheses to measure the total data inconsistency of labeled data and unlabeled data. Given a small set of labeled videos and a large set of unlabeled videos, the proposed semi-supervised learning algorithm trains a classifier by maximizing the inter-view confidence and intra-view confidence, and dynamically incorporating unlabeled data into the labeled data set. To evaluate the proposed boosted co-training algorithm, eigen-action and information saliency feature vectors are employed as two input views. The KTH and Weizmann human action databases are used for experiments, average recognition accuracy of 93.2% and 99.6% are obtained, respectively.

[1]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[2]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[3]  Sanjoy Dasgupta,et al.  PAC Generalization Bounds for Co-training , 2001, NIPS.

[4]  O. Mangasarian,et al.  Semi-Supervised Support Vector Machines for Unlabeled Data Classification , 2001 .

[5]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[6]  Yang Song,et al.  Unsupervised Learning of Human Motion , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, ICPR 2004.

[8]  Maria-Florina Balcan,et al.  Co-Training and Expansion: Towards Bridging Theory and Practice , 2004, NIPS.

[9]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[10]  Neil D. Lawrence,et al.  Semi-supervised Learning via Gaussian Processes , 2004, NIPS.

[11]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Mubarak Shah,et al.  Actions sketch: a novel action representation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  I. Patras,et al.  Spatiotemporal salient points for visual recognition of human actions , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[15]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[16]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[17]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[18]  Shaogang Gong,et al.  Beyond Tracking: Modelling Activity and Understanding Behaviour , 2006, International Journal of Computer Vision.

[19]  Nicolas Le Roux,et al.  Label Propagation and Quadratic Criterion , 2006, Semi-Supervised Learning.

[20]  Yang Wang,et al.  Unsupervised Discovery of Action Classes , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[22]  Roberto Cipolla,et al.  Extracting Spatiotemporal Interest Points using Global Information , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Donghai Guan,et al.  Activity Recognition Based on Semi-supervised Learning , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[25]  R. Bharat Rao,et al.  Bayesian Co-Training , 2007, J. Mach. Learn. Res..

[26]  Juan Carlos Niebles,et al.  A Hierarchical Model of Shape and Appearance for Human Action Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Eli Shechtman,et al.  Space-Time Behavior-Based Correlation-OR-How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them? , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[29]  Kristen Grauman,et al.  Watch, Listen & Learn: Co-training on Captioned Images and Videos , 2008, ECML/PKDD.

[30]  Trevor Darrell,et al.  Multi-View Learning in the Presence of View Disagreement , 2008, UAI 2008.

[31]  Mubarak Shah,et al.  Content based video matching using spatiotemporal volumes , 2008, Comput. Vis. Image Underst..

[32]  Rama Chellappa,et al.  Machine Recognition of Human Activities: A Survey , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[33]  Yi Liu,et al.  SemiBoost: Boosting for Semi-Supervised Learning , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Pong C. Yuen,et al.  Object motion detection using information theoretic spatio-temporal saliency , 2009, Pattern Recognit..

[35]  Trevor Darrell,et al.  Co-training with noisy perceptual observations , 2009, CVPR.

[36]  Pong C. Yuen,et al.  Human action recognition using boosted EigenActions , 2010, Image Vis. Comput..

[37]  Yi Peng,et al.  Unsupervised and Semi-supervised Support Vector Machines , 2011 .