Feature seeding for action recognition

Progress in action recognition has been in large part due to advances in the features that drive learning-based methods. However, the relative sparsity of training data and the risk of overfitting have made it difficult to directly search for good features. In this paper we suggest using synthetic data to search for robust features that can more easily take advantage of limited data, rather than using the synthetic data directly as a substitute for real data. We demonstrate that the features discovered by our selection method, which we call seeding, improve performance on an action classification task on real data, even though the synthetic data from which the features are seeded differs significantly from the real data, both in terms of appearance and the set of action classes.

[1]  Ying Wu,et al.  Discriminative subvolume search for efficient action detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Demetri Terzopoulos,et al.  Surveillance in Virtual Reality: System Design and Multi-Camera Control , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Ivor W. Tsang,et al.  Visual Event Recognition in Videos by Learning from Web Data , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[5]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[6]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[7]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[8]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[9]  Padraig Cunningham,et al.  Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets , 2004, SGAI Conf..

[10]  Christopher Joseph Pal,et al.  Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Martial Hebert,et al.  Representing Pairwise Spatial and Temporal Relations for Action Recognition , 2010, ECCV.

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  Fei-FeiLi,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008 .

[14]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15]  Terry Windeatt,et al.  Feature Ranking Ensembles for Facial Action Unit Classification , 2008, ANNPR.

[16]  Gustavo Carneiro The automatic design of feature spaces for local image descriptors using an ensemble of non-linear feature extractors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[18]  Raghu Machiraju,et al.  Human Activity Recognition for Synthesis , 2006 .

[19]  Xiaogang Wang,et al.  Boosted multi-task learning for face verification with applications to web image and video search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[21]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[23]  David Elliott,et al.  In the Wild , 2010 .

[24]  Dieter Fox,et al.  Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation , 2010, Int. J. Robotics Res..

[25]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[26]  Zicheng Liu,et al.  Cross-dataset action detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Yihong Gong,et al.  Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks , 2008, ECCV.

[28]  Jason Weston,et al.  Large Scale Transductive SVMs , 2006, J. Mach. Learn. Res..

[29]  Jiebo Luo,et al.  Recognizing realistic actions from videos “in the wild” , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.