Relevance feedback for real-world human action retrieval

Content-based video retrieval is an increasingly popular research field, in large part due to the quickly growing catalogue of multimedia data to be found online. Even though a large portion of this data concerns humans, however, retrieval of human actions has received relatively little attention. Presented in this paper is a video retrieval system that can be used to perform a content-based query on a large database of videos very efficiently. Furthermore, it is shown that by using ABRS-SVM, a technique for incorporating Relevance feedback (RF) on the search results, it is possible to quickly achieve useful results even when dealing with very complex human action queries, such as in Hollywood movies.

[1]  Ling Shao,et al.  Spatio-temporal shape contexts for human action retrieval , 2009, IMCE '09.

[2]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[4]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Ling Shao,et al.  Feature detector and descriptor evaluation in human action recognition , 2010, CIVR '10.

[6]  Pierre Kornprobst,et al.  Action Recognition Using a Bio-Inspired Feedforward Spiking Network , 2009, International Journal of Computer Vision.

[7]  Robert M. Hayes The SMART retrieval system; experiments in automatic document processing: Edited by Gerard Salton, Prentice-Hall, Englewood Cliffs, New Jersey, 1971. 556 pages , 1973 .

[8]  LiXuelong,et al.  Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval , 2006 .

[9]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[10]  Ling Shao,et al.  Multimedia Interaction and Intelligent User Interfaces: Principles, Methods and Applications , 2010 .

[11]  Ramesh C. Jain Content-based multimedia information management , 1998, Proceedings 14th International Conference on Data Engineering.

[12]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Qi Tian,et al.  Incorporate support vector machines to content-based image retrieval with relevance feedback , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[14]  Cordelia Schmid,et al.  Actions in context , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Ling Shao,et al.  Retrieving Human Actions Using Spatio-Temporal Features and Relevance Feedback , 2010 .

[16]  Rongrong Ji,et al.  Random Sampling SVM Based Soft Query Expansion for Image Retrieval , 2007, Fourth International Conference on Image and Graphics (ICIG 2007).

[17]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Xuelong Li,et al.  Negative Samples Analysis in Relevance Feedback , 2007, IEEE Transactions on Knowledge and Data Engineering.

[19]  Xian-Sheng Hua,et al.  Active Reranking for Web Image Search , 2010, IEEE Transactions on Image Processing.

[20]  Jake K. Aggarwal,et al.  An Overview of Contest on Semantic Description of Human Activities (SDHA) 2010 , 2010, ICPR Contests.

[21]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jing Li,et al.  Subspace learning-based dimensionality reduction in building recognition , 2009, Neurocomputing.

[24]  Pietro Perona,et al.  Human action recognition by sequence of movelet codewords , 2002, Proceedings. First International Symposium on 3D Data Processing Visualization and Transmission.

[25]  Shahrul Azman Mohd. Noah,et al.  Integrating Audio Visual Data for Human Action Detection , 2008, 2008 Fifth International Conference on Computer Graphics, Imaging and Visualisation.

[26]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[27]  Shaogang Gong,et al.  Recognising action as clouds of space-time interest points , 2009, CVPR.

[28]  James W. Davis,et al.  The representation and recognition of human movement using temporal templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[30]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[31]  Rémi Ronfard,et al.  Action Recognition from Arbitrary Views using 3D Exemplars , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[32]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[33]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[34]  Yuxiao Hu,et al.  Searching Human Behaviors using Spatial-Temporalwords , 2007, 2007 IEEE International Conference on Image Processing.

[35]  Dacheng Tao,et al.  Biased Discriminant Euclidean Embedding for Content-Based Image Retrieval , 2010, IEEE Transactions on Image Processing.