Zero-Example Event Search using MultiModal Pseudo Relevance Feedback

We propose a novel method MultiModal Pseudo Relevance Feedback (MMPRF) for event search in video, which requires no search examples from the user. Pseudo Relevance Feedback has shown great potential in retrieval tasks, but previous works are limited to unimodal tasks with only a single ranked list. To tackle the event search task which is inherently multimodal, our proposed MMPRF takes advantage of multiple modalities and multiple ranked lists to enhance event search performance in a principled way. The approach is unique in that it leverages not only semantic features, but also non-semantic low-level features for event search in the absence of training data. Evaluated on the TRECVID MEDTest dataset, the approach improves the baseline by up to 158% in terms of the mean average precision. It also significantly contributes to CMU Team's final submission in TRECVID-13 Multimedia Event Detection.

[1]  Xian-Sheng Hua,et al.  Bayesian video search reranking , 2008, ACM Multimedia.

[2]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[3]  Alexander G. Hauptmann,et al.  Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.

[4]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[5]  Rong Yan,et al.  Negative pseudo-relevance feedback in content-based video retrieval , 2003, MULTIMEDIA '03.

[6]  A. G. Amitha Perera,et al.  Multimedia event detection with multimodal feature fusion and temporal concept localization , 2013, Machine Vision and Applications.

[7]  James Allan,et al.  A cluster-based resampling method for pseudo-relevance feedback , 2008, SIGIR '08.

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[10]  James Allan,et al.  Zero-shot video retrieval using content and concepts , 2013, CIKM.

[11]  Alan Hanjalic,et al.  Supervised reranking for web image search , 2010, ACM Multimedia.

[12]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[13]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[14]  Wei Liu,et al.  Double Fusion for Multimedia Event Detection , 2012, MMM.

[15]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[16]  ChengXiang Zhai,et al.  Positional relevance model for pseudo-relevance feedback , 2010, SIGIR.

[17]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[18]  Stephen E. Robertson,et al.  Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.

[19]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[20]  Rong Yan,et al.  Video Retrieval Based on Semantic Concepts , 2008, Proceedings of the IEEE.

[21]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[22]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares , 2007, IEEE Journal of Selected Topics in Signal Processing.

[23]  Nicu Sebe,et al.  Fisher kernel based relevance feedback for multimodal video retrieval , 2013, ICMR '13.

[24]  Teruko Mitamura,et al.  Multimodal knowledge-based analysis in multimedia event detection , 2012, ICMR '12.

[25]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[26]  Yi Yang,et al.  E-LAMP: integration of innovative ideas for multimedia event detection , 2013, Machine Vision and Applications.

[27]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[28]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Pinar Duygulu Sahin,et al.  Joint visual-text modeling for automatic retrieval of multimedia documents , 2005, ACM Multimedia.