Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search

Reranking has been a focal technique in multimedia retrieval due to its efficacy in improving initial retrieval results. Current reranking methods, however, mainly rely on the heuristic weighting. In this paper, we propose a novel reranking approach called Self-Paced Reranking (SPaR) for multimodal data. As its name suggests, SPaR utilizes samples from easy to more complex ones in a self-paced fashion. SPaR is special in that it has a concise mathematical objective to optimize and useful properties that can be theoretically verified. It on one hand offers a unified framework providing theoretical justifications for current reranking methods, and on the other hand generates a spectrum of new reranking schemes. This paper also advances the state-of-the-art self-paced learning research which potentially benefits applications in other fields. Experimental results validate the efficacy and the efficiency of the proposed method on both image and video search tasks. Notably, SPaR achieves by far the best result on the challenging TRECVID multimedia event search task.

[1]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[2]  Kathrin Klamroth,et al.  Biconvex sets and optimization with biconvex functions: a survey and extensions , 2007, Math. Methods Oper. Res..

[3]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  Qi Tian,et al.  Learning to judge image search results , 2011, MM '11.

[6]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[7]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[8]  Maarten de Rijke,et al.  Content-Based Analysis Improves Audiovisual Archive Retrieval , 2012, IEEE Transactions on Multimedia.

[9]  Deva Ramanan,et al.  Self-Paced Learning for Long-Term Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[11]  Cees Snoek,et al.  Composite Concept Discovery for Zero-Shot Video Event Detection , 2014, ICMR.

[12]  Meng Wang,et al.  Harvesting visual concepts for image search with complex queries , 2012, ACM Multimedia.

[13]  Dong Liu,et al.  Event-Driven Semantic Concept Discovery by Exploiting Weakly Tagged Internet Images , 2014, ICMR.

[14]  Teruko Mitamura,et al.  Zero-Example Event Search using MultiModal Pseudo Relevance Feedback , 2014, ICMR.

[15]  Hui Cheng,et al.  Semantic pooling for complex event detection , 2013, MM '13.

[16]  Yang Gao,et al.  Self-paced dictionary learning for image classification , 2012, ACM Multimedia.

[17]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[18]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[20]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[21]  Fei-Fei Li,et al.  Shifting Weights: Adapting Object Detectors from Image to Video , 2012, NIPS.

[22]  Nicu Sebe,et al.  We are not equally negative: fine-grained labeling for multimedia event detection , 2013, ACM Multimedia.

[23]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[24]  Deyu Meng,et al.  Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations , 2014, ICMR.

[25]  Florian Metze,et al.  CMU-Informedia @ TRECVID 2013 Multimedia Event Detection , 2013 .

[26]  Frédéric Jurie,et al.  Improving web image search results using query-relative classifiers , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Jingdong Wang,et al.  Robust visual reranking via sparsity and ranking constraints , 2011, ACM Multimedia.

[28]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[29]  Shih-Fu Chang How far we've come: Impact of 20 years of multimedia information retrieval , 2013, TOMCCAP.

[30]  Xian-Sheng Hua,et al.  Bayesian video search reranking , 2008, ACM Multimedia.

[31]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[32]  Daphne Koller,et al.  Learning specific-class segmentation from diverse data , 2011, 2011 International Conference on Computer Vision.

[33]  Masoud Mazloom,et al.  Querying for video events by semantic signatures from few examples , 2013, MM '13.

[34]  Rong Yan,et al.  Video Retrieval Based on Semantic Concepts , 2008, Proceedings of the IEEE.

[35]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[36]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[37]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[38]  Alexander G. Hauptmann,et al.  Leveraging high-level and low-level features for multimedia event detection , 2012, ACM Multimedia.

[39]  Alan Hanjalic,et al.  Supervised reranking for web image search , 2010, ACM Multimedia.

[40]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.