Latent topics-based relevance feedback for video retrieval

This paper presents a novel Content-Based Video Retrieval approach in order to cope with the semantic gap challenge by means of latent topics. Firstly, a supervised topic model is proposed to transform the classical retrieval approach into a class discovery problem. Subsequently, a new probabilistic ranking function is deduced from that model to tackle the semantic gap between low-level features and high-level concepts. Finally, a short-term relevance feedback scheme is defined where queries can be initialised with samples from inside or outside the database. Several retrieval simulations have been carried out using three databases and seven different ranking functions to test the performance of the presented approach. Experiments revealed that the proposed ranking function is able to provide a competitive advantage within the content-based retrieval field. Graphical abstractDisplay Omitted HighlightsThe retrieval problem is addressed as a class discovery problem by latent topics.A new ranking function is presented to deal with the semantic gap challenge.A retrieval framework is defined using the proposed ranking function.The results show the effectiveness of the proposed approach.

[1]  Mubarak Shah,et al.  High-level event recognition in unconstrained videos , 2013, International Journal of Multimedia Information Retrieval.

[2]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[3]  Maneesh Kumar Singh,et al.  State-of-the-art on spatio-temporal information-based video retrieval , 2009, Pattern Recognit..

[4]  Francesc J. Ferri,et al.  A naive relevance feedback model for content-based image retrieval using multiple similarity measures , 2010, Pattern Recognit..

[5]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[6]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity through Ranking , 2009, IbPRIA.

[7]  Edward A. Fox,et al.  A genetic programming framework for content-based image retrieval , 2009, Pattern Recognit..

[8]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[9]  David B. Dunson,et al.  Probabilistic topic models , 2012, Commun. ACM.

[10]  Shih-Fu Chang,et al.  Consumer video understanding: a benchmark database and an evaluation of human and machine performance , 2011, ICMR.

[11]  Yue Lu,et al.  Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA , 2011, Information Retrieval.

[12]  Alan F. Smeaton Techniques used and open challenges to the analysis, indexing and retrieval of digital video , 2007, Inf. Syst..

[13]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ramesh C. Jain,et al.  A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video , 2002, Pattern Recognit..

[15]  Filiberto Pla,et al.  An Interactive Video Retrieval Approach Based on Latent Topics , 2013, ICIAP.

[16]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[19]  Daniel P. W. Ellis,et al.  Audio fingerprinting to identify multiple videos of an event , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[21]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[23]  Stéphane Ayache,et al.  TRECVID 2007: Collaborative Annotation using Active Learning , 2007, TRECVID.

[24]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Denise C. Park,et al.  A lifespan database of adult facial stimuli , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Yi Yang,et al.  A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Robert Wall,et al.  Reading tea leaves , 2007 .

[29]  Zhongfei Zhang,et al.  Effective Image Retrieval Based on Hidden Concept Discovery in Image Database , 2007, IEEE Transactions on Image Processing.

[30]  ZhaiChengxiang,et al.  Investigating task performance of probabilistic topic models , 2011 .

[31]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Charles Elkan,et al.  Latent semantic indexing (LSI) fails for TREC collections , 2011, SKDD.

[33]  Elena Deza,et al.  Encyclopedia of Distances , 2014 .

[34]  Md. Monirul Islam,et al.  A review on automatic image annotation techniques , 2012, Pattern Recognit..

[35]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[36]  Lei Zhang,et al.  Image retrieval based on micro-structure descriptor , 2011, Pattern Recognit..

[37]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[38]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..