Learning structured concept-segments for interactive video retrieval

Now with a large lexicon of over 300 semantic concepts available for indexing purpose, video retrieval can be made easier by leveraging on the available semantic indices. However, any successful concept-based video retrieval approach must take the following into account: though improving continuously, these concept indexing results are still far from perfect; more concepts are awaiting for detection instead of being detected due to the limited amount of annotated data. If possible, a structured query formulation other than a simple AND logic of some chosen concepts is more desirable to model the complex query need with the fixed concept lexicon. In this paper, we propose a concept-based interactive video retrieval approach to tackle these problems. To better represent the query information need, the proposed approach learns through the feedback information a structured formulation which consists of multiple semantic concept combination terms. Instead of taking the top-ranked items from the selected concepts, it leverages on a simple mining algorithm to drill down to concept-segments where the positive examples are most densely populated than the negative examples. We evaluate the proposed method on the large scale TRECVid 05&06 data sets, and achieve promising results. Retrieval in concept-segment level has a 14% improvement upon the concept-level. Structured query formulation improves around 13% compared with the simple logical AND formulation. The learning and retrieval process only takes 300ms, satisfying the real-time interactive search need.

[1]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[2]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[3]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[4]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[5]  Marcel Worring,et al.  A Learned Lexicon-Driven Paradigm for Interactive Video Retrieval , 2007, IEEE Transactions on Multimedia.

[6]  Dong Wang,et al.  Video diver: generic video indexing with diverse features , 2007, MIR '07.

[7]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[8]  Dong Wang,et al.  THU and ICRC at TRECVID 2007 , 2007, TRECVID.

[9]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[10]  Dong Wang,et al.  The importance of query-concept-mapping for automatic video retrieval , 2007, ACM Multimedia.

[11]  Bo Zhang,et al.  Using High-Level Semantic Features in Video Retrieval , 2006, CIVR.

[12]  Dong Wang,et al.  Video search in concept subspace: a text-like paradigm , 2007, CIVR '07.

[13]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[14]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[15]  J. R. Render A Large Scale Concept Ontology for News Stories: Empirical Methods, Analysis, and Improvements , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[16]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[17]  R. Manmatha,et al.  An Inference Network Approach to Image Retrieval , 2004, CIVR.

[18]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[19]  Marcel Worring,et al.  User Strategies in Video Retrieval: A Case Study , 2004, CIVR.

[20]  Dong Wang,et al.  Relay Boost Fusion for Learning Rare Concepts in Multimedia , 2006, CIVR.