Utilizing Related Samples to Enhance Interactive Concept-Based Video Search

One of the main challenges in interactive concept-based video search is the problem of insufficient relevant samples, especially for queries with complex semantics. In this paper, “related samples” are exploited to enhance interactive video search. The related samples refer to those video segments that are relevant to part of the query rather than the entire query. Compared to the relevant samples which may be rare, the related samples are usually plentiful and easy to find in search results. Generally, the related samples are visually similar and temporally neighboring to the relevant samples. Based on these two characters, we develop a visual ranking model that simultaneously exploits the relevant, related, and irrelevant samples, as well as a temporal ranking model to leverage the temporal relationship between related and relevant samples. An adaptive fusion method is then proposed to optimally explore these two ranking models to generate search results. We conduct extensive experiments on two real-world video datasets: TRECVID 2008 and YouTube datasets. As the experimental results show, our approach achieves at least 96% and 167% performance improvements against the state-of-the-art approaches on the TRECVID 2008 and YouTube datasets, respectively.

[1]  Ronald Rosenfeld,et al.  Semi-supervised learning with graphs , 2005 .

[2]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Meng Wang,et al.  Learning concept bundles for video search with complex queries , 2011, MM '11.

[4]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.

[5]  Ullas Gargi,et al.  Performance characterization of video-shot-change detection methods , 2000, IEEE Trans. Circuits Syst. Video Technol..

[6]  Chong-Wah Ngo,et al.  Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study , 2010, IEEE Transactions on Multimedia.

[7]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[8]  Winston H. Hsu,et al.  Video Search and High-Level Feature Extraction , 2005 .

[9]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[10]  Tat-Seng Chua,et al.  Utilizing related samples to learn complex queries in interactive concept-based video search , 2010, CIVR '10.

[11]  Yi Yang,et al.  Interactive Video Indexing With Statistical Active Learning , 2012, IEEE Transactions on Multimedia.

[12]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[13]  Alan F. Smeaton,et al.  Measuring the Influence of Concept Detection on Video Retrieval , 2009, CAIP.

[14]  Dong Wang,et al.  Learning structured concept-segments for interactive video retrieval , 2008, CIVR '08.

[15]  Jin Zhao,et al.  Video Retrieval Using High Level Features: Exploiting Query Matching and Confidence-Based Weighting , 2006, CIVR.

[16]  Dong Wang,et al.  The importance of query-concept-mapping for automatic video retrieval , 2007, ACM Multimedia.

[17]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[18]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[19]  Tao Mei,et al.  Joint multi-label multi-instance learning for image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[21]  Tao Mei,et al.  Graph-based semi-supervised learning with multiple labels , 2009, J. Vis. Commun. Image Represent..

[22]  Chong-Wah Ngo,et al.  Selection of Concept Detectors for Video Search by Ontology-Enriched Semantic Spaces , 2008, IEEE Transactions on Multimedia.

[23]  Jorma Laaksonen,et al.  Improving Automatic Video Retrieval with Semantic Concept Detection , 2009, SCIA.

[24]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[26]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[27]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[28]  Djoerd Hiemstra,et al.  Building Detectors to Support Searches on Combined Semantic Concepts , 2007 .

[29]  Meng Wang,et al.  MSRA atT TRECVID 2008: High-Level Feature Extraction and Automatic Search , 2008, TRECVID.

[30]  Chong-Wah Ngo,et al.  Semantic context transfer across heterogeneous sources for domain adaptive video search , 2009, ACM Multimedia.

[31]  Yu-Gang Jiang,et al.  VIREO-374 : LSCOM Semantic Concept Detectors Using Local Keypoint Features , 2007 .

[32]  Marcel Worring,et al.  Adding Semantics to Detectors for Video Retrieval , 2007, IEEE Transactions on Multimedia.

[33]  Tat-Seng Chua,et al.  VisionGo: towards true interactivity , 2009, CIVR '09.

[34]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[35]  John Adcock,et al.  FXPAL Interactive Search Experiments for TRECVID 2007 , 2007, TRECVID.

[36]  Rong Yan,et al.  Extreme video retrieval: joint maximization of human and computer performance , 2006, MM '06.

[37]  Sheng Tang,et al.  TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS , 2007, TRECVID.

[38]  Huan Liu,et al.  Enhancing accessibility of microblogging messages using semantic knowledge , 2011, CIKM '11.

[39]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.