论文信息 - Refining video annotation by exploiting pairwise concurrent relation

Refining video annotation by exploiting pairwise concurrent relation

Video annotation is a promising and essential step for content-based video search and retrieval. Most of the state-of-the-art video annotation approaches detect multiple semantic concepts in an isolated manner, which neglect the fact that video concepts are usually correlated in semantic nature. In this paper, we propose to refine video annotation by leveraging the pairwise concurrent relation among video concepts. Such concurrent relation is explicitly modeled by a concurrent matrix and then a propagation strategy is adopted to refine the annotations. Through spreading the scores of all related concepts to each other iteratively, the detection results approach stable and optimal. In contrast with existing concept fusion methods, the proposed approach is computationally more efficient and easy to implement, not requiring to construct any contextual model. Furthermore, we show its intuitive connection with the PageRank algorithm. We conduct the experiments on TRECVID 2005 corpus and report superior performance compared to existing key approaches.

Tao Mei | Xian-Sheng Hua | Zheng-Jun Zha | Zengfu Wang | Guo-Jun Qi

[1] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[2] Tao Mei,et al. Concurrent Multiple Instance Learning for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3] John R. Smith,et al. Multimedia semantic indexing using model vectors , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[4] Shih-Fu Chang,et al. Context-Based Concept Fusion with Boosted Conditional Random Fields , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5] Carl D. Meyer,et al. Deeper Inside PageRank , 2004, Internet Math..

[6] Shih-Fu Chang,et al. Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[7] John R. Smith,et al. IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[8] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.