MSRA-USTC-SJTU at TRECVID 2007: High-Level Feature Extraction and Search

This paper describes the MSRA-USTC-SJTU experiments for TRECVID 2007. We performed the experiments in high-level feature extraction and automatic search tasks. For high-level feature extraction, we investigated the benefit of unlabeled data by semi-supervised learning, and the multi-layer (ML) multi-instance (MI) relation embedded in video by MLMI kernel, as well as the correlations between concepts by correlative multi-label learning. For automatic search, we fuse text, visual example, and concept-based models while using temporal consistency and face information for re-ranking and result refinement.

[1]  Meng Wang,et al.  Optimizing multi-graph learning: towards a unified video annotation scheme , 2007, ACM Multimedia.

[2]  Tao Mei,et al.  Multi-layer multi-instance kernel for video concept detection , 2007, ACM Multimedia.

[3]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[4]  Tao Mei,et al.  EMS: Energy Minimization Based Video Scene Segmentation , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[5]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[6]  Tao Mei,et al.  Video Concept Detection Using Support Vector Machines - TRECVID 2007 Evaluations , 2007 .

[7]  Meng Wang,et al.  Microsoft Research Asia TRECVID 2006 High-Level Feature Extraction and Rushes Exploitation , 2006, TRECVID.

[8]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[9]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[10]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Xian-Sheng Hua,et al.  Video search re-ranking via multi-graph propagation , 2007, ACM Multimedia.

[12]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[13]  Wei Gao,et al.  Cross-lingual query suggestion using query logs of different languages , 2007, SIGIR.

[14]  Tao Mei,et al.  Concurrent Multiple Instance Learning for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Tao Mei,et al.  Video annotation based on temporally consistent Gaussian random field , 2007 .