Microsoft Research Asia TRECVID 2006 High-Level Feature Extraction and Rushes Exploitation

In this paper, we describe the MSRA experiments for TRECVID 2006, including details of the approaches and performance analyses for high-level feature extraction task and rushes exploitation task. For high-level feature extraction, we mainly investigated the benefit of unlabeled data by semi-supervised learning methods, including adaptive semi-supervised learning with kernel density estimation, manifold ranking, and transductive graph. Moreover, we performed fusion in two different levels: modality level and model level. We were ranked in the top 10 list in terms of mean average precision performance among all participants. For rushes exploitation, we detected the duplicate content based on ordinal video signature. We also performed video structuring (i.e. decomposing rushes into shots and sub-shots) and camera motion classification (i.e. classifying each sub-shot into static, pan, tilt, zoom, rotation, or object motion in terms of camera motion). Furthermore, we validated the approaches to concept modeling and detected 39 concepts on rushes data without re-training the visual models obtained in high-level feature extraction task.