Video Search and High-Level Feature Extraction

We participated in two TRECVID tasks in 2006 – “High-Level Feature Extraction” and the automatic type of “Search”. Summaries of our approaches, comparative performance analysis of individual components, and insights from such analysis are presented below. (Draft for TRECVID Workshop 2006, Nov. 01 2006) 2 Task: High-Level Feature Extraction In TRECVID 2006, we explore several novel approaches to help detect high-level concepts, including the context-based concept fusion method [13] that incorporates inter-conceptual relationships to help detect individual concepts; the lexicon-spatial pyramid matching method that adopts pyramid matching with local features, event detection with Earth Mover’s distance that utilizes the information from multiple frames within one shot for concept detection, and text features from speech recognition and machine translation to enhance visual concept detection. In the end, we find that each of these components provides significant performance improvement for at least some, if not all, concepts.

[1]  Shih-Fu Chang,et al.  Detecting image near-duplicate by stochastic attributed relational graph matching with learning , 2004, MULTIMEDIA '04.

[2]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[3]  Shih-Fu Chang,et al.  COLUMBIA-IBM NEWS VIDEO STORY SEGMENTATION IN TRECVID 2004 , 2005 .

[4]  Shih-Fu Chang,et al.  Learning Random Attributed Relational Graph for Part-based Object Detection , 2005 .

[5]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[6]  Shih-Fu Chang,et al.  Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation , 2005, CIVR.

[7]  David C. Gibbon,et al.  AT&T Research at TRECVID 2006 , 2006, TRECVID.

[8]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[9]  Winston H. Hsu,et al.  Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors , 2006 .

[10]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[11]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[13]  Shih-Fu Chang,et al.  Active Context-Based Concept Fusionwith Partial User Labels , 2006, 2006 International Conference on Image Processing.

[14]  Winston H. Hsu,et al.  Anchor Shot Detection in TRECVID-2005 Broadcast News Videos , 2006 .