Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction

 A_CL1_1: choose the best-performing classifier for each concept from all the following runs and an event detection method.  A_CL2_2: (visual-based) choose the best-performing visual-based classifier for each concept from runs A_CL4_4, A_CL5_5, and A_CL6_6.  A_CL3_3: (visual-text) weighted average fusion of visual-based classifier A_CL4_4 with a text classifier.  A_CL4_4: (visual-based) average fusion of A_CL5_5 with a lexicon-spatial pyramid matching approach that incorporates local feature and pyramid matching.  A_CL5_5:(visual-based) average fusion of A_CL6_6 with a context-based concept fusion approach that incorporates inter-conceptual relationships to help detect individual concepts.  A_CL6_6: (visual-based) average fusion of two SVM baseline classification results for each concept: (1) a fused classification result by averaging 3 single SVM classification results. Each SVM classifier uses one type of color/texture/edge features and is trained on the whole training set with 40% negative samples; (2) a single SVM classifier using color and texture trained on the whole training set (90 videos) with 20% negative samples.

[1]  Shih-Fu Chang,et al.  Detecting image near-duplicate by stochastic attributed relational graph matching with learning , 2004, MULTIMEDIA '04.

[2]  Milind R. Naphade,et al.  Learning the semantics of multimedia queries and concepts from a small number of examples , 2005, MULTIMEDIA '05.

[3]  Shih-Fu Chang,et al.  COLUMBIA-IBM NEWS VIDEO STORY SEGMENTATION IN TRECVID 2004 , 2005 .

[4]  Winston H. Hsu,et al.  Video Search and High-Level Feature Extraction , 2005 .

[5]  Shih-Fu Chang,et al.  Learning Random Attributed Relational Graph for Part-based Object Detection , 2005 .

[6]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[7]  Shih-Fu Chang,et al.  Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation , 2005, CIVR.

[8]  David C. Gibbon,et al.  AT&T Research at TRECVID 2006 , 2006, TRECVID.

[9]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[10]  Winston H. Hsu,et al.  Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors , 2006 .

[11]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Shih-Fu Chang,et al.  Active Context-Based Concept Fusionwith Partial User Labels , 2006, 2006 International Conference on Image Processing.

[14]  Alexander G. Hauptmann,et al.  LSCOM Lexicon Definitions and Annotations (Version 1.0) , 2006 .

[15]  Winston H. Hsu,et al.  Anchor Shot Detection in TRECVID-2005 Broadcast News Videos , 2006 .

[16]  Shih-Fu Chang,et al.  Context-Based Concept Fusion with Boosted Conditional Random Fields , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[17]  Dong Xu,et al.  Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .