Columbia University/VIREO-CityU/IRIT TRECVID2008 High-Level Feature Extraction and Interactive Video Search

! A_CU-run6: local feature alone – average fusion of 3 SVM classification results for each concept using various feature representation choices. ! A_CU-run5: linear weighted fusion of A_CU-run6 with two grid-based global features (color moment and wavelet texture). ! A_CU-run4: linear weighted fusion of A_CU-run5 with a SVM classification result using detection scores of CU-VIREO374 as features. ! C_CU-run3: linear weighted fusion of A_CU-run4 with a SVM classification result using web images. ! A_CU-run2: re-rank the results of “two_people” and “singing” from A_CU-run4 with concept-specific detectors. ! C_CU-run1: linear weighted fusion of A_CU-run2 with a SVM classification result using web images.

[1]  Liang Gu,et al.  Robust singing detection in speech/music discriminator design , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[4]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[5]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Julien Pinquier,et al.  A fusion study in speech / music classification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[7]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[8]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[9]  G. Jaffré,et al.  Costume: a new feature for automatic video content indexing , 2004 .

[10]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[11]  Katsuhiko Shirai,et al.  Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals , 2005, INTERSPEECH.

[12]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[13]  Shih-Fu Chang,et al.  Automatic discovery of query-class-dependent models for multimodal search , 2005, MULTIMEDIA '05.

[14]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[15]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[17]  Rong Yan,et al.  Exploring the Synergy of Humans and Machines in Extreme Video Retrieval , 2006, CIVR.

[18]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[19]  Alexander G. Hauptmann,et al.  LSCOM Lexicon Definitions and Annotations (Version 1.0) , 2006 .

[20]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[21]  Rong Yan,et al.  How many high-level concepts will fill the semantic gap in news video retrieval? , 2007, CIVR '07.

[22]  Jun Yang Learning to Adapt Across Multimedia Domains , 2007 .

[23]  Chong-Wah Ngo,et al.  Towards optimal bag-of-features for object categorization and semantic video retrieval , 2007, CIVR '07.

[24]  James J. Jiang A Literature Survey on Domain Adaptation of Statistical Classifiers , 2007 .

[25]  Yu-Gang Jiang,et al.  VIREO-374 : LSCOM Semantic Concept Detectors Using Local Keypoint Features , 2007 .

[26]  Julien Pinquier,et al.  Singing voice characterization for audio indexing , 2007, 2007 15th European Signal Processing Conference.

[27]  Shih-Fu Chang,et al.  Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts , 2007 .

[28]  David C. Gibbon,et al.  A Fast, Comprehensive Shot Boundary Determination System , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[29]  Shih-Fu Chang,et al.  CU-VIREO 374 : Fusing Columbia 374 and VIREO 374 for Large Scale Semantic Concept Detection , 2008 .

[30]  Shih-Fu Chang,et al.  CuZero: embracing the frontier of interactive visual search for informed users , 2008, MIR '08.

[31]  Shih-Fu Chang,et al.  Cross-domain learning methods for high-level visual concept classification , 2008, 2008 15th IEEE International Conference on Image Processing.