Columbia University’s Baseline Detectors for 374 LSCOM Semantic Visual Concepts

emantic concept detection represents a key requirement in accessing large collections of digital mages/videos. Automatic detection of presence of a large number of semantic concepts, such as person,” or “waterfront,” or “explosion”, allows intuitive indexing and retrieval of visual content t the semantic level. Development of effective concept detectors and systematic evaluation ethods has become an active research topic in recent years. For example, a major video retrieval enchmarking event, NIST TRECVID[1], has contributed to this emerging area through (1) the rovision of large sets of common data and (2) the organization of common benchmark tasks to erform over this data.

[1]  John R. Smith,et al.  IBM Research TRECVID-2009 Video Retrieval System , 2009, TRECVID.

[2]  Shih-Fu Chang,et al.  Context-Based Concept Fusion with Boosted Conditional Random Fields , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3]  Dong Xu,et al.  Columbia University TRECVID-2006 Video Search and High-Level Feature Extraction , 2006, TRECVID.

[4]  Christian Petersohn Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection System , 2004, TRECVID.

[5]  Shih-Fu Chang,et al.  A reranking approach for context-based concept fusion in video indexing and retrieval , 2007, CIVR '07.

[6]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[7]  Paul Over,et al.  TREC video retrieval evaluation TRECVID 2010 (slides) , 2009 .

[8]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[9]  Alexander G. Hauptmann,et al.  LSCOM Lexicon Definitions and Annotations (Version 1.0) , 2006 .

[10]  Winston H. Hsu,et al.  Brief Descriptions of Visual Features for Baseline TRECVID Concept Detectors , 2006 .

[11]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[12]  John R. Smith,et al.  Normalized classifier fusion for semantic visual concept detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[13]  Dong Xu,et al.  Visual Event Recognition in News Video using Kernel Methods with Multi-Level Temporal Alignment , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.