Collective Media Annotation using Undirected Random Field Models

We present methods for semantic annotation of multimedia data. The goal is to detect semantic attributes (also referred to as concepts) in clips of video via analysis of a single keyframe or set of frames. The proposed methods integrate high performance discriminative single concept detectors in a random field model for collective multiple concept detection. Furthermore, we describe a generic framework for semantic media classification capable of capturing arbitrary complex dependencies between the semantic concepts. Finally, we present initial experimental results comparing the proposed approach to existing methods.

[1]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[2]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[3]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[4]  Martial Hebert,et al.  Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[5]  Paul Over,et al.  TRECVID 2005 - An Overview , 2005, TRECVID.

[6]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[7]  Marcel Worring,et al.  Building a visual ontology for video retrieval , 2005, MULTIMEDIA '05.

[8]  Paul Over,et al.  TRECVID 2006 Overview , 2006, TRECVID.

[9]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[10]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[11]  Iryna Gurevych,et al.  Semantic Similarity Applied to Spoken Dialogue Summarization , 2004, COLING.

[12]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[13]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[14]  Ossama Emam,et al.  Unsupervised Information Extraction Approach Using Graph Mutual Reinforcement , 2006, EMNLP.

[15]  Andrew McCallum,et al.  Collective multi-label classification , 2005, CIKM '05.

[16]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[17]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[18]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[19]  Sanjiv Kumar,et al.  Models for learning spatial interactions in natural images , 2004 .

[20]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Rong Yan,et al.  Mining Relationship Between Video Concepts using Probabilistic Graphical Models , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[22]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[23]  Marcel Worring,et al.  A Learned Lexicon-Driven Paradigm for Interactive Video Retrieval , 2007, IEEE Transactions on Multimedia.

[24]  Rada Mihalcea,et al.  Measuring the Semantic Similarity of Texts , 2005, EMSEE@ACL.

[25]  Alexander G. Hauptmann,et al.  Towards a Large Scale Concept Ontology for Broadcast Video , 2004, CIVR.

[26]  Cor J. Veenman,et al.  Robust Scene Categorization by Learning Image Statistics in Context , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[27]  Jong Wook Kim,et al.  CP/CV: concept similarity mining without frequency information from domain describing taxonomies , 2006, CIKM '06.

[28]  Diana Inkpen,et al.  Semantic Similarity for Detecting Recognition Errors in Automatic Speech Transcripts , 2005, HLT.

[29]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[30]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[31]  David McLean,et al.  An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources , 2003, IEEE Trans. Knowl. Data Eng..