Semantic representation: search and mining of multimedia content

Semantic understanding of multimedia content is critical in enabling effective access to all forms of digital media data. By making large media repositories searchable, semantic content descriptions greatly enhance the value of such data. Automatic semantic understanding is a very challenging problem and most media databases resort to describing content in terms of low-level features or using manually ascribed annotations. Recent techniques focus on detecting semantic concepts in video, such as indoor, outdoor, face, people, nature, etc. This approach works for a fixed lexicon for which annotated training examples exist. In this paper we consider the problem of using such semantic concept detection to map the video clips into semantic spaces. This is done by constructing a model vector that acts as a compact semantic representation of the underlying content. We then present experiments in the semantic spaces leveraging such information for enhanced semantic retrieval, classification, visualization, and data mining purposes. We evaluate these ideas using a large video corpus and demonstrate significant performance gains in retrieval effectiveness.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Terry E. Weymouth,et al.  Semantic Queries with Pictures: The VIMSYS Model , 1991, VLDB.

[3]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[4]  Thomas S. Huang,et al.  Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[5]  J.R. Smith,et al.  Learning visual models of semantic concepts , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[6]  Ioannis Pitas,et al.  On the stability of support vector machines for face detection , 2002, Proceedings. International Conference on Image Processing.

[7]  John R. Smith,et al.  New anchor selection methods for image retrieval , 2003, IS&T/SPIE Electronic Imaging.

[8]  John R. Smith,et al.  Exploring semantic dependencies for scalable concept detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[9]  John R. Smith,et al.  Modeling semantic concepts to support query by keywords in video , 2002, Proceedings. International Conference on Image Processing.

[10]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[11]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.