ITI-CERTH participation to TRECVID 2009 HLFE and Search

This paper provides an overview of the tasks submitted to TRECVID 2009 by ITI-CERTH. ITICERTH participated in the high-level feature extraction task and the search task. In the high-level feature extraction task, techniques are developed that combine motion information with existing wellperforming descriptors such as SIFT and Bag-of-Words for shot representation. In a separate run, the use of compressed video information to form a Bag-of-Words model for shot representation is studied. The search task is based on an interactive retrieval application combining retrieval functionalities in various modalities (i.e. textual, visual and concept search) with a user interface supporting interactive search over all queries submitted. Evaluation results on the submitted runs for this task provide interesting conclusions regarding the comparison of the involved retrieval functionalities as well as the strategies in interactive video search.

[1]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[2]  Christof Monz,et al.  The QMUL system description for IWSLT 2010 , 2010, IWSLT.

[3]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[4]  Ebroul Izquierdo,et al.  Knowledge Space of Semantic Inference for Automatic Annotation and retrieval of Multimedia Content - K-Space , 2006, SAMT.

[5]  Yiannis Kompatsiaris,et al.  Local Invariant Feature Tracks for high-level video feature extraction , 2010, 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10.

[6]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[7]  ChengXiang Zhai,et al.  Semantic term matching in axiomatic approaches to information retrieval , 2006, SIGIR.

[8]  Bert R. Boyce,et al.  Vocabulary control for information retrieval , 1987, J. Am. Soc. Inf. Sci..

[9]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[10]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Mannes Poel,et al.  Multimedia Semantic Syndication for Enhanced News Services (MESH) , 2006 .

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Hui Fang,et al.  A Re-examination of Query Expansion Using Lexical Resources , 2008, ACL.

[14]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[15]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[17]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Takenobu Tokunaga,et al.  The Use of WordNet in Information Retrieval , 1998, WordNet@ACL/COLING.

[19]  Yiannis Kompatsiaris,et al.  COST292 experimental framework for TRECVID2008 , 2008, TRECVID.

[20]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[21]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[22]  Yiannis Kompatsiaris,et al.  MESH participation to TRECVID2008 HLFE , 2008, TRECVID.

[23]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[24]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[25]  Yiannis Kompatsiaris,et al.  The COST292 experimental framework for TRECVID 2007 , 2007, TRECVID.

[26]  Emily Gallup Fayen,et al.  Guidelines for the construction, format, and management of monolingual controlled vocabularies : A revision of ANSI/NISO Z39.19 for the 21st century , 2007 .