论文信息 - Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004

Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004

In this paper, the two different applications based on the Schema Reference System that were developed by the SCHEMA NoE for participation to the search task of TRECVID 2004 are illustrated. The first application, named ”Schema-Text”, is an interactive retrieval application that employs only textual information while the second one, named ”Schema-XM”, is an extension of the former, employing algorithms and methods for combining textual, visual and higher level information. Two runs for each application were submitted, I A 2 SCHEMA-Text 3, I A 2 SCHEMA-Text 4 for Schema-Text and I A 2 SCHEMA-XM 1, I A 2 SCHEMA-XM 2 for Schema-XM. The comparison of these two applications in terms of retrieval efficiency revealed that the combination of information from different data sources can provide higher efficiency for retrieval systems. Experimental testing additionally revealed that initially performing a text-based query and subsequently proceeding with visual similarity search using one of the returned relevant keyframes as an example image is a good scheme for combining visual and textual information.

Noel E. O'Connor | Stephan Herrmann | Vasileios Mezaris | Bart Lehane | Haralambos Doulaverakis

[1] Jean-Luc Gauvain,et al. The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[2] Levent Onural,et al. Utilization of the recursive shortest spanning tree algorithm for video-object segmentation by 2-D affine motion modeling , 2000, IEEE Trans. Circuits Syst. Video Technol..

[3] Lei Guo,et al. A novel image retrieval model , 2002, 6th International Conference on Signal Processing, 2002..

[4] Michael G. Strintzis,et al. Still Image Segmentation Tools For Object-Based Multimedia Applications , 2004, Int. J. Pattern Recognit. Artif. Intell..

[5] Jiebo Luo,et al. Indoor vs outdoor classification of consumer photographs using low-level and semantic features , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[6] Noel E. O'Connor,et al. QIMERA: A SOFTWARE PLATFORM FOR VIDEO OBJECT SEGMENTATION AND TRACKING , 2003 .

[7] K. Sparck Jones,et al. Simple, proven approaches to text retrieval , 1994 .

[8] Ajay Divakaran,et al. MPEG-7 visual motion descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[9] Noel E. O'Connor,et al. Region and object segmentation algorithms in the Qimera segmentation platform , 2003 .

[10] B. S. Manjunath,et al. Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..