Dublin City University Video Track Experiments for TREC 2002

Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we developed an interactive video retrieval system, which incorporated the 40 hours of the video search test collection and supported user searching using our own feature extraction data along with the donated feature data and ASR transcript from other Video Track groups. This video retrieval system allows a user to specify a query based on the 10 features and ASR transcript, and the query result is a ranked list of videos that can be further browsed at the shot level. To evaluate the usefulness of the feature-based query, we have developed a second system interface that provides only ASR transcript-based querying, and we conducted an experiment with 12 test users to compare these 2 systems. Results were submitted to NIST and we are currently conducting further analysis of user performance with these 2 systems.

[1]  Didier J. Le Gall,et al.  The MPEG video compression standard , 1991, Compcon.

[2]  Michael Mills,et al.  A magnifier tool for video data , 1992, CHI.

[3]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[4]  Joan L. Mitchell,et al.  MPEG Video: Compression Standard , 1996 .

[5]  A. Ardeshir Goshtasby,et al.  Detecting human faces in color images , 1998, Image Vis. Comput..

[6]  Ioannis Pitas,et al.  A novel method for automatic face segmentation, facial feature extraction and tracking , 1998, Signal Process. Image Commun..

[7]  Jacques Savoy,et al.  Report on the TREC-9 Experiment: Link-based Retrieval and Distributed Collections , 2000, TREC.

[8]  Alan F. Smeaton,et al.  Fischlar: an on-line system for indexing and browsing broadcast television content , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  Noel E. O'Connor,et al.  Speech-music discrimination from MPEG-1 bitstream , 2001 .

[10]  Alan F. Smeaton,et al.  News story segmentation in the Fischlar video indexing system , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[11]  Alan F. Smeaton,et al.  Designing the User Interface for the Físchlár Digital Video Library , 2006, J. Digit. Inf..

[12]  N. O'Connor,et al.  Rhythm detection for speech-music discrimination in MPEG compressed domain , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).

[13]  B. S. Manjunath,et al.  Representation of motion activity in hierarchical levels for video indexing and filtering , 2002, Proceedings. International Conference on Image Processing.

[14]  Jean-Luc Gauvain,et al.  The LIMSI Broadcast News transcription system , 2002, Speech Commun..

[15]  Alan F. Smeaton Challenges for Content-Based Navigation of Digital Video in the Físchlár Digital Library , 2002, CIVR.

[16]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Philip Rennert StreamSage Unsupervised ASR-Based Topic Segmentation , 2003, TRECVID.

[18]  Noel E. O'Connor,et al.  Face detection and clustering for video indexing applications , 2003 .

[19]  Alan F. Smeaton,et al.  Design, implementation and testing of an interactive video retrieval system , 2003, MIR '03.

[20]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.