K-Space at TRECvid 2006

In this paper we describe the K-Space participation in TRECVid 2006. K-Space participated in two tasks, high-level feature extraction and search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission made use of support vector machines (SVMs) created with low-level MPEG-7 visual features, fused with specific concept detectors. Search submissions were both manual and automatic and made use of both low- and high-level features. In the high-level feature extraction submission, four of our six runs achieved performance above the TRECVid median, whilst our search submission performed around the median. The K-Space team consisted of eight partner institutions from the EU-funded K-Space Network, and our submissions made use of tools and techniques from each partner. As such this paper will provide overviews of each partner’s contributions and provide appropriate references for specific descriptions of individual components.

[1]  Bernard Mérialdo,et al.  Eurécom at TRECVid 2006: Extraction of High-level Features and BBC Rushes Exploitation , 2006, TRECVID.

[2]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Werner Bailer,et al.  Detailed audiovisual profile: enabling interoperability between MPEG-7 based systems , 2006, 2006 12th International Multi-Media Modelling Conference.

[4]  Pavel Praks,et al.  On SVD-Free Latent Semantic Indexing for Iris Recognition of Large Databases , 2007 .

[5]  Joemon M. Jose,et al.  Spatial querying for image retrieval: a user-oriented evaluation , 1998, SIGIR '98.

[6]  Hugo Zaragoza,et al.  Information Retrieval: Algorithms and Heuristics , 2002, Information Retrieval.

[7]  Alan F. Smeaton,et al.  Using score distributions for query-time fusion in multimediaretrieval , 2006, MIR '06.

[8]  Marcin Grzegorzek,et al.  Wavelet and Eigen-Space Feature Extraction for Classification of Metallography Images , 2007, EJC.

[9]  Stephen L. Chiu,et al.  Extracting Fuzzy Rules from Data for Function Approximation and Pattern Classification , 2000 .

[10]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[11]  Gerald Salton,et al.  Automatic text processing , 1988 .

[12]  Franciska de Jong,et al.  Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition , 2007, SAMT.

[13]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[14]  Frank Nielsen,et al.  Statistical region merging , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Ebroul Izquierdo,et al.  Image Classification Using an Ant Colony Optimization Approach , 2006, SAMT.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Yannis Avrithis,et al.  A Semantic Multimedia Analysis Approach Utilizing a Region Thesaurus and LSA , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[18]  Benoit Huet,et al.  Graph-Based Spatio-temporal Region Extraction , 2006, ICIAR.

[19]  Joemon M. Jose,et al.  Adaptive image retrieval using a Graph model for semantic feature integration , 2006, MIR '06.

[20]  Benoit Huet,et al.  Neural Network Combining Classifier Based on Dempster-Shafer Theory for Semantic Indexing in Video Content , 2007, MMM.

[21]  Ebroul Izquierdo,et al.  BINARY PARTICLE SWARM AND FUZZY INFERENCE FOR IMAGE CLASSIFICATION , 2006 .

[22]  Krishna Chandramouli Particle Swarm Optimisation and Self Organising Maps Based Image Classifier , 2007 .

[23]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[24]  Noel E. O'Connor,et al.  Detecting the presence of large buildings in natural images , 2005 .

[25]  Noel E. O'Connor,et al.  The acetoolbox: low-level audiovisual feature extraction for retrieval and classification , 2005 .

[26]  Alan F. Smeaton,et al.  Measuring the impact of temporal context on video retrieval , 2008, CIVR '08.

[27]  Fabrice Souvannavong,et al.  Latent semantic indexing for semantic content detection of video shots , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[28]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[29]  Werner Bailer,et al.  Joanneum Research at TRECVID 2005 - Camera Motion Detection , 2005, TRECVID.

[30]  Werner Bailer,et al.  Optimized mean shift algorithm for color segmentation in image sequences , 2005, IS&T/SPIE Electronic Imaging.

[31]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[32]  Noel E. O'Connor,et al.  Region-based segmentation of images using syntactic visual features , 2005 .

[33]  Robert P. W. Duin,et al.  Using two-class classifiers for multiclass classification , 2002, Object recognition supported by user interaction for service robots.

[34]  P. F. Felzenzwalb Efficiently computing a good segmentation , 1998 .

[35]  Benoit Huet,et al.  Eurécom at TRECVid 2005: Extraction of High-level Features , 2005, TRECVID.

[36]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[37]  Ian R. Fasel,et al.  A generative framework for real time object detection and classification , 2005, Comput. Vis. Image Underst..

[38]  Noel E. O'Connor,et al.  Event detection in field sports video using audio-visual features and a support vector Machine , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Iadh Ounis,et al.  Dempster-Shafer Theory for a Query-Biased Combination of Evidence on the Web , 2005, Information Retrieval.

[40]  Noel E. O'Connor,et al.  Using Dempster-Shafer Theory to Fuse Multiple Information Sources in Region-Based Segmentation , 2007, 2007 IEEE International Conference on Image Processing.

[41]  Hugh E. Williams,et al.  The Zettair Search Engine , 1998 .

[42]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[43]  Michael Stonebraker,et al.  The Morgan Kaufmann Series in Data Management Systems , 1999 .

[44]  S. L. Phung,et al.  A novel skin color model in YCbCr color space and its application to human face detection , 2002, Proceedings. International Conference on Image Processing.

[45]  Alan F. Smeaton,et al.  TRECVID 2004 Experiments in Dublin City University , 2004, TRECVID.

[46]  Slim Essid,et al.  Classification automatique des signaux audio-fréquences : reconnaissance des instruments de musique. (Automatic Classification of Audio Signals: Machine Recognition of Musical Instruments) , 2005 .

[47]  P. Praks,et al.  Latent Semantic Indexing for Image Retrieval Systems , 2003 .

[48]  Ebroul Izquierdo,et al.  The sparse image representation for automated image retrieval , 2008, 2008 15th IEEE International Conference on Image Processing.

[49]  Yannis Avrithis,et al.  Fusing MPEG-7 Visual Descriptors for Image Classification , 2005, ICANN.

[50]  Yannis Avrithis,et al.  A Region Thesaurus Approach for High-Level Concept Detection in the Natural Disaster Domain , 2007, SAMT.

[51]  Shih-Fu Chang,et al.  CU-VIREO 374 : Fusing Columbia 374 and VIREO 374 for Large Scale Semantic Concept Detection , 2008 .

[52]  Pavel Praks,et al.  Web Image Classification for Information Extraction , 2005 .

[53]  Pedro F. Felzenszwalb,et al.  Efficiently computing a good segmentation , 1998 .

[54]  Stéphane Ayache,et al.  Evaluation of active learning strategies for video indexing , 2007, Signal Process. Image Commun..

[55]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[56]  Vojtech Svátek,et al.  Human expert modelling using semantics-oriented video retrieval for surveillance in hard industry , 2006, MobiMedia '06.

[57]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[58]  S. T. Dumais,et al.  Using latent semantic analysis to improve access to textual information , 1988, CHI '88.

[59]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[60]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[61]  Thomas Sikora,et al.  Robust face detection based on components and their topology , 2006, Electronic Imaging.

[62]  Benoit Huet,et al.  Classifier Fusion: Combination Methods For Semantic Indexing in Video Content , 2006, ICANN.

[63]  Noel E. O'Connor,et al.  A framework for event detection in field-sports video broadcasts based on SVM generated audio-visual feature model. Case-study: soccer video , 2004 .

[64]  G. Peeters Automatic Classification of Large Musical Instrument Databases Using Hierarchical Classifiers with Inertia Ratio Maximization , 2003 .

[65]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[66]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[67]  Barry Smyth,et al.  Similarity vs. Diversity , 2001, ICCBR.

[68]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .