Concept detectors: how good is good enough?

Today, semantic concept based video retrieval systems often show insufficient performance for real-life applications. Clearly, a big share of the reason is the lacking performance of the detectors of these concepts. While concept detectors are on their endeavor to improve, following important questions need to be addressed: "How good do detectors need to be to produce usable search systems?" and "How does the detector performance influence different concept combination methods?". We use Monte Carlo Simulations to provide answers to the above questions. The main contribution of this paper is a probabilistic model of detectors which outputs confidence scores to indicate the likelihood of a concept to occur. This score is also converted into a posterior probability and a binary classification. We investigate the influence of changes to the model's parameters on the performance of multiple concept combination methods. Current web search engines produce a mean average precision (MAP) of around 0.20. Our simulation reveals that the best performing video search method achieve this performance using detectors with 0.60 MAP and is therefore usable in real-life. Furthermore, perfect detection allows the best performing combination method to produce 0.39 search MAP in an artificial environment with Oracle settings. We also find that MAP is not necessarily a good evaluation measure for concept detectors since it is not always correlated with search performance.

[1]  Rong Yan,et al.  The combination limit in multimedia retrieval , 2003, MULTIMEDIA '03.

[2]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[3]  Stephen E. Robertson,et al.  Probabilistic models of indexing and searching , 1980, SIGIR '80.

[4]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[5]  Christoph Arndt,et al.  Information Measures: Information and its Description in Science and Engineering , 2001 .

[6]  Dennis Koelma,et al.  The MediaMill TRECVID 2008 Semantic Video Search Engine , 2008, TRECVID.

[7]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[8]  Anawach Sangswang,et al.  Justification of a stochastic model for a DC-DC boost converter , 2003, IECON'03. 29th Annual Conference of the IEEE Industrial Electronics Society (IEEE Cat. No.03CH37468).

[9]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[10]  Djoerd Hiemstra,et al.  Reusing annotation labor for concept selection , 2009, CIVR '09.

[11]  Jun Yang,et al.  (Un)Reliability of video concept detection , 2008, CIVR '08.

[12]  Rong Yan,et al.  Probabilistic models for combining diverse knowledge sources in multimedia retrieval , 2006 .

[13]  Djoerd Hiemstra,et al.  A probabilistic ranking framework using unobservable binary events for video search , 2008, CIVR '08.

[14]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[15]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) The MediaMill TRECVID 2008 semantic video search engine , 2008 .

[16]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[17]  John A. Bather,et al.  Decision Theory: An Introduction to Dynamic Programming and Sequential Decisions , 2000 .

[18]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[19]  Bo Zhang,et al.  Using High-Level Semantic Features in Video Retrieval , 2006, CIVR.

[20]  Paul Over,et al.  High-level feature detection from video in TRECVid: a 5-year retrospective of achievements , 2009 .

[21]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[22]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[23]  John R. Kender,et al.  Visual concepts for news story tracking: analyzing and exploiting the NIST TRESVID video annotation experiment , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Rong Yan,et al.  A review of text and image retrieval approaches for broadcast news video , 2007, Information Retrieval.

[25]  D. J. White,et al.  Decision Theory , 2018, Behavioral Finance for Private Banking.

[26]  S. Robertson The probability ranking principle in IR , 1997 .

[27]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[28]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[29]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.