Study of context influence on classifiers trained under different video-document representations

The problem of content-based video retrieval continues to pose a challenge to the research community, the performance of video retrieval systems being low due to the semantic gap. In this paper we consider whether taking advantage of context can aid the video retrieval process by making the prediction of relevance easier, i.e. if it is easier for a classification system to predict the relevance of a video shot under a given context, then that context has potential in also improving retrieval, since the underlying features better differentiate relevant from non-relevant video shots. We use an operational definition of context, where datasets can be split into disjoint sub-collections which reflect a particular context. Contexts considered include task difficulty and user expertise, among others. In the classification process, four main types of features are used to represent video-shots: conventional low-level visual features representing physical properties of the video shots, behavioral features which are based on user interaction with the video shots, and two different bag-of-words features obtained from the Automatic Speech Recognition from the audio of the video. So, we measure how well each kind of video representation performs and, for each of these representations, our datasets are then split into different contexts in order to discover which contexts affect the performance of a number of trained classifiers. Thus, we aim to discover contexts which improve the classifier's performance and if this improvement is consistent regardless the kind of representation. Experimental results show which kind of the tested document representations works best for the different features; following on from this, we then identify the contexts which result in the classifiers performing better.

[1]  Robert Villa,et al.  FacetBrowser: a user interface for complex search tasks , 2008, ACM Multimedia.

[2]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[3]  Nicholas J. Belkin,et al.  Display time as implicit feedback: understanding task effects , 2004, SIGIR '04.

[4]  Peter Ingwersen,et al.  Information retrieval in context: IRiX , 2005, SIGF.

[5]  Paul Over,et al.  TRECVID 2007--Overview , 2007, TRECVID.

[6]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[7]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[8]  Fabio Crestani,et al.  Information Context: Nature, Impact, and Role: 5th International Conference on Conceptions of Library and Information Sciences, CoLIS 2005, Glasgow, UK, ... (Lecture Notes in Computer Science) , 2005 .

[9]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[10]  Paul Over,et al.  TRECVID 2005 - An Overview , 2005, TRECVID.

[11]  Joemon M. Jose,et al.  How users assess Web pages for information seeking , 2005, J. Assoc. Inf. Sci. Technol..

[12]  Paul Over,et al.  TRECVID 2006 Overview , 2006, TRECVID.

[13]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Marianne Afifi,et al.  Joint Conference on Digital Libraries (JCDL) , 2003 .

[15]  Carol L. Barry,et al.  Users' Criteria for Relevance Evaluation: A Cross-situational Comparison , 1998, Inf. Process. Manag..

[16]  Amanda Spink,et al.  Issues of context in information retrieval (IR): an introduction to the special issue , 2002, Inf. Process. Manag..

[17]  Robert Villa,et al.  A study of awareness in multimedia search , 2008, JCDL '08.

[18]  Robert Villa,et al.  Comparison of Feature Construction Methods for Video Relevance Prediction , 2009, MMM.

[19]  Ryen W. White,et al.  A study of factors affecting the utility of implicit relevance feedback , 2005, SIGIR '05.

[20]  Nicholas J. Belkin,et al.  Information retrieval in context - IRiX: workshop at SIGIR 2004 - Sheffield , 2004, SIGF.

[21]  Joemon M. Jose,et al.  Glasgow University at TRECVid 2006 , 2006, TRECVID.

[22]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[23]  Charles L. A. Clarke,et al.  Modeling task-genre relationships for IR in the workplace , 2005, SIGIR '05.

[24]  Fabio Crestani,et al.  Introduction to special issue on contextual information retrieval systems , 2007, Information Retrieval.

[25]  Rong Yan,et al.  Co-retrieval: A Boosted Reranking Approach for Video Retrieval , 2004, CIVR.

[26]  Steve Fox,et al.  Evaluating implicit measures to improve web search , 2005, TOIS.

[27]  Thierry Urruty,et al.  Simulated evaluation of faceted browsing based on feature selection , 2010, Multimedia Tools and Applications.

[28]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[29]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[30]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.