New grand challenge for multimedia information retrieval: bridging the utility gap

The needs and expectations regarding multimedia content access have grown rapidly with the fast development of multimedia technology and the explosion of multimedia content around us. This imposed high demands on the level of sophistication of multimedia information retrieval (MIR) solutions. Although the potential to develop the MIR technology that meets such high demands has also rapidly grown over the years, we are not there yet with adequate solutions. This paper states that a significant step forward could become possible if the MIR field moves towards a utility-centered research focus. There, the criteria related to utility should be deployed to help us bridge the critical remaining gap that is in front of us—the utility gap, the gap between the expected and de facto usefuleness of MIR systems. Utility criteria reach beyond the objective relevance of MIR results to also consider their informativeness and how helpful they are for user’s further actions. Bridging the utility gap can therefore be seen as the next grand challenge in the MIR research field. To pursue this challenge, we propose a utility-by-design approach, by which utility is targeted explicitly and embedded deep in the foundations of MIR solutions. The paper will first motivate this new MIR grand challenge and position it with respect to the current efforts in the field. Then, some possibilities for realizing the utility-by-design approach will be highlighted and translated into a number of recommended research directions.

[1]  Rong Yan,et al.  Video Retrieval Based on Semantic Concepts , 2008, Proceedings of the IEEE.

[2]  Shiliang Zhang,et al.  Affective MTV analysis based on arousal and valence features , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[3]  Stevan Rudinac,et al.  Exploiting noisy visual concept detection to improve spoken content based video retrieval , 2010, ACM Multimedia.

[4]  Touradj Ebrahimi,et al.  Affective content analysis of music video clips , 2011, MIRUM '11.

[5]  Michiel Hildebrand,et al.  Linking User Generated Video Annotations to the Web of Data , 2012, MMM.

[6]  Mohammad Soleymani,et al.  A Bayesian framework for video affective representation , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[7]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[8]  Maja Pantic,et al.  Implicit human-centered tagging [Social Sciences] , 2009, IEEE Signal Process. Mag..

[9]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[10]  Ximena Olivares,et al.  Visual diversification of image search results , 2009, WWW '09.

[11]  Paul Over,et al.  TRECVID 2008 - Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2010, TRECVID.

[12]  Touradj Ebrahimi Quality of multimedia experience: past, present and future , 2009, MM '09.

[13]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[14]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[15]  Stevan Rudinac,et al.  Finding representative and diverse community contributed images to create visual summaries of geographic areas , 2011, MM '11.

[16]  Michael R. Lyu,et al.  Learning to recommend with trust and distrust relationships , 2009, RecSys '09.

[17]  Jiayu Tang,et al.  What Else Is There? Search Diversity Examined , 2009, ECIR.

[18]  Shih-Fu Chang,et al.  Reranking Methods for Visual Search , 2007, IEEE MultiMedia.

[19]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[20]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[21]  Mohammad Soleymani,et al.  The Community and the Crowd: Multimedia Benchmark Dataset Development , 2012, IEEE MultiMedia.

[22]  Daniel P. W. Ellis,et al.  Signal Processing for Music Analysis , 2011, IEEE Journal of Selected Topics in Signal Processing.

[23]  Malcolm Slaney,et al.  Image classification using the web graph , 2010, ACM Multimedia.

[24]  P. Lang,et al.  Affective judgment and psychophysiological response: Dimensional covariation in the evaluation of pictorial stimuli. , 1989 .

[25]  Kai Song,et al.  Diversifying the image retrieval results , 2006, MM '06.

[26]  John R. Smith Minding the Gap , 2012, IEEE Multim..

[27]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[28]  Roberto Turrin,et al.  Performance of recommender algorithms on top-n recommendation tasks , 2010, RecSys '10.

[29]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[30]  Malcolm Slaney,et al.  Precision-Recall Is Wrong for Multimedia , 2011, IEEE MultiMedia.

[31]  Martha Larson,et al.  Nontrivial landmark recommendation using geotagged photos , 2013, TIST.

[32]  Chun Chen,et al.  Music recommendation by unified hypergraph: combining social media information and music content , 2010, ACM Multimedia.

[33]  Mor Naaman,et al.  Generating diverse and representative image search results for landmarks , 2008, WWW.

[34]  Mark Sanderson,et al.  Diversity in Photo Retrieval: Overview of the ImageCLEFPhoto Task 2009 , 2009, CLEF.

[35]  Alan Hanjalic A New Gap to Bridge: Where to Go Next in Social Media Retrieval? - (Extended Abstract) , 2012, MMM.

[36]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.

[37]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[38]  Suh-Yin Lee,et al.  Emotion-based music recommendation by affinity discovery from film music , 2009, Expert Syst. Appl..

[39]  A. Hanjalic,et al.  Extracting moods from pictures and sounds: towards truly personalized TV , 2006, IEEE Signal Processing Magazine.

[40]  Raya Fidel,et al.  The image retrieval task: implications for the design and evaluation of image databases , 1997, New Rev. Hypermedia Multim..

[41]  M. de Rijke,et al.  An effective coherence measure to determine topical consistency in user-generated content , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[42]  Martha Larson,et al.  Personalized Landmark Recommendation Based on Geotags from Photo Sharing Sites , 2011, ICWSM.

[43]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[44]  Mathias Lux,et al.  A classification scheme for user intentions in image search , 2010, CHI EA '10.

[45]  Edward Y. Chang,et al.  Active Learning for Interactive Multimedia Retrieval , 2008, Proceedings of the IEEE.

[46]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Nicu Sebe,et al.  Affective Meeting Video Analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[48]  Alan Hanjalic,et al.  A system concept for socially enriched access to soccer video collections , 2010, IEEE MultiMedia.

[49]  Martha Larson,et al.  The participation payoff: challenges and opportunities for multimedia access in networked communities , 2010, MIR '10.

[50]  Martha Larson,et al.  Towards Understanding the Challenges Facing Effective Trust-Aware Recommendation , 2010 .

[51]  SpinkAmanda,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008 .

[52]  J. Stephen Downie,et al.  Ten Years of ISMIR: Reflections on Challenges and Opportunities , 2009, ISMIR.

[53]  Rainer Lienhart,et al.  The Holy Grail of Multimedia Information Retrieval: So Close or Yet So Far Away? , 2008 .

[54]  Susanne Boll MultiTube--Where Web 2.0 and Multimedia Could Meet , 2007, IEEE MultiMedia.

[55]  Stephen C. Y. Lu,et al.  Collaborative Tagging Applications and Approaches , 2008, IEEE MultiMedia.

[56]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[57]  Ajay Divakaran Multimedia Content Analysis: Theory and Applications , 2008 .

[58]  Simon King,et al.  From context to content: leveraging context to infer media metadata , 2004, MULTIMEDIA '04.

[59]  Paolo Avesani,et al.  Trust-aware recommender systems , 2007, RecSys '07.

[60]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[61]  Ling-Yu Duan,et al.  Hierarchical movie affective content analysis based on arousal and valence features , 2008, ACM Multimedia.

[62]  Edie M. Rasmussen,et al.  Searching for images: The analysis of users' queries for image retrieval in American history , 2003, J. Assoc. Inf. Sci. Technol..

[63]  Satoshi Nakamura,et al.  Generation of views of TV content using TV viewers' perspectives expressed in live chats on the web , 2005, MULTIMEDIA '05.

[64]  Hang-Bong Kang,et al.  Affective content detection using HMMs , 2003, ACM Multimedia.

[65]  David A. Shamma,et al.  Tweet the debates: understanding community annotation of uncollected sources , 2009, WSM@MM.

[66]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[67]  Martha Larson,et al.  Discovering User Perceptions of Semantic Similarity in Near-duplicate Multimedia Files , 2012, CrowdSearch.

[68]  Kiyoharu Aizawa,et al.  Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene Classification , 2010, IEEE Transactions on Multimedia.

[69]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[70]  Mohammad Soleymani,et al.  Crowdsourcing for Affective Annotation of Video: Development of a Viewer-reported Boredom Corpus , 2010 .

[71]  Alan Hanjalic,et al.  Content-Based Analysis of Digital Video , 2004, Springer US.

[72]  Luis von Ahn Games with a Purpose , 2006, Computer.

[73]  Ajay Divakaran Multimedia Content Analysis , 2009 .

[74]  Ricardo Baeza-Yates,et al.  Improved query difficulty prediction for the web , 2008, CIKM '08.

[75]  Marcel J. T. Reinders,et al.  The Task Dependent Effect of Tags and Ratings on Social Media Access , 2022 .

[76]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008 .

[77]  P. Lang International affective picture system (IAPS) : affective ratings of pictures and instruction manual , 2005 .

[78]  Malcolm Slaney,et al.  Web-Scale Multimedia Analysis: Does Content Matter? , 2011, IEEE MultiMedia.

[79]  Junqing Yu,et al.  An improved valence-arousal emotion space for video affective content representation and recognition , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[80]  Alfred C. Weaver,et al.  Biometric authentication , 2006, Computer.

[81]  D. Donato Toward a deeper understanding of user intent and query expressiveness , 2011 .

[82]  Marko Tkalcic,et al.  Using affective parameters in a content-based recommender system for images , 2010, User Modeling and User-Adapted Interaction.