Mining User Queries with Markov Chains: Application to Online Image Retrieval

We propose a novel method for automatic annotation, indexing and annotation-based retrieval of images. The new method, that we call Markovian Semantic Indexing (MSI), is presented in the context of an online image retrieval system. Assuming such a system, the users' queries are used to construct an Aggregate Markov Chain (AMC) through which the relevance between the keywords seen by the system is defined. The users' queries are also used to automatically annotate the images. A stochastic distance between images, based on their annotation and the keyword relevance captured in the AMC, is then introduced. Geometric interpretations of the proposed distance are provided and its relation to a clustering in the keyword space is investigated. By means of a new measure of Markovian state similarity, the mean first cross passage time (CPT), optimality properties of the proposed distance are proved. Images are modeled as points in a vector space and their similarity is measured with MSI. The new method is shown to possess certain theoretical advantages and also to achieve better Precision versus Recall results when compared to Latent Semantic Indexing (LSI) and probabilistic Latent Semantic Indexing (pLSI) methods in Annotation-Based Image Retrieval (ABIR) tasks.

[1]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[2]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Joo-Hwee Lim,et al.  Latent semantic fusion model for image retrieval and annotation , 2007, CIKM '07.

[4]  Yihong Gong,et al.  A latent topic model for linked documents , 2009, SIGIR.

[5]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[6]  Ugo Montanari,et al.  Networks of constraints: Fundamental properties and applications to picture processing , 1974, Inf. Sci..

[7]  Ambuj K. Singh,et al.  ViVo: visual vocabulary construction for mining biomedical images , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Richard O. Duda,et al.  Subjective bayesian methods for rule-based inference systems , 1976, AFIPS '76.

[9]  Roger C. Schank,et al.  Conceptual dependency: A theory of natural language understanding , 1972 .

[10]  Michele Benzi Numerical Solution of Markov Chains , 2011, Numer. Linear Algebra Appl..

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Jiaqi Liang,et al.  Effects of variable solar irradiance on the reactive power compensation for large solar farm , 2010, 2010 IREP Symposium Bulk Power System Dynamics and Control - VIII (IREP).

[13]  Ronald A. Howard,et al.  Dynamic Probabilistic Systems , 1971 .

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Khaled Mellouli,et al.  Propagating belief functions in qualitative Markov trees , 1987, Int. J. Approx. Reason..

[16]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[17]  Jianping Fan,et al.  Integrating Concept Ontology and Multitask Learning to Achieve More Effective Classifier Training for Multilevel Image Annotation , 2008, IEEE Transactions on Image Processing.

[18]  Daniel Gatica-Perez,et al.  On image auto-annotation with latent space models , 2003, ACM Multimedia.

[19]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[20]  James Ze Wang,et al.  The Story Picturing Engine---a system for automatic text illustration , 2006, TOMCCAP.

[21]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[22]  John D. Lowrance,et al.  A Framework for Evidential-Reasoning Systems , 1990, AAAI.

[23]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[24]  Keon Stevenson,et al.  Comparative evaluation of Web image search engines for multimedia applications , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[25]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Simone Santini,et al.  Similarity Measures , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[28]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .