Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization

In web search, users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query, the technique of query expansion (QE) consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper, we propose a novel method for learning, in a supervised way, semantic representations for words and phrases. By embedding queries and documents in special matrices, our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR measures beyond expansion from current word-embeddings models and well-established traditional QE methods.

[1]  Jianfeng Gao,et al.  Towards Concept-Based Translation Models Using Search Logs for Query Expansion , 2012, Proceedings of the 21st ACM international conference on Information and knowledge management.

[2]  Yoshua Bengio,et al.  Modeling term dependencies with quantum language models for IR , 2013, SIGIR.

[3]  Jason Weston,et al.  Supervised Semantic Indexing , 2009, ECIR.

[4]  Thierry Paul,et al.  Quantum computation and quantum information , 2007, Mathematical Structures in Computer Science.

[5]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[6]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[7]  W. Bruce Croft,et al.  Query reformulation using anchor text , 2010, WSDM '10.

[8]  Jaime G. Carbonell,et al.  Document Representation and Query Expansion Models for Blog Recommendation , 2008, ICWSM.

[9]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[10]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[11]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[12]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[13]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[14]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Yoshua Bengio,et al.  Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[17]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[18]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[19]  Clement T. Yu,et al.  A theory of term importance in automatic text analysis , 1974, J. Am. Soc. Inf. Sci..

[20]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[21]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Yoshua Bengio,et al.  Neural Probabilistic Language Models , 2006 .

[26]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[27]  Jianfeng Gao,et al.  Clickthrough-based translation models for web search: from word models to phrase models , 2010, CIKM.

[28]  ChengXiang Zhai,et al.  Tapping into knowledge base for concept feedback: leveraging conceptnet to improve search results for difficult queries , 2012, WSDM '12.

[29]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[30]  Isaac L. Chuang,et al.  Quantum Computation and Quantum Information (10th Anniversary edition) , 2011 .