Using Topic Information to Improve Non-exact Keyword-Based Search for Mobile Applications

Considering the wide offer of mobile applications available nowadays, effective search engines are imperative for an user to find applications that provide a specific desired functionality. Retrieval approaches that leverage topic similarity between queries and applications have shown promising results in previous studies. However, the search engines used by most app stores are based on keyword-matching and boosting. In this paper, we explore means to include topic information in such approaches, in order to improve their ability to retrieve relevant applications for non-exact queries, without impairing their computational performance. More specifically, we create topic models specialized on application descriptions and explore how the most relevant terms for each topic covered by an application can be used to complement the information provided by its description. Our experiments show that, although these topic keywords are not able to provide all the information of the topic model, they provide a sufficiently informative summary of the topics covered by the descriptions, leading to improved performance.

[1]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[2]  Víctor Fresno-Fernández,et al.  Integrating the Probabilistic Models BM25/BM25F into Lucene , 2009, ArXiv.

[3]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[4]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[5]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[6]  ZaragozaHugo,et al.  The Probabilistic Relevance Framework , 2009 .

[7]  James Allan,et al.  A Comparative Study of Utilizing Topic Models for Information Retrieval , 2009, ECIR.

[8]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[9]  Tao Tao,et al.  A formal study of information retrieval heuristics , 2004, SIGIR '04.

[10]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[11]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[12]  Nargis Pervin,et al.  Mobilewalla: A Mobile Application Search Engine , 2011, MobiCASE.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Sangaralingam Kajanan,et al.  A Mobile App Search Engine , 2013, Mob. Networks Appl..

[15]  Haohong Wang,et al.  Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval , 2015, SIGIR.

[16]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[17]  Yi Fang,et al.  Mobile App Retrieval for Social Media Users via Inference of Implicit Intent in Social Media Text , 2016, CIKM.

[18]  Long Jin,et al.  Semantic Matching in APP Search , 2015, WSDM.