Mobile App Retrieval for Social Media Users via Inference of Implicit Intent in Social Media Text

People often implicitly or explicitly express their needs in social media in the form of "user status text". Such text can be very useful for service providers and product manufacturers to proactively provide relevant services or products that satisfy people's immediate needs. In this paper, we study how to infer a user's intent based on the user's "status text" and retrieve relevant mobile apps that may satisfy the user's needs. We address this problem by framing it as a new entity retrieval task where the query is a user's status text and the entities to be retrieved are mobile apps. We first propose a novel approach that generates a new representation for each query. Our key idea is to leverage social media to build parallel corpora that contain implicit intention text and the corresponding explicit intention text. Specifically, we model various user intentions in social media text using topic models, and we predict user intention in a query that contains implicit intention. Then, we retrieve relevant mobile apps with the predicted user intention. We evaluate the mobile app retrieval task using a new data set we create. Experiment results indicate that the proposed model is effective and outperforms the state-of-the-art retrieval models.

[1]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[2]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[3]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[4]  ChengXiang Zhai,et al.  Term feedback for information retrieval with language models , 2007, SIGIR.

[5]  Eugene Agichtein,et al.  Ready to buy or just browsing?: detecting web searcher goals from interaction data , 2010, SIGIR.

[6]  Ying Li,et al.  Detecting online commercial intention (OCI) , 2006, WWW '06.

[7]  Ting Liu,et al.  Mining Intention-Related Products on Online Q&A Community , 2015, Journal of Computer Science and Technology.

[8]  Douglas W. Oard,et al.  A comparative study of query and document translation for cross-language information retrieval , 1998, AMTA.

[9]  Jason Nieh,et al.  A measurement study of google play , 2014, SIGMETRICS '14.

[10]  Thomas Gottron,et al.  Bad news travel fast: a content-based analysis of interestingness on Twitter , 2011, WebSci '11.

[11]  Markus Strohmaier,et al.  Towards linking buyers and sellers: detecting commercial Intent on twitter , 2013, WWW.

[12]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[13]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[14]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[15]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[16]  Gao Cong,et al.  Mining User Intents in Twitter: A Semi-Supervised Approach to Inferring Intent Categories for Tweets , 2015, AAAI.

[17]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[18]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[19]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[20]  Huahai Yang,et al.  Identifying User Needs from Social Media , 2013 .

[21]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[22]  Jian-Yun Nie,et al.  Mining User Consumption Intention from Social Media Using Domain Adaptive Convolutional Neural Network , 2015, AAAI.

[23]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[24]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[25]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[26]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[27]  W. Bruce Croft,et al.  Cross-lingual relevance models , 2002, SIGIR '02.

[28]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[29]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[30]  Reiner Kraft,et al.  Mining anchor text for query refinement , 2004, WWW '04.

[31]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[32]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[33]  Kristian J. Hammond,et al.  Watson: Anticipating and Contextualizing Information Needs , 1999 .

[34]  D. Watson,et al.  Toward a consensual structure of mood. , 1985, Psychological bulletin.

[35]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[36]  Haohong Wang,et al.  Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval , 2015, SIGIR.

[37]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[38]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[39]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[40]  Max Kaufmann Syntactic Normalization of Twitter Messages , 2010 .

[41]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[42]  Ricardo Baeza-Yates,et al.  Predicting The Next App That You Are Going To Use , 2015, WSDM.

[43]  Charles L. A. Clarke,et al.  Classifying and Characterizing Query Intent , 2009, ECIR.

[44]  P. Gloor,et al.  Predicting Stock Market Indicators Through Twitter “I hope it is not as bad as I fear” , 2011 .