Emerging App Issue Identification via Online Joint Sentiment-Topic Tracing

Millions of mobile apps are available in app stores, such as Apple's App Store and Google Play. For a mobile app, it would be increasingly challenging to stand out from the enormous competitors and become prevalent among users. Good user experience and well-designed functionalities are the keys to a successful app. To achieve this, popular apps usually schedule their updates frequently. If we can capture the critical app issues faced by users in a timely and accurate manner, developers can make timely updates, and good user experience can be ensured. There exist prior studies on analyzing reviews for detecting emerging app issues. These studies are usually based on topic modeling or clustering techniques. However, the short-length characteristics and sentiment of user reviews have not been considered. In this paper, we propose a novel emerging issue detection approach named MERIT to take into consideration the two aforementioned characteristics. Specifically, we propose an Adaptive Online Biterm Sentiment-Topic (AOBST) model for jointly modeling topics and corresponding sentiments that takes into consideration app versions. Based on the AOBST model, we infer the topics negatively reflected in user reviews for one app version, and automatically interpret the meaning of the topics with most relevant phrases and sentences. Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT in identifying emerging app issues, improving the state-of-the-art method by 22.3% in terms of F1-score. In terms of efficiency, MERIT can return results within acceptable time.

[1]  Zhoujun Li,et al.  Emerging topic detection for organizations from microblogs , 2013, SIGIR.

[2]  Christos Faloutsos,et al.  Why people hate your app: making sense of user feedback in a mobile app store , 2013, KDD.

[3]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[4]  Gabriele Bavota,et al.  Pattern-Based Mining of Opinions in Q&A Websites , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[5]  Koushik Sen,et al.  When deep learning met code search , 2019, ESEC/SIGSOFT FSE.

[6]  M. Narasimha Murty,et al.  On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations , 2010, PAKDD.

[7]  Harald C. Gall,et al.  What would users change in my app? summarizing app reviews for recommending software changes , 2016, SIGSOFT FSE.

[8]  Mehmet A. Orgun,et al.  A survey on real-time event detection from the Twitter data stream , 2018, J. Inf. Sci..

[9]  Jieming Zhu,et al.  PAID: Prioritizing app issues for developers by tracking user reviews over versions , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[12]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[13]  ChengXiang Zhai,et al.  Automatic labeling of multinomial topic models , 2007, KDD '07.

[14]  Abdolreza Abhari,et al.  Cluster-discovery of Twitter messages for event detection and trending , 2015, J. Comput. Sci..

[15]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[16]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Jianxin Li,et al.  Bursty event detection from microblog: a distributed and incremental approach , 2016, Concurr. Comput. Pract. Exp..

[19]  Michael R. Lyu,et al.  Online App Review Analysis for Identifying Emerging Issues , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[20]  Alain Abran,et al.  A systematic literature review: Opinion mining studies from mobile app store user reviews , 2017, J. Syst. Softw..

[21]  Walid Maalej,et al.  How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews , 2014, 2014 IEEE 22nd International Requirements Engineering Conference (RE).

[22]  Harald C. Gall,et al.  Recommending and Localizing Change Requests for Mobile Apps Based on User Reviews , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[23]  Foutse Khomh,et al.  Opiner: An opinion search and summarization engine for APIs , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[24]  Elisabeth Platzer,et al.  Opportunities of automated motive-based user review analysis in the context of mobile app acceptance , 2011 .

[25]  Jun'ichi Tsujii,et al.  A Latent Concept Topic Model for Robust Topic Inference Using Word Embeddings , 2016, ACL.

[26]  Minhaz Fahim Zibran,et al.  Leveraging Automated Sentiment Analysis in Software Engineering , 2017, 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR).

[27]  Mark Harman,et al.  Causal impact analysis for app releases in google play , 2016, SIGSOFT FSE.

[28]  Chunyan Miao,et al.  Generative Topic Embedding: a Continuous Representation of Documents , 2016, ACL.

[29]  Hui Xu,et al.  AR-Tracker: Track the Dynamics of Mobile Apps via User Review Mining , 2015, 2015 IEEE Symposium on Service-Oriented System Engineering.

[30]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[31]  Nicole Novielli,et al.  Sentiment Polarity Detection for Software Development , 2017, Empirical Software Engineering.

[32]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[33]  S. Ejaz Ahmed Effect Sizes for Research: A Broad Application Approach , 2006, Technometrics.

[34]  Yuanyuan Zhang,et al.  A Survey of App Store Analysis for Software Engineering , 2017, IEEE Transactions on Software Engineering.

[35]  Xiaodong Gu,et al.  "What Parts of Your Apps are Loved by Users?" (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[36]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[37]  Peter J. Bentley,et al.  Investigating app store ranking algorithms using a simulation of mobile app ecosystems , 2013, 2013 IEEE Congress on Evolutionary Computation.

[38]  Aixin Sun,et al.  Topic Modeling for Short Texts with Auxiliary Word Embeddings , 2016, SIGIR.

[39]  Maleknaz Nayebi,et al.  Release Practices for Mobile Apps -- What do Users and Developers Think? , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[40]  Tunga Güngör,et al.  Part-of-Speech Tagging , 2005 .

[41]  Harald C. Gall,et al.  Exploring the integration of user feedback in automated testing of Android applications , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[42]  Daniel Barbará,et al.  On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[43]  Ahmed E. Hassan,et al.  Fresh apps: an empirical study of frequently-updated mobile apps in the Google play store , 2015, Empirical Software Engineering.

[44]  Tung Thanh Nguyen,et al.  Phrase-based extraction of user opinions in mobile app reviews , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[45]  Ahmed E. Hassan,et al.  A survey on the use of topic models when mining software repositories , 2015, Empirical Software Engineering.

[46]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[47]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[48]  Ning Chen,et al.  AR-miner: mining informative reviews for developers from mobile app marketplace , 2014, ICSE.

[49]  Tiago P. Peixoto,et al.  A network approach to topic models , 2017, Science Advances.

[50]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[51]  Ahmed E. Hassan,et al.  What Do Mobile App Users Complain About? , 2015, IEEE Software.

[52]  Walid Maalej,et al.  Bug report, feature request, or simply praise? On automatically classifying app reviews , 2015, 2015 IEEE 23rd International Requirements Engineering Conference (RE).

[53]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[54]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[55]  Xiaohui Yan,et al.  A biterm topic model for short texts , 2013, WWW.

[56]  Tung Thanh Nguyen,et al.  Mining User Opinions in Mobile App Reviews: A Keyword-Based Approach (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[57]  Maleknaz Nayebi,et al.  Analysis of marketed versus not-marketed mobile app releases , 2016 .

[58]  Vivek Kumar Rangarajan Sridhar,et al.  Unsupervised Topic Modeling for Short Texts Using Distributed Representations of Words , 2015, VS@HLT-NAACL.

[59]  Di Jiang,et al.  Latent Topic Embedding , 2016, COLING.

[60]  Michael R. Lyu,et al.  Emerging App Issue Identification from User Feedback: Experience on WeChat , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[61]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[62]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[63]  Xiaohui Yan,et al.  A Probabilistic Model for Bursty Topic Discovery in Microblogs , 2015, AAAI.

[64]  Xiuzhen Zhang,et al.  A probabilistic method for emerging topic tracking in Microblog stream , 2016, World Wide Web.

[65]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[66]  Eric P. Xing,et al.  Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream , 2010, UAI.

[67]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[68]  Rachel Harrison,et al.  Retrieving and analyzing mobile apps feature requests from online reviews , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[69]  Xiaodong Gu,et al.  Deep Code Search , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[70]  Diana Inkpen,et al.  Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[71]  Ying Zou,et al.  Too Many User-Reviews! What Should App Developers Look at First? , 2019, IEEE Transactions on Software Engineering.

[72]  Philip J. Guo,et al.  Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[73]  Michael R. Lyu,et al.  Experience Report: Understanding Cross-Platform App Issues from User Reviews , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[74]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[75]  Bastin Tony Roy Savarimuthu,et al.  Attributes that Predict which Features to Fix: Lessons for App Store Mining , 2017, EASE.

[76]  Rajarshi Das,et al.  Gaussian LDA for Topic Models with Word Embeddings , 2015, ACL.

[77]  Mia Hubert,et al.  Robust statistics for outlier detection , 2011, WIREs Data Mining Knowl. Discov..

[78]  Premkumar T. Devanbu,et al.  When would this bug get reported? , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[79]  Weizhong Zhao,et al.  A heuristic approach to determine an appropriate number of topics in topic modeling , 2015, BMC Bioinformatics.