Mobile App Tagging

Mobile app tagging aims to assign a list of keywords indicating core functionalities, main contents, key features or concepts of a mobile app. Mobile app tags can be potentially useful for app ecosystem stakeholders or other parties to improve app search, browsing, categorization, and advertising, etc. However, most mainstream app markets, e.g., Google Play, Apple App Store, etc., currently do not explicitly support such tags for apps. To address this problem, we propose a novel auto mobile app tagging framework for annotating a given mobile app automatically, which is based on a search-based annotation paradigm powered by machine learning techniques. Specifically, given a novel query app without tags, our proposed framework (i) first explores online kernel learning techniques to retrieve a set of top-N similar apps that are semantically most similar to the query app from a large app repository; and (ii) then mines the text data of both the query app and the top-N similar apps to discover the most relevant tags for annotating the query app. To evaluate the efficacy of our proposed framework, we conduct an extensive set of experiments on a large real-world dataset crawled from Google Play. The encouraging results demonstrate that our technique is effective and promising.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Ying He,et al.  Retrieval-Based Face Annotation by Weak Label Regularized Local Coordinate Coding , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Wang-Chien Lee,et al.  App recommendation: a contest between satisfaction and temptation , 2013, WSDM.

[4]  Rong Jin,et al.  Online Multiple Kernel Similarity Learning for Visual Search. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[5]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[8]  Long Jin,et al.  Semantic Matching in APP Search , 2015, WSDM.

[9]  Chunyan Miao,et al.  Online multimodal deep similarity learning with application to image retrieval , 2013, ACM Multimedia.

[10]  Ying He,et al.  Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation , 2011, IEEE Transactions on Knowledge and Data Engineering.

[11]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[12]  Wei Xu,et al.  Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent , 2011, ArXiv.

[13]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[14]  Hui Xiong,et al.  Mobile app recommendations with security and privacy awareness , 2014, KDD.

[15]  Mehryar Mohri,et al.  Two-Stage Learning Kernel Algorithms , 2010, ICML.

[16]  Ning Chen,et al.  SimApp: A Framework for Detecting Similar Mobile Applications by Online Kernel Learning , 2015, WSDM.

[17]  Ning Chen,et al.  AR-miner: mining informative reviews for developers from mobile app marketplace , 2014, ICSE.

[18]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[20]  Rong Jin,et al.  Online Multiple Kernel Classification , 2013, Machine Learning.

[21]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Chong-Wah Ngo,et al.  On the Annotation of Web Videos by Efficient Near-Duplicate Search , 2010, IEEE Transactions on Multimedia.

[23]  Steven C. H. Hoi,et al.  LIBOL: a library for online learning algorithms , 2014, J. Mach. Learn. Res..

[24]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[25]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Vincent Ng,et al.  Automatic Keyphrase Extraction: A Survey of the State of the Art , 2014, ACL.

[28]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .