Towards Permission Request Prediction on Mobile Apps via Structure Feature Learning

The popularity of mobile apps has posed severe privacy risks to users because many permissions are over-claimed. In this work, we explore the techniques that can automatically predict the permission requests of a new mobile app based on its functionality and textual description information, which can help users to be aware of the privacy risks of mobile apps. Our framework formalizes the permission prediction problem as a multi-label learning problem, where a regularized structure feature learning framework is utilized to automatically capture the relations among textual descriptions, permissions, and app category. The permission prediction result can be automatically learned using our approach. We evaluate our approach on 173 permission requests from 11,067 mobile apps across 30 categories. Extensive experiment results indicate that our method consistently provides better performance (3%-5% performance improvement in terms of F1 score), when compared to the other state-of-the-art methods.

[1]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[2]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[3]  Ninghui Li,et al.  Using probabilistic generative models for ranking risks of Android apps , 2012, CCS.

[4]  Chris H. Q. Ding,et al.  Efficient Algorithms for Selecting Features with Arbitrary Group Constraints via Group Lasso , 2013, 2013 IEEE 13th International Conference on Data Mining.

[5]  Inderjit S. Dhillon,et al.  Large-scale Multi-label Learning with Missing Labels , 2013, ICML.

[6]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[7]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[8]  Guanhua Yan,et al.  Exploring Discriminatory Features for Automated Malware Classification , 2013, DIMVA.

[9]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[10]  Feiping Nie,et al.  Exclusive Feature Learning on Arbitrary Structures via \ell_{1, 2}-norm , 2014, NIPS.

[11]  Dawn Xiaodong Song,et al.  Mining Permission Request Patterns from Android and Facebook Applications , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  Jieping Ye,et al.  Moreau-Yosida Regularization for Grouped Tree Structure Learning , 2010, NIPS.

[13]  Paul C. van Oorschot,et al.  A methodology for empirical analysis of permission-based security models and its application to android , 2010, CCS '10.

[14]  Chris H. Q. Ding,et al.  Multi-label ReliefF and F-statistic feature selections for image annotation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[16]  Joshua Goodman,et al.  Online Discriminative Spam Filter Training , 2006, CEAS.

[17]  Steve Hanna,et al.  Android permissions demystified , 2011, CCS '11.

[18]  Jieping Ye,et al.  A shared-subspace learning framework for multi-label classification , 2010, TKDD.

[19]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[20]  Guanhua Yan,et al.  Discriminant malware distance learning on structural information for automated malware classification , 2013, SIGMETRICS.

[21]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Lei Cen,et al.  Personalized Mobile App Recommendation: Reconciling App Functionality and User Privacy Preference , 2015, WSDM.

[23]  Patrick D. McDaniel,et al.  On lightweight mobile phone application certification , 2009, CCS.

[24]  Luo Si,et al.  Mobile App Security Risk Assessment: A Crowdsourcing Ranking Approach from User Comments , 2015, SDM.

[25]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[26]  Tao Xie,et al.  WHYPER: Towards Automating Risk Assessment of Mobile Applications , 2013, USENIX Security Symposium.

[27]  Volker Tresp,et al.  Multi-label informed latent semantic indexing , 2005, SIGIR '05.

[28]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[29]  Pamela J. Wisniewski,et al.  Designing the default privacy settings for facebook applications , 2014, CSCW Companion.

[30]  Yuval Elovici,et al.  Applying Behavioral Detection on Android-Based Devices , 2010, MOBILWARE.

[31]  ZhouZhi-Hua,et al.  Multilabel dimensionality reduction via dependence maximization , 2010 .