AC-Net: Assessing the Consistency of Description and Permission in Android Apps

With Android applications (apps) becoming increasingly popular, there exist huge risks lurking in the app marketplaces as most malicious software attempt to collect users’ private information without their awareness. Although these apps request users’ authorization for permissions, the users can still face privacy leakage issues due to their limited knowledge in distinguishing permissions. Thus, accurate and automatic permission checking is necessary and important for users’ privacy protection. According to previous studies, analyzing app descriptions is a helpful way to examine whether some permissions are required for apps. Different from those studies, we consider app permissions from a more fine-grained perspective and aim at predicting the multiple correspondent permissions to one sentence of app description. In this paper, we propose an end-to-end framework for assessing the consistency between descriptions and permissions, named Assessing Consistency based on neural Network (AC-Net). For evaluation, a new dataset involving the description-to-permission correspondences of 1415 popular Android apps was built. The experiments demonstrate that AC-Net significantly outperforms the state-of-the-art method by over 24.5% in accurately predicting permissions from descriptions.

[1]  Lei Cen,et al.  AUTOREB: Automatically Understanding the Review-to-Behavior Fidelity in Android Applications , 2015, CCS.

[2]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[3]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[4]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[5]  Zhong Chen,et al.  AutoCog: Measuring the Description-to-permission Fidelity in Android Applications , 2014, CCS.

[6]  Alireza Sahami Shirazi,et al.  Large-scale assessment of mobile notifications , 2014, CHI.

[7]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[8]  Alessandra Gorla,et al.  Checking app behavior against app descriptions , 2014, ICSE.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  Premkumar T. Devanbu,et al.  Are deep neural networks the best choice for modeling source code? , 2017, ESEC/SIGSOFT FSE.

[12]  Fredric C. Gey,et al.  Probabilistic retrieval based on staged logistic regression , 1992, SIGIR '92.

[13]  Jingzheng Wu,et al.  PACS: Pemission abuse checking system for android applictions based on review mining , 2017, 2017 IEEE Conference on Dependable and Secure Computing.

[14]  Zibin Zheng,et al.  MalPat: Mining Patterns of Malicious and Benign Android Apps via Permission-Related APIs , 2018, IEEE Transactions on Reliability.

[15]  Yan Chen,et al.  Uranine: Real-time Privacy Leakage Monitoring without System Modification for Android , 2015, SecureComm.

[16]  Gianluca Stringhini,et al.  Permissions snapshots: Assessing users' adaptation to the Android runtime permission model , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[17]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[18]  L. Cranor,et al.  Curbing Android Permission Creep , 2011 .

[19]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[20]  Angelos Stavrou,et al.  Analysis of Android Applications' Permissions , 2012, 2012 IEEE Sixth International Conference on Software Security and Reliability Companion.

[21]  Myra B. Cohen,et al.  Piecing together app behavior from multiple artifacts: A case study , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[22]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[23]  Wenpeng Yin,et al.  Comparative Study of CNN and RNN for Natural Language Processing , 2017, ArXiv.

[24]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[25]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[26]  Ram Krishnan,et al.  Toward a Framework for Detecting Privacy Policy Violations in Android Application Code , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[27]  David A. Wagner,et al.  Android permissions: user attention, comprehension, and behavior , 2012, SOUPS.

[28]  Tao Zhang,et al.  AutoPPG: Towards Automatic Generation of Privacy Policy for Android Applications , 2015, SPSM@CCS.

[29]  Mitsuaki Akiyama,et al.  Understanding the Inconsistencies between Text Descriptions and the Use of Privacy-sensitive Resources of Mobile Apps , 2015, SOUPS.

[30]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[31]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[32]  Ali Sunyaev,et al.  Availability and quality of mobile health app privacy policies , 2015, J. Am. Medical Informatics Assoc..

[33]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[34]  Mu Zhang,et al.  Towards Automatic Generation of Security-Centric Descriptions for Android Apps , 2015, CCS.

[35]  Nina Taft,et al.  Exploring decision making with Android's runtime permission dialogs using in-context surveys , 2017, SOUPS.

[36]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[37]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[38]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[39]  Christopher D. Manning,et al.  Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[40]  Malcolm Hall,et al.  ProtectMyPrivacy: detecting and mitigating privacy leaks on iOS devices using crowdsourcing , 2013, MobiSys '13.

[41]  David A. Wagner,et al.  Android Permissions Remystified: A Field Study on Contextual Integrity , 2015, USENIX Security Symposium.

[42]  Hareton K. N. Leung,et al.  Enhancing the Description-to-Behavior Fidelity in Android Apps with Privacy Policy , 2018, IEEE Transactions on Software Engineering.

[43]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[44]  Yuan Zhang,et al.  Vetting undesirable behaviors in android apps with permission use analysis , 2013, CCS.

[45]  Tao Xie,et al.  WHYPER: Towards Automating Risk Assessment of Mobile Applications , 2013, USENIX Security Symposium.

[46]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[47]  Douglas Kline,et al.  Revisiting squared-error and cross-entropy functions for training neural network classifiers , 2005, Neural Computing & Applications.

[48]  Asad Waqar Malik,et al.  Classification and Mapping of Adaptive Security for Mobile Computing , 2020, IEEE Transactions on Emerging Topics in Computing.

[49]  Ivan Martinovic,et al.  SecuRank: Starving Permission-Hungry Apps Using Contextual Permission Analysis , 2016, SPSM@CCS.

[50]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[51]  Tim Menzies,et al.  Easy over hard: a case study on deep learning , 2017, ESEC/SIGSOFT FSE.

[52]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[53]  Haoyu Wang,et al.  Using text mining to infer the purpose of permission use in mobile apps , 2015, UbiComp.

[54]  Muttukrishnan Rajarajan,et al.  Android Security: A Survey of Issues, Malware Penetration, and Defenses , 2015, IEEE Communications Surveys & Tutorials.

[55]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[56]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[57]  David A. Wagner,et al.  I've got 99 problems, but vibration ain't one: a survey of smartphone users' concerns , 2012, SPSM '12.

[58]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.