Towards using unstructured user input request for malware detection

Abstract Privacy analysis techniques for mobile apps are mostly based on system-centric data originating from well-defined system API calls. But these apps may also collect sensitive information via their unstructured input sources that elude privacy analysis. The consequence is that users are unable to determine the extent to which apps may impact on their privacy when downloaded and installed on mobile devices. To this end, we present a privacy analysis framework for unstructured input. Our approach leverages app meta-data descriptions and taxonomy of sensitive information, to identify sensitive unstructured user input. The outcome is an understanding of the level of user privacy risk posed by an app based on its unstructured user input request. Subsequently, we evaluate the usefulness of the unstructured sensitive user input for malware detection. We evaluate our methods using 175K benign apps and 175K malware APKs. The outcome highlights that malicious app detector built on unstructured sensitive user achieve an average balance accuracy of 0.996 demonstrated with Trojan-Banker and Trojan-SMS when the malware family and target applications are known and balanced accuracy of 0.70 with generic malware.

[1]  Zhemin Yang,et al.  LeakMiner: Detect Information Leakage on Android with Static Taint Analysis , 2012, 2012 Third World Congress on Software Engineering.

[2]  Adam Doupé,et al.  Deep Android Malware Detection , 2017, CODASPY.

[3]  William Enck,et al.  AppsPlayground: automatic security analysis of smartphone applications , 2013, CODASPY.

[4]  Yvonne Rogers,et al.  From spaces to places: emerging contexts in mobile privacy , 2009, UbiComp.

[5]  Shirley Gregor,et al.  The Nature of Theory in Information Systems , 2006, MIS Q..

[6]  Peng Wang,et al.  Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale , 2015, USENIX Security Symposium.

[7]  Xiangliang Zhang,et al.  Characterizing Android apps' behavior for effective detection of malapps at large scale , 2017, Future Gener. Comput. Syst..

[8]  Jan Muntermann,et al.  A method for taxonomy development and its application in information systems , 2013, Eur. J. Inf. Syst..

[9]  Yajin Zhou,et al.  Systematic Detection of Capability Leaks in Stock Android Smartphones , 2012, NDSS.

[10]  Zhenlong Yuan,et al.  DroidDetector: Android Malware Characterization and Detection Using Deep Learning , 2016 .

[11]  Mila Dalla Preda,et al.  Testing android malware detectors against code obfuscation: a systematization of knowledge and unified methodology , 2016, Journal of Computer Virology and Hacking Techniques.

[12]  T. Holt,et al.  Exploring stolen data markets online: products and market forces , 2010 .

[13]  Jacques Klein,et al.  IccTA: Detecting Inter-Component Privacy Leaks in Android Apps , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  Wenke Lee,et al.  CHEX: statically vetting Android apps for component hijacking vulnerabilities , 2012, CCS.

[15]  G. Loewenstein,et al.  Privacy and human behavior in the age of information , 2015, Science.

[16]  Isil Dillig,et al.  Apposcopy: semantics-based detection of Android malware through static analysis , 2014, SIGSOFT FSE.

[17]  Patrick D. McDaniel,et al.  On lightweight mobile phone application certification , 2009, CCS.

[18]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[19]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[20]  Juan E. Tapiador,et al.  AndrODet: An adaptive Android obfuscation detector , 2019, Future Gener. Comput. Syst..

[21]  Avik Chaudhuri,et al.  SCanDroid: Automated Security Certification of Android , 2009 .

[22]  Xingquan Zhu,et al.  Machine Learning for Android Malware Detection Using Permission and API Calls , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[23]  Peter Carey,et al.  Data Protection: A Practical Guide to UK and EU Law , 2004 .

[24]  Yajin Zhou,et al.  Detecting repackaged smartphone applications in third-party android marketplaces , 2012, CODASPY '12.

[25]  Latifur Khan,et al.  SMV-Hunter: Large Scale, Automated Detection of SSL/TLS Man-in-the-Middle Vulnerabilities in Android Apps , 2014, NDSS.

[26]  Xiangliang Zhang,et al.  Detecting Android malicious apps and categorizing benign apps with ensemble of classifiers , 2018, Future Gener. Comput. Syst..

[27]  Sankardas Roy,et al.  Amandroid: A Precise and General Inter-component Data Flow Analysis Framework for Security Vetting of Android Apps , 2014, CCS.

[28]  Xiangyu Zhang,et al.  SUPOR: Precise and Scalable Sensitive User Input Detection for Android Apps , 2015, USENIX Security Symposium.

[29]  Konrad Rieck,et al.  DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket , 2014, NDSS.

[30]  Byung-Gon Chun,et al.  TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones , 2010, OSDI.

[31]  Sakir Sezer,et al.  A New Android Malware Detection Approach Using Bayesian Classification , 2013, 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).

[32]  Ross J. Anderson,et al.  Aurasium: Practical Policy Enforcement for Android Applications , 2012, USENIX Security Symposium.

[33]  Eul Gyu Im,et al.  A Multimodal Deep Learning Method for Android Malware Detection Using Various Features , 2019, IEEE Transactions on Information Forensics and Security.

[34]  Zhen Huang,et al.  PScout: analyzing the Android permission specification , 2012, CCS.

[35]  Sankardas Roy,et al.  Deep Ground Truth Analysis of Current Android Malware , 2017, DIMVA.

[36]  Bashar Nuseibeh,et al.  Distilling privacy requirements for mobile applications , 2014, ICSE.

[37]  G. Annas HIPAA regulations - a new era of medical-record privacy? , 2003, The New England journal of medicine.

[38]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[39]  Ting Su,et al.  FSMdroid: Guided GUI Testing of Android Apps , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[40]  Eric Bodden,et al.  A Machine-learning Approach for Classifying and Categorizing Android Sources and Sinks , 2014, NDSS.

[41]  Wenke Lee,et al.  Checking More and Alerting Less: Detecting Privacy Leakages via Enhanced Data-flow Analysis and Peer Voting , 2015, NDSS.

[42]  Mansour Ahmadi,et al.  DroidSieve: Fast and Accurate Classification of Obfuscated Android Malware , 2017, CODASPY.

[43]  Vitor Monte Afonso,et al.  Identifying Android malware using dynamically obtained features , 2014, Journal of Computer Virology and Hacking Techniques.

[44]  Vasant Raval,et al.  PCI DSS: Payment Card Industry Data Security Standards in Context , 2008, Comput. Law Secur. Rev..

[45]  Sam Malek,et al.  Lightweight, Obfuscation-Resilient Detection and Family Identification of Android Malware , 2018, ACM Trans. Softw. Eng. Methodol..

[46]  Bernd Freisleben,et al.  Why eve and mallory love android: an analysis of android SSL (in)security , 2012, CCS.

[47]  Alessandra Gorla,et al.  Mining Apps for Abnormal Usage of Sensitive Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[48]  Xiangliang Zhang,et al.  Exploring Permission-Induced Risk in Android Applications for Malicious Application Detection , 2014, IEEE Transactions on Information Forensics and Security.

[49]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[50]  Robert H. Deng,et al.  Comparing Mobile Privacy Protection through Cross-Platform Applications , 2013, NDSS.

[51]  Gianluca Stringhini,et al.  MaMaDroid , 2019, ACM Trans. Priv. Secur..

[52]  Peng Wang,et al.  AsDroid: detecting stealthy behaviors in Android applications by user interface and program behavior contradiction , 2014, ICSE.

[53]  Carl A. Gunter,et al.  Free for All! Assessing User Data Exposure to Advertising Libraries on Android , 2016, NDSS.

[54]  Yang Liu,et al.  From UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[55]  Yajin Zhou,et al.  Owner-Centric Protection of Unstructured Data on Smartphones , 2014, TRUST.

[56]  Martin C. Libicki,et al.  Markets for Cybercrime Tools and Stolen Data: Hackers' Bazaar , 2014 .

[57]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[58]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[59]  Haipeng Cai,et al.  On the Deterioration of Learning-Based Malware Detectors for Android , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[60]  Yidong Li,et al.  DroidEnsemble: Detecting Android Malicious Applications With Ensemble of String and Structural Static Features , 2018, IEEE Access.

[61]  Xiangyu Zhang,et al.  Detecting sensitive data disclosure via bi-directional text correlation analysis , 2016, SIGSOFT FSE.

[62]  Hahn-Ming Lee,et al.  DroidMat: Android Malware Detection through Manifest and API Calls Tracing , 2012, 2012 Seventh Asia Joint Conference on Information Security.

[63]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[64]  Stefan Savage,et al.  An analysis of underground forums , 2011, IMC '11.

[65]  Jacques Klein,et al.  AndroZoo: Collecting Millions of Android Apps for the Research Community , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[66]  Jeff H. Perkins,et al.  Information Flow Analysis of Android Applications in DroidSafe , 2015, NDSS.

[67]  Aristide Fattori,et al.  CopperDroid: Automatic Reconstruction of Android Malware Behaviors , 2015, NDSS.

[68]  Heng Yin,et al.  DroidAPIMiner: Mining API-Level Features for Robust Malware Detection in Android , 2013, SecureComm.

[69]  Hao Chen,et al.  AndroidLeaks: Automatically Detecting Potential Privacy Leaks in Android Applications on a Large Scale , 2012, TRUST.

[70]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[71]  Jacques Klein,et al.  Reflection-aware static analysis of Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[72]  Daniel J. Solove A Taxonomy of Privacy , 2006 .

[73]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[74]  Inah Omoronyia,et al.  Security-oriented view of app behaviour using textual descriptions and user-granted permission requests , 2020, Comput. Secur..

[75]  Heng Yin,et al.  DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis , 2012, USENIX Security Symposium.

[76]  Mu Zhang,et al.  Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs , 2014, CCS.

[77]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[78]  Xiaofeng Wang,et al.  UIPicker: User-Input Privacy Identification in Mobile Applications , 2015, USENIX Security Symposium.

[79]  Yang Liu,et al.  Guided, stochastic model-based GUI testing of Android apps , 2017, ESEC/SIGSOFT FSE.

[80]  Dengfeng Li,et al.  UiRef: analysis of sensitive user inputs in Android applications , 2017, WISEC.

[81]  Yuan Zhang,et al.  AppIntent: analyzing sensitive data transmission in android for privacy leakage detection , 2013, CCS.

[82]  Jacques Klein,et al.  FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps , 2014, PLDI.

[83]  Zhenlong Yuan,et al.  Droid-Sec: deep learning in android malware detection , 2015, SIGCOMM 2015.

[84]  John C. S. Lui,et al.  TaintART: A Practical Multi-level Information-Flow Tracking System for Android RunTime , 2016, CCS.

[85]  Yajin Zhou,et al.  RiskRanker: scalable and accurate zero-day android malware detection , 2012, MobiSys '12.