EASEAndroid: Automatic Policy Analysis and Refinement for Security Enhanced Android via Large-Scale Semi-Supervised Learning

Mandatory protection systems such as SELinux and SEAndroid harden operating system integrity. Unfortunately, policy development is error prone and requires lengthy refinement using audit logs from deployed systems. While prior work has studied SELinux policy in detail, SEAndroid is relatively new and has received little attention. SEAndroid policy engineering differs significantly from SELinux: Android fundamentally differs from traditional Linux; the same policy is used on millions of devices for which new audit logs are continually available; and audit logs contain a mix of benign and malicious accesses. In this paper, we propose EASEAndroid, the first SEAndroid analytic platform for automatic policy analysis and refinement. Our key insight is that the policy refinement process can be modeled and automated using semi-supervised learning. Given an existing policy and a small set of known access patterns, EASEAndroid continually expands the knowledge base as new audit logs become available, producing suggestions for policy refinement. We evaluate EASEAndroid on 1.3 million audit logs from real-world devices. EASEAndroid successfully learns 2,518 new access patterns and generates 331 new policy rules. During this process, EASEAndroid discovers eight categories of attack access patterns in real devices, two of which are new attacks directly against the SEAndroid MAC mechanism.

[1]  Trent Jaeger,et al.  Resolving constraint conflicts , 2004, SACMAT '04.

[2]  J. Bentley A survey of techniques for fixed radius near neighbor searching. , 1975 .

[3]  Daniel F. Sterne,et al.  A Domain and Type Enforcement UNIX Prototype , 1995, Comput. Syst..

[4]  Seungjin Choi,et al.  Supervised Learning , 2009, Encyclopedia of Biometrics.

[5]  Hong Chen,et al.  Analyzing and Comparing the Protection Quality of Security Enhanced Operating Systems , 2009, NDSS.

[6]  Trent Jaeger,et al.  A logical specification and analysis for SELinux MLS policy , 2007, SACMAT '07.

[7]  Fabio Roli,et al.  Bagging Classifiers for Fighting Poisoning Attacks in Adversarial Classification Tasks , 2011, MCS.

[8]  Trent Jaeger,et al.  Analyzing Integrity Protection in the SELinux Example Policy , 2003, USENIX Security Symposium.

[9]  Trent Jaeger,et al.  PRIMA: policy-reduced integrity measurement architecture , 2006, SACMAT '06.

[10]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[11]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[12]  Quan Chen,et al.  Hypervision Across Worlds: Real-time Kernel Protection from the ARM TrustZone Secure World , 2014, CCS.

[13]  Yajin Zhou,et al.  Dissecting Android Malware: Characterization and Evolution , 2012, 2012 IEEE Symposium on Security and Privacy.

[14]  Zhuoqing Morley Mao,et al.  Peeking into Your App without Actually Seeing It: UI State Inference and Novel Android Attacks , 2014, USENIX Security Symposium.

[15]  Trent Jaeger,et al.  Toward Automated Information-Flow Integrity Verification for Security-Critical Applications , 2006, NDSS.

[16]  C. R. Ramakrishnan,et al.  Policy Analysis for Administrative Role Based Access Control , 2006, CSFW.

[17]  Somesh Jha,et al.  Retrofitting legacy code for authorization policy enforcement , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[18]  Stephen Smalley,et al.  Integrating Flexible Support for Security Policies into the Linux Operating System , 2001, USENIX Annual Technical Conference, FREENIX Track.

[19]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[20]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[21]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[22]  Toshihiro Yamauchi,et al.  SEEdit: SELinux Security Policy Configuration System with Higher Level Language , 2009, LISA.

[23]  Linda Markowsky Towards Making SELinux Smart: Leveraging SELinux to Protect End Nodes in a Federated Environment , 2012 .

[24]  Luigi V. Mancini,et al.  Towards a formal model for security policies specification and validation in the selinux system , 2004, SACMAT '04.

[25]  Jerome H. Saltzer,et al.  The protection of information in computer systems , 1975, Proc. IEEE.

[26]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[27]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[28]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[29]  John D. Ramsdell,et al.  Guided Policy Generation for Application , 2006 .

[30]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[31]  Mohamed Shehab,et al.  A learning-based approach for SELinux policy optimization with type mining , 2010, CSIIRW '10.

[32]  Stephen Smalley,et al.  The Inevitability of Failure: The Flawed Assumption of Security in Modern Computing Environments , 2000 .

[33]  Jean-Pierre Seifert,et al.  Usage control platformization via trustworthy SELinux , 2008, ASIACCS '08.

[34]  Stephen Smalley,et al.  Security Enhanced (SE) Android: Bringing Flexible MAC to Android , 2013, NDSS.

[35]  Gail-Joon Ahn,et al.  Visualization based policy analysis: case study in SELinux , 2008, SACMAT '08.