ReACP: A Semi-Automated Framework for Reverse-engineering and Testing of Access Control Policies of Web Applications

This technical report details our a semi-automated framework for the reverseengineering and testing of access control (AC) policies for web-based applications. In practice, AC specifications are often missing or poorly documented, leading to AC vulnerabilities. Our goal is to learn and recover AC policies from implementation, and assess them to find AC issues. Built on top of a suite of security tools, our framework automatically explores a system under test, mines domain input specifications from access request logs, and then, generates and executes more access requests using combinatorial test generation. We apply machine learning on the obtained data to characterise relevant attributes that influence access control to learn policies. Finally, the inferred policies are used for detecting AC issues, being vulnerabilities or implementation errors. We have evaluated our framework on three open-source applications with respect to correctness and completeness. The results are very promising in terms of the quality of inferred policies, more than 94% of them are correct with respect to implemented AC mechanisms. The remaining incorrect policies are mainly due to our unrefined permission classification. Moreover, a careful analysis of these policies has revealed 92 vulnerabilities, many of them are new.

[1]  Vijayalakshmi Atluri,et al.  Role-based Access Control , 1992 .

[2]  Laurie A. Williams,et al.  Access Control Policy Extraction from Unconstrained Natural Language Text , 2013, 2013 International Conference on Social Computing.

[3]  Donald Kossmann,et al.  AJAXSearch: crawling, indexing and searching web 2.0 applications , 2008, Proc. VLDB Endow..

[4]  James R. Cordy,et al.  Recovering Role-Based Access Control Security Models from Dynamic Web Applications , 2012, ICWE.

[5]  Myra B. Cohen,et al.  Practical Combinatorial Interaction Testing: Empirical Findings on Efficiency and Early Fault Detection , 2015, IEEE Transactions on Software Engineering.

[6]  Myra B. Cohen,et al.  Learning Combinatorial Interaction Test Generation Strategies Using Hyperheuristic Search , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[7]  Henning Fernau,et al.  Algorithms for Learning Regular Expressions , 2005, ALT.

[8]  George Noseevich,et al.  Detecting Insufficient Access Control in Web Applications , 2011, 2011 First SysSec Workshop.

[9]  David F. Ferraiolo,et al.  Guide to Attribute Based Access Control (ABAC) Definition and Considerations , 2014 .

[10]  Ramaswamy Chandramouli,et al.  Role-Based Access Control (2nd ed.) , 2007 .

[11]  Eric Medvet,et al.  Automatic generation of regular expressions from examples with genetic programming , 2012, GECCO '12.

[12]  Arie van Deursen,et al.  Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes , 2012, TWEB.

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Lionel C. Briand,et al.  Automated Inference of Access Control Policies for Web Applications , 2015, SACMAT.

[15]  James R. Cordy,et al.  Automated Reverse Engineering of UML Sequence Diagrams for Dynamic Web Applications , 2009, 2009 International Conference on Software Testing, Verification, and Validation Workshops.

[16]  Marc Najork,et al.  Web Crawling , 2010, Found. Trends Inf. Retr..

[17]  Alan C. O'Connor,et al.  2010 economic analysis of role-based access control. Final report , 2010 .

[18]  Tao Xie,et al.  Automated extraction of security policies from natural-language software documents , 2012, SIGSOFT FSE.

[19]  Ravi S. Sandhu,et al.  Role-Based Access Control Models , 1996, Computer.

[20]  Jun Yan,et al.  Backtracking Algorithms and Search Heuristics to Generate Test Suites for Combinatorial Testing , 2006, 30th Annual International Computer Software and Applications Conference (COMPSAC'06).

[21]  Emil C. Lupu,et al.  A Survey of Policy Specification Approaches , 2002 .

[22]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[23]  Paolo Tonella,et al.  Dynamic model extraction and statistical analysis of Web applications: Follow-up after 6 years , 2008, 2008 10th International Symposium on Web Site Evolution.

[24]  Giuliano Antoniol,et al.  An approach for reverse engineering of web-based applications , 2001, Proceedings Eighth Working Conference on Reverse Engineering.

[25]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[26]  Mike Shema,et al.  Hacking Exposed Web Applications , 2010 .

[27]  Jacques Klein,et al.  Bypassing the Combinatorial Explosion: Using Similarity to Generate and Prioritize T-Wise Test Configurations for Software Product Lines , 2014, IEEE Transactions on Software Engineering.

[28]  Yoonsik Cheon,et al.  PWiseGen: Generating test cases for pairwise testing using genetic algorithms , 2011, 2011 IEEE International Conference on Computer Science and Automation Engineering.

[29]  Matthias Grochtmann,et al.  Classification trees for partition testing , 1993, Softw. Test. Verification Reliab..