Automatic Top-Down Role Engineering Framework Using Natural Language Processing Techniques

A challenging problem in managing large networks is the complexity of security administration. Role Based Access Control RBAC is the most well-known access control model in diverse enterprises of all sizes because of its ease of administration as well as economic benefits it provides. Deploying such system requires identifying a complete set of roles which are correct and efficient. This process, called role engineering, has been identified as one of the most expensive tasks in migrating to RBAC. Numerous bottom-up, top-down, and hybrid role mining approaches have been proposed due to increased interest in role engineering in recent years. In this paper, we propose a new top-down role engineering approach and take the first step towards extracting access control policies from unrestricted natural language requirements documents. Most organizations have high-level requirement specifications that include a set of access control policies which describes allowable operations for the system. It is very time consuming, labor-intensive, and error-prone to manually sift through these natural language documents to identify and extract access control policies. We propose to use natural language processing techniques, more specifically Semantic Role Labeling SRL to automatically extract access control policies from these documents, define roles, and build an RBAC system. By successfully applying semantic role labeling to identify predicate-argument structure, and using a set of predefined rules on the extracted arguments, we were able correctly identify access control policies with a precision of 79%, recall of 88%, and $$ F_{1} $$ score of 82%.

[1]  Rolf Schwitter,et al.  Controlled Natural Languages for Knowledge Representation , 2010, COLING.

[2]  Noah A. Smith,et al.  Automatic Categorization of Privacy Policies: A Pilot Study , 2012 .

[3]  Jon Doyle,et al.  Semantic parameterization: A process for modeling domain descriptions , 2008, TSEM.

[4]  Clare-Marie Karat,et al.  An empirical study of natural language parsing of privacy policy rules using the SPARCLE policy workbench , 2006, SOUPS '06.

[5]  Gerhard Schimpf,et al.  Process-oriented approach for role-finding to implement role-based security administration in a large industrial organization , 2000, RBAC '00.

[6]  Annie I. Antón,et al.  Analyzing Regulatory Rules for Privacy and Security Requirements , 2008, IEEE Transactions on Software Engineering.

[7]  Clare-Marie Karat,et al.  Designing Natural Language and Structured Entry Methods for Privacy Policy Authoring , 2005, INTERACT.

[8]  Stanley M. Sutton,et al.  Text2Test: Automated Inspection of Natural Language Use Cases , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[9]  Tanja Samardzic,et al.  Lemmatisation as a Tagging Task , 2012, ACL.

[10]  Jorge Lobo,et al.  Mining Roles with Multiple Objectives , 2010, TSEC.

[11]  Annie I. Antón,et al.  Requirements-based Access Control Analysis and Policy Specification (ReCAPS) , 2009, Inf. Softw. Technol..

[12]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[13]  Tao Xie,et al.  Automated extraction of security policies from natural-language software documents , 2012, SIGSOFT FSE.

[14]  Jakub Piskorski,et al.  Information Extraction: Past, Present and Future , 2013, Multi-source, Multilingual Information Extraction and Summarization.

[15]  Leonardo A. Martucci,et al.  Formal definitions for usable access control rule sets from goals to metrics , 2013, SOUPS.

[16]  S. Abirami,et al.  Conceptual modeling of natural language functional requirements , 2014, J. Syst. Softw..

[17]  L. Johnson,et al.  Minimum Security Requirements for Federal Information and Information Systems , 2006 .

[18]  E. B. Fernandez,et al.  Determining role rights from use cases , 1997, RBAC '97.

[19]  Laurie A. Williams,et al.  Relation extraction for inferring access control rules from natural language artifacts , 2014, ACSAC.

[20]  Robert W. Reeder,et al.  Soups 2005 , 2005, IEEE Secur. Priv..

[21]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[22]  Ravi S. Sandhu,et al.  Role-Based Access Control Models , 1996, Computer.

[23]  David W. Chadwick,et al.  Expressions of expertness: the virtuous circle of natural language for access control policy specification , 2008, SOUPS '08.

[24]  Mark Strembeck,et al.  Deriving role engineering artifacts from business processes and scenario models , 2011, SACMAT '11.

[25]  Clare-Marie Karat,et al.  Usable security and privacy: a case study of developing privacy management tools , 2005, SOUPS '05.

[26]  Christopher D. Manning Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? , 2011, CICLing.

[27]  Branimir Boguraev,et al.  The talent system: TEXTRACT architecture and data model , 2003, HLT-NAACL 2003.

[28]  Joachim M. Buhmann,et al.  A class of probabilistic models for role engineering , 2008, CCS.

[29]  Annie I. Antón,et al.  Analyzing goal semantics for rights, permissions, and obligations , 2005, 13th IEEE International Conference on Requirements Engineering (RE'05).

[30]  Branimir Boguraev,et al.  Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser , 1996, COLING.

[31]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[32]  Annie I. Antón,et al.  Legal requirements acquisition for the specification of legally compliant information systems , 2009 .

[33]  David F. Ferraiolo,et al.  Guide to Attribute Based Access Control (ABAC) Definition and Considerations , 2014 .

[34]  Annie I. Antón,et al.  Deriving semantic models from privacy policies , 2005, Sixth IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY'05).

[35]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[36]  Michael P. Gallaher,et al.  Planning Report 02-1: The Economic Impact of Role-Based Access Control | NIST , 2002 .

[37]  Micha Elsner,et al.  EM Works for Pronoun Anaphora Resolution , 2009, EACL.

[38]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[39]  M. Gallaher,et al.  The Economic Impact of Role-Based Access Control , 2002 .

[40]  Joachim M. Buhmann,et al.  Role Mining with Probabilistic Models , 2013, TSEC.

[41]  E. Letier,et al.  Goal-Oriented Elaboration of Security Requirements , 2001 .

[42]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.