Automated extraction of attributes from natural language attribute-based access control (ABAC) Policies

The National Institute of Standards and Technology (NIST) has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy (NLACP) to a machine-readable form. To study the automation process, we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations. Therefore, this paper focuses on the questions of: how can we automatically infer the hierarchical structure of an ABAC model given NLACPs; and, how can we extract and define the set of authorization attributes based on the resulting structure. To address these questions, we propose an approach built upon recent advancements in natural language processing and machine learning techniques. For such a solution, the lack of appropriate data often poses a bottleneck. Therefore, we decouple the primary contributions of this work into: (1) developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts, and (2) generating a set of realistic synthetic natural language access control policies (NLACPs) to evaluate the proposed framework. Our experimental results are promising as we achieved - in average - an F1-score of 0.96 when extracting attributes values of subjects, and 0.91 when extracting the values of objects’ attributes from natural language access control policies.

[1]  Jian Su,et al.  Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel , 2006, NAACL.

[2]  Bernd Freisleben,et al.  Work in Progress: K-Nearest Neighbors Techniques for ABAC Policies Clustering , 2016, ABAC '16.

[3]  Laurie A. Williams,et al.  Relation extraction for inferring access control rules from natural language artifacts , 2014, ACSAC.

[4]  Sylvia L. Osborn,et al.  HGABAC: Towards a Formal Model of Hierarchical Attribute-Based Access Control , 2014, FPS.

[5]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[6]  F. J. Pelletier The Principle of Semantic Compositionality , 1994 .

[7]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[8]  Vijayalakshmi Atluri,et al.  Poster: Constrained Policy Mining in Attribute Based Access Control , 2017, SACMAT.

[9]  David Brossard,et al.  A Systematic Approach to Implementing ABAC , 2017, ABAC '17.

[10]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11]  Antonio Liotta,et al.  Towards ABAC Policy Mining from Logs with Deep Learning , 2015 .

[12]  Chunyu Kit,et al.  Tokenization as the Initial Phase in NLP , 1992, COLING.

[13]  Eric Medvet,et al.  Evolutionary Inference of Attribute-Based Access Control Policies , 2015, EMO.

[14]  Virginia Teller Review of Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition by Daniel Jurafsky and James H. Martin. Prentice Hall 2000. , 2000 .

[15]  Hassan Takabi,et al.  Towards a Top-down Policy Engineering Framework for Attribute-based Access Control , 2017, SACMAT.

[16]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[17]  Cecilia Ovesdotter Alm,et al.  Generating Clinically Relevant Texts: A Case Study on Life-Changing Events , 2016, CLPsych@HLT-NAACL.

[18]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19]  Tao Xie,et al.  Automated extraction of security policies from natural-language software documents , 2012, SIGSOFT FSE.

[20]  Hassan Takabi,et al.  Automatic Extraction of Access Control Policies from Natural Language Documents , 2020, IEEE Transactions on Dependable and Secure Computing.

[21]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[22]  Christopher Potts,et al.  Mittens: an Extension of GloVe for Learning Domain-Specialized Representations , 2018, NAACL.

[23]  David F. Ferraiolo,et al.  Guide to Attribute Based Access Control (ABAC) Definition and Considerations , 2014 .

[24]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[25]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[26]  Simon Fong,et al.  DBSCAN: Past, present and future , 2014, The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014).

[27]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[28]  Scott D. Stoller,et al.  Mining Attribute-Based Access Control Policies , 2013, IEEE Transactions on Dependable and Secure Computing.

[29]  Scott D. Stoller,et al.  Mining Hierarchical Temporal Roles with Multiple Metrics , 2016, DBSec.

[30]  Richard Johansson,et al.  Dependency-based Semantic Role Labeling of PropBank , 2008, EMNLP.

[31]  Hassan Takabi,et al.  Automatic Top-Down Role Engineering Framework Using Natural Language Processing Techniques , 2015, WISTP.

[32]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[33]  Phil Blunsom,et al.  Recurrent Convolutional Neural Networks for Discourse Compositionality , 2013, CVSM@ACL.

[34]  Hassan Takabi,et al.  Identification of Access Control Policy Sentences from Natural Language Policy Documents , 2017, DBSec.

[35]  Vijayalakshmi Atluri,et al.  The Role Hierarchy Mining Problem: Discovery of Optimal Role Hierarchies , 2008, 2008 Annual Computer Security Applications Conference (ACSAC).

[36]  Manar Alohaly,et al.  A Deep Learning Approach for Extracting Attributes of ABAC Policies , 2018, SACMAT.

[37]  Claire Cardie,et al.  SimCompass: Using Deep Learning Word Embeddings to Assess Cross-level Similarity , 2014, *SEMEVAL.

[38]  James Allen,et al.  From Adjective Glosses to Attribute Concepts: Learning Different Aspects That an Adjective Can Describe , 2015, IWCS.

[39]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[40]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[41]  Ronald C. Turner Proposed Model for Natural Language ABAC Authoring , 2017, ABAC '17.

[42]  Veenu Mangat,et al.  Evaluation of text document clustering approach based on particle swarm optimization , 2013, Central European Journal of Computer Science.

[43]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[44]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[45]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[46]  Scott D. Stoller,et al.  Mining Attribute-Based Access Control Policies from Logs , 2014, DBSec.

[47]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[48]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[49]  Hassan Takabi,et al.  Towards an Automatic Top-down Role Engineering Approach Using Natural Language Processing Techniques , 2015, SACMAT.

[50]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[51]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[52]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[53]  Scott D. Stoller,et al.  Mining attribute-based access control policies from RBAC policies , 2013, 2013 10th International Conference and Expo on Emerging Technologies for a Smarter World (CEWIT).

[54]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[55]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[56]  Joshua Glasser,et al.  Bridging the Gap: A Pragmatic Approach to Generating Insider Threat Data , 2013, 2013 IEEE Security and Privacy Workshops.

[57]  Michal Daszykowski,et al.  Revised DBSCAN algorithm to cluster data with dense adjacent clusters , 2013 .

[58]  Ryma Abassi,et al.  XML access control: from XACML to annotated schemas , 2010, The Second International Conference on Communications and Networking.

[59]  M. Sonia Evaluation of text document clustering approach based on Bees Algorithm , 2017 .

[60]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[61]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[62]  Jing Jiang,et al.  Information Extraction from Text , 2012, Mining Text Data.

[63]  Hans-Peter Kriegel,et al.  DBSCAN Revisited, Revisited , 2017, ACM Trans. Database Syst..

[64]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[65]  Amirreza Masoumzadeh,et al.  Mining Positive and Negative Attribute-Based Access Control Policy Rules , 2018, SACMAT.

[66]  Richard van de Stadt CyberChair: A Web-Based Groupware Application to Facilitate the Paper Reviewing Process , 2012, ArXiv.