An Accurate and Scalable Role Mining Algorithm based on Graph Embedding and Unsupervised Feature Learning

Role-based access control (RBAC) is one of the most widely authorization models used by organizations. In RBAC, accesses are controlled based on the roles of users within the organization. The flexibility and usability of RBAC have encouraged organizations to migrate from traditional discretionary access control (DAC) models to RBAC. The most challenging step in this migration is role mining, which is the process of extracting meaningful roles from existing access control lists. Although various approaches have been proposed to address this NP-complete role mining problem in the literature, they either suffer from low scalability, or present heuristics that suffer from low accuracy. In this paper, we propose an accurate and scalable approach to the role mining problem. To this aim, we represent user-permission assignments as a bipartite graph where nodes are users and permissions, and edges are user-permission assignments. Next, we introduce an efficient deep learning algorithm based on random walk sampling to learn low-dimensional representations of the graph, such that permissions that are assigned to similar users are closer in this new space. Then, we use k-means and GMM clustering techniques to cluster permission nodes into roles. We show the effectiveness of our proposed approach by testing it on different datasets. Experimental results show that our approach performs accurate role mining, even for large datasets.

[1]  Vijayalakshmi Atluri,et al.  Optimal Boolean Matrix Decomposition: Application to Role Engineering , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[2]  Jorge Lobo,et al.  Mining roles with semantic meanings , 2008, SACMAT '08.

[3]  Vijayalakshmi Atluri,et al.  The role mining problem: A formal perspective , 2010, TSEC.

[4]  Ulrike Steffens,et al.  Role mining with ORCA , 2005, SACMAT '05.

[5]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[6]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[7]  Robert E. Tarjan,et al.  Fast exact and heuristic methods for role minimization problems , 2008, SACMAT '08.

[8]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[9]  Jaideep Vaidya,et al.  RoleMiner: mining roles using subset enumeration , 2006, CCS '06.

[10]  Mohamed Shehab,et al.  Towards a General Framework for Optimal Role Mining: A Constraint Satisfaction Approach , 2015, SACMAT.

[11]  Trupti M. Kodinariya,et al.  Review on determining number of Cluster in K-Means Clustering , 2013 .

[12]  Axel Bucker Identity Management Design Guide With IBM Tivoli Identity Manager , 2005 .

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Jorge Lobo,et al.  Evaluating role mining algorithms , 2009, SACMAT '09.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[17]  Bin Jiang,et al.  Clustering Uncertain Data Based on Probability Distribution Similarity , 2013, IEEE Transactions on Knowledge and Data Engineering.

[18]  R. L. Thorndike Who belongs in the family? , 1953 .

[19]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[20]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[21]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[22]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[23]  Talaat M. Wahbi,et al.  E-government Security Models , 2016 .

[24]  Hassan Takabi,et al.  StateMiner: an efficient similarity-based approach for optimal mining of role hierarchy , 2010, SACMAT '10.

[25]  Vijayalakshmi Atluri,et al.  The role mining problem: finding a minimal descriptive set of roles , 2007, SACMAT '07.

[26]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  Vijayalakshmi Atluri,et al.  An optimization framework for role mining , 2014, J. Comput. Secur..