Guided learning for role discovery (GLRD): framework, algorithms, and applications

Role discovery in graphs is an emerging area that allows analysis of complex graphs in an intuitive way. In contrast to community discovery, which finds groups of highly connected nodes, role discovery finds groups of nodes that share similar topological structure in the graph, and hence a common role (or function) such as being a broker or a periphery node. However, existing work so far is completely unsupervised, which is undesirable for a number of reasons. We provide an alternating least squares framework that allows convex constraints to be placed on the role discovery problem, which can provide useful supervision. In particular we explore supervision to enforce i) sparsity, ii) diversity, and iii) alternativeness in the roles. We illustrate the usefulness of this supervision on various data sets and applications.

[1]  L. Trefethen,et al.  Numerical linear algebra , 1997 .

[2]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[3]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[4]  Ian Davidson,et al.  Constrained Clustering: Advances in Algorithms, Theory, and Applications , 2008 .

[5]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[6]  Christos Faloutsos,et al.  It's who you know: graph mining using recursive structural features , 2011, KDD.

[7]  Ian Davidson,et al.  Finding Alternative Clusterings Using Constraints , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Sanjay Ranka,et al.  Mixture models for learning low-dimensional roles in high-dimensional data , 2010, KDD '10.

[9]  N. Sidiropoulos,et al.  Least squares algorithms under unimodality and non‐negativity constraints , 1998 .

[10]  Jun Liu,et al.  Efficient Euclidean projections in linear time , 2009, ICML '09.

[11]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[12]  Ian Davidson,et al.  A principled and flexible framework for finding alternative clusterings , 2009, KDD.

[13]  Tamara G. Kolda,et al.  MATLAB Tensor Toolbox , 2006 .

[14]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[15]  Ian Davidson,et al.  Behavioral event data and their analysis , 2012, Data Mining and Knowledge Discovery.

[16]  Fei Wang,et al.  Community discovery using nonnegative matrix factorization , 2011, Data Mining and Knowledge Discovery.

[17]  Danai Koutra,et al.  RolX: structural role extraction & mining in large graphs , 2012, KDD.

[18]  Tao Li,et al.  The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[19]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[20]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[21]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[22]  Xuelong Li,et al.  Constrained Nonnegative Matrix Factorization for Image Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Mirella Lapata,et al.  Semi-Supervised Semantic Role Labeling , 2009, EACL.

[24]  Marc Teboulle,et al.  Mirror descent and nonlinear projected subgradient methods for convex optimization , 2003, Oper. Res. Lett..