Unsupervised Link Selection in Networks

Real-world networks are often noisy, and the existing linkage structure may not be reliable. For example, a link which connects nodes from different communities may affect the group assignment of nodes in a negative way. In this paper, we study a new problem called link selection, which can be seen as the network equivalent of the traditional feature selection problem in machine learning. More specifically, we investigate unsupervised link selection as follows: given a network, it selects a subset of informative links from the original network which enhance the quality of community structures. To achieve this goal, we use Ratio Cut size of a network as the quality measure. The resulting link selection approach can be formulated as a semi-definite programming problem. In order to solve it efficiently, we propose a backward elimination algorithm using sequential optimization. Experiments on benchmark network datasets illustrate the effectiveness of our method.

[1]  Frank Harary,et al.  Graph Theory , 2016 .

[2]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  Y. Yao,et al.  Detecting Network Cliques with Radon Basis Pursuit , 2012, AISTATS.

[5]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[6]  S. Eisenstat,et al.  A Stable and Efficient Algorithm for the Rank-One Modification of the Symmetric Eigenproblem , 1994, SIAM J. Matrix Anal. Appl..

[7]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[8]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[9]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[10]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[11]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[12]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[13]  Randy Goebel,et al.  Detecting Communities in Social Networks Using Max-Min Modularity , 2009, SDM.

[14]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[15]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[16]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[17]  Xiaowei Xu,et al.  SCAN: a structural clustering algorithm for networks , 2007, KDD '07.

[18]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Stephen P. Boyd Convex optimization of graph Laplacian eigenvalues , 2006 .

[21]  Huan Liu,et al.  Feature Selection with Linked Data in Social Media , 2012, SDM.

[22]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Mark E. J. Newman,et al.  An efficient and principled method for detecting communities in networks , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Lior Wolf,et al.  Feature selection for unsupervised and supervised inference: the emergence of sparsity in a weighted-based approach , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Jiawei Han,et al.  Towards feature selection in network , 2011, CIKM '11.