Robust Classification of Information Networks by Consistent Graph Learning

Graph regularization-based methods have achieved great success for network classification by making the label-link consistency assumption, i.e., if two nodes are linked together, they are likely to belong to the same class. However, in a real-world network, there exist links that connect nodes of different classes. These inconsistent links raise a big challenge for graph regularization and deteriorate the classification performance significantly. To address this problem, we propose a novel algorithm, namely Consistent Graph Learning, which is robust to the inconsistent links of a network. In particular, given a network and a small number of labeled nodes, we aim at learning a consistent network with more consistent and fewer inconsistent links than the original network. Since the link information of a network is naturally represented by a set of relation matrices, the learning of a consistent network is reduced to learning consistent relation matrices under some constraints. More specifically, we achieve it by joint graph regularization on the nuclear norm minimization of consistent relation matrices together with ℓ1-norm minimization on the difference matrices between the original relation matrices and the learned consistent ones subject to certain constraints. Experiments on both homogeneous and heterogeneous network datasets show that the proposed method outperforms the state-of-the-art methods.

[1]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[2]  Jie Tang,et al.  Combining link and content for collective active learning , 2010, CIKM.

[3]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[4]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.

[5]  Feiping Nie,et al.  Forging The Graphs: A Low Rank and Positive Semidefinite Graph Learning Approach , 2012, NIPS.

[6]  Jiawei Han,et al.  Towards feature selection in network , 2011, CIKM '11.

[7]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[8]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[9]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[10]  Zhihua Zhang,et al.  Probabilistic Relational PCA , 2009, NIPS.

[11]  L. Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[14]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[15]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[16]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[17]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[18]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[19]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[20]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[21]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.