Exploit Label Embeddings for Enhancing Network Classification

Learning representations for network has aroused great research interests in recent years. Existing approaches embed vertices into a low dimensional continuous space which encodes local or global network structures. While these methods show improvements over traditional representations, they do not utilize the label information until the learned embeddings are used for training classifier. That is, the process of representation learning is separated from the labels and thus is unsupervised. In this paper, we propose a novel method which learns the embeddings for vertices under the supervision of labels. The key idea is to regard the label as the context of a vertex. More specifically, we attach a true or virtual label node for each training or test sample, and update the embeddings for vertices and labels to maximize the probability of both the neighbors and their labels in the context. We conduct extensive experiments on three real datasets. Results demonstrate that our method outperforms the state-of-the-art approaches by a large margin.

[1]  L. Bottou Stochastic Gradient Learning in Neural Networks , 1991 .

[2]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[3]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[4]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[5]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Huan Liu,et al.  Scalable learning of collective behavior based on sparse social dimensions , 2009, CIKM.

[8]  Mukund Balasubramanian,et al.  The Isomap Algorithm and Topological Stability , 2002, Science.

[9]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[10]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[11]  Alexander J. Smola,et al.  Distributed large-scale natural graph factorization , 2013, WWW.

[12]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[15]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.