Joint embedding of structure and features via graph convolutional networks

The creation of social ties is largely determined by the entangled effects of people’s similarities in terms of individual characters and friends. However, feature and structural characters of people usually appear to be correlated, making it difficult to determine which has greater responsibility in the formation of the emergent network structure. We propose AN2VEC, a node embedding method which ultimately aims at disentangling the information shared by the structure of a network and the features of its nodes. Building on the recent developments of Graph Convolutional Networks (GCN), we develop a multitask GCN Variational Autoencoder where different dimensions of the generated embeddings can be dedicated to encoding feature information, network structure, and shared feature-network information. We explore the interaction between these disentangled characters by comparing the embedding reconstruction performance to a baseline case where no shared information is extracted. We use synthetic datasets with different levels of interdependency between feature and network characters and show (i) that shallow embeddings relying on shared information perform better than the corresponding reference with unshared information, (ii) that this performance gap increases with the correlation between network and feature structure, and (iii) that our embedding is able to capture joint information of structure and features. Our method can be relevant for the analysis and prediction of any featured network structure ranging from online social systems to network medicine.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[3]  Yee Whye Teh,et al.  Disentangling Disentanglement in Variational Autoencoders , 2018, ICML.

[4]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[5]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[6]  D. Watts,et al.  Origins of Homophily in an Evolving Social Network1 , 2009, American Journal of Sociology.

[7]  Larry E. Toothaker,et al.  Multiple Regression: Testing and Interpreting Interactions , 1991 .

[8]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[9]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[10]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[11]  Zhao Chen,et al.  GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks , 2017, ICML.

[12]  Gueorgi Kossinets,et al.  Empirical Analysis of an Evolving Social Network , 2006, Science.

[13]  Phi Vu Tran,et al.  Multi-Task Graph Autoencoders , 2018, ArXiv.

[14]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[15]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[16]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[17]  Wei Lu,et al.  Deep Neural Networks for Learning Graph Representations , 2016, AAAI.

[18]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[19]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[20]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[21]  Lise Getoor,et al.  Query-driven Active Surveying for Collective Classification , 2012 .

[22]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[23]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[24]  Peter L. Patrick The speech community , 2008 .

[25]  W. Shrum Friendship in School: Gender and Racial Homophily. , 1988 .

[26]  Jennifer Neville,et al.  Randomization tests for distinguishing social influence and homophily effects , 2010, WWW '10.

[27]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[28]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[29]  Eric Fleury,et al.  Socioeconomic Dependencies of Linguistic Patterns in Twitter: a Multivariate Analysis , 2018, WWW.

[30]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[31]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[32]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[33]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[34]  Marco Cuturi,et al.  Generalizing Point Embeddings using the Wasserstein Space of Elliptical Distributions , 2018, NeurIPS.

[35]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[36]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[37]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[38]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[39]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[40]  Heng Huang,et al.  Deep Attributed Network Embedding , 2018, IJCAI.

[41]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[42]  Santo Fortunato,et al.  Community detection in networks: A user guide , 2016, ArXiv.

[43]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[44]  Jari Saramäki,et al.  Emergence of communities in weighted networks. , 2007, Physical review letters.

[45]  Eric Fleury,et al.  Link Prediction in the Twitter Mention Network: Impacts of Local Structure and Similarity of Interest , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[46]  Ole Tange,et al.  GNU Parallel: The Command-Line Power Tool , 2011, login Usenix Mag..

[47]  José Ignacio Alvarez-Hamelin,et al.  Socioeconomic correlations and stratification in social-communication networks , 2016, Journal of The Royal Society Interface.

[48]  Cheng Li,et al.  DeepCas: An End-to-end Predictor of Information Cascades , 2016, WWW.

[49]  Arun Sundararajan,et al.  Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks , 2009, Proceedings of the National Academy of Sciences.

[50]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[51]  Mike Innes,et al.  Flux: Elegant machine learning with Julia , 2018, J. Open Source Softw..

[52]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[53]  Carl T. Bergstrom,et al.  The map equation , 2009, 0906.1405.

[54]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[55]  Jianmin Wang,et al.  Flexible Attributed Network Embedding , 2018, ArXiv.

[56]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[57]  Tiago P. Peixoto Hierarchical block structures and high-resolution model selection in large networks , 2013, ArXiv.

[58]  Wenwu Zhu,et al.  Deep Variational Network Embedding in Wasserstein Space , 2018, KDD.

[59]  Eric Fleury,et al.  Location, Occupation, and Semantics Based Socioeconomic Status Inference on Twitter , 2018, 2018 IEEE International Conference on Data Mining Workshops (ICDMW).

[60]  Kevin Chen-Chuan Chang,et al.  A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[61]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[62]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[63]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.