Scale-Free, Attributed and Class-Assortative Graph Generation to Facilitate Introspection of Graph Neural Networks

Semi-supervised node classification on graphs is a complex interplay between graph structure, node features and class-assortative (homophilic) properties, and the flexibility of a model to capture these nuances. Modern datasets used to push the frontier for such tasks exhibit diverse properties across these aspects, making it challenging to study how these properties individually and jointly influence performance of modern methods like graph neural networks (GNNs). In this work, we propose an intuitive and flexible scalefree graph generation model, CaBaM, which enables simulation of class-assortative and attributed graphs via thewell-known BarabasiAlbertmodel.We show empirically and theoretically how ourmodel can easily describe a variety of graph types, while imbuing the generated graphs with the necessary ingredients for attribute, topology, and label-aware semi-supervised node-classification. We hope our work illustrates the need for graph generation and provides a stepping stone compensating for the lack of manipulability offered in common public graph dataset benchmarks. We also hope this inspires future work towards (a) more principled evaluation and study of GNNs, specifically their sensitivity to varying assortativity and attribute distributions, and (b) development of GNN architectures which facilitate graph context-awareness in line with these properties.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  Tom A. B. Snijders,et al.  Markov Chain Monte Carlo Estimation of Exponential Random Graph Models , 2002, J. Soc. Struct..

[3]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[4]  Christos Faloutsos,et al.  A General Suspiciousness Metric for Dense Blocks in Multimodal Data , 2015, 2015 IEEE International Conference on Data Mining.

[5]  I. M. Sokolov,et al.  Construction and properties of assortative random networks , 2004 .

[6]  Emily Cox Pahnke,et al.  Understanding network formation in strategy research: Exponential random graph models , 2016 .

[7]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[8]  Hamed Nilforoshan,et al.  SliceNDice: Mining Suspicious Multi-Attribute Entity Groups with Multi-View Graphs , 2019, 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[9]  Mark Heimann,et al.  Distribution of Node Embeddings as Multiresolution Features for Graphs , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[10]  Jennifer Neville,et al.  Attributed graph models: modeling network structure with correlated attributes , 2014, WWW.

[11]  Tim Weninger,et al.  Modeling Graphs with Vertex Replacement Grammars , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[12]  Danai Koutra,et al.  Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms , 2011, ECML/PKDD.

[13]  Ah Chung Tsoi,et al.  Computational Capabilities of Graph Neural Networks , 2009, IEEE Transactions on Neural Networks.

[14]  Markus Strohmaier,et al.  Homophily influences ranking of minorities in social networks , 2018, Scientific Reports.

[15]  Yaron Lipman,et al.  Provably Powerful Graph Networks , 2019, NeurIPS.

[16]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[17]  Ulrik Brandes,et al.  What is network science? , 2013, Network Science.

[18]  Christos Faloutsos,et al.  Top-N recommendation through belief propagation , 2012, CIKM.

[19]  Jingrui He,et al.  DEMO-Net: Degree-specific Graph Neural Networks for Node and Graph Classification , 2019, KDD.

[20]  M. Winlaw,et al.  An In-Depth Analysis of the Chung-Lu Model , 2015 .

[21]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[22]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[23]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Christos Faloutsos,et al.  Netprobe: a fast and scalable system for fraud detection in online auction networks , 2007, WWW '07.

[25]  A. Martin-Löf,et al.  Generating Simple Random Graphs with Prescribed Degree Distribution , 2006, 1509.06985.

[26]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[27]  G. Caldarelli,et al.  Preferential attachment in the growth of social networks, the Internet encyclopedia wikipedia , 2007 .

[28]  Alexander A. Alemi,et al.  Watch Your Step: Learning Node Embeddings via Graph Attention , 2017, NeurIPS.

[29]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Tamara G. Kolda,et al.  A Scalable Generative Graph Model with Community Structure , 2013, SIAM J. Sci. Comput..

[31]  Beom Jun Kim,et al.  Growing scale-free networks with tunable clustering. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[33]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[34]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[35]  J. Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[36]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[37]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[38]  Kristina Lerman,et al.  MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing , 2019, ICML.

[39]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[40]  Jennifer Neville,et al.  Incorporating Assortativity and Degree Dependence into Scalable Network Models , 2015, AAAI.

[41]  Matthew E. Brashears,et al.  Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications , 2014 .

[42]  E. Papalexakis,et al.  HiJoD: Semi-Supervised Multi-aspect Detection of Misinformation using Hierarchical Joint Decomposition , 2020, ECML/PKDD.

[43]  Guido Caldarelli,et al.  Social network growth with assortative mixing , 2004 .

[44]  Xiao Huang,et al.  Label Informed Attributed Network Embedding , 2017, WSDM.

[45]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[46]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[47]  Evangelos E. Papalexakis,et al.  Semi-supervised Content-Based Detection of Misinformation via Tensor Embeddings , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).