Generating a Doppelganger Graph: Resembling but Distinct

Deep generative models, since their inception, have become increasingly more capable of generating novel and perceptually realistic signals (e.g., images and sound waves). With the emergence of deep models for graph structured data, natural interests seek extensions of these generative models for graphs. Successful extensions were seen recently in the case of learning from a collection of graphs (e.g., protein data banks), but the learning from a single graph has been largely under explored. The latter case, however, is important in practice. For example, graphs in financial and healthcare systems contain so much confidential information that their public accessibility is nearly impossible, but open science in these fields can only advance when similar data are available for benchmarking. In this work, we propose an approach to generating a doppelganger graph that resembles a given one in many graph properties but nonetheless can hardly be used to reverse engineer the original one, in the sense of a near zero edge overlap. The approach is an orchestration of graph representation learning, generative adversarial networks, and graph realization algorithms. Through comparison with several graph generative models (either parameterized by neural networks or not), we demonstrate that our result barely reproduces the given graph but closely matches its properties. We further show that downstream tasks, such as node classification, on the generated graphs reach similar performance to the use of the original ones.

[1]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[2]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[3]  Jie Chen,et al.  Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics , 2019, ArXiv.

[4]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[5]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[7]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[8]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[9]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[10]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[11]  Toyotaro Suzumura,et al.  Scalable Graph Learning for Anti-Money Laundering: A First Look , 2018, ArXiv.

[12]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[15]  Renjie Liao,et al.  Efficient Graph Generation with Graph Recurrent Attention Networks , 2019, NeurIPS.

[16]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[17]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[18]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[19]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[20]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[21]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[22]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[23]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[24]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[25]  Bert Huang,et al.  Labeled Graph Generative Adversarial Networks , 2019, ArXiv.

[26]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[27]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[28]  Daniel D. Johnson,et al.  Learning Graphical State Transitions , 2016, ICLR.

[29]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[30]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[31]  Stephan Günnemann,et al.  NetGAN: Generating Graphs via Random Walks , 2018, ICML.

[32]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[33]  Eric P. Xing,et al.  Toward Controlled Generation of Text , 2017, ICML.

[34]  Stefano Ermon,et al.  Graphite: Iterative Generative Modeling of Graphs , 2018, ICML.

[35]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[36]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[37]  Cao Xiao,et al.  Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders , 2018, NeurIPS.

[38]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[39]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[40]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.