Identifying networks with common organizational principles

Many complex systems can be represented as networks, and the problem of network comparison is becoming increasingly relevant. There are many techniques for network comparison, from simply comparing network summary statistics to sophisticated but computationally costly alignment-based approaches. Yet it remains challenging to accurately cluster networks that are of a different size and density, but hypothesized to be structurally similar. In this paper, we address this problem by introducing a new network comparison methodology that is aimed at identifying common organizational principles in networks. The methodology is simple, intuitive and applicable in a wide variety of settings ranging from the functional classification of proteins to tracking the evolution of a world trade network.

[1]  Ping Zhu,et al.  A study of graph spectra for comparing graphs and trees , 2008, Pattern Recognit..

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Mario Thüne,et al.  Eigenvalues of Matrices and Graphs , 2012 .

[4]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[5]  Bruce A. Reed,et al.  A Critical Point for Random Graphs with a Given Degree Sequence , 1995, Random Struct. Algorithms.

[6]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[7]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[8]  I. Ispolatov,et al.  Duplication-divergence model of protein interaction network. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[10]  Jukka-Pekka Onnela,et al.  Taxonomies of networks from community structure. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Zoran Levnajic,et al.  Revealing the Hidden Language of Complex Networks , 2014, Scientific Reports.

[12]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[13]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[14]  Janez Demsar,et al.  A combinatorial approach to graphlet counting , 2014, Bioinform..

[15]  Hongdong Li,et al.  Kernel Methods on Riemannian Manifolds with Gaussian RBF Kernels , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Carey E. Priebe,et al.  Statistical inference for network samples using subgraph counts , 2017, ArXiv.

[17]  Gueorgi Kossinets,et al.  Empirical Analysis of an Evolving Social Network , 2006, Science.

[18]  Anatol E. Wegner,et al.  Subgraph covers - An information theoretic approach to motif analysis in networks , 2014, ArXiv.

[19]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wayne B. Hayes,et al.  SANA: simulated annealing far outperforms many other search algorithms for biological network alignment , 2017, Bioinform..

[21]  Karsten M. Borgwardt,et al.  Halting in Random Walk Kernels , 2015, NIPS.

[22]  Bonnie Berger,et al.  Global alignment of multiple protein interaction networks with application to functional orthology detection , 2008, Proceedings of the National Academy of Sciences.

[23]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[24]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[25]  Gesine Reinert,et al.  How threshold behaviour affects the use of subgraphs for network comparison , 2010, Bioinform..

[26]  B. Bollobás The evolution of random graphs , 1984 .

[27]  Haiyuan Yu,et al.  HINT: High-quality protein interactomes and their applications in understanding human disease , 2012, BMC Systems Biology.

[28]  Alexandre d'Aspremont,et al.  Support vector machine classification with indefinite kernels , 2007, Math. Program. Comput..

[29]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[30]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[31]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[32]  Anirban Banerjee,et al.  On the spectrum of the normalized graph Laplacian , 2007, 0705.3772.

[33]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[34]  J. Dall,et al.  Random geometric graphs. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[36]  Charles D. Bernholz,et al.  The United Nations Commodity Trade Statistics Database (UN Comtrade) , 2004 .

[37]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[38]  G. Reinert,et al.  Stein’s method for the bootstrap , 2004 .

[39]  P. Bickel,et al.  Subsampling bootstrap of count features of networks , 2013, 1312.2645.

[40]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[41]  A Masoudi-Nejad,et al.  Building blocks of biological networks: a review on major network motif discovery algorithms. , 2012, IET systems biology.

[42]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[43]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[44]  Natasa Przulj,et al.  Integrative network alignment reveals large regions of global network similarity in yeast and human , 2011, Bioinform..

[45]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[46]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[47]  P. Uetz,et al.  The binary protein-protein interaction landscape of Escherichia coli , 2014, Nature Biotechnology.

[48]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[49]  Behnam Neyshabur,et al.  NETAL: a new graph-based method for global alignment of protein-protein interaction networks , 2013, Bioinform..

[50]  P. Stadler,et al.  Spectral classes of regular, random, and empirical graphs , 2014, 1406.6454.

[51]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[52]  Richard P. Brent,et al.  An Algorithm with Guaranteed Convergence for Finding a Zero of a Function , 1971, Comput. J..

[53]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[54]  Gesine Reinert,et al.  Alignment-free protein interaction network comparison , 2014, Bioinform..

[55]  Jinbo Xu,et al.  HubAlign: an accurate and efficient method for global alignment of protein–protein interaction networks , 2014, Bioinform..

[56]  Desmond J. Higham,et al.  Fitting a geometric graph to a protein-protein interaction network , 2008, Bioinform..

[57]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[58]  Jukka-Pekka Onnela,et al.  Feature-Based Classification of Networks , 2016, ArXiv.

[59]  Gesine Reinert,et al.  Comparison of large networks with sub-sampling strategies , 2016, Scientific Reports.

[60]  E. N. Gilbert,et al.  Random Plane Networks , 1961 .

[61]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[62]  Hans-Peter Kriegel,et al.  Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks , 2006, Pacific Symposium on Biocomputing.

[63]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[64]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[65]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[66]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .