Classification on Large Networks: A Quantitative Bound via Motifs and Graphons

When each data point is a large graph, graph statistics such as densities of certain subgraphs (motifs) can be used as feature vectors for machine learning. While intuitive, motif counts are expensive to compute and difficult to work with theoretically. Via graphon theory, we give an explicit quantitative bound for the ability of motif homomorphisms to distinguish large networks under both generative and sampling noise. Furthermore, we give similar bounds for the graph spectrum and connect it to homomorphism densities of cycles. This results in an easily computable classifier on graph data with theoretical performance guarantee. Our method yields competitive results on classification tasks for the autoimmune disease Lupus Erythematosus.

[1]  László Lovász,et al.  Limits of dense graph sequences , 2004, J. Comb. Theory B.

[2]  V. Sós,et al.  Convergent Sequences of Dense Graphs II. Multiway Cuts and Statistical Physics , 2012 .

[3]  Daniel M. Roy,et al.  The Class of Random Graphs Arising from Exchangeable Random Measures , 2015, ArXiv.

[4]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[5]  László Lovász,et al.  Large Networks and Graph Limits , 2012, Colloquium Publications.

[6]  F. Arnett,et al.  NEUROPSYCHIATRIC MANIFESTATIONS OF SYSTEMIC LUPUS ERYTHEMATOSUS: DIAGNOSIS, CLINICAL SPECTRUM, AND RELATIONSHIP TO OTHER FEATURES OF THE DISEASE , 1976, Medicine.

[7]  Gemma C. Garriga,et al.  Permutation Tests for Studying Classifier Performance , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[8]  Pierre Alliez,et al.  Signing the Unsigned: Robust Surface Reconstruction from Raw Pointsets , 2010, Comput. Graph. Forum.

[9]  J. Cooper,et al.  Theory of Approximation , 1960, Mathematical Gazette.

[10]  V. Sós,et al.  Convergent Sequences of Dense Graphs I: Subgraph Frequencies, Metric Properties and Testing , 2007, math/0702004.

[11]  D. Aldous Representations for partially exchangeable arrays of random variables , 1981 .

[12]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[13]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[14]  Kaspar Riesen,et al.  Graph Classification and Clustering Based on Vector Space Embedding , 2010, Series in Machine Perception and Artificial Intelligence.

[15]  P. Basser,et al.  In vivo fiber tractography using DT‐MRI data , 2000, Magnetic resonance in medicine.

[16]  Alexander A. Sherstov Making polynomials robust to noise , 2012, STOC '12.

[17]  Edoardo M. Airoldi,et al.  Stochastic blockmodel approximation of a graphon: Theory and consistent estimation , 2013, NIPS.

[18]  Bernhard Schölkopf,et al.  BundleMAP: Anatomically Localized Features from dMRI for Detection of Disease , 2015, MLMI.

[19]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[20]  A. Tsybakov,et al.  Oracle inequalities for network models and sparse graphon estimation , 2015, 1507.04118.

[21]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[22]  Yufei Zhao,et al.  An $L^p$ theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions , 2014, Transactions of the American Mathematical Society.

[23]  M. Torrens Co-Planar Stereotaxic Atlas of the Human Brain—3-Dimensional Proportional System: An Approach to Cerebral Imaging, J. Talairach, P. Tournoux. Georg Thieme Verlag, New York (1988), 122 pp., 130 figs. DM 268 , 1990 .

[24]  B. Parlett,et al.  Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices , 2004 .

[25]  Roman Garnett,et al.  Efficient Graph Kernels by Randomization , 2012, ECML/PKDD.

[26]  P. Wolfe,et al.  Nonparametric graphon estimation , 2013, 1309.5936.

[27]  A. Guillin,et al.  On the rate of convergence in Wasserstein distance of the empirical measure , 2013, 1312.2128.

[28]  Christian Borgs,et al.  Private Graphon Estimation for Sparse Graphs , 2015, NIPS.

[29]  S. V. N. Vishwanathan,et al.  Fast Computation of Graph Kernels , 2006, NIPS.

[30]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[31]  C. Villani Optimal Transport: Old and New , 2008 .

[32]  Joseph Horowitz,et al.  Mean rates of convergence of empirical measures in the Wasserstein metric , 1994 .

[33]  Thomas Schultz,et al.  Diminished white matter integrity in patients with systemic lupus erythematosus , 2014, NeuroImage: Clinical.

[34]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[35]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[36]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[37]  V. Sós,et al.  GRAPH LIMITS AND EXCHANGEABLE RANDOM GRAPHS , 2008 .

[38]  Marleen de Bruijne,et al.  Scalable kernels for graphs with continuous attributes , 2013, NIPS.

[39]  Trevor Campbell,et al.  Edge-exchangeable graphs and sparsity , 2016, NIPS.

[40]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .