Network analysis using entropy component analysis

Structural complexity measures have found widespread use in network analysis. For instance, entropy can be used to distinguish between different structures. Recently we have reported an approximate network von Neumann entropy measure, which can be conveniently expressed in terms of the degree configurations associated with the vertices that define the edges in both undirected and directed graphs. However, this analysis was posed at the global level, and did not consider in detail how the entropy is distributed across edges. The aim in this paper is to use our previous analysis to define a new characterization of network structure, which captures the distribution of entropy across the edges of a network. Since our entropy is defined in terms of vertex degree values defining an edge, we can histogram the edge entropy using a multi-dimensional array for both undirected and directed networks. Each edge in a network increments the contents of the appropriate bin in the histogram, indexed according to the degree pair in an undirected graph or the in/out-degree quadruple for a directed graph. We normalize the resulting histograms and vectorize them to give network feature vectors reflecting the distribution of entropy across the edges of the network. By performing principal component analysis (PCA) on the feature vectors for samples, we embed populations of graphs into a low-dimensional space. We explore a number of variants of this method, including using both fixed and adaptive binning over edge vertex degree combinations, using both entropy weighted and raw bin-contents, and using multi-linear principal component analysis (MPCA), aimed at extracting the tensorial structure of high-dimensional data, as an alternative to classical PCA for component analysis. We apply the resulting methods to the problem of graph classification, and compare the results obtained to those obtained using some alternative state-of-the-art methods on real-world data.

[1]  Lukasz Kaiser,et al.  Entanglement and the complexity of directed graphs , 2012, Theor. Comput. Sci..

[2]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[3]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[4]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[5]  Simone Severini,et al.  Quantifying Complexity in Networks: The von Neumann Entropy , 2009, Int. J. Agent Technol. Syst..

[6]  Edwin R. Hancock,et al.  Graph Characterization via Ihara Coefficients , 2011, IEEE Transactions on Neural Networks.

[7]  Edwin R. Hancock,et al.  Spectral embedding of graphs , 2003, Pattern Recognit..

[8]  G. Caldarelli,et al.  Systemic risk in financial networks , 2013 .

[9]  Richard C. Wilson,et al.  Approximate von Neumann entropy for directed graphs. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  J. Crutchfield,et al.  Measures of statistical complexity: Why? , 1998 .

[11]  Francisco A. Rodrigues,et al.  Collective behavior in financial markets , 2011 .

[12]  Haiping Lu,et al.  MPCA: Multilinear Principal Component Analysis of Tensor Objects , 2008, IEEE Transactions on Neural Networks.

[13]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[14]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[15]  Massimo Piccardi,et al.  Discriminative prototype selection methods for graph embedding , 2013, Pattern Recognit..

[16]  Simone Severini,et al.  The von Neumann Entropy of Networks , 2008, 0812.2597.

[17]  Robert Jenssen,et al.  Kernel Entropy Component Analysis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  G. Caldarelli,et al.  Networks of equities in financial markets , 2004 .

[19]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[20]  Edwin R. Hancock,et al.  Graph characterizations from von Neumann entropy , 2012, Pattern Recognit. Lett..

[21]  Matthias Dehmer,et al.  Information processing in complex networks: Graph entropy and information functionals , 2008, Appl. Math. Comput..

[22]  Geng Li,et al.  Effective graph classification based on topological and label attributes , 2012, Stat. Anal. Data Min..

[23]  Ernesto Estrada Quantifying network heterogeneity. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[24]  Edwin R. Hancock,et al.  Pattern Vectors from Algebraic Graph Theory , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  D. Garlaschelli,et al.  Emergence of Complexity in Financial Networks , 2004 .

[26]  Francisco Escolano,et al.  Heat diffusion: thermodynamic depth complexity of networks. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Jens Christian Claussen,et al.  Offdiagonal complexity: A computationally quick complexity measure for graphs and networks , 2004, q-bio/0410024.

[28]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[29]  Cesar H. Comin,et al.  Modular Dynamics of Financial Market Networks , 2015, 1501.05040.

[30]  Andrei N. Kolmogorov,et al.  On Tables of Random Numbers (Reprinted from "Sankhya: The Indian Journal of Statistics", Series A, Vol. 25 Part 4, 1963) , 1998, Theor. Comput. Sci..

[31]  Charles H. Bennett,et al.  On the nature and origin of complexity in discrete, homogeneous, locally-interacting systems , 1986 .

[32]  Matthias Dehmer,et al.  Advances in network complexity , 2013 .