Evaluating relevance and redundancy to quantify how binary node metadata interplay with the network structure

Networks are real systems modelled through mathematical objects made up of nodes and links arranged into peculiar and deliberate (or partially deliberate) topologies. Studying these real-world topologies allows for several properties of interest to be revealed. In real networks, nodes are also identified by a certain number of non-structural features or metadata. Given the current possibility of collecting massive quantity of such metadata, it becomes crucial to identify automatically which are the most relevant for the observed structure. We propose a new method that, independently from the network size, is able to not only report the relevance of binary node metadata, but also rank them. Such a method can be applied to networks from any domain, and we apply it in two heterogeneous cases: a temporal network of technology transfer and a protein-protein interaction network. Together with the relevance of node metadata, we investigate the redundancy of these metadata displaying by the results on a Redundancy-Relevance diagram, which is able to highlight the differences among vectors of metadata from both a structural and a non-structural point of view. The obtained results provide insights of a practical nature into the importance of the observed node metadata for the actual network structure.

[1]  Giovanna Ferraro,et al.  Technology transfer in innovation networks , 2017 .

[2]  Leto Peel,et al.  The ground truth about metadata and community detection in networks , 2016, Science Advances.

[3]  P. Tiberto,et al.  Magnetic properties of jet-printer inks containing dispersed magnetite nanoparticles , 2013 .

[4]  Matteo Cinelli,et al.  Generalized Rich-Club Ordering in Networks , 2018, J. Complex Networks.

[5]  Chunquan Li,et al.  The Implications of Relationships between Human Diseases and Metabolic Subpathways , 2011, PloS one.

[6]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[7]  Konstantin Avrachenkov,et al.  Cooperative Game Theory Approaches for Network Partitioning , 2017, COCOON.

[8]  Leto Peel,et al.  Active discovery of network roles for predicting the classes of network nodes , 2013, J. Complex Networks.

[9]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[10]  Ting Hu,et al.  Characterizing gene-gene interactions in a statistical epistasis network of twelve candidate genes for obesity , 2015, BioData Mining.

[11]  Matteo Cinelli,et al.  Rich-Club Ordering and the Dyadic Effect: Two Interrelated Phenomena , 2017, ArXiv.

[12]  P. Pin,et al.  Assessing the relevance of node features for network structure , 2008, Proceedings of the National Academy of Sciences.

[13]  Matteo Cinelli,et al.  Structural Bounds on the Dyadic Effect , 2017, J. Complex Networks.

[14]  A. V. D. Ven,et al.  Central problems in the management of innovation , 1986 .

[15]  Ernesto Estrada,et al.  Structural Patterns in Complex Networks through Spectral Analysis , 2010, SSPR/SPR.

[16]  W. Zachary,et al.  An Information Flow Model for Conflict and Fission in Small Groups , 1977, Journal of Anthropological Research.

[17]  Albert-László Barabási,et al.  Distribution of node characteristics in complex networks , 2007, Proceedings of the National Academy of Sciences.

[18]  T. Jiang,et al.  Modularity in the genetic disease‐phenotype network , 2008, FEBS letters.

[19]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[20]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[21]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[22]  Ting Hu,et al.  Functional dyadicity and heterophilicity of gene-gene interactions in statistical epistasis networks , 2015, BioData Mining.

[23]  L. R. Silva,et al.  Scale-free homophilic network , 2013 .

[24]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[25]  Santo Fortunato,et al.  Community detection in networks: Structural communities versus ground truth , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Luo-Qing Wang,et al.  Assessing the relevance of individual characteristics for the structure of similarity networks in new social strata in Shanghai , 2018, Physica A: Statistical Mechanics and its Applications.

[27]  Wolfgang Glänzel,et al.  Subject field characteristic citation scores and scales for assessing research performance , 1987, Scientometrics.

[28]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[29]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Santo Fortunato,et al.  Network structure, metadata and the prediction of missing nodes , 2016, ArXiv.

[31]  Shi Zhou,et al.  The rich-club phenomenon in the Internet topology , 2003, IEEE Communications Letters.