Inferring monopartite projections of bipartite networks: an entropy-based approach

Bipartite networks are currently regarded as providing a major insight into the organization of many real-world systems, unveiling the mechanisms driving the interactions occurring between distinct groups of nodes. One of the most important issues encountered when modeling bipartite networks is devising a way to obtain a (monopartite) projection on the layer of interest, which preserves as much as possible the information encoded into the original bipartite structure. In the present paper we propose an algorithm to obtain statistically-validated projections of bipartite networks, according to which any two nodes sharing a statistically-significant number of neighbors are linked. Since assessing the statistical significance of nodes similarity requires a proper statistical benchmark, here we consider a set of four null models, defined within the exponential random graph framework. Our algorithm outputs a matrix of link-specific p-values, from which a validated projection is straightforwardly obtainable, upon running a multiple hypothesis testing procedure. Finally, we test our method on an economic network (i.e. the countries-products World Trade Web representation) and a social network (i.e. MovieLens, collecting the users' ratings of a list of movies). In both cases non-trivial communities are detected: while projecting the World Trade Web on the countries layer reveals modules of similarly-industrialized nations, projecting it on the products layer allows communities characterized by an increasing level of complexity to be detected; in the second case, projecting MovieLens on the films layer allows clusters of movies whose affinity cannot be fully accounted for by genre similarity to be individuated.

[1]  P. Bonacich TECHNIQUE FOR ANALYZING OVERLAPPING MEMBERSHIPS , 1972 .

[2]  O. Bagasra,et al.  Proceedings of the National Academy of Sciences , 1914, Science.

[3]  D. Garlaschelli,et al.  Maximum likelihood: extracting unbiased information from complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  J. Darroch On the Distribution of the Number of Successes in Independent Trials , 1964 .

[5]  Riccardo Di Clemente,et al.  Diversification versus Specialization in Complex Ecosystems , 2014, PloS one.

[6]  Ulrik Brandes,et al.  Social Networks , 2013, Handbook of Graph Drawing and Visualization.

[7]  Andrea Gabrielli,et al.  Randomizing bipartite networks: the case of the World Trade Web , 2015, Scientific Reports.

[8]  R. Cox,et al.  Journal of the Royal Statistical Society B , 1972 .

[9]  Giulio Cimini,et al.  Statistically similar portfolios and systemic risk , 2016 .

[10]  Diego Garlaschelli,et al.  Analytical maximum-likelihood method to detect patterns in real networks , 2011, 1103.0701.

[11]  Thomas W. Malone,et al.  Proceedings of the 1994 ACM conference on Computer supported cooperative work , 1994 .

[12]  M. Puri,et al.  Asymptotic expansions for sums of nonidentically distributed Bernoulli random variables , 1989 .

[13]  V. Mikhailov On a Refinement of the Central Limit Theorem for Sums of Independent Random Indicators , 1994 .

[14]  Jean-Loup Guillaume,et al.  A Realistic Model for Complex Networks , 2003 .

[15]  R. Randles Theory of Probability and Its Applications , 2005 .

[16]  Giorgio Fagiolo,et al.  Enhanced reconstruction of weighted networks from strengths and degrees , 2013, 1307.2104.

[17]  César A. Hidalgo,et al.  The building blocks of economic complexity , 2009, Proceedings of the National Academy of Sciences.

[18]  Yili Hong,et al.  On computing the distribution function for the Poisson binomial distribution , 2013, Comput. Stat. Data Anal..

[19]  IEEE Transactions on Parallel and Distributed Systems, Vol. 13 , 2002 .

[20]  A. Volkova A Refinement of the Central Limit Theorem for Sums of Independent Random Indicators , 1996 .

[21]  Frank Harary,et al.  Graph Theory , 2016 .

[22]  Guido Caldarelli,et al.  A Network Analysis of Countries’ Export Flows: Firm Grounds for the Building Blocks of the Economy , 2011, PloS one.

[23]  L. Pietronero,et al.  How the Taxonomy of Products Drives the Economic Development of Countries , 2014, PloS one.

[24]  E. Nadaraya Theory of Probability and its Applications , 1964 .

[25]  G. Caldarelli,et al.  Economic complexity: Conceptual grounding of a new metrics for global competitiveness , 2013 .

[26]  M. D. Martínez-Miranda,et al.  Computational Statistics and Data Analysis , 2009 .

[27]  Ben Derudder,et al.  The cliquishness of world cities , 2005 .

[28]  S. M. Samuels On the Number of Successes in Independent Trials , 1965 .

[29]  Lauren Wood 技術解説 IEEE Internet Computing , 1999 .

[30]  Jon Rokne,et al.  Encyclopedia of Social Network Analysis and Mining , 2014, Springer New York.

[31]  Luciano Pietronero,et al.  From Innovation to Diversification: A Simple Competitive Model , 2015, PloS one.

[32]  Guido Caldarelli,et al.  A New Metrics for Countries' Fitness and Products' Complexity , 2012, Scientific Reports.

[33]  Giulio Cimini,et al.  Statistically validated network of portfolio overlaps and systemic risk , 2016, Scientific Reports.

[34]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[35]  Zachary Neal,et al.  The backbone of bipartite projections: Inferring relationships from co-authorship, co-sponsorship, co-attendance and other co-behaviors , 2014, Soc. Networks.

[36]  M. Newman,et al.  Statistical mechanics of networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Agata Fronczak,et al.  Exponential random graph models , 2012, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[38]  G. Gaulier,et al.  BACI: International Trade Database at the Product-Level (the 1994-2007 Version) , 2009 .

[39]  Guido Caldarelli,et al.  Measuring the Intangibles: A Metrics for the Economic Complexity of Countries and Products , 2013, PloS one.

[40]  L. Christophorou Science , 2018, Emerging Dynamics: Science, Energy, Society and Values.

[41]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[42]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[43]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[44]  M. Tumminello,et al.  Statistically Validated Networks in Bipartite Complex Systems , 2010, PloS one.

[45]  César A. Hidalgo,et al.  The Product Space Conditions the Development of Nations , 2007, Science.

[46]  Navid Dianati,et al.  A maximum entropy approach to separating noise from signal in bimodal affiliation networks , 2016, 1607.01735.

[47]  Christian Staudt,et al.  Engineering Parallel Algorithms for Community Detection in Massive Networks , 2013, IEEE Transactions on Parallel and Distributed Systems.

[48]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[49]  Guido Caldarelli,et al.  Scale-Free Networks , 2007 .

[50]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[51]  Ricardo Hausmann,et al.  Country Diversification, Product Ubiquity, and Economic Divergence , 2010 .

[52]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[53]  Marián Boguñá,et al.  Extracting the multiscale backbone of complex weighted networks , 2009, Proceedings of the National Academy of Sciences.

[54]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[55]  Matthieu Latapy,et al.  Basic notions for the analysis of large two-mode networks , 2008, Soc. Networks.

[56]  J. Herskowitz,et al.  Proceedings of the National Academy of Sciences, USA , 1996, Current Biology.