A comprehensive statistical study of metabolic and protein–protein interaction network properties

Abstract Understanding the mathematical properties of graphs underlying biological systems could give hints on the evolutionary mechanisms behind these structures. In this article we perform a complete statistical analysis over thousands of graphs representing metabolic and protein–protein interaction (PPI) networks. First, we investigate the quality of fits obtained for the nodes degree distributions to power-law functions. This analysis suggests that a power-law distribution poorly describes the data except for the far right tail in the case of PPI networks. Next we obtain descriptive statistics for the main graph parameters and try to identify the properties that deviate from the expected values had the networks been built by randomly linking nodes with the same degree distribution. This survey identifies the properties of biological networks which are not solely the result of their degree distribution, but emerge from yet unidentified mechanisms other than those that drive these distributions. The findings suggest that, while PPI networks have properties that differ from their expected values in their randomized versions with great statistical significance, the differences for metabolic networks have a smaller statistical significance, though it is possible to identify some drift.

[1]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[2]  M. Friswell,et al.  Uncertainty identification by the maximum likelihood method , 2005 .

[3]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[4]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[6]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[7]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[8]  Cohen,et al.  Resilience of the internet to random breakdowns , 2000, Physical review letters.

[9]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[10]  Arnau Montagud,et al.  Automation on the Generation of Genome-Scale Metabolic Models , 2012, J. Comput. Biol..

[11]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[12]  Raya Khanin,et al.  How Scale-Free Are Biological Networks , 2006, J. Comput. Biol..

[13]  Ernesto Estrada,et al.  The Structure of Complex Networks: Theory and Applications , 2011 .

[14]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[16]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[17]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[18]  Albert-László Barabási,et al.  Scale-Free Networks: A Decade and Beyond , 2009, Science.

[19]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[20]  A. Vázquez Growing network with local rules: preferential attachment, clustering hierarchy, and degree correlations. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[21]  S. Redner,et al.  Organization of growing random networks. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Piet Groeneboom,et al.  Kernel-type estimators for the extreme value index , 2003 .

[23]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[24]  Gipsi Lima-Mendez,et al.  The powerful law of the power law and other myths in network biology. , 2009, Molecular bioSystems.

[25]  Paul Erdös,et al.  On random graphs, I , 1959 .

[26]  L. Haan,et al.  A moment estimator for the index of an extreme-value distribution , 1989 .

[27]  Cai Xu,et al.  Structural Properties of US Flight Network , 2003 .

[28]  M. Gerstein,et al.  Structure and evolution of transcriptional regulatory networks. , 2004, Current opinion in structural biology.

[29]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[30]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[31]  Aaron Clauset,et al.  Scale-free networks are rare , 2018, Nature Communications.

[32]  Dmitri V. Krioukov,et al.  Scale-free Networks Well Done , 2018, Physical Review Research.

[33]  R. D. Almeida,et al.  Master equation for the degree distribution of a Duplication and Divergence network , 2017, Physica A: Statistical Mechanics and its Applications.

[34]  R. D'Agostino Transformation to normality of the null distribution of g1 , 1970 .

[35]  B. M. Hill,et al.  A Simple General Approach to Inference About the Tail of a Distribution , 1975 .

[36]  Igor Jurisica,et al.  Functional topology in a network of protein interactions , 2004, Bioinform..

[37]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..