Discriminative topological features reveal biological network mechanisms

BackgroundRecent genomic and bioinformatic advances have motivated the development of numerous network models intending to describe graphs of biological, technological, and sociological origin. In most cases the success of a model has been evaluated by how well it reproduces a few key features of the real-world data, such as degree distributions, mean geodesic lengths, and clustering coefficients. Often pairs of models can reproduce these features with indistinguishable fidelity despite being generated by vastly different mechanisms. In such cases, these few target features are insufficient to distinguish which of the different models best describes real world networks of interest; moreover, it is not clear a priori that any of the presently-existing algorithms for network generation offers a predictive description of the networks inspiring them.ResultsWe present a method to assess systematically which of a set of proposed network generation algorithms gives the most accurate description of a given biological network. To derive discriminative classifiers, we construct a mapping from the set of all graphs to a high-dimensional (in principle infinite-dimensional) "word space". This map defines an input space for classification schemes which allow us to state unambiguously which models are most descriptive of a given network of interest. Our training sets include networks generated from 17 models either drawn from the literature or introduced in this work. We show that different duplication-mutation schemes best describe the E. coli genetic network, the S. cerevisiae protein interaction network, and the C. elegans neuronal network, out of a set of network models including a linear preferential attachment model and a small-world model.ConclusionsOur method is a first step towards systematizing network models and assessing their predictability, and we anticipate its usefulness for a number of communities.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  K. Goh,et al.  Universal behavior of load distribution in scale-free networks. , 2001, Physical review letters.

[3]  H. Spencer The structure of the nervous system. , 1870 .

[4]  V. Eguíluz,et al.  Growing scale-free networks with small-world behavior. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  S. Brenner,et al.  The structure of the nervous system of the nematode Caenorhabditis elegans. , 1986, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[6]  A. Vázquez Knowing a network by walking on it: emergence of scaling , 2000, cond-mat/0006132.

[7]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8]  A. Vespignani,et al.  Modeling of Protein Interaction Networks , 2001, Complexus.

[9]  G. B. A. Barab'asi Competition and multiscaling in evolving networks , 2000, cond-mat/0011029.

[10]  S Redner,et al.  Degree distributions of growing networks. , 2001, Physical review letters.

[11]  P. Grindrod Range-dependent random graphs and their application to modeling large small-world Proteome datasets. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[13]  Asako Saegusa Japan's quake strategy urged to switch to long-term forecasting , 1998, Nature.

[14]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  D.-H. Kim,et al.  Multi-component static model for social networks , 2004 .

[17]  J. Hopcroft,et al.  Are randomly grown graphs really random? , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[19]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[20]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[21]  M. A. Muñoz,et al.  Scale-free networks from varying vertex intrinsic fitness. , 2002, Physical review letters.

[22]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[23]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[24]  Z N Oltvai,et al.  Evolutionary conservation of motif constituents in the yeast protein interaction network , 2003, Nature Genetics.

[25]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[26]  Etay Ziv,et al.  Novel systematic discovery of statistically significant network features , 2003 .

[27]  Uri Alon,et al.  Response to Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[28]  S. Shen-Orr,et al.  Superfamilies of Evolved and Designed Networks , 2004, Science.

[29]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[30]  T. Joachims,et al.  1 Making Large-scale Svm Learning Practical , 1999 .

[31]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[32]  Sarel J Fleishman,et al.  Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[33]  Ricard V. Solé,et al.  A Model of Large-Scale proteome Evolution , 2002, Adv. Complex Syst..

[34]  V. Eguíluz,et al.  Highly clustered scale-free networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[36]  D. Higham Spectral Reordering of a Range-Dependent Weighted Random Graph , 2005 .

[37]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[38]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.