A mathematical model for generating bipartite graphs and its application to protein networks

Complex systems arise in many different contexts from large communication systems and transportation infrastructures to molecular biology. Most of these systems can be organized into networks composed of nodes and interacting edges. Here, we present a theoretical model that constructs bipartite networks with the particular feature that the degree distribution can be tuned depending on the probability rate of fundamental processes. We then use this model to investigate protein-domain networks. A protein can be composed of up to hundreds of domains. Each domain represents a conserved sequence segment with specific functional tasks. We analyze the distribution of domains in Homo sapiens and Arabidopsis thaliana organisms and the statistical analysis shows that while (a) the number of domain types shared by k proteins exhibits a power-law distribution, (b) the number of proteins composed of k types of domains decays as an exponential distribution. The proposed mathematical model generates bipartite graphs and predicts the emergence of this mixing of (a) power-law and (b) exponential distributions. Our theoretical and computational results show that this model requires (1) growth process and (2) copy mechanism.

[1]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[2]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Ingmar Reuter,et al.  Integr8 and Genome Reviews: integrated views of complete genomes and proteomes , 2004, Nucleic Acids Res..

[5]  S. Wuchty Scale-free behavior in protein domain networks. , 2001, Molecular biology and evolution.

[6]  Alain Barrat,et al.  Rate equation approach for correlations in growing network models. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  B. Kahng,et al.  Evolution of the Protein Interaction Network of Budding Yeast: Role of the Protein Family Compatibility Constraint , 2003, q-bio/0312009.

[8]  Markus Koppenberger,et al.  Topology of music recommendation networks. , 2006, Chaos.

[9]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[10]  Tatsuya Akutsu,et al.  Protein domain networks: Scale-free mixing of positive and negative exponents , 2006 .

[11]  Eugene V. Koonin,et al.  Modeling genome evolution with a diffusion approximation of a birth-and-death process , 2005, Bioinform..

[12]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[13]  E. Koonin,et al.  Birth and death of protein domains: A simple model of evolution explains power law behavior , 2002, BMC Evolutionary Biology.

[14]  S. N. Dorogovtsev,et al.  Self-organization of collaboration networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Alessandro Vespignani,et al.  Evolution and Structure of the Internet: A Statistical Physics Approach , 2004 .

[16]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[17]  S. Redner,et al.  Organization of growing random networks. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Geoffrey J. Barton,et al.  TarO: a target optimisation system for structural biology , 2008, Nucleic Acids Res..

[19]  S. Redner,et al.  Infinite-order percolation and giant fluctuations in a protein interaction network. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  S. Redner,et al.  Rate Equation Approach for Growing Networks , 2003 .

[21]  M. Go Correlation of DNA exonic regions with protein structural units in haemoglobin , 1981, Nature.

[22]  M Go,et al.  Modular structural units, exons, and function in chicken lysozyme. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[23]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[24]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  I. Ispolatov,et al.  Duplication-divergence model of protein interaction network. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[27]  Thomas Wilhelm,et al.  Dynamic simulation of protein complex formation on a genomic scale , 2005, Bioinform..

[28]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[29]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[30]  Güler Ergün Human sexual contact network as a bipartite graph , 2001 .

[31]  S. Redner,et al.  Connectivity of growing random networks. , 2000, Physical review letters.