A generative model for protein contact networks

In this paper, we present a generative model for protein contact networks (PCNs). The soundness of the proposed model is investigated by focusing primarily on mesoscopic properties elaborated from the spectra of the graph Laplacian. To complement the analysis, we also study the classical topological descriptors, such as statistics of the shortest paths and the important feature of modularity. Our experiments show that the proposed model results in a considerable improvement with respect to two suitably chosen generative mechanisms, mimicking with better approximation real PCNs in terms of diffusion properties elaborated from the normalized Laplacian spectra. However, as well as the other network models, it does not reproduce with sufficient accuracy the shortest paths structure. To compensate this drawback, we designed a second step involving a targeted edge reconfiguration process. The ensemble of reconfigured networks denotes further improvements that are statistically significant. As an important byproduct of our study, we demonstrate that modularity, a well-known property of proteins, does not entirely explain the actual network architecture characterizing PCNs. In fact, we conclude that modularity, intended as a quantification of an underlying community structure, should be considered as an emergent property of the structural organization of proteins. Interestingly, such a property is suitably optimized in PCNs together with the feature of path efficiency.

[1]  Peter G Wolynes,et al.  Evolution, energy landscapes and the paradoxes of protein folding. , 2015, Biochimie.

[2]  R. Merris Laplacian matrices of graphs: a survey , 1994 .

[3]  Alessandro Giuliani,et al.  Toward a Multilevel Representation of Protein Molecules: Comparative Approaches to the Aggregation/Folding Propensity Problem , 2014, Inf. Sci..

[4]  Marcin J. Skwark,et al.  Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns , 2014, PLoS Comput. Biol..

[5]  B. Montgomery Pettitt,et al.  The unsolved “solved-problem” of protein folding , 2013, Journal of biomolecular structure & dynamics.

[6]  Willem H. Haemers,et al.  Developments on Spectral Characterizations of Graphs , 2007, Discret. Math..

[7]  Alexandru T Balaban,et al.  Graphical representation of proteins. , 2011, Chemical reviews.

[8]  Edwin R. Hancock,et al.  Graph characteristics from the heat kernel trace , 2009, Pattern Recognit..

[9]  J. Skolnick In quest of an empirical potential for protein structure prediction. , 2006, Current opinion in structural biology.

[10]  Tiago P Peixoto,et al.  Eigenvalue spectra of modular networks. , 2013, Physical review letters.

[11]  Alessandro Giuliani,et al.  Protein contact network topology: a natural language for allostery. , 2015, Current opinion in structural biology.

[12]  Matthias Dehmer,et al.  A history of graph entropy measures , 2011, Inf. Sci..

[13]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[14]  Alexander S Mikhailov,et al.  Evolutionary reconstruction of networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Sivaraman Balakrishnan,et al.  Learning generative models for protein fold families , 2011, Proteins.

[16]  Ruth Nussinov,et al.  Disordered proteins and network disorder in network descriptions of protein structure, dynamics and function: hypotheses and a comprehensive review. , 2012 .

[17]  Hernán A Makse,et al.  Small-world to fractal transition in complex networks: a renormalization group approach. , 2009, Physical review letters.

[18]  Saraswathi Vishveshwara,et al.  Network properties of protein-decoy structures , 2012, Journal of biomolecular structure & dynamics.

[19]  Andrew Currin,et al.  Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently , 2014, Chemical Society reviews.

[20]  Alexis Papadimitriou,et al.  Edge betweenness centrality: A novel algorithm for QoS-based topology control over wireless sensor networks , 2012, J. Netw. Comput. Appl..

[21]  Ruth Nussinov,et al.  Disordered proteins and network disorder in network descriptions of protein structure, dynamics and function: hypotheses and a comprehensive review. , 2011, Current protein & peptide science.

[22]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[23]  Oxana V. Galzitskaya,et al.  Physics of protein folding , 2004 .

[24]  D. Frishman,et al.  Protein abundance profiling of the Escherichia coli cytosol , 2008, BMC Genomics.

[25]  Antonio Turi,et al.  Distance-dependent hydrophobic-hydrophobic contacts in protein folding simulations. , 2014, Physical chemistry chemical physics : PCCP.

[26]  P Fariselli,et al.  The effect of backbone on the small-world properties of protein contact maps , 2008, Physical biology.

[27]  Martijn P. van den Heuvel,et al.  The Laplacian spectrum of neural networks , 2014, Front. Comput. Neurosci..

[28]  Lisa Singh,et al.  Exploring community structure in biological networks with random graphs , 2013, BMC Bioinformatics.

[29]  Rama Mishra,et al.  Knot theory in understanding proteins , 2012, Journal of mathematical biology.

[30]  Carsten Wiuf,et al.  Fatgraph models of proteins , 2009, 0902.1025.

[31]  Ernesto Estrada Universality in protein residue networks. , 2010, Biophysical journal.

[32]  D. Baker,et al.  Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich era , 2013, Proceedings of the National Academy of Sciences.

[33]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[34]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[35]  Bairong Shen,et al.  The construction of an amino acid network for understanding protein structure and function , 2014, Amino Acids.

[36]  Christopher M. Dobson,et al.  Mutational analysis of acylphosphatase suggests the importance of topology and contact order in protein folding , 1999, Nature Structural Biology.

[37]  Daniele Santoni,et al.  Modules Identification in Protein Structures: The Topological and Geometrical Solutions , 2014, J. Chem. Inf. Model..

[38]  Terence Hwa,et al.  Coevolutionary signals across protein lineages help capture multiple protein conformations , 2013, Proceedings of the National Academy of Sciences.

[39]  Saraswathi Vishveshwara,et al.  Protein structure and folding – simplicity within complexity , 2013, Journal of biomolecular structure & dynamics.

[40]  Daniele Santoni,et al.  Structural and Functional Analysis of Hemoglobin and Serum Albumin Through Protein Long-Range Interaction Networks , 2012 .

[41]  Modesto Orozco,et al.  A theoretical view of protein dynamics. , 2014, Chemical Society reviews.

[42]  Carlo Baldassi,et al.  Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners , 2014, PloS one.

[43]  Alessandro Giuliani,et al.  Analysis of heat kernel highlights the strongly modular and heat-preserving structure of proteins , 2014, 1409.1819.

[44]  O. Sporns,et al.  The economy of brain network organization , 2012, Nature Reviews Neuroscience.

[45]  Jayanth R Banavar,et al.  Physics of proteins. , 2007, Annual review of biophysics and biomolecular structure.

[46]  Lorenzo Livi,et al.  Graph ambiguity , 2013, Fuzzy Sets Syst..

[47]  Jesper Ferkinghoff-Borg,et al.  A generative, probabilistic model of local protein structure , 2008, Proceedings of the National Academy of Sciences.

[48]  Michael Menzinger,et al.  Laplacian spectra as a diagnostic tool for network structure and dynamics. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Xiao Zhang,et al.  Spectra of random graphs with community structure and arbitrary degrees , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Fan Chung,et al.  The heat kernel as the pagerank of a graph , 2007, Proceedings of the National Academy of Sciences.

[51]  Mariano Sigman,et al.  A small world of weak ties provides optimal global integration of self-similar modules in functional brain networks , 2011, Proceedings of the National Academy of Sciences.

[52]  M. Mitrovic,et al.  Spectral and dynamical properties in classes of sparse networks with mesoscopic inhomogeneities. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Thomas A. Hopf,et al.  Three-Dimensional Structures of Membrane Proteins from Genomic Sequencing , 2012, Cell.

[54]  Carsten Wiuf,et al.  An Algebro-Topological Description of Protein Domain Structure , 2011, PloS one.

[55]  A. Giuliani,et al.  Protein contact networks: an emerging paradigm in chemistry. , 2013, Chemical reviews.

[56]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Mark R Segal,et al.  A novel topology for representing protein folds , 2009, Protein science : a publication of the Protein Society.

[58]  Alessandro Giuliani,et al.  Characterization of Graphs for Protein Structure Modeling and Recognition of Solubility , 2014, ArXiv.

[59]  E. Aurell,et al.  Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[60]  J. A. Almendral,et al.  Dynamical and spectral properties of complex networks , 2007, 0705.3216.

[61]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[62]  Thomas A. Hopf,et al.  Protein 3D Structure Computed from Evolutionary Sequence Variation , 2011, PloS one.

[63]  B. Montgomery Pettitt,et al.  The unsolved “solved-problem” of protein folding , 2013, Journal of biomolecular structure & dynamics.

[64]  David F. Gleich,et al.  Heat kernel based community detection , 2014, KDD.

[65]  Marcel A de Reus,et al.  The Laplacian spectrum of neural networks , 2014, Frontiers in computational neuroscience.

[66]  I. Gutman,et al.  Laplacian energy of a graph , 2006 .

[67]  R. Kuehn,et al.  Spectra of modular and small-world matrices , 2010, ArXiv.

[68]  A. Valencia,et al.  Emerging methods in protein co-evolution , 2013, Nature Reviews Genetics.

[69]  Alessandro Giuliani,et al.  Multifractal characterization of protein contact networks , 2014, 1410.0890.

[70]  Faruck Morcos,et al.  From structure to function: the convergence of structure based models and co-evolutionary information. , 2014, Physical chemistry chemical physics : PCCP.

[71]  M. Karplus,et al.  Three key residues form a critical contact network in a protein folding transition state , 2001, Nature.

[72]  D. Leitner Energy flow in proteins. , 2008, Annual review of physical chemistry.

[73]  Piero Fariselli,et al.  Reconstruction of 3D Structures From Protein Contact Maps , 2008, IEEE ACM Trans. Comput. Biol. Bioinform..

[74]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[75]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[76]  Francesc Comellas,et al.  Spectral reconstruction of complex networks , 2008 .