Supervised Approaches for Function Prediction of Proteins Contact Networks from Topological Structure Information

The role performed by a protein is directly connected to its physico-chemical structure. How the latter affects the behaviour of these molecules is still an open research topic. In this paper we consider a subset of the Escherichia Coli proteome where each protein is represented through the spectral characteristics of its residue contact network and its physiological function is encoded by a suitable class label. By casting this problem as a machine learning task, we aim at assessing whether a relation exists between such spectral properties and the protein’s function. To this end we adopted a set of supervised learning techniques, possibly optimised by means of genetic algorithms. First results are promising and they show that such high-level spectral representation contains enough information in order to discriminate among functional classes. Our experiments pave the way for further research and analysis.

[1]  Alfredo Colosimo,et al.  Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. , 2002, Chemical reviews.

[2]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[3]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[4]  Alessandro Giuliani,et al.  Protein contact network topology: a natural language for allostery. , 2015, Current opinion in structural biology.

[5]  Shoji Takada,et al.  Bimodal protein solubility distribution revealed by an aggregation analysis of the entire ensemble of Escherichia coli proteins , 2009, Proceedings of the National Academy of Sciences.

[6]  Cesare Furlanello,et al.  An introduction to spectral distances in networks , 2010, WIRN.

[7]  R. Nussinov,et al.  Allostery: absence of a change in shape does not imply that allostery is not at play. , 2008, Journal of molecular biology.

[8]  D. W. Scott On optimal and data based histograms , 1979 .

[9]  Alessandro Giuliani,et al.  A generative model for protein contact networks , 2015, Journal of biomolecular structure & dynamics.

[10]  Alessandro Giuliani,et al.  Spectral reconstruction of protein contact networks , 2017 .

[11]  A. Giuliani,et al.  Protein contact networks: an emerging paradigm in chemistry. , 2013, Chemical reviews.

[12]  Alessandro Giuliani,et al.  Characterization of Graphs for Protein Structure Modeling and Recognition of Solubility , 2014, ArXiv.

[13]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[14]  Alessandro Giuliani,et al.  Toward a Multilevel Representation of Protein Molecules: Comparative Approaches to the Aggregation/Folding Propensity Problem , 2014, Inf. Sci..

[15]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[16]  J. Changeux,et al.  Allosteric Mechanisms of Signal Transduction , 2005, Science.