Identification of structurally important amino acids in proteins by graph-theoretic measures

Identifying key residues important for maintaining a protein structure is a non-trivial problem in Computational Biology. In this paper, we present results based on a graph model representing protein structures. This model considers the structure as residue-residue interactions in order to capture protein stability. We propose the application of approximate minimum vertex cover algorithms (MVC) as a novel approach for identifying the structurally important residues, which we shall refer to as key residues. We establish that MVC based algorithms captures the essence of protein structural stability by correlation analysis with ΔΔG, the change of protein free energies due to amino acid variations. We also benchmark our approach with popular approaches for analyzing large complex networks --- betweenness, and Eigenvector centrality. Our findings are such that they do not correlate well with ΔΔG. We give explanations from the free energy point of view, which shall benefit future development measures for protein structure stability.

[1]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[2]  C. Harris,et al.  The IARC TP53 database: New online mutation analysis and recommendations to users , 2002, Human mutation.

[3]  Csaba Böde,et al.  Network analysis of protein dynamics , 2007, FEBS letters.

[4]  R. Nussinov,et al.  Residues crucial for maintaining short paths in network communication mediate signaling in proteins , 2006, Molecular systems biology.

[5]  J. Thornton,et al.  Satisfying hydrogen bonding potential in proteins. , 1994, Journal of molecular biology.

[6]  P. Bork,et al.  Towards a structural basis of human non-synonymous single nucleotide polymorphisms. , 2000, Trends in genetics : TIG.

[7]  David F. Burke,et al.  PROVAT: a tool for Voronoi tessellation analysis of protein structures and complexes , 2005, Bioinform..

[8]  J. Moult,et al.  Loss of protein structure stability as a major causative factor in monogenic disease. , 2005, Journal of molecular biology.

[9]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[10]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[11]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[12]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Victoria A. Higman,et al.  Uncovering network systems within protein structures. , 2003, Journal of molecular biology.

[15]  J. I. Izpisúa Belmonte,et al.  Global DNA methylation and transcriptional analyses of human ESC-derived cardiomyocytes , 2014, Protein & Cell.