Proteins QSAR with Markov average electrostatic potentials.

Classic physicochemical and topological indices have been largely used in small molecules QSAR but less in proteins QSAR. In this study, a Markov model is used to calculate, for the first time, average electrostatic potentials xik for an indirect interaction between aminoacids placed at topologic distances k within a given protein backbone. The short-term average stochastic potential xi1 for 53 Arc repressor mutants was used to model the effect of Alanine scanning on thermal stability. The Arc repressor is a model protein of relevance for biochemical studies on bioorganics and medicinal chemistry. A linear discriminant analysis model developed correctly classified 43 out of 53, 81.1% of proteins according to their thermal stability. More specifically, the model classified 20/28, 71.4% of proteins with near wild-type stability and 23/25, 92.0% of proteins with reduced stability. Moreover, predictability in cross-validation procedures was of 81.0%. Expansion of the electrostatic potential in the series xi0, xi1, xi2, and xi3, justified the use of the abrupt truncation approach, being the overall accuracy >70.0% for xi0 but equal for xi1, xi2, and xi3. The xi1 model compared favorably with respect to others based on D-Fire potential, surface area, volume, partition coefficient, and molar refractivity, with less than 77.0% of accuracy [Ramos de Armas, R.; González-Díaz, H.; Molina, R.; Uriarte, E. Protein Struct. Func. Bioinf.2004, 56, 715]. The xi1 model also has more tractable interpretation than others based on Markovian negentropies and stochastic moments. Finally, the model is notably simpler than the two models based on quadratic and linear indices. Both models, reported by Marrero-Ponce et al., use four-to-five time more descriptors. Introduction of average stochastic potentials may be useful for QSAR applications; having xik amenable physical interpretation and being very effective.

[1]  J. Svendsen,et al.  Antibiotic activity of pentadecapeptides modelled from amino acid descriptors , 2001, Journal of peptide science : an official publication of the European Peptide Society.

[2]  W. Dunn,et al.  Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. , 1995, Journal of medicinal chemistry.

[3]  L. Nilsson,et al.  On the truncation of long-range electrostatic interactions in DNA. , 2000, Biophysical journal.

[4]  Eugenio Uriarte,et al.  Markovian Backbone Negentropies: Molecular descriptors for protein research. I. Predicting protein stability in Arc repressor mutants , 2004, Proteins.

[5]  Eugenio Uriarte,et al.  Stochastic-based descriptors studying peptides biological properties: modeling the bitter tasting threshold of dipeptides. , 2004, Bioorganic & medicinal chemistry.

[6]  J. Clarke,et al.  The folding of an immunoglobulin-like Greek key protein is defined by a common-core nucleus and regions constrained by topology. , 2000, Journal of molecular biology.

[7]  H. Díaz,et al.  A TOPS-MODE approach to predict permeability coefficients , 2004 .

[8]  M. Borodovsky,et al.  Detection of new genes in a bacterial genome using Markov models for three gene classes. , 1995, Nucleic acids research.

[9]  S. Harvey Treatment of electrostatic effects in macromolecular modeling , 1989, Proteins.

[10]  T. Darden,et al.  Molecular dynamics simulations of biomolecules: long-range electrostatic effects. , 1999, Annual review of biophysics and biomolecular structure.

[11]  Lourdes Santana,et al.  Markovian chemicals “in silico” design (MARCH-INSIDE), a promising approach for computer-aided molecular design III: 2.5D indices for the discovery of antibacterials , 2005 .

[12]  H. Noll,et al.  A computer-controlled multichannel micropipetter. , 1978, Analytical biochemistry.

[13]  Maykel Pérez González,et al.  Designing Antibacterial Compounds through a Topological Substructural Approach , 2004, J. Chem. Inf. Model..

[14]  M. Akke,et al.  From snapshot to movie: phi analysis of protein folding transition states taken one step further. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Ronal Ramos de Armas,et al.  Vibrational Markovian modelling of footprints after the interaction of antibiotics with the packaging region of HIV type 1 , 2003, Bulletin of mathematical biology.

[16]  B. Celda,et al.  Conformational and structural analysis of the equilibrium between single‐ and double‐strand β‐helix of a D,L‐alternating oligonorleucine , 2004, Biopolymers.

[17]  Francisco Torrens,et al.  Protein linear indices of the 'macromolecular pseudograph alpha-carbon atom adjacency matrix' in bioinformatics. Part 1: prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor. , 2005, Bioorganic & medicinal chemistry.

[18]  Francesc Rosselló,et al.  On the algebraic representation of RNA secondary structures with G⋅U pairs , 2003, Journal of mathematical biology.

[19]  B. Celda,et al.  Solution structure of a D,L-alternating oligonorleucine as a model of double-stranded antiparallel beta-helix. , 2002, Biopolymers.

[20]  Humberto González-Díaz,et al.  Markov entropy backbone electrostatic descriptors for predicting proteins biological activity. , 2004, Bioorganic & medicinal chemistry letters.

[21]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[22]  Maykel Pérez González,et al.  TOPS-MODE Based QSARs Derived from Heterogeneous Series of Compounds. Applications to the Design of New Herbicides , 2003, J. Chem. Inf. Comput. Sci..

[23]  Miguel A. Cabrera,et al.  Unified Markov thermodynamics based on stochastic forms to classify drugs considering molecular structure, partition system, and biological species: distribution of the antimicrobial G1 on rat tissues. , 2005, Bioorganic & medicinal chemistry letters.

[24]  T. Pöschel,et al.  Stochastic Processes in Physics, Chemistry, and Biology , 2000 .

[25]  Robert T. Sauer,et al.  Protein stability effects of a complete set of alanine substitutions in Arc repressor , 1994, Nature Structural Biology.

[26]  T. Hubbard,et al.  Fold recognition and ab initio structure predictions using hidden markov models and β‐strand pair potentials , 1995, Proteins.

[27]  Gustavo A. Arteca,et al.  Path-Integral Calculation of the Mean Number of Overcrossings in an Entangled Polymer Network , 1999, J. Chem. Inf. Comput. Sci..

[28]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[29]  Peter A. Kollman,et al.  Conformational and energetic effects of truncating nonbonded interactions in an aqueous protein dynamics simulation , 1993, J. Comput. Chem..

[30]  Maykel Pérez González,et al.  TOPS-MODE approach to predict mutagenicity in dental monomers , 2004 .

[31]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..

[32]  A. Nandy,et al.  Novel techniques of graphical representation and analysis of DNA sequences—A review , 1998, Journal of Biosciences.

[33]  Rafael Molina,et al.  Stochastic molecular descriptors for polymers. 2. Spherical truncation of electrostatic interactions , 2005 .

[34]  Humberto González Díaz,et al.  Markovian chemicals "in silico" design (MARCH-INSIDE), a promising approach for computer aided molecular design II: experimental and theoretical assessment of a novel method for virtual screening of fasciolicides , 2002, Journal of molecular modeling.

[35]  Zheng Yuan Prediction of protein subcellular locations using Markov chain models , 1999, FEBS letters.

[36]  K. Chou,et al.  Studies on the specificity of HIV protease: An application of Markov chain theory , 1993, Journal of protein chemistry.

[37]  Milan Randic,et al.  On A Four-Dimensional Representation of DNA Primary Sequences , 2003, J. Chem. Inf. Comput. Sci..

[38]  M. A. Cabrera Pérez,et al.  A topological-substructural molecular design (TOPS-MODE) approach to determining pharmacokinetics and pharmacological properties of 6-fluoroquinolone derivatives. , 2003, European journal of pharmaceutics and biopharmaceutics : official journal of Arbeitsgemeinschaft fur Pharmazeutische Verfahrenstechnik e.V.

[39]  S. Blondelle,et al.  Stabilization of an ?-helical conformation in an isolated hexapeptide inhibitor of calmodulin , 2001 .

[40]  Maykel Cruz-Monteagudo,et al.  Predicting multiple drugs side effects with a general drug-target interaction thermodynamic Markov model. , 2005, Bioorganic & medicinal chemistry.

[41]  Kuo-Chen Chou,et al.  Prediction of protein signal sequences. , 2002, Current protein & peptide science.

[42]  P. Auffinger,et al.  A simple test for evaluating the truncation effects in simulations of systems involving charged groups , 1995 .

[43]  A. Fersht Structure and mechanism in protein science , 1998 .

[44]  Humberto González Díaz,et al.  Symmetry considerations in Markovian chemicals 'in silico' design (MARCH-INSIDE) I: central chirality codification, classification of ACE inhibitors and prediction of \sigma-receptor antagonist activities , 2003, Comput. Biol. Chem..

[45]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[46]  K. Chou Prediction and classification of α‐turn types , 1997 .

[47]  Miguel A. Cabrera,et al.  Markovian chemicals "in silico" design (MARCH-INSIDE), a promising approach for computer-aided molecular design I: discovery of anticancer compounds , 2003, Journal of molecular modeling.

[48]  Humberto González Díaz,et al.  Simple stochastic fingerprints towards mathematical modelling in biology and medicine. 1. The treatment of coccidiosis , 2004, Bulletin of mathematical biology.

[49]  Ernesto Estrada,et al.  Generalization of topological indices , 2001 .

[50]  H. Wiener Structural determination of paraffin boiling points. , 1947, Journal of the American Chemical Society.

[51]  Humberto González-Díaz,et al.  Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments , 2005, Biopolymers.

[52]  W. DeGrado,et al.  A thermodynamic scale for the helix-forming tendencies of the commonly occurring amino acids. , 1990, Science.

[53]  P. Flory Principles of polymer chemistry , 1953 .

[54]  Hydrophobic core substitutions in calbindin D9k: effects on stability and structure. , 1998, Biochemistry.

[55]  S. Kundu,et al.  How a repulsive charge distribution becomes attractive and stabilized by a polarizable protein dielectric , 2004 .

[56]  E V Koonin,et al.  New genes in old sequence: a strategy for finding genes in the bacterial genome. , 1994, Trends in biochemical sciences.

[57]  Hongyi Zhou,et al.  Stability scale and atomic solvation parameters extracted from 1023 mutation experiments , 2002, Proteins.

[58]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[59]  Karsten Kristiansen,et al.  The formation of a native-like structure containing eight conserved hydrophobic residues is rate limiting in two-state protein folding of ACBP , 1999, Nature Structural Biology.

[60]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[61]  M. J. Parker,et al.  Effects of core mutations on the folding of a beta-sheet protein: implications for backbone organization in the I-state. , 1999, Biochemistry.

[62]  N. Mavromatos,et al.  LECT NOTES PHYS , 2002 .

[63]  D. Shortle,et al.  Contributions of the polar, uncharged amino acids to the stability of staphylococcal nuclease: evidence for mutational effects on the free energy of the denatured state. , 1992, Biochemistry.

[64]  Humberto González Díaz,et al.  Stochastic molecular descriptors for polymers. 1. Modelling the properties of icosahedral viruses with 3D-Markovian negentropies , 2004 .

[65]  Maykel Pérez González,et al.  A topological sub-structural approach of the mutagenic activity in dental monomers. 1. Aromatic epoxides , 2004 .

[66]  T. Alber,et al.  Mutational effects on protein stability. , 1989, Annual review of biochemistry.

[67]  D Baker,et al.  A breakdown of symmetry in the folding transition state of protein L. , 2000, Journal of molecular biology.

[68]  Jean Garnier,et al.  FORESST: fold recognition from secondary structure predictions of proteins , 1999, Bioinform..

[69]  Humberto González Díaz,et al.  3D-MEDNEs: an alternative "in silico" technique for chemical research in toxicology. 1. prediction of chemically induced agranulocytosis. , 2003, Chemical research in toxicology.

[70]  K. Dill,et al.  Denatured states of proteins. , 1991, Annual review of biochemistry.

[71]  Humberto González-Díaz,et al.  Predicting stability of Arc repressor mutants with protein stochastic moments. , 2005, Bioorganic & medicinal chemistry.

[72]  V. Daggett,et al.  Mapping the interactions present in the transition state for unfolding/folding of FKBP12. , 1999, Journal of molecular biology.

[73]  Humberto González Díaz,et al.  Markovian negentropies in bioinformatics. 1. A picture of footprints after the interaction of the HIV-1 -RNA packaging region with drugs , 2003, Bioinform..

[74]  M. A. Cabrera Pérez,et al.  In silico prediction of central nervous system activity of compounds. Identification of potential pharmacophores by the TOPS-MODE approach. , 2004, Bioorganic & medicinal chemistry.

[75]  A. Warshel,et al.  On the origin of the electrostatic barrier for proton transport in aquaporin , 2004, FEBS letters.

[76]  K. Chou Prediction of human immunodeficiency virus protease cleavage sites in proteins. , 1996, Analytical biochemistry.

[77]  E. Uriarte,et al.  Stochastic‐based descriptors studying biopolymers biological properties: Extended MARCH‐INSIDE methodology describing antibacterial activity of lactoferricin derivatives , 2005, Biopolymers.

[78]  Daniel Monleón,et al.  Study of electrostatic potential surface distribution of wild-type plastocyanin Synechocystis solution structure determined by homonuclear NMR. , 2003, Biopolymers.

[79]  Hongyi Zhou,et al.  Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction , 2002, Protein science : a publication of the Protein Society.

[80]  Francisco Torrens,et al.  Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. , 2004, Molecules.

[81]  P. Mezey,et al.  A method for the characterization of foldings in protein ribbon models. , 1990, Journal of molecular graphics.

[82]  B. Matthews,et al.  Structural basis of amino acid alpha helix propensity. , 1993, Science.