Proteins Markovian 3D-QSAR with spherically-truncated average electrostatic potentials.

Proteins 3D-QSAR is an emerging field of bioorganic chemistry. However, the large dimensions of the structures to be handled may become a bottleneck to scaling up classic QSAR problems for proteins. In this sense, truncation approach could be used as in molecular dynamic to perform timely calculations. The spherical truncation of electrostatic field with different functions breaks down long-range interactions at a given cutoff distance (r(off)) resulting in short-range ones. Consequently, a Markov chain model may approach to the average electrostatic potentials of spatial distribution of charges within the protein backbone. These average electrostatic potentials can be used to predict proteins properties. Herein, we explore the effect of abrupt, shifting, force shifting, and switching truncation functions on 3D-QSAR models classifying 26 proteins with different functions: lysozymes, dihydrofolate reductases, and alcohol dehydrogenases. Almost all methods have shown overall accuracies higher than 73%. The present result points to an acceptable robustness of the MC for different truncation schemes and r(off) values. The results of best accuracy 92% with abrupt truncation coincide with our recent communication. We also developed models with the same accuracy value for other truncation functions; however they are more complex functions. PCA analysis for 152 non-homologous proteins has shown that there are five main eigenvalues, which explain more than 87% of the variance of the studied properties. The present molecular descriptors may encode structural information not totally accounted for the previous ones, so success with these descriptors could be expected when classic fails. The present result confirms the utility of our Markov models combined with truncation approach to generate bioorganic structure protein molecular descriptors for QSAR.

[1]  Milan Randic,et al.  Orthogonal molecular descriptors , 1991 .

[2]  Ernesto Estrada,et al.  A Protein Folding Degree Measure and Its Dependence on Crystal Packing, Protein Size, Secondary Structure, and Domain Structural Class , 2004, J. Chem. Inf. Model..

[3]  Humberto González-Díaz,et al.  Predicting stability of Arc repressor mutants with protein stochastic moments. , 2005, Bioorganic & medicinal chemistry.

[4]  Milan Randic,et al.  Resolution of ambiguities in structure-property studies by use of orthogonal descriptors , 1991, J. Chem. Inf. Comput. Sci..

[5]  K. Chou Prediction and classification of α‐turn types , 1997 .

[6]  Miguel A. Cabrera,et al.  Markovian chemicals "in silico" design (MARCH-INSIDE), a promising approach for computer-aided molecular design I: discovery of anticancer compounds , 2003, Journal of molecular modeling.

[7]  Humberto González Díaz,et al.  Simple stochastic fingerprints towards mathematical modelling in biology and medicine. 1. The treatment of coccidiosis , 2004, Bulletin of mathematical biology.

[8]  W. Dunn,et al.  Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. , 1995, Journal of medicinal chemistry.

[9]  Humberto González-Díaz,et al.  Markov entropy backbone electrostatic descriptors for predicting proteins biological activity. , 2004, Bioorganic & medicinal chemistry letters.

[10]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[11]  Gustavo A. Arteca,et al.  Path-Integral Calculation of the Mean Number of Overcrossings in an Entangled Polymer Network , 1999, J. Chem. Inf. Comput. Sci..

[12]  B. Celda,et al.  Solution structure of a D,L-alternating oligonorleucine as a model of double-stranded antiparallel beta-helix. , 2002, Biopolymers.

[13]  Humberto González Díaz,et al.  Markovian negentropies in bioinformatics. 1. A picture of footprints after the interaction of the HIV-1 -RNA packaging region with drugs , 2003, Bioinform..

[14]  Humberto González-Díaz,et al.  Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments , 2005, Biopolymers.

[15]  P. Schleyer Encyclopedia of computational chemistry , 1998 .

[16]  Milan Randić,et al.  Correlation of enthalphy of octanes with orthogonal connectivity indices , 1991 .

[17]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[18]  Zhirong Sun,et al.  Support vector machine approach for protein subcellular localization prediction , 2001, Bioinform..

[19]  Humberto González Díaz,et al.  Stochastic molecular descriptors for polymers. 1. Modelling the properties of icosahedral viruses with 3D-Markovian negentropies , 2004 .

[20]  Ernesto Estrada,et al.  Combination of 2D-, 3D-Connectivity and Quantum Chemical Descriptors in QSPR. Complexation of - and -Cyclodextrin with Benzene Derivatives , 2001, J. Chem. Inf. Comput. Sci..

[21]  Milan Randic,et al.  On A Four-Dimensional Representation of DNA Primary Sequences , 2003, J. Chem. Inf. Comput. Sci..

[22]  F M Richards,et al.  Protein packing: dependence on protein size, secondary structure and amino acid composition. , 2000, Journal of molecular biology.

[23]  K. B. Ward,et al.  Occluded molecular surface: Analysis of protein packing , 1995, Journal of molecular recognition : JMR.

[24]  Jean Garnier,et al.  FORESST: fold recognition from secondary structure predictions of proteins , 1999, Bioinform..

[25]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..

[26]  B. Montgomery Pettitt,et al.  Structural and energetic effects of truncating long ranged interactions in ionic and polar fluids , 1985 .

[27]  L. Nilsson,et al.  On the truncation of long-range electrostatic interactions in DNA. , 2000, Biophysical journal.