Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability.

This report describes a new set of macromolecular descriptors of relevance to protein QSAR/QSPR studies, protein's quadratic indices. These descriptors are calculated from the macromolecular pseudograph's alpha-carbon atom adjacency matrix. A study of the protein stability effects for a complete set of alanine substitutions in Arc repressor illustrates this approach. Quantitative Structure-Stability Relationship (QSSR) models allow discriminating between near wild-type stability and reduced-stability A-mutants. A linear discriminant function gives rise to excellent discrimination between 85.4% (35/41)and 91.67% (11/12) of near wild-type stability/reduced stability mutants in training and test series, respectively. The model's overall predictability oscillates from 80.49 until 82.93, when n varies from 2 to 10 in leave-n-out cross validation procedures. This value stabilizes around 80.49% when n was > 6. Additionally, canonical regression analysis corroborates the statistical quality of the classification model (Rcanc = 0.72, p-level <0.0001). This analysis was also used to compute biological stability canonical scores for each Arc A-mutant. On the other hand, nonlinear piecewise regression model compares favorably with respect to linear regression one on predicting the melting temperature (tm)of the Arc A-mutants. The linear model explains almost 72% of the variance of the experimental tm (R = 0.85 and s = 5.64) and LOO press statistics evidenced its predictive ability (q2 = 0.55 and scv = 6.24). However, this linear regression model falls to resolve t(m) predictions of Arc A-mutants in external prediction series. Therefore, the use of nonlinear piecewise models was required. The tm values of A-mutants in training (R = 0.94) and test(R = 0.91) sets are calculated by piecewise model with a high degree of precision. A break-point value of 51.32 degrees C characterizes two mutants' clusters and coincides perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutants' Arc homodimers. These models also permit the interpretation of the driving forces of such a folding process. The models include protein's quadratic indices accounting for hydrophobic (z1), bulk-steric (z2), and electronic (z3) features of the studied molecules. Preponderance of z1 and z3 over z2 indicates the higher importance of the hydrophobic and electronic side chain terms in the folding of the Arc dimer. In this sense, developed equations involve short-reaching (k < or = 3), middle- reaching (3 < k < or = 7) and far-reaching (k= 8 or greater) z1, 2, 3-protein's quadratic indices. This situation points to topologic/topographic protein's backbone interactions control of the stability profile of wild-type Arc and its A-mutants. Consequently, the present approach represents a novel and very promising way to mathematical research in biology sciences.

[1]  James B. Grace Bioinformatics: Mathematical Challenges and Ecology , 1997, Science.

[2]  R. Sauer,et al.  The Arc and Mnt repressors. A new class of sequence-specific DNA-binding protein. , 1989, The Journal of biological chemistry.

[3]  Eduardo A. Castro,et al.  Tomocomd-Cardd, a novel approach for computer-aided ‘ rational’ drug design: I. Theoretical and experimental assessment of a promising method for computational screening and in silico design of new anthelmintic compounds , 2004, J. Comput. Aided Mol. Des..

[4]  Hongyi Zhou,et al.  Stability scale and atomic solvation parameters extracted from 1023 mutation experiments , 2002, Proteins.

[5]  Vicente Romero Zaldivar,et al.  Total and Local Quadratic Indices of the “Molecular Pseudograph’s Atom Adjacency Matrix”. Application to Prediction of Caco-2 Permeability of Drugs , 2003 .

[6]  R. Sauer,et al.  Equilibrium dissociation and unfolding of the Arc repressor dimer. , 1989, Biochemistry.

[7]  J U Bowie,et al.  Identifying determinants of folding and activity for a protein of unknown structure. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[8]  B. Matthews,et al.  Structural and genetic analysis of protein stability. , 1993, Annual review of biochemistry.

[9]  Yovani Marrero-Ponce,et al.  Linear Indices of the "Molecular Pseudograph's Atom Adjacency Matrix": Definition, Significance-Interpretation, and Application to QSAR Analysis of Flavone Derivatives as HIV-1 Integrase Inhibitors , 2004, J. Chem. Inf. Model..

[10]  F. Collins,et al.  Principles of Biochemistry , 1937, The Indian Medical Gazette.

[11]  Milan Randić,et al.  Generalized molecular descriptors , 1991 .

[12]  Robert T. Sauer,et al.  Protein stability effects of a complete set of alanine substitutions in Arc repressor , 1994, Nature Structural Biology.

[13]  Francisco Torrens,et al.  Nucleic acid quadratic indices of the "macromolecular graph's nucleotides adjacency matrix" , 2004 .

[14]  D. Shortle Denatured states of proteins and their roles in folding and stability , 1993 .

[15]  W. Dunn,et al.  Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. , 1995, Journal of medicinal chemistry.

[16]  Francisco Torrens,et al.  A new topological descriptors based model for predicting intestinal epithelial transport of drugs in Caco-2 cell culture. , 2004, Journal of pharmacy & pharmaceutical sciences : a publication of the Canadian Society for Pharmaceutical Sciences, Societe canadienne des sciences pharmaceutiques.

[17]  Yovani Marrero Ponce Total and local (atom and atom type) molecular quadratic indices: significance interpretation, comparison to other molecular descriptors, and QSPR/QSAR applications. , 2004, Bioorganic & medicinal chemistry.

[18]  Humberto González Díaz,et al.  Markovian negentropies in bioinformatics. 1. A picture of footprints after the interaction of the HIV-1 -RNA packaging region with drugs , 2003, Bioinform..

[19]  R. Sauer,et al.  P22 Arc repressor: folding kinetics of a single-domain, dimeric protein. , 1994, Biochemistry.

[20]  Humberto González Díaz,et al.  3D-MEDNEs: an alternative "in silico" technique for chemical research in toxicology. 1. prediction of chemically induced agranulocytosis. , 2003, Chemical research in toxicology.

[21]  A. Tropsha,et al.  Beware of q 2 , 2002 .

[22]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[23]  Yovani Marrero-Ponce,et al.  Quadratic indices of the ‘molecular pseudograph's atom adjacency matrix’ and their stochastic forms: a novel approach for virtual screening and in silico discovery of new lead paramphistomicide drugs-like compounds , 2005 .

[24]  T. Alber,et al.  Mutational effects on protein stability. , 1989, Annual review of biochemistry.

[25]  Ernesto Estrada,et al.  A novel approach for the virtual screening and rational design of anticancer compounds. , 2000, Journal of medicinal chemistry.

[26]  Francisco Torrens,et al.  3D-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. , 2004, Bioorganic & medicinal chemistry.

[27]  Yovani Marrero Ponce Total and Local Quadratic Indices of the Molecular Pseudograph’s Atom Adjacency Matrix: Applications to the Prediction of Physical Properties of Organic Compounds , 2003, Molecules : A Journal of Synthetic Chemistry and Natural Product Chemistry.

[28]  E Marshall Hot Property: Biologists Who Compute , 1996, Science.

[29]  E Estrada,et al.  In silico studies for the rational discovery of anticonvulsant compounds. , 2000, Bioorganic & medicinal chemistry.

[30]  D. Goldenberg Genetic studies of protein stability and mechanisms of folding. , 1988, Annual review of biophysics and biophysical chemistry.

[31]  Ramón García-Domenech,et al.  Designing sedative/hypnotic compounds from a novel substructural graph-theoretical approach , 1998, Journal of computer-aided molecular design.

[32]  K. Dill,et al.  Denatured states of proteins. , 1991, Annual review of biochemistry.

[33]  Kathryn Fraughnaugh,et al.  Introduction to graph theory , 1973, Mathematical Gazette.

[34]  R. Sauer,et al.  P22 Arc repressor: Enhanced expression of unstable mutants by addition of polar C‐terminal sequences , 1993, Protein science : a publication of the Protein Society.

[35]  Frank Harary,et al.  Graph Theory , 2016 .

[36]  M. Charton,et al.  The dependence of the Chou-Fasman parameters on amino acid side chain structure. , 1983, Journal of theoretical biology.

[37]  M. Hollstein,et al.  Clinical implications of the p53 gene. , 1996, Annual review of medicine.

[38]  R. Sauer,et al.  Isolation and analysis of arc repressor mutants: Evidence for an unusual mechanism of DNA binding , 1986, Proteins.

[39]  Francisco Torrens,et al.  Atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic. , 2005, Bioorganic & medicinal chemistry.

[40]  Svante Wold,et al.  Multivariate Parametrization of 55 Coded and Non‐Coded Amino Acids , 1989 .

[41]  Ernesto Estrada,et al.  On the usefulness of graph-theoretic descriptors in predicting theoretical parameters. Phototoxicity of polycyclic aromatic hydrocarbons (PAHs) , 2004 .

[42]  A. Tropsha,et al.  Beware of q2! , 2002, Journal of molecular graphics & modelling.

[43]  S. Wold,et al.  The prediction of bradykinin potentiating potency of pentapeptides. An example of a peptide quantitative structure-activity relationship. , 1986, Acta chemica Scandinavica. Series B: Organic chemistry and biochemistry.

[44]  A. R. Fresht Structure and Mechanism in Protein Science: A Guide to Enzyme Catalysis and Protein Folding , 1999 .

[45]  Francisco Torrens,et al.  Atom, atom-type, and total linear indices of the "molecular pseudograph's atom adjacency matrix": application to QSPR/QSAR studies of organic compounds. , 2004, Molecules.

[46]  C. Anfinsen,et al.  The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[47]  R García-Domenech,et al.  Discovery of New Antimalarial Compounds by use of Molecular Connectivity Techniques , 1999, The Journal of pharmacy and pharmacology.

[48]  Ronal Ramos de Armas,et al.  Vibrational Markovian modelling of footprints after the interaction of antibiotics with the packaging region of HIV type 1 , 2003, Bulletin of mathematical biology.

[49]  S. Wold,et al.  Peptide quantitative structure-activity relationships, a multivariate approach. , 1987, Journal of medicinal chemistry.

[50]  Shaowu Zhang,et al.  Support Vector Machines for Predicting Protein Homo- Oligomers by Incorporating Pseudo-Amino Acid Composition # , 2003 .

[51]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[52]  D. Sidransky,et al.  CLINICAL IMPLICATIONS OF THE p 53 GENE , 1996 .

[53]  Yovani Marrero-Ponce,et al.  Non-stochastic and stochastic linear indices of the 'molecular pseudograph's atom adjacency matrix': application to 'in silico' studies for the rational discovery of new antimalarial compounds. , 2005, Bioorganic & medicinal chemistry.