Generalized Molecular Descriptors Derived From Event-Based Discrete Derivative.

In the present study, a generalized approach for molecular structure characterization is introduced, based on the relation frequency matrix (F) representation of the molecular graph and the subsequent calculation of the corresponding discrete derivative (finite difference) over a pair of elements (atoms). In earlier publications (22- 24), an unique event, named connected subgraphs, (based on the Kier-Hall's subgraphs) was systematically employed for the computation of the matrix F. The present report is a generalization of this notion, in which eleven additional events are introduced, classified in three categories, namely, topological (terminal paths, vertex path incidence, quantum subgraphs, walks of length k, Sach's subgraphs), fingerprints (MACCs, E-state and substructure fingerprints) and atomic contributions (Ghose and Crippen atom-types for hydrophobicity and refractivity) for F generation. The events are intended to capture diverse information by the generation or search of different kinds of substructures from the graph representation of a molecule. The discrete derivative over duplex atom relations are calculated for each event, and the resulting derivatives, local vertex invariants (LOVIs) are finally obtained. These LOVIs are subsequently employed as the basis for the calculation of global and local indices over groups of atoms (heteroatoms, halogens, methyl carbons, etc.), by using norms, means, statistics and classical algorithms as aggregator (fusion) operators. These indices were implemented in our house software DIVATI (Derivative Type Indices, a new module of TOMOCOMDCARDD system). DIVATI provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http: //www.tomocomd.com. Factor analysis shows that the presented events are rather orthogonal and collect diverse information about the chemical structure. Finally, QSPR models were built to describe the logP and logK of 34 furylethylenes derivatives using the eleven events. Generally, the equations obtained according to these events showed high correlations, with the Sach's sub-graphs and Multiplicity events showing the best behavior in the description of logK (Q2 LOO value of 99.06%) and logP (Q2 LOO value of 98.1 %), respectively. These results show that these new eventbased indices constitute a powerful approach for chemoinformatics studies.

[1]  R. García-Domenech,et al.  Novel 2D TOMOCOMD-CARDD molecular descriptors: atom-based stochastic and non-stochastic bilinear indices and their QSPR applications , 2008 .

[2]  Y. Marrero-Ponce,et al.  Discrete Derivatives for Atom‐Pairs as a Novel Graph‐Theoretical Invariant for Generating New Molecular Descriptors: Orthogonality, Interpretation and QSARs/QSPRs on Benchmark Databases , 2014, Molecular informatics.

[3]  Enrique Molina,et al.  3D connectivity indices in QSPR/QSAR studies. , 2001 .

[4]  Lourdes Santana,et al.  Proteomics, networks and connectivity indices , 2008, Proteomics.

[5]  Reisel Millán Cabrera,et al.  Extending Graph (Discrete) Derivative Descriptors to N-Tuple Atom-Relations , 2015 .

[6]  Lemont B. Kier,et al.  An Electrotopological-State Index for Atoms in Molecules , 1990, Pharmaceutical Research.

[7]  V. A. Gorbatov Fundamentos de la matemática discreta , 1988 .

[8]  Alexander Basilevsky,et al.  Statistical Factor Analysis and Related Methods , 1994 .

[9]  E Uriarte,et al.  Recent advances on the role of topological indices in drug discovery research. , 2001, Current medicinal chemistry.

[10]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[11]  Yovani Marrero-Ponce,et al.  Examining the predictive accuracy of the novel 3D N-linear algebraic molecular codifications on benchmark datasets , 2016, Journal of Cheminformatics.

[12]  Lorentz Jäntschi Graph Theory. 1. Fragmentation of Structural Graphs , 2002 .

[13]  Francisco Torrens,et al.  Bond-based linear indices of the non-stochastic and stochastic edge-adjacency matrix. 1. Theory and modeling of ChemPhys properties of organic molecules , 2010, Molecular Diversity.

[14]  A. Balaban,et al.  Topological Indices and Related Descriptors in QSAR and QSPR , 2003 .

[15]  Yovani Marrero-Ponce,et al.  Derivatives in discrete mathematics: a novel graph-theoretical invariant for generating new 2/3D molecular descriptors. I. Theory and QSPR application , 2012, Journal of Computer-Aided Molecular Design.

[16]  Francisco Torrens,et al.  3D-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. , 2004, Bioorganic & medicinal chemistry.

[17]  Lourdes Santana,et al.  Medicinal chemistry and bioinformatics--current trends in drugs discovery with networks topological indices. , 2007, Current topics in medicinal chemistry.

[18]  Francisco Torrens,et al.  Atom, atom-type, and total linear indices of the "molecular pseudograph's atom adjacency matrix": application to QSPR/QSAR studies of organic compounds. , 2004, Molecules.

[19]  Lemont B. Kier,et al.  Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information , 1995, J. Chem. Inf. Comput. Sci..

[20]  Gordon M. Crippen,et al.  Atomic physicochemical parameters for three-dimensional-structure-directed quantitative structure-activity relationships. 2. Modeling dispersive and hydrophobic interactions , 1987, J. Chem. Inf. Comput. Sci..

[21]  Francisco Torrens,et al.  A new topological descriptors based model for predicting intestinal epithelial transport of drugs in Caco-2 cell culture. , 2004, Journal of pharmacy & pharmaceutical sciences : a publication of the Canadian Society for Pharmaceutical Sciences, Societe canadienne des sciences pharmaceutiques.

[22]  J. Aihara,et al.  General rules for constructing Hueckel molecular orbital characteristic polynomials , 1977 .

[23]  Roberto Todeschini,et al.  Molecular descriptors for chemoinformatics , 2009 .

[24]  A. Crum Brown 1. On an Application of Mathematics to Chemistry. , 1869 .

[25]  E Estrada,et al.  Novel local (fragment-based) topological molecular descriptors for QSpr/QSAR and molecular design. , 2001, Journal of molecular graphics & modelling.

[26]  L. Hall,et al.  Molecular Structure Description: The Electrotopological State , 1999 .

[27]  Francisco Torrens,et al.  Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in ‘in silico’ selection of new lead tyrosinase inhibitors , 2007, J. Comput. Aided Mol. Des..

[28]  M. M. Martins Alho,et al.  A new type of quinoxalinone derivatives affects viability, invasion, and intracellular growth of Toxoplasma gondii tachyzoites in vitro , 2016, Parasitology Research.