Distributed and multicore QuBiLS‐MIDAS software v2.0: Computing chiral, fuzzy, weighted and truncated geometrical molecular descriptors based on tensor algebra

Advances to the distributed, multi‐core and fully cross‐platform QuBiLS‐MIDAS software v2.0 (http://tomocomd.com/qubils-midas) are reported in this article since the v1.0 release. The QuBiLS‐MIDAS software is the only one that computes atom‐pair and alignment‐free geometrical MDs (3D‐MDs) from several distance metrics other than the Euclidean distance, as well as alignment‐free 3D‐MDs that codify structural information regarding the relations among three and four atoms of a molecule. The most recent features added to the QuBiLS‐MIDAS software v2.0 are related (a) to the calculation of atomic weightings from indices based on the vertex‐degree invariant (e.g., Alikhanidi index); (b) to consider central chirality during the molecular encoding; (c) to use measures based on clustering methods and statistical functions to codify structural information among more than two atoms; (d) to the use of a novel method based on fuzzy membership functions to spherically truncate inter‐atomic relations; and (e) to the use of weighted and fuzzy aggregation operators to compute global 3D‐MDs according to the importance and/or interrelation of the atoms of a molecule during the molecular encoding. Moreover, a novel module to compute QuBiLS‐MIDAS 3D‐MDs from their headings was also developed. This module can be used either by the graphical user interface or by means of the software library. By using the library, both the predictive models built with the QuBiLS‐MIDAS 3D‐MDs and the QuBiLS‐MIDAS 3D‐MDs calculation can be embedded in other tools. A set of predefined QuBiLS‐MIDAS 3D‐MDs with high information content and low redundancy on a set comprised of 20,469 compounds is also provided to be employed in further cheminformatics tasks. This set of predefined 3D‐MDs evidenced better performance than all the universe of Dragon (v5.5) and PaDEL 0D‐to‐3D MDs in variability studies, whereas a linear independence study proved that these QuBiLS‐MIDAS 3D‐MDs codify chemical information orthogonal to the Dragon 0D‐to‐3D MDs. This set of predefined 3D‐MDs would be periodically updated as long as new results be achieved. In general, this report highlights our continued efforts to provide a better tool for a most suitable characterization of compounds, and in this way, to contribute to obtaining better outcomes in future applications.

[1]  J. Gasteiger,et al.  ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY – A RAPID ACCESS TO ATOMIC CHARGES , 1980 .

[2]  Arup K. Ghose,et al.  An estimation of the atomic contribution to octanol-water partition coefficient and molar refractivity from fundamental atomic and structural properties: Its uses in computer aided drug design , 1990 .

[3]  Fionn Murtagh,et al.  Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion? , 2011, Journal of Classification.

[4]  Andrey Yerin,et al.  Algorithmic Analysis of Cahn-Ingold-Prelog Rules of Stereochemistry: Proposals for Revised Rules and a Guide for Machine Implementation , 2018, J. Chem. Inf. Model..

[5]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[6]  E. Castro,et al.  3D-chiral (2.5) atom-based TOMOCOMD-CARDD descriptors: theory and QSAR applications to central chirality codification , 2008 .

[7]  Milan Randić,et al.  Generalized molecular descriptors , 1991 .

[8]  B. Montgomery Pettitt,et al.  Structural and energetic effects of truncating long ranged interactions in ionic and polar fluids , 1985 .

[9]  Lazim Abdullah,et al.  Choquet integral with respect to maximized L-measure and delta-measure , 2017 .

[10]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching , 2017, Journal of Cheminformatics.

[11]  Dong-Sheng Cao,et al.  ChemDes: an integrated web-based platform for molecular descriptor and fingerprint computation , 2015, Journal of Cheminformatics.

[12]  Johann Gasteiger,et al.  Quantitative models of gas-phase proton-transfer reactions involving alcohols, ethers, and their thio analogs. Correlation analyses based on residual electronegativity and effective polarizability , 1984 .

[13]  W C Guida,et al.  The significance of chirality in drug design and development. , 2011, Current topics in medicinal chemistry.

[14]  Vesna Rastija,et al.  PyDescriptor : A new PyMOL plugin for calculating thousands of easily understandable molecular descriptors , 2017 .

[15]  Jürgen Bajorath,et al.  Variability of Molecular Descriptors in Compound Databases Revealed by Shannon Entropy Calculations , 2000, J. Chem. Inf. Comput. Sci..

[16]  Yovani Marrero-Ponce,et al.  IMMAN: free software for information theory-based chemometric analysis , 2015, Molecular Diversity.

[17]  Andrius Merkys,et al.  A posteriori metadata from automated provenance tracking: integration of AiiDA and TCOD , 2017, Journal of Cheminformatics.

[18]  Definition and application of a novel valence molecular connectivity index , 2003 .

[19]  Yovani Marrero-Ponce,et al.  Multi‐Server Approach for High‐Throughput Molecular Descriptors Calculation based on Multi‐Linear Algebraic Maps , 2015, Molecular informatics.

[20]  Tatsuya Takagi,et al.  Mordred: a molecular descriptor calculator , 2018, Journal of Cheminformatics.

[21]  Manuela Pavan,et al.  DRAGON SOFTWARE: AN EASY APPROACH TO MOLECULAR DESCRIPTOR CALCULATIONS , 2006 .

[22]  Yovani Marrero-Ponce,et al.  Choquet integral-based fuzzy molecular characterizations: when global definitions are computed from the dependency among atom/bond contributions (LOVIs/LOEIs) , 2018, Journal of Cheminformatics.

[23]  Emilio Benfenati,et al.  QSAR modeling of Daphnia magna and fish toxicities of biocides using 2D descriptors. , 2019, Chemosphere.

[24]  Francisco Torrens,et al.  3D-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. , 2004, Bioorganic & medicinal chemistry.

[25]  Roberto Todeschini,et al.  Impact of Molecular Descriptors on Computational Models. , 2018, Methods in molecular biology.

[26]  E. J. Kupchik Structure‐Molar Refraction Relationships of Alkylsilanes Using Empirically‐Modified First Order Molecular Connectivity Indices , 1986 .

[27]  Liliane Mouawad,et al.  vSDC: a method to improve early recognition in virtual screening when limited experimental resources are available , 2016, Journal of Cheminformatics.

[28]  W. Pedrycz,et al.  Generalized means as model of compensative connectives , 1984 .

[29]  F. Cortés‐Guzmán,et al.  GOWAWA Aggregation Operator‐based Global Molecular Characterizations: Weighting Atom/bond Contributions (LOVIs/LOEIs) According to their Influence in the Molecular Encoding , 2018, Molecular informatics.

[30]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[31]  Alexander Golbraikh,et al.  QSAR Modeling Using Chirality Descriptors Derived from Molecular Topology , 2003, J. Chem. Inf. Comput. Sci..

[32]  Emilio Benfenati,et al.  Ecotoxicological QSAR modeling of organic compounds against fish: Application of fragment based descriptors in feature analysis. , 2019, Aquatic toxicology.

[33]  J. Clerc,et al.  Versatile topological structure descriptor for quantitative structure/property studies , 1990 .

[34]  Boris Hollas,et al.  An Analysis of the Autocorrelation Descriptor for Molecules , 2003 .

[35]  Gordon M. Crippen,et al.  Chirality Descriptors in QSAR , 2008 .

[36]  Humberto González-Díaz,et al.  Proteins Markovian 3D-QSAR with spherically-truncated average electrostatic potentials. , 2005, Bioorganic & medicinal chemistry.

[37]  Weida Tong,et al.  Mold2, Molecular Descriptors from 2D Structures for Chemoinformatics and Toxicoinformatics , 2008, J. Chem. Inf. Model..

[38]  José M. Merigó,et al.  Distance measures, weighted averages, OWA operators and Bonferroni means , 2017, Appl. Soft Comput..

[39]  A. K. Madan,et al.  Eccentric Connectivity Index: A Novel Highly Discriminating Topological Descriptor for Structure-Property and Structure-Activity Studies , 1997, J. Chem. Inf. Comput. Sci..

[40]  Rafael Molina,et al.  Stochastic molecular descriptors for polymers. 2. Spherical truncation of electrostatic interactions , 2005 .

[41]  Y. Marrero-Ponce,et al.  Tensor algebra-based geometric methodology to codify central chirality on organic molecules , 2017, SAR and QSAR in environmental research.

[42]  Ronald R. Yager,et al.  Generalized OWA Aggregation Operators , 2004, Fuzzy Optim. Decis. Mak..

[43]  H. Bhatt,et al.  3D-QSAR (CoMFA, CoMSIA, HQSAR and topomer CoMFA), MD simulations and molecular docking studies on purinylpyridine derivatives as B-Raf inhibitors for the treatment of melanoma cancer , 2019, Structural Chemistry.

[44]  Johann Gasteiger,et al.  The Coding of the Three-Dimensional Structure of Molecules by Molecular Transforms and Its Application to Structure-Spectra Correlations and Studies of Biological Activity , 1996, J. Chem. Inf. Comput. Sci..

[45]  J. Mahdi,et al.  Pharmacological Importance of Stereochemical Resolution of Enantiomeric Drugs , 1997, Drug safety.

[46]  Roberto Todeschini,et al.  Molecular descriptors for chemoinformatics , 2009 .

[47]  Dong-Sheng Cao,et al.  ChemoPy: freely available python package for computational biology and chemoinformatics , 2013, Bioinform..

[48]  Computational molecular modelling of N-cinnamoyl and hydroxycinnamoyl amides as potential α-glucosidase inhibitors , 2018, Medicinal Chemistry Research.

[49]  Ernesto Estrada,et al.  Spectral Moments of the Edge Adjacency Matrix in Molecular Graphs, 1. Definition and Applications to the Prediction of Physical Properties of Alkanes , 1996, J. Chem. Inf. Comput. Sci..

[50]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[51]  Apilak Worachartcheewan,et al.  AutoWeka: toward an automated data mining software for QSAR and QSPR studies. , 2015, Methods in molecular biology.

[52]  A. Ghose,et al.  Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods: An Analysis of ALOGP and CLOGP Methods , 1998 .

[53]  Paolo Tosco,et al.  A 3D-QSAR-Driven Approach to Binding Mode and Affinity Prediction , 2012, J. Chem. Inf. Model..

[54]  Andreas Klamt,et al.  COSMOsar3D: Molecular Field Analysis Based on Local COSMO σ-Profiles , 2012, J. Chem. Inf. Model..

[55]  J. Sutherland,et al.  A comparison of methods for modeling quantitative structure-activity relationships. , 2004, Journal of medicinal chemistry.

[56]  N. Cabrera,et al.  Computational Molecular Modeling of Pin1 Inhibition Activity of Quinazoline, Benzophenone, and Pyrimidine Derivatives , 2019, Journal of Chemistry.

[57]  Yovani Marrero-Ponce,et al.  Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes. , 2019, Chemical research in toxicology.

[58]  J. Gálvez,et al.  Prediction of properties of chiral compounds by molecular topology. , 1998, Journal of molecular graphics & modelling.

[59]  Structure — Molar Refraction Relationships of Alkylgermanes Using Molecular Connectivity , 1988 .

[60]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[61]  Lu Xu,et al.  Developing Molecular Identification Numbers by an All-Paths Method , 1997, J. Chem. Inf. Comput. Sci..

[62]  CHUN WEI YAP,et al.  PaDEL‐descriptor: An open source software to calculate molecular descriptors and fingerprints , 2011, J. Comput. Chem..

[63]  Vishwesh Venkatraman,et al.  KRAKENX: software for the generation of alignment-independent 3D descriptors , 2016, Journal of Molecular Modeling.

[64]  J. Medina-Franco,et al.  Conformation-dependent QSAR approach for the prediction of inhibitory activity of bromodomain modulators , 2017, SAR and QSAR in environmental research.

[65]  R. Cramer,et al.  Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. , 1988, Journal of the American Chemical Society.

[66]  L. Nilsson,et al.  On the truncation of long-range electrostatic interactions in DNA. , 2000, Biophysical journal.

[67]  Young Kee Kang,et al.  Additivity of atomic static polarizabilities and dispersion coefficients , 1982 .

[68]  James Green,et al.  ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins , 2015, BMC Bioinformatics.

[69]  E. Castro,et al.  Chalcone derivative cytotoxicity activity against MCF-7 human breast cancer cell QSAR study , 2015 .

[70]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[71]  Dong-Sheng Cao,et al.  PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions , 2018, Journal of Cheminformatics.

[72]  M. Karelson,et al.  Correlation of Boiling Points with Molecular Structure. 1. A Training Set of 298 Diverse Organics and a Test Set of 9 Simple Inorganics , 1996 .

[73]  Y. Marrero-Ponce,et al.  N-tuple topological/geometric cutoffs for 3D N-linear algebraic molecular codifications: variability, linear independence and QSAR analysis , 2016, SAR and QSAR in environmental research.

[74]  Emilio Benfenati,et al.  Consensus QSAR modeling of toxicity of pharmaceuticals to different aquatic organisms: Ranking and prioritization of the DrugBank database compounds. , 2019, Ecotoxicology and environmental safety.

[75]  Yovani Marrero-Ponce,et al.  QuBiLS‐MIDAS: A parallel free‐software for molecular descriptors computation based on multilinear algebraic maps , 2014, J. Comput. Chem..

[76]  K. Chou,et al.  PseAAC: a flexible web server for generating various kinds of protein pseudo amino acid composition. , 2008, Analytical biochemistry.

[77]  John Manchester,et al.  SAMFA: Simplifying Molecular Description for 3D-QSAR , 2008, J. Chem. Inf. Model..