Localized Heuristic Inverse Quantitative Structure Activity Relationship with Bulk Descriptors Using Numerical Gradients

State-of-the-art quantitative structure-activity relationship (QSAR) models are often based on nonlinear machine learning algorithms, which are difficult to interpret. From a pharmaceutical perspective, QSARs are used to enhance the chemical design process. Ultimately, they should not only provide a prediction but also contribute to a mechanistic understanding and guide modifications to the chemical structure, promoting compounds with desirable biological activity profiles. Global ranking of descriptor importance and inverse QSAR have been used for these purposes. This paper introduces localized heuristic inverse QSAR, which provides an assessment of the relative ability of the descriptors to influence the biological response in an area localized around the predicted compound. The method is based on numerical gradients with parameters optimized using data sets sampled from analytical functions. The heuristic character of the method reduces the computational requirements and makes it applicable not only to fragment based methods but also to QSARs based on bulk descriptors. The application of the method is illustrated on congeneric QSAR data sets, and it is shown that the predicted influential descriptors can be used to guide structural modifications that affect the biological response in the desired direction. The method is implemented into the AZOrange Open Source QSAR package. The current implementation of localized heuristic inverse QSAR is a step toward a generally applicable method for elucidating the structure activity relationship specifically for a congeneric region of chemical space when using QSARs based on bulk properties. Consequently, this method could contribute to accelerating the chemical design process in pharmaceutical projects, as well as provide information that could enhance the mechanistic understanding for individual scaffolds.

[1]  Luke E. K. Achenie,et al.  Novel Mathematical Programming Model for Computer Aided Molecular Design , 1996 .

[2]  Dominique Douguet,et al.  A genetic algorithm for the automated generation of small organic molecules: Drug design using an evolutionary algorithm , 2000, J. Comput. Aided Mol. Des..

[3]  G. Schneider,et al.  Extraction and visualization of potential pharmacophore points using support vector machines: application to ligand-based virtual screening for COX-2 inhibitors. , 2005, Journal of medicinal chemistry.

[4]  Jiri Pospichal,et al.  Simulated Annealing Construction of Molecular Graphs with Required Properties , 1996, J. Chem. Inf. Comput. Sci..

[5]  Christoph Helma and Jeroen Kazius Artificial Intelligence and Data Mining for Toxicity Prediction , 2006 .

[6]  Roberto Zavala,et al.  Multiple classifier systems in Akatek (Mayan) , 2000 .

[7]  M. Taha,et al.  Discovery of new potent human protein tyrosine phosphatase inhibitors via pharmacophore and QSAR analysis followed by in silico screening. , 2007, Journal of molecular graphics & modelling.

[8]  Young Ho Kim,et al.  Inhibition of protein tyrosine phosphatase 1B by diterpenoids isolated from Acanthopanax koreanum. , 2006, Bioorganic & medicinal chemistry letters.

[9]  Jean-Loup Faulon,et al.  The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides. , 2003, Journal of molecular graphics & modelling.

[10]  Rebecca Denton,et al.  A rapid computational filter for predicting the rate of human renal clearance. , 2010, Journal of molecular graphics & modelling.

[11]  D Horvath,et al.  Interpretability of SAR/QSAR Models of any Complexity by Atomic Contributions , 2012, Molecular informatics.

[12]  Scott Boyer,et al.  AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment , 2011, J. Cheminformatics.

[13]  Li Shao,et al.  Consensus Ranking Approach to Understanding the Underlying Mechanism With QSAR , 2010, J. Chem. Inf. Model..

[14]  Ting Wang,et al.  Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules , 2004, Multiple Classifier Systems.

[15]  Rajarshi Guha,et al.  Interpreting Computational Neural Network QSAR Models: A Measure of Descriptor Importance , 2005, J. Chem. Inf. Model..

[16]  Irina G. Tsygankova,et al.  Variable Selection in QSAR Models for Drug Design , 2008 .

[17]  Scott Boyer,et al.  Interpretation of Nonlinear QSAR Models Applied to Ames Mutagenicity Data , 2009, J. Chem. Inf. Model..