Improved SAR Models - Exploiting the Target-Ligand Relationships

Small organic molecules, by binding to different proteins, can be used to modulate (inhibit/activate) their functions for therapeutic purposes and to elucidate the molecular mechanisms underlying biological processes. Over the decades structure-activity-relationship (SAR) models have been developed to quantify the bioactivity relationship of a chemical compound interacting with a target protein, with advances focussing on the chemical compound representation and the statistical learning methods. We have developed approaches to improve the performance of SAR models using compound activity information from different targets. The methods developed in the study aim to determine the candidacy of a target to help another target in improving the performance of its SAR model by providing supplemental activity information. Having identified a helping target we also develop methods to identify a subset of compounds that would result in improving the sensitivity of the SAR model. Identification of helping targets as well as helping compounds is performed using various nearest neighbor approaches using similarity measures derived from the targets as well as active compounds. We also developed methods that involve use of cross-training a series of SVM-based models for identifying the helping set of targets. Our experimental results show that our methods show statistically significant results and incorporate the

[1]  Jason Weston,et al.  Protein ranking: from local to global structure in the protein similarity network. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Toshio Fujita,et al.  The Correlation of Biological Activity of Plant Growth Regulators and Chloromycetin Derivatives with Hammett Constants and Partition Coefficients , 1963 .

[3]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[5]  Robert P. Sheridan,et al.  Chemical Similarity Using Geometric Atom Pair Descriptors , 1996, J. Chem. Inf. Comput. Sci..

[6]  S. Frye Structure-activity relationship homology (SARAH): a conceptual framework for drug discovery in the genomic era. , 1999, Chemistry & biology.

[7]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[8]  Jürgen Bajorath,et al.  Integration of virtual and high-throughput screening , 2002, Nature Reviews Drug Discovery.

[9]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[10]  J. Mason,et al.  New perspectives in lead generation II: Evaluating molecular diversity , 1996 .

[11]  David J. Livingstone,et al.  The Characterization of Chemical Structures Using Molecular Properties. A Survey , 2000, J. Chem. Inf. Comput. Sci..

[12]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[13]  Lemont B. Kier,et al.  Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information , 1995, J. Chem. Inf. Comput. Sci..

[14]  Herman van Vlijmen,et al.  Recent advances in chemoinformatics. , 2007, Journal of chemical information and modeling.

[15]  R. M. Muir,et al.  Correlation of Biological Activity of Phenoxyacetic Acids with Hammett Substituent Constants and Partition Coefficients , 1962, Nature.

[16]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[17]  Ian A. Watson,et al.  ErG: 2D Pharmacophore Descriptions for Scaffold Hopping. , 2006 .

[18]  Eugene W. Myers,et al.  Basic local alignment search tool. Journal of Molecular Biology , 1990 .

[19]  Stephen D. Pickett,et al.  Diversity Profiling and Design Using 3D Pharmacophores: Pharmacophore-Derived Queries (PDQ) , 1996, J. Chem. Inf. Comput. Sci..

[20]  G. Müller Towards 3D structures of G protein-coupled receptors: a multidisciplinary approach. , 2000, Current medicinal chemistry.

[21]  B. Roth,et al.  The Multiplicity of Serotonin Receptors: Uselessly Diverse Molecules or an Embarrassment of Riches? , 2000 .

[22]  Xin Chen,et al.  Recursive Partitioning Analysis of a Large Structure-Activity Data Set Using Three-Dimensional Descriptors1 , 1998, J. Chem. Inf. Comput. Sci..

[23]  Xiaoyang Xia,et al.  Classification of kinase inhibitors using a Bayesian model. , 2004, Journal of medicinal chemistry.

[24]  G. Schneider,et al.  Virtual Screening for Bioactive Molecules , 2000 .

[25]  George Karypis,et al.  Profile-based direct kernels for remote homology detection and fold recognition , 2005, Bioinform..

[26]  P. Willett,et al.  Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. , 2004, Organic & biomolecular chemistry.

[27]  Yuanyuan Wang,et al.  Comparisons of classification methods for screening potential compounds , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[28]  Liisa Holm,et al.  Picasso: generating a covering set of protein family profiles , 2001, Bioinform..

[29]  Subhash C. Basak,et al.  Determining structural similarity of chemicals using graph-theoretic indices , 1988, Discret. Appl. Math..

[30]  D. Rogers,et al.  Using Extended-Connectivity Fingerprints with Laplacian-Modified Bayesian Analysis in High-Throughput Screening Follow-Up , 2005, Journal of biomolecular screening.

[31]  Kim D. Janda,et al.  Molecular diversity and combinatorial chemistry : libraries and drug discovery , 1996 .

[32]  Darren V. S. Green,et al.  Modelling Structure‐Activity Relationships , 2000 .

[33]  Pierre Baldi,et al.  Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity , 2005, ISMB.

[34]  Hugo Kubinyi,et al.  3D QSAR in drug design : theory, methods and applications , 2000 .

[35]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[36]  Johann Gasteiger,et al.  Neural Networks for Chemists: An Introduction , 1993 .

[37]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[38]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[39]  Feng Yang,et al.  Novel topological index F based on incidence matrix , 2003, J. Comput. Chem..

[40]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[41]  T. A. Andrea,et al.  Applications of neural networks in quantitative structure-activity relationships of dihydrofolate reductase inhibitors. , 1991, Journal of medicinal chemistry.

[42]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[43]  James Devillers,et al.  Neural Networks in QSAR and Drug Design , 1996 .

[44]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[45]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .