Robust QSAR Models from Novel Descriptors and Bayesian Regularised Neural Networks

Abstract The QSAR method, using multivariate statistics, was developed by Hansch and Fujita, and it has been successfully applied to many drug and agrochemical design problems. As well as speed and simplicity QSAR has advantages of being capable of accounting for some transport and metabolic processes which occur once the compound is administered. Until recently QSAR analyses have used relatively simple molecular descriptors based on substituent constants (e.g., Hammett constants, π, or molar refractivities), physicochemical properties (e.g., partition coefficients), topological indices (e.g., Randic and Weiner indices). Recently several new representations have been devised: atomistic; molecular eigenvalues and BCUT indices derived therefrom; E-state fields; topological autocorrelation vectors; various molecular fragment-based hash codes. These representations have advantages in speed of computation, in more accurately representing molecular properties most relevant to activity, or in being more generally applicable to diverse chemical classes acting at a common receptor, than traditional representations. Historically, linear regression methods such as MLR (multiple linear regression) and PLS (partial least squares) have been used to develop QSAR models. Regression is an “ill-posed” problem in statistics, which sometimes results in QSAR models exhibiting instability when trained with noisy data. In addition traditional regression techniques often require subjective decisions to be made on the part of the investigator as to the likely non-linear relationship between structure and activity, and whether there are cross-terms. Regression methods based on neural networks offer some advantages over MLR methods as they can account for non-linear SARs, and can deal with linear dependencies which sometimes appear in real SAR problems. However, some problems still exist in the development of SAR models using conventional backpropagation neural networks. We have used a specific type of neural network, the Bayesian Regularized Artificial Neural Network (BRANN), in the development of SAR models. The advantage of BRANN is that the models are robust and the validation process, which scales as O(N2) in normal regression methods, is unnecessary. These networks have the potential to solve a number of problems which arise in QSAR modelling such as: choice of model; robustness of model; choice of validation set; size of validation effort; and optimization of network architecture. The application of the methods to QSAR of compounds active at the benzodiazepine and muscarinic receptors will be illustrated.

[1]  R. Cramer,et al.  Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. , 1988, Journal of the American Chemical Society.

[2]  Richard G. Brereton,et al.  Chemometrics: Applications of Mathematics and Statistics to Laboratory Systems , 1991 .

[3]  Andreas Zell,et al.  Locating Biologically Active Compounds in Medium-Sized Heterogeneous Datasets by Topological Autocorrelation Vectors: Dopamine and Benzodiazepine Agonists , 1996, J. Chem. Inf. Comput. Sci..

[4]  David J. Livingstone,et al.  The Use of Artificial Neural Networks in QSAR , 1992 .

[5]  Ajay,et al.  Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? , 1998, Journal of medicinal chemistry.

[6]  L. P. Davies,et al.  Substituted imidazo[1,2-b]pyridazines. New compounds with activity at central and peripheral benzodiazepine receptors. , 1992, Biochemical pharmacology.

[7]  S. Unger Molecular Connectivity in Structure–activity Analysis , 1987 .

[8]  C. Hansch,et al.  p-σ-π Analysis. A Method for the Correlation of Biological Activity and Chemical Structure , 1964 .

[9]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[10]  B. Silverman,et al.  Registration, orientation, and similarity of molecular electrostatic potentials through multipole matching , 1996, J. Comput. Chem..

[11]  D. M. Ryan,et al.  Rational design of potent sialidase-based inhibitors of influenza virus replication , 1993, Nature.

[12]  D. Walters,et al.  Genetically evolved receptor models: a computational approach to construction of receptor models. , 1994, Journal of medicinal chemistry.

[13]  J. Hadamard Sur les problemes aux derive espartielles et leur signification physique , 1902 .

[14]  R Pool,et al.  Beyond databases and E-mail. , 1993, Science.

[15]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[16]  Yvonne C. Martin,et al.  The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding , 1997, J. Chem. Inf. Comput. Sci..

[17]  Gerald M. Maggiora,et al.  Computational neural networks as model-free mapping devices , 1992, J. Chem. Inf. Comput. Sci..

[18]  Nenad Trinajstić,et al.  In search for graph invariants of chemical interes , 1993 .

[19]  M. Karplus,et al.  Genetic neural networks for quantitative structure-activity relationships: improvements and application of benzodiazepine affinity for benzodiazepine/GABAA receptors. , 1996, Journal of medicinal chemistry.

[20]  G. Kane Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .

[21]  F. Burden Using Artificial Neural Networks to Predict Biological Activity from Simple Molecular Structural Considerations , 1996 .

[22]  K. M. Smith,et al.  Novel software tools for chemical diversity , 1998 .

[23]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[24]  T. Langer,et al.  Computer-aided molecular modeling, synthesis, and biological evaluation of 8-(benzyloxy)-2-phenylpyrazolo[4,3-c]quinoline as a novel benzodiazepine receptor agonist ligand. , 1995, Journal of medicinal chemistry.

[25]  Frank R. Burden,et al.  Holographic QSAR of benzodiazepines , 1998 .

[26]  P. Skolnick,et al.  Structural requirements for agonist actions at the benzodiazepine receptor: studies with analogues of 6-(benzyloxy)-4-(methoxymethyl)-beta-carboline-3-carboxylic acid ethyl ester. , 1990, Journal of medicinal chemistry.

[27]  Frank R. Burden,et al.  New QSAR Methods Applied to Structure-Activity Mapping and Combinatorial Chemistry , 1999, J. Chem. Inf. Comput. Sci..

[28]  S. J. Ireland,et al.  Syntheses, pharmacological evaluation and molecular modelling of substituted 6-alkoxyimidazo[1,2-b]pyridazines as new ligands for the benzodiazepine receptor , 1996 .

[29]  Willy Haefely,et al.  Recent advances in the molecular pharmacology of Benzodiazepine receptors and in the structure-activity relationships of their agonists and antagonists , 1985 .

[30]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[31]  D. Maddalena,et al.  Prediction of receptor properties and binding affinity of ligands to benzodiazepine/GABAA receptors using artificial neural networks. , 1995, Journal of medicinal chemistry.

[32]  P. Skolnick,et al.  Structure-activity relationship studies at the benzodiazepine receptor (BZR): a comparison of the substitutent effects of pyrazoloquinolinone analogs. , 1993, Journal of medicinal chemistry.

[33]  F. Burden,et al.  Robust QSAR models using Bayesian regularized neural networks. , 1999, Journal of medicinal chemistry.

[34]  W. Draber,et al.  Rational Approaches to Structure, Activity, and Ecotoxicology of Agrochemicals , 1992 .

[35]  B D Silverman,et al.  Comparative molecular moment analysis (CoMMA): 3D-QSAR without molecular superposition. , 1996, Journal of medicinal chemistry.

[36]  Rohan Andrew Davis,et al.  Imidazo[1,2-b]pyridazines. XVII. Synthesis and Central Nervous System Activity of Some 6-(Alkylthio and chloro)-3-(methoxy, unsubstituted and benzamidomethyl)-2-arylimidazo[1,2-b]pyridazines Containing Methoxy, Methylenedioxy and Methyl Substituents , 1994 .

[37]  A. Leo,et al.  Extension of the fragment method to calculate amino acid zwitterion and side chain partition coefficients , 1987, Proteins.

[38]  W A Wulf,et al.  The collaboratory opportunity. , 1993, Science.

[39]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[40]  F. Burden A CHEMICALLY INTUITIVE MOLECULAR INDEX BASED ON THE EIGENVALUES OF A MODIFIED ADJACENCY MATRIX , 1997 .

[41]  Daniel E. Platt,et al.  Registration, orientation, and similarity of molecular electrostatic potentials through multipole matching , 1996, J. Comput. Chem..

[42]  F. Burden,et al.  A quantitative structure--activity relationships model for the acute toxicity of substituted benzenes to Tetrahymena pyriformis using Bayesian-regularized neural networks. , 2000, Chemical research in toxicology.

[43]  N Yokoyama,et al.  2-Arylpyrazolo[4,3-c]quinolin-3-ones: novel agonist, partial agonist, and antagonist of benzodiazepines. , 1982, Journal of medicinal chemistry.

[44]  M. Randic Characterization of molecular branching , 1975 .

[45]  David T. Stanton,et al.  Evaluation and Use of BCUT Descriptors in QSAR and QSPR Studies , 1999, J. Chem. Inf. Comput. Sci..

[46]  F Baskett,et al.  Microprocessors: From Desktops to Supercomputers , 1993, Science.

[48]  P. W. Codding,et al.  Synthesis of novel 3-substituted beta-carbolines as benzodiazepine receptor ligands: probing the benzodiazepine receptor pharmacophore. , 1988, Journal of medicinal chemistry.

[49]  Wray L. Buntine,et al.  Bayesian Back-Propagation , 1991, Complex Syst..

[50]  P. Andrews,et al.  Functional group contributions to drug-receptor interactions. , 1984, Journal of medicinal chemistry.

[51]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[52]  Frank R. Burden,et al.  Atomistic topological indices applied to benzodiazepines using various regression methods , 1998 .