Prediction of Biological Activity for High-Throughput Screening Using Binary Kernel Discrimination

High-throughput screening has made a significant impact on drug discovery, but there is an acknowledged need for quantitative methods to analyze screening results and predict the activity of further compounds. In this paper we introduce one such method, binary kernel discrimination, and investigate its performance on two datasets; the first is a set of 1650 monoamine oxidase inhibitors, and the second a set of 101 437 compounds from an in-house enzyme assay. We compare the performance of binary kernel discrimination with a simple procedure which we call "merged similarity search", and also with a feedforward neural network. Binary kernel discrimination is shown to perform robustly with varying quantities of training data and also in the presence of noisy data. We conclude by highlighting the importance of the judicious use of general pattern recognition techniques for compound selection.

[1]  S. Free,et al.  A MATHEMATICAL CONTRIBUTION TO STRUCTURE-ACTIVITY STUDIES. , 1964, Journal of medicinal chemistry.

[2]  C. Hansch Quantitative approach to biochemical structure-activity relationships , 1969 .

[3]  R. Cramer,et al.  SUBSTRUCTURAL ANALYSIS, A NOVEL APPROACH TO THE PROBLEM OF DRUG DESIGN , 1974 .

[4]  J. Aitchison,et al.  Multivariate binary discrimination by the kernel method , 1976 .

[5]  R. Venkataraghavan,et al.  Atom pairs as molecular features in structure-activity studies: definition and applications , 1985, J. Chem. Inf. Comput. Sci..

[6]  G. Tutz An alternative choice of smoothing for kernel-based density estimates in discrete discriminant analysis , 1986 .

[7]  Ramaswamy Nilakantan,et al.  Topological torsion: a new molecular descriptor for SAR applications. Comparison with other descriptors , 1987, J. Chem. Inf. Comput. Sci..

[8]  Peter Willett,et al.  Comparison of fragment weighting schemes for substructural analysis , 1989 .

[9]  G. Tutz On cross-validation for discrete kernel estimates in discrimination , 1989 .

[10]  Luigi Di Pace,et al.  A machine learning approach to computer-aided molecular design , 1991, J. Comput. Aided Mol. Des..

[11]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[12]  F. Burden Using Artificial Neural Networks to Predict Biological Activity from Simple Molecular Structural Considerations , 1996 .

[13]  Ashwin Srinivasan,et al.  The discovery of indicator variables for QSAR using inductive logic programming , 1998, J. Comput. Aided Mol. Des..

[14]  H. Kubinyi,et al.  A scoring scheme for discriminating between drugs and nondrugs. , 1998, Journal of medicinal chemistry.

[15]  P. Labute,et al.  Binary Quantitative Structure—Activity Relationship (QSAR) Analysis of Estrogen Receptor Ligands. , 1999 .

[16]  Christophe G. Lambert,et al.  Analysis of a Large Structure/Biological Activity Data Set Using Recursive Partitioning , 1999, J. Chem. Inf. Comput. Sci..

[17]  Sung Jin Cho,et al.  Binary Formal Inference-Based Recursive Modeling Using Multiple Atom and Physicochemical Property Class Pair and Torsion Descriptors as Decision Criteria , 2000, J. Chem. Inf. Comput. Sci..

[18]  Jonathan D. Hirst,et al.  Nonparametric Regression Applied to Quantitative Structure-Activity Relationships , 2000, J. Chem. Inf. Comput. Sci..

[19]  Gavin Harper,et al.  Bounds on The Performance of a Greedy Algorithm for Probabilities , 2001, Math. Oper. Res..

[20]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .