Bayesian methods in virtual screening and chemical biology.

The Naïve Bayesian Classifier, as well as related classification and regression approaches based on Bayes' theorem, has experienced increased attention in the cheminformatics world in recent years. In this contribution, we first review the mathematical framework on which Bayes' methods are built, and then continue to discuss implications of this framework as well as practical experience under which conditions Bayes' methods give the best performance in virtual screening settings. Finally, we present an overview of applications of Bayes' methods to both virtual screening and the chemical biology arena, where applications range from bridging phenotypic and mechanistic space of drug action to the prediction of ligand-target interactions.

[1]  Paul Labute,et al.  Binary QSAR: A New Method for the Determination of Quantitative Structure Activity Relationships , 1998, Pacific Symposium on Biocomputing.

[2]  A. Bender,et al.  Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. , 2006, IDrugs : the investigational drugs journal.

[3]  C. E. Peishoff,et al.  A critical assessment of docking programs and scoring functions. , 2006, Journal of medicinal chemistry.

[4]  Paul Labute,et al.  A probabilistic approach to high throughput drug discovery. , 2002, Combinatorial chemistry & high throughput screening.

[5]  L Martin Cloutier,et al.  Bayesian versus Frequentist statistical modeling: a debate for hit selection from HTS campaigns. , 2008, Drug discovery today.

[6]  Meir Glick,et al.  Enrichment of Extremely Noisy High-Throughput Screening Data Using a Naïve Bayes Classifier , 2004, Journal of biomolecular screening.

[7]  Gisbert Schneider,et al.  Scaffold‐Hopping: How Far Can You Jump? , 2006 .

[8]  Jürgen Bajorath,et al.  Distinguishing between Natural Products and Synthetic Molecules by Descriptor Shannon Entropy Analysis and Binary QSAR Calculations , 2000, J. Chem. Inf. Comput. Sci..

[9]  Anil K. Saxena,et al.  Evaluation of Binary QSAR Models Derived from LUDI and MOE Scoring Functions for Structure Based Virtual Screening , 2006, J. Chem. Inf. Model..

[10]  D. Rogers,et al.  Using Extended-Connectivity Fingerprints with Laplacian-Modified Bayesian Analysis in High-Throughput Screening Follow-Up , 2005, Journal of biomolecular screening.

[11]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[12]  Ron Kohavi,et al.  Improving simple Bayes , 1997 .

[13]  R. Glen,et al.  Molecular similarity: a key technique in molecular informatics. , 2004, Organic & biomolecular chemistry.

[14]  Meir Glick,et al.  Enrichment of High-Throughput Screening Data with Increasing Levels of Noise Using Support Vector Machines, Recursive Partitioning, and Laplacian-Modified Naive Bayesian Classifiers , 2006, J. Chem. Inf. Model..

[15]  Yinghui Zhou,et al.  Choice of designs and doses for early phase trials , 2004, Fundamental & clinical pharmacology.

[16]  Patrizia Crivori,et al.  Virtual screening to enrich a compound collection with CDK2 inhibitors using docking, scoring, and composite scoring models , 2005, Proteins.

[17]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[18]  Andreas Bender,et al.  Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier , 2004, J. Chem. Inf. Model..

[19]  Jürgen Bajorath,et al.  Bayesian Similarity Searching in High-Dimensional Descriptor Spaces Combined with Kullback-Leibler Descriptor Divergence Analysis , 2008, J. Chem. Inf. Model..

[20]  Henrik Boström,et al.  Improving structure-based virtual screening by multivariate analysis of scoring data. , 2003, Journal of medicinal chemistry.

[21]  Thomas Bäck,et al.  Mining a Chemical Database for Fragment Co-occurrence: Discovery of "Chemical Clichés" , 2006, J. Chem. Inf. Model..

[22]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[23]  Andreas Bender,et al.  "Bayes Affinity Fingerprints" Improve Retrieval Rates in Virtual Screening and Define Orthogonal Bioactivity Space: When Are Multitarget Drugs a Feasible Concept? , 2006, J. Chem. Inf. Model..

[24]  Andreas Bender,et al.  Similarity Searching of Chemical Databases Using Atom Environment Descriptors (MOLPRINT 2D): Evaluation of Performance , 2004, J. Chem. Inf. Model..

[25]  Anthony E Klon Bayesian modeling in virtual high throughput screening. , 2009, Combinatorial chemistry & high throughput screening.

[26]  David J Diller,et al.  Deriving knowledge through data mining high-throughput screening data. , 2004, Journal of medicinal chemistry.

[27]  Mats Gyllenberg,et al.  A Bayesian molecular interaction library , 2003, J. Comput. Aided Mol. Des..

[28]  Andrew Smellie,et al.  Surrogate docking: structure-based virtual screening at high throughput speed , 2005, J. Comput. Aided Mol. Des..

[29]  D. Bojanic,et al.  Keynote review: in vitro safety pharmacology profiling: an essential tool for successful drug development. , 2005, Drug discovery today.

[30]  P. Willett,et al.  Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. , 2004, Organic & biomolecular chemistry.

[31]  Meir Glick,et al.  Prediction of Biological Targets for Compounds Using Multiple-Category Bayesian Models Trained on Chemogenomics Databases , 2006, J. Chem. Inf. Model..

[32]  Jürgen Bajorath,et al.  An Information-Theoretic Approach to Descriptor Selection for Database Profiling and QSAR Modeling , 2003 .

[33]  S. Gilmore,et al.  Evaluating statistics in clinical trials: Making the unintelligible intelligible , 2008, The Australasian journal of dermatology.

[34]  R. Glen,et al.  Ligand-protein docking: cancer research at the interface between biology and chemistry. , 2003, Current medicinal chemistry.

[35]  Wasserman,et al.  Bayesian Model Selection and Model Averaging. , 2000, Journal of mathematical psychology.

[36]  Christian N Parker,et al.  McMaster University Data-Mining and Docking Competition , 2005, Journal of biomolecular screening.

[37]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[38]  Ian A. Watson,et al.  Kinase inhibitor data modeling and de novo inhibitor design with fragment approaches. , 2009, Journal of medicinal chemistry.

[39]  Andreas Bender,et al.  "Virtual fragment linking": an approach to identify potent binders from low affinity fragment hits. , 2008, Journal of medicinal chemistry.

[40]  John A. Tallarico,et al.  Multi-parameter phenotypic profiling: using cellular effects to characterize small-molecule compounds , 2009, Nature Reviews Drug Discovery.

[41]  Jérôme Hert,et al.  Comparison of Fingerprint-Based Methods for Virtual Screening Using Multiple Bioactive Reference Structures , 2004, J. Chem. Inf. Model..

[42]  F. Burden,et al.  Robust QSAR models using Bayesian regularized neural networks. , 1999, Journal of medicinal chemistry.

[43]  Nicos Angelopoulos,et al.  Bayesian Model Averaging for Ligand Discovery , 2009, J. Chem. Inf. Model..

[44]  Ying Liu,et al.  A Comparative Study on Feature Selection Methods for Drug Discovery , 2004, J. Chem. Inf. Model..

[45]  Paul Labute,et al.  Binary Quantitative Structure-Activity Relationship (QSAR) Analysis of Estrogen Receptor Ligands , 1999, J. Chem. Inf. Comput. Sci..

[46]  George Papadatos,et al.  Evaluation of machine-learning methods for ligand-based virtual screening , 2007, J. Comput. Aided Mol. Des..

[47]  R. Glen,et al.  Screening for Dihydrofolate Reductase Inhibitors Using MOLPRINT 2D, a Fast Fragment-Based Method Employing the Naïve Bayesian Classifier: Limitations of the Descriptor and the Importance of Balanced Chemistry in Training and Test Sets , 2005, Journal of biomolecular screening.

[48]  John A. Tallarico,et al.  Integrating high-content screening and ligand-target prediction to identify mechanism of action. , 2008, Nature chemical biology.

[49]  Anthony E. Klon,et al.  Finding more needles in the haystack: A simple and efficient method for improving high-throughput docking results. , 2004, Journal of medicinal chemistry.

[50]  Ian A. Watson,et al.  Chemical fragments as foundations for understanding target space and activity prediction. , 2008, Journal of medicinal chemistry.

[51]  Naomie Salim,et al.  Similarity‐Based Virtual Screening with a Bayesian Inference Network , 2009, ChemMedChem.

[52]  M Gyllenberg,et al.  A fragment library based on Gaussian mixtures predicting favorable molecular interactions. , 2001, Journal of molecular biology.

[53]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[54]  G. S. Gill,et al.  Molecular surface point environments for virtual screening and the elucidation of binding patterns (MOLPRINT) , 2004 .