Machine learning in virtual screening.

In this review, we highlight recent applications of machine learning to virtual screening, focusing on the use of supervised techniques to train statistical learning algorithms to prioritize databases of molecules as active against a particular protein target. Both ligand-based similarity searching and structure-based docking have benefited from machine learning algorithms, including naïve Bayesian classifiers, support vector machines, neural networks, and decision trees, as well as more traditional regression techniques. Effective application of these methodologies requires an appreciation of data preparation, validation, optimization, and search methodologies, and we also survey developments in these areas.

[1]  R. Adams Proceedings , 1947 .

[2]  Krzysztof Socha,et al.  Ant Colony Optimization and Swarm Intelligence , 2004, Lecture Notes in Computer Science.

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Josef Kittler,et al.  Multiple Classifier Systems , 2004, Lecture Notes in Computer Science.

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Edmund K. Burke,et al.  The Genetic and Evolutionary Computation Conference , 2011 .

[7]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[8]  Michael C. Mozer,et al.  Mathematical Perspectives on Neural Networks , 1996 .

[9]  P. Willett,et al.  PHARMACOPHORE PERCEPTION , DEVELOPMENT , AND USE IN DRUG DESIGN , 2011 .

[10]  Johann Gasteiger,et al.  Handbook of Chemoinformatics , 2003 .

[11]  E. V. Nuttall ACT , 1986 .

[12]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[13]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[14]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[15]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[16]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[17]  Johann Gasteiger,et al.  Neural networks in chemistry and drug design , 1999 .