In Search of Predictive Models for Inhibitors of 5-alpha Reductase 2 Based on the Integration of Bioactivity and Molecular Descriptors Data

5-alpha reductase (5α-reductase) is a microsomal protein that converts testosterone into dihydrotestosterone (DHT). When changes occur in the function of this enzyme, disorders such as pseudohermaphroditism, baldness, benign prostatic hyperplasia and prostate cancer may arise. Currently, there are only two marketed drugs, finasteride and dutasteride, for the therapy of benign prostatic hyperplasia, which have long term side effects, stressing the need for the development of better inhibitors. In the present study, we used a dataset of compounds with known inhibitory activity against 5α-reductase (isozyme 2; 5α-R2) obtained from the ChEMBL database, and employed machine learning methods (random forests and support vector machines) to build classifiers for high-throughput virtual screening campaigns to help prioritise molecules for further analysis. The performance of the classification models was evaluated based on sensitivity, specificity, precision, F-score and accuracy. Our results show that, overall the classification models produced by the two algorithms present similar performance. Furthermore, the classifiers show high performance on the identification and discrimination between potent and weak inhibitors.

[1]  M. Neves,et al.  Anticancer steroids: linking natural and semi-synthetic compounds. , 2013, Natural product reports.

[2]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[3]  Jonathan D Hirst,et al.  Machine learning in virtual screening. , 2009, Combinatorial chemistry & high throughput screening.

[4]  Hanna Geppert,et al.  Current Trends in Ligand-Based Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation , 2010, J. Chem. Inf. Model..

[5]  John B. O. Mitchell,et al.  Informatics, machine learning and computational medicinal chemistry. , 2011, Future medicinal chemistry.

[6]  Chi H. Lee,et al.  Computational analysis and predictive modeling of polymorph descriptors , 2013, Chemistry Central Journal.

[7]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[8]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[9]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[10]  B. Haendler,et al.  5-alpha-reductase type I (SRD5A1) is up-regulated in non-small cell lung cancer but does not impact proliferation, cell cycle distribution or apoptosis , 2012, Cancer cell international.

[11]  Kunal Roy,et al.  How far can virtual screening take us in drug discovery? , 2013, Expert opinion on drug discovery.

[12]  J. Bajorath,et al.  Docking and scoring in virtual screening for drug discovery: methods and applications , 2004, Nature Reviews Drug Discovery.

[13]  T. Bhardwaj,et al.  3D-QSAR studies on unsaturated 4-azasteroids as human 5alpha-reductase inhibitors: a self organizing molecular field analysis approach. , 2010, European journal of medicinal chemistry.

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[15]  G. Oliva,et al.  Virtual screening and its integration with modern drug design technologies. , 2008, Current medicinal chemistry.

[16]  S. Silvestre,et al.  Steroidal 5α-reductase and 17α-hydroxylase/17,20-lyase (CYP17) inhibitors useful in the treatment of prostatic diseases , 2013, The Journal of Steroid Biochemistry and Molecular Biology.

[17]  Igor I. Baskin,et al.  Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis? , 2012, J. Chem. Inf. Model..

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  M. Kumar,et al.  Self-organizing molecular field analysis on pregnane derivatives as human steroidal 5α-reductase inhibitors , 2010, Steroids.

[20]  A Lavecchia,et al.  Virtual screening strategies in drug discovery: a critical review. , 2013, Current medicinal chemistry.

[21]  Maurizio Vichi,et al.  Studies in Classification Data Analysis and knowledge Organization , 2011 .

[22]  Vinod Scaria,et al.  Computational models for in-vitro anti-tubercular activity of molecules based on high-throughput chemical biology screening datasets , 2012, BMC pharmacology.

[23]  T. Penning,et al.  Steroid 5α-reductases and 3α-hydroxysteroid dehydrogenases: key enzymes in androgen metabolism , 2001 .

[24]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[25]  George Papadatos,et al.  Evaluation of machine-learning methods for ligand-based virtual screening , 2007, J. Comput. Aided Mol. Des..

[26]  D. Russell,et al.  Structural and biochemical properties of cloned and expressed human and rat steroid 5 alpha-reductases. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Christina Wang,et al.  The effect of 5alpha-reductase inhibition with dutasteride and finasteride on bone mineral density, serum lipoproteins, hemoglobin, prostate specific antigen and sexual function in healthy young men. , 2008, The Journal of urology.

[28]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[29]  Yoshifumi Fukunishi,et al.  Structure-based drug screening and ligand-based drug screening with machine learning. , 2009, Combinatorial chemistry & high throughput screening.

[30]  Salma Jamal,et al.  Predictive modeling of anti-malarial molecules inhibiting apicoplast formation , 2013, BMC Bioinformatics.

[31]  T. Bhardwaj,et al.  Self organizing molecular field analysis on a series of human 5alpha-reductase inhibitors: unsaturated 3-carboxysteroid. , 2009, European journal of medicinal chemistry.