SWEETLEAD: an In Silico Database of Approved Drugs, Regulated Chemicals, and Herbal Isolates for Computer-Aided Drug Discovery

In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD (https://simtk.org/home/sweetlead) provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

[1]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[2]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[3]  Joel Dudley,et al.  Exploiting drug-disease relationships for computational drug repositioning , 2011, Briefings Bioinform..

[4]  Heine H. Hansen List of drugs , 2008 .

[5]  Shyam Sundar,et al.  Injectable paromomycin for Visceral leishmaniasis in India. , 2007, The New England journal of medicine.

[6]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[7]  D. Wechsler,et al.  Treatment with sirolimus results in complete responses in patients with autoimmune lymphoproliferative syndrome , 2009, British journal of haematology.

[8]  Anthony Nicholls,et al.  Essential considerations for using protein-ligand structures in drug discovery. , 2012, Drug discovery today.

[9]  Kui Xu,et al.  Database identifies FDA-approved drugs with potential to be repurposed for treatment of orphan diseases , 2011, Briefings Bioinform..

[10]  Laetitia Martin-Chanas,et al.  Identify drug repurposing candidates by mining the Protein Data Bank , 2011, Briefings Bioinform..

[11]  Ruili Huang,et al.  The NCGC Pharmaceutical Collection: A Comprehensive Resource of Clinically Approved Drugs Enabling Repurposing and Chemical Genomics , 2011, Science Translational Medicine.

[12]  Hitoshi Iba,et al.  Classification of Gene Expression Data by Majority Voting Genetic Programming Classifier , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[13]  P. Sanseau,et al.  Computational Drug Repositioning: From Data to Therapeutics , 2013, Clinical pharmacology and therapeutics.

[14]  Detrusor overactivity successfully treated with duloxetine , 2007, Journal of obstetrics and gynaecology : the journal of the Institute of Obstetrics and Gynaecology.

[15]  Igor Goryanin,et al.  A semi-automated genome annotation comparison and integration scheme , 2013, BMC Bioinformatics.

[16]  Alexander A. Morgan,et al.  Computational Repositioning of the Anticonvulsant Topiramate for Inflammatory Bowel Disease , 2011, Science Translational Medicine.

[17]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[18]  Judith D. Cohn,et al.  Genome Majority Vote Improves Gene Predictions , 2011, PLoS Comput. Biol..

[19]  J. Languillon,et al.  WHO co-ordinated short-term double-blind trial with thalidomide in the treatment of acute lepra reactions in male lepromatous patients. , 1971, Bulletin of the World Health Organization.

[20]  G. Muirhead,et al.  Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. , 1996, International journal of impotence research.

[21]  Gisbert Schneider,et al.  Computer-based de novo design of drug-like molecules , 2005, Nature Reviews Drug Discovery.

[22]  E. Estey,et al.  Use of arsenic trioxide (As2O3) in the treatment of patients with acute promyelocytic leukemia , 2003, Cancer.

[23]  Mark McGann,et al.  FRED and HYBRID docking performance on standardized datasets , 2012, Journal of Computer-Aided Molecular Design.

[24]  B. Barlogie,et al.  Antitumor activity of thalidomide in refractory multiple myeloma. , 1999, The New England journal of medicine.

[25]  Shuichi Hirono,et al.  Comparison of Consensus Scoring Strategies for Evaluating Computational Models of Protein-Ligand Complexes , 2006, J. Chem. Inf. Model..

[26]  Mark A. Murcko,et al.  Virtual screening : an overview , 1998 .

[27]  M. Kanehisa A database for post-genome analysis. , 1997, Trends in genetics : TIG.

[28]  Michael J. Keiser,et al.  Large Scale Prediction and Testing of Drug Activity on Side-Effect Targets , 2012, Nature.

[29]  Antony J. Williams,et al.  ChemSpider:: An Online Chemical Information Resource , 2010 .

[30]  Russ B. Altman,et al.  PharmGKB: the Pharmacogenetics Knowledge Base , 2002, Nucleic Acids Res..

[31]  David S. Wishart,et al.  Biospider: A Web Server for Automating Metabolome Annotations , 2007, Pacific Symposium on Biocomputing.

[32]  G. Oliva,et al.  Virtual screening and its integration with modern drug design technologies. , 2008, Current medicinal chemistry.

[33]  Joel Lexchin,et al.  The cost of drug development: a systematic review. , 2011, Health policy.

[34]  David A. Dixon,et al.  Annual reports in computational chemistry , 2007 .

[35]  Michael J. Keiser,et al.  Predicting new molecular targets for known drugs , 2009, Nature.

[36]  Thomas Girke,et al.  ChemMine. A Compound Mining Database for Chemical Genomics1 , 2005, Plant Physiology.

[37]  Wei Tang,et al.  Use of arsenic trioxide (As2O3) in the treatment of acute promyelocytic leukemia (APL): II. Clinical efficacy and pharmacokinetics in relapsed patients. , 1997, Blood.

[38]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..