The ChEMBL database in 2017

ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.

[1]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[2]  Prudence Mutowo-Meullenet,et al.  A drug target slim: using gene ontology and gene ontology annotations to navigate protein-ligand target space in ChEMBL , 2016, Journal of Biomedical Semantics.

[3]  Narayanan Eswar,et al.  A Kernel for Open Source Drug Discovery in Tropical Diseases , 2009, PLoS neglected tropical diseases.

[4]  Ubbo Visser,et al.  BioAssay Ontology (BAO): a semantic description of bioassays and high-throughput screening results , 2011, BMC Bioinformatics.

[5]  George Papadatos,et al.  Activity, assay and target data curation and quality in the ChEMBL database , 2015, Journal of Computer-Aided Molecular Design.

[6]  Rajarshi Guha,et al.  Pharos: Collating protein information to shed light on the druggable genome , 2016, Nucleic Acids Res..

[7]  Milton H. Saier,et al.  The Transporter Classification Database (TCDB): recent advances , 2015, Nucleic Acids Res..

[8]  George Papadatos,et al.  ChEMBL web services: streamlining access to drug discovery data and utilities , 2015, Nucleic Acids Res..

[9]  S. Ceccarelli,et al.  Carnitine palmitoyltransferase (CPT) modulators: a medicinal chemistry perspective on 35 years of research. , 2011, Journal of medicinal chemistry.

[10]  Paul Morgan,et al.  Can the flow of medicines be improved? Fundamental pharmacokinetic and pharmacological principles toward improving Phase II survival. , 2012, Drug discovery today.

[11]  Jeremy N. Burrows,et al.  The Open Access Malaria Box: A Drug Discovery Catalyst for Neglected Diseases , 2013, PloS one.

[12]  Johann Gasteiger,et al.  Self-organizing maps for identification of new inhibitors of P-glycoprotein. , 2007, Journal of medicinal chemistry.

[13]  M. Pangalos,et al.  Lessons learned from the fate of AstraZeneca's drug pipeline: a five-dimensional framework , 2014, Nature Reviews Drug Discovery.

[14]  James R. Brown,et al.  Thousands of chemical starting points for antimalarial lead identification , 2010, Nature.

[15]  Organización Mundial de la Salud Guidelines for ATC classification and DDD assignment , 1996 .

[16]  K. Hornbuckle,et al.  Evaluation of the Characteristics of Safety Withdrawal of Prescription Drugs from Worldwide Pharmaceutical Markets-1960 to 1999 , 2001 .

[17]  Brian Hudson,et al.  Strategic Pooling of Compounds for High-Throughput Screening , 1999, J. Chem. Inf. Comput. Sci..

[18]  George Papadatos,et al.  A large-scale crop protection bioassay data set , 2015, Scientific Data.

[19]  E. Turner,et al.  How to access and process FDA drug approval packages for use in research , 2013, BMJ.

[20]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[21]  D. Sall,et al.  Modern phenotypic drug discovery is a viable, neoclassic pharma strategy. , 2012, Journal of medicinal chemistry.

[22]  George Papadatos,et al.  MyChEMBL: A Virtual Platform for Distributing Cheminformatics Tools and Open Data , 2014, Challenges.

[23]  L. Lasagna,et al.  Drug safety discontinuations in the United Kingdom, the United States, and Spain from 1974 through 1993: A regulatory perspective , 1995, Clinical pharmacology and therapeutics.

[24]  Andrew C. Good,et al.  An Empirical Process for the Design of High-Throughput Screening Deck Filters , 2006, J. Chem. Inf. Model..

[25]  George Papadatos,et al.  SureChEMBL: a large-scale, chemically annotated patent document database , 2015, Nucleic Acids Res..

[26]  Yue Liu,et al.  CLO: The cell line ontology , 2014, Journal of Biomedical Semantics.

[27]  M. Bunnage Getting pharmaceutical R&D back on target. , 2011, Nature chemical biology.

[28]  S. Lewis,et al.  Uberon, an integrative multi-species anatomy ontology , 2012, Genome Biology.

[29]  George Papadatos,et al.  UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers , 2014, Journal of Cheminformatics.

[30]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[31]  Barend Mons,et al.  Open PHACTS: semantic interoperability for drug discovery. , 2012, Drug discovery today.

[32]  Robert Preissner,et al.  WITHDRAWN—a resource for withdrawn and discontinued drugs , 2015, Nucleic Acids Res..

[33]  Yanli Wang,et al.  PubChem BioAssay: 2017 update , 2016, Nucleic Acids Res..

[34]  James F Blake,et al.  Identification and evaluation of molecular properties related to preclinical optimization and clinical fate. , 2005, Medicinal chemistry (Shariqah (United Arab Emirates)).

[35]  Anna Zhukova,et al.  Modeling sample variables with an Experimental Factor Ontology , 2010, Bioinform..

[36]  Michael Hay,et al.  Clinical development success rates for investigational drugs , 2014, Nature Biotechnology.

[37]  Antje Chang,et al.  The BRENDA Tissue Ontology (BTO): the first all-integrating ontology of all organisms for enzyme sources , 2010, Nucleic Acids Res..

[38]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[39]  R. M. Owen,et al.  An analysis of the attrition of drug candidates from four major pharmaceutical companies , 2015, Nature Reviews Drug Discovery.

[40]  Daniel James,et al.  Lessons Learnt from Assembling Screening Libraries for Drug Discovery for Neglected Diseases , 2007, ChemMedChem.

[41]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[42]  S. Szeinbach,et al.  Market withdrawal of new molecular entities approved in the United States from 1980 to 2009 , 2011, Pharmacoepidemiology and drug safety.

[43]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[44]  Paul N. Schofield,et al.  The Units Ontology: a tool for integrating units of measurement in science , 2012, Database J. Biol. Databases Curation.

[45]  Igor V. Filippov,et al.  Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source Solution , 2009, J. Chem. Inf. Model..

[46]  José Luís Oliveira,et al.  BeCAS: biomedical concept recognition services and visualization , 2013, Bioinform..

[47]  I. Kola,et al.  Can the pharmaceutical industry reduce attrition rates? , 2004, Nature Reviews Drug Discovery.

[48]  D. Dalvie,et al.  Cytochrome P450 and Non–Cytochrome P450 Oxidative Metabolism: Contributions to the Pharmacokinetics, Safety, and Efficacy of Xenobiotics , 2016, Drug Metabolism and Disposition.

[49]  Martin Romacker,et al.  Evolving BioAssay Ontology (BAO): modularization, integration and applications , 2014, Journal of Biomedical Semantics.

[50]  Matthieu Schapira,et al.  ChromoHub V 2 : cancer genomics , 2014 .

[51]  Ernesto Callegari,et al.  A comprehensive listing of bioactivation pathways of organic functional groups. , 2005, Current drug metabolism.

[52]  Gautier Koscielny,et al.  Open Targets: a platform for therapeutic target identification and validation , 2016, Nucleic Acids Res..

[53]  J. Arrowsmith,et al.  Trial Watch: Phase II and Phase III attrition rates 2011–2012 , 2013, Nature Reviews Drug Discovery.

[54]  Amar Koleti,et al.  Metadata Standard and Data Exchange Specifications to Describe, Model, and Integrate Complex and Diverse High-Throughput Screening Data from the Library of Integrated Network-based Cellular Signatures (LINCS) , 2014, Journal of biomolecular screening.

[55]  Jean-Robert Ioset,et al.  Drugs for Neglected Diseases initiative model of drug development for neglected diseases: current status and future challenges. , 2011, Future medicinal chemistry.

[56]  Joanna L. Sharman,et al.  The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands , 2015, Nucleic Acids Res..

[57]  George Papadatos,et al.  myChEMBL: a virtual machine implementation of open data and cheminformatics tools , 2014, Bioinform..

[58]  Lihua Liu,et al.  ChromoHub V2: cancer genomics , 2014, Bioinform..