Pharos: Collating protein information to shed light on the druggable genome

The ‘druggable genome’ encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage ‘serendipitous browsing’, whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.

[1]  P Kolb,et al.  GPCRdb: the G protein‐coupled receptor database – an introduction , 2016, British journal of pharmacology.

[2]  Gary D Bader,et al.  The human genome and drug discovery after a decade. Roads (still) not taken , 2011, 1102.0448.

[3]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[4]  Milton H. Saier,et al.  The Transporter Classification Database (TCDB): recent advances , 2015, Nucleic Acids Res..

[5]  Tudor I. Oprea,et al.  DrugCentral: online drug compendium , 2016, Nucleic Acids Res..

[6]  Lokesh P. Tripathi,et al.  TargetMine, an Integrated Data Warehouse for Candidate Gene Prioritisation and Target Discovery , 2011, PloS one.

[7]  Andrew D. Rouillard,et al.  The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins , 2016, Database J. Biol. Databases Curation.

[8]  Joanna L. Sharman,et al.  The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands , 2015, Nucleic Acids Res..

[9]  Deng Pan,et al.  DGIdb 2.0: mining clinically relevant drug–gene interactions , 2015, Nucleic Acids Res..

[10]  Feng Xu,et al.  Therapeutic target database update 2016: enriched resource for bench to clinical drug target and targeted pathway information , 2015, Nucleic Acids Res..

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  Tsippi Iny Stein,et al.  The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses , 2016, Current protocols in bioinformatics.

[13]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[14]  J. Gestwicki,et al.  Expanding the Number of ‘Druggable’ Targets: Non‐Enzymes and Protein–Protein Interactions , 2013, Chemical biology & drug design.

[15]  O. Civelli,et al.  Orphan G protein-coupled receptors and obesity. , 2004, European journal of pharmacology.

[16]  S. Brunak,et al.  Network biology concepts in complex disease comorbidities , 2016, Nature Reviews Genetics.

[17]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[18]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[19]  A. Hopkins,et al.  The druggable genome , 2002, Nature Reviews Drug Discovery.

[20]  Anushya Muruganujan,et al.  PANTHER version 10: expanded protein families and functions, and analysis tools , 2015, Nucleic Acids Res..

[21]  Tudor I. Oprea,et al.  A comprehensive map of molecular drug targets , 2016, Nature Reviews Drug Discovery.