Literature information in PubChem: associations between PubChem records and scientific articles

BackgroundPubChem is an open archive consisting of a set of three primary public databases (BioAssay, Compound, and Substance). It contains information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically modified) amino acid and nucleic acid sequences (including siRNA and miRNA). Currently (as of Nov. 2015), PubChem contains more than 150 million depositor-provided chemical substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results provided from over 1 million biological assay records.DescriptionMany PubChem records (substances, compounds, and assays) include depositor-provided cross-references to scientific articles in PubMed. Some PubChem contributors provide bioactivity data extracted from scientific articles. Literature-derived bioactivity data complement high-throughput screening (HTS) data from the concluded NIH Molecular Libraries Program and other HTS projects. Some journals provide PubChem with information on chemicals that appear in their newly published articles, enabling concurrent publication of scientific articles in journals and associated data in public databases. In addition, PubChem links records to PubMed articles indexed with the Medical Subject Heading (MeSH) controlled vocabulary thesaurus.ConclusionLiterature information, both provided by depositors and derived from MeSH annotations, can be accessed using PubChem’s web interfaces, enabling users to explore information available in literature related to PubChem records beyond typical web search results.Graphical AbstractGraphical abstractLiterature information for PubChem records is derived from various sources

[1]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[2]  Evan Bolton,et al.  PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem , 2015, Nucleic Acids Res..

[3]  Joanna L. Sharman,et al.  The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands , 2015, Nucleic Acids Res..

[4]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database's 10th year anniversary: update 2015 , 2014, Nucleic Acids Res..

[5]  David S. Wishart,et al.  DrugBank 4.0: shedding new light on drug metabolism , 2013, Nucleic Acids Res..

[6]  Evan Bolton,et al.  An overview of the PubChem BioAssay resource , 2009, Nucleic Acids Res..

[7]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information , 2008, Nucleic Acids Res..

[8]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[9]  Kathi Canese,et al.  PubMed: The Bibliographic Database , 2013 .

[10]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[11]  Deborah Hix,et al.  The immune epitope database (IEDB) 3.0 , 2014, Nucleic Acids Res..

[12]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[13]  Dachuan Zhang,et al.  MMDB and VAST+: tracking structural similarities between macromolecular complexes , 2013, Nucleic Acids Res..

[14]  George Papadatos,et al.  The ChEMBL bioactivity database: an update , 2013, Nucleic Acids Res..

[15]  Lynda B. M. Ellis,et al.  The University of Minnesota Biocatalysis/Biodegradation Database: improving public access , 2009, Nucleic Acids Res..

[16]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[17]  Yanli Wang,et al.  PubChem BioAssay: 2014 update , 2013, Nucleic Acids Res..

[18]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[19]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[22]  J. Mcentyre Linking up with Entrez. , 1998, Trends in Genetics.

[23]  Satoshi Niijima,et al.  GLIDA: GPCR—ligand database for chemical genomics drug discovery—database and tools update , 2007, Nucleic Acids Res..

[24]  Jie Li,et al.  PDB-wide collection of binding data: current status of the PDBbind database , 2015, Bioinform..

[25]  Yanli Wang,et al.  PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..