A Cloud-Based Metabolite and Chemical Prioritization System for the Biology/Disease-Driven Human Proteome Project.

Targeted metabolomics and biochemical studies complement the ongoing investigations led by the Human Proteome Organization (HUPO) Biology/Disease-Driven Human Proteome Project (B/D-HPP). However, it is challenging to identify and prioritize metabolite and chemical targets. Literature-mining-based approaches have been proposed for target proteomics studies, but text mining methods for metabolite and chemical prioritization are hindered by a large number of synonyms and nonstandardized names of each entity. In this study, we developed a cloud-based literature mining and summarization platform that maps metabolites and chemicals in the literature to unique identifiers and summarizes the copublication trends of metabolites/chemicals and B/D-HPP topics using Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores. We successfully prioritized metabolites and chemicals associated with the B/D-HPP targeted fields and validated the results by checking against expert-curated associations and enrichment analyses. Compared with existing algorithms, our system achieved better precision and recall in retrieving chemicals related to B/D-HPP focused areas. Our cloud-based platform enables queries on all biological terms in multiple species, which will contribute to B/D-HPP and targeted metabolomics/chemical studies.

[1]  Jung-Hsien Chiang,et al.  Systematic Protein Prioritization for Targeted Proteomics Studies through Literature Mining. , 2018, Journal of proteome research.

[2]  Michael Snyder,et al.  Omics AnalySIs System for PRecision Oncology (OASISPRO): a web-based omics analysis tool for clinical phenotype prediction , 2018, Bioinform..

[3]  David S. Wishart,et al.  HMDB 4.0: the human metabolome database for 2018 , 2017, Nucleic Acids Res..

[4]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[5]  R. Altman,et al.  Association of Omics Features with Histopathology Patterns in Lung Adenocarcinoma. , 2017, Cell systems.

[6]  Eduard Sabidó,et al.  What is targeted proteomics? A concise revision of targeted acquisition and targeted data analysis in mass spectrometry , 2017, Proteomics.

[7]  George Michailidis,et al.  Sparse network modeling and metscape‐based visualization methods for the analysis of large‐scale metabolomics data , 2017, Bioinform..

[8]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2017 , 2016, Nucleic Acids Res..

[9]  Jaehoon Choi,et al.  BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature , 2016, PloS one.

[10]  Ruedi Aebersold,et al.  Highlights of the Biology and Disease-driven Human Proteome Project, 2015-2016. , 2016, Journal of proteome research.

[11]  Chris Sander,et al.  Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome , 2016, Cell.

[12]  Andrew I Su,et al.  Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems. , 2016, Journal of proteome research.

[13]  Kun‐Hsing Yu,et al.  Omics Profiling in Precision Oncology* , 2016, Molecular & Cellular Proteomics.

[14]  Damian Szklarczyk,et al.  STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data , 2015, Nucleic Acids Res..

[15]  Christoph Steinbeck,et al.  ChEBI in 2016: Improved services and an expanding collection of metabolites , 2015, Nucleic Acids Res..

[16]  Andrew I Su,et al.  Prioritizing Proteomics Assay Development for Clinical Translation. , 2015, Journal of the American College of Cardiology.

[17]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[18]  Nancy Wilkins-Diehr,et al.  XSEDE: Accelerating Scientific Discovery , 2014, Computing in Science & Engineering.

[19]  Gary D Bader,et al.  Highlights of B/D‐HPP and HPP Resource Pillar Workshops at 12th Annual HUPO World Congress of Proteomics , 2014, Proteomics.

[20]  Zhiyong Lu,et al.  PubTator: a web-based text mining tool for assisting biocuration , 2013, Nucleic Acids Res..

[21]  Cheng Zhang,et al.  Biomedical text mining and its applications in cancer research , 2013, J. Biomed. Informatics.

[22]  Gary D Bader,et al.  The biology/disease-driven human proteome project (B/D-HPP): enabling protein research for the life sciences community. , 2013, Journal of proteome research.

[23]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[24]  Giovanni Scardoni,et al.  Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data , 2012, Bioinform..

[25]  Sophia Ananiadou,et al.  Discovering and visualizing indirect associations between biomedical concepts , 2011, Bioinform..

[26]  A. Seymour,et al.  High-throughput and multiplexed LC/MS/MRM method for targeted metabolomics. , 2010, Analytical chemistry.

[27]  Jing Gao,et al.  Metscape: a Cytoscape plug-in for visualizing and interpreting metabolomic data in the context of human metabolic networks , 2010, Bioinform..

[28]  Sophia Ananiadou,et al.  FACTA: a text search engine for finding associated biomedical concepts , 2008, Bioinform..

[29]  Christian von Mering,et al.  STITCH: interaction networks of chemicals and proteins , 2007, Nucleic Acids Res..

[30]  D. Wishart Proteomics and the Human Metabolome Project , 2007, Expert review of proteomics.

[31]  Michael C. Rosenstein,et al.  The comparative toxicogenomics database: a cross-species resource for building chemical-gene interaction networks. , 2006, Toxicological sciences : an official journal of the Society of Toxicology.

[32]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[33]  Hagit Shatkay,et al.  Mining the Biomedical Literature in the Genomic Era: An Overview , 2003, J. Comput. Biol..

[34]  Trevor F. Cox,et al.  Multidimensional Scaling, Second Edition , 2000 .

[35]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[36]  L. Freeman Centrality in social networks conceptual clarification , 1978 .