Bioprospecting the Bibleome: Adding Evidence to Support the Inflammatory Basis of Cancer.

BACKGROUND CANCER SIGNIFICANCE AND QUESTION BioProspecting is a novel approach that enabled our team to mine genetic marker related data from the New England Journal of Medicine (NEJM) utilizing Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and the Human Gene Ontology (HUGO). Genes associated with disorders using the Multi-threaded Clinical Vocabulary Server (MCVS) Natural Language Processing (NLP) engine, whose output was represented as an ontology-network incorporating the semantic encodings of the literature. Metabolic functions were used to identify potentially novel relationships between (genes or proteins) and (diseases or drugs). In an effort to identify genes important to transformation of normal tissue into a malignancy, we went on to identify the genes linked to multiple cancers and then mapped those genes to metabolic and signaling pathways. FINDINGS Ten Genes were related to 30 or more cancers, 72 genes were related to 20 or more cancers and 191 genes were related to 10 or more cancers. The three pathways most often associated with the top 200 novel cancer markers were the Acute Phase Response Signaling, the Glucocorticoid Receptor Signaling and the Hepatic Fibrosis/Hepatic Stellate Cell Activation pathway. MEANING AND IMPLICATIONS OF THE ADVANCE This association highlights the role of inflammation in the induction and perhaps transformation of mortal cells into cancers. MAJOR FINDINGS BioProspecting can speed our identification and understanding of synergies between articles in the biomedical literature. In this case we found considerable synergy between the Oncology literature and the Sepsis literature. By mapping these associations to known metabolic, regulatory and signaling pathways we were able to identify further evidence for the inflammatory basis of cancer.

[1]  Steven H. Brown,et al.  Evaluation of the content coverage of SNOMED CT: ability of SNOMED clinical terms to represent clinical problem lists. , 2006, Mayo Clinic proceedings.

[2]  J GARLAND A Voice in the Wilderness , 1962, British medical journal.

[3]  Steven H. Brown,et al.  Standardization of microarray and pharmacogenomics data. , 2006, Methods in molecular biology.

[4]  Peter L. Elkin,et al.  BioProspecting: novel marker discovery obtained by mining the bibleome , 2009, BMC Bioinformatics.

[5]  J. Pritchard,et al.  The allelic architecture of human disease genes: common disease-common variant...or not? , 2002, Human molecular genetics.

[6]  Rong Chen,et al.  Finding Disease-Related Genomic Experiments Within an International Repository: First Steps in Translational Bioinformatics , 2006, AMIA.

[7]  Peter L Elkin,et al.  Primer on medical genomics part V: bioinformatics. , 2003, Mayo Clinic proceedings.

[8]  Werner Ceusters,et al.  Negative findings in electronic health records and biomedical ontologies: A realist approach , 2007, Int. J. Medical Informatics.

[9]  Theodore Speroff,et al.  eQuality: electronic quality assessment from narrative clinical reports. , 2006, Mayo Clinic proceedings.

[10]  L. Grivell Mining the bibliome: searching for a needle in a haystack? , 2002, EMBO reports.

[11]  J J Cimino,et al.  The Practical Impact of Ontologies on Biomedical Informatics , 2006, Yearbook of Medical Informatics.

[12]  Atul J. Butte,et al.  Unsupervised knowledge discovery in medical databases using relevance networks , 1999, AMIA.

[13]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..