Introducing the Big Knowledge to Use (BK2U) challenge

The purpose of the Big Data to Knowledge initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use. Without a high‐level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule‐based BK (rule BK) and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule BK for drug–drug interaction discovery.

[1]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[2]  J. Kuhlmann,et al.  Clinical-Pharmacological Strategies to Assess Drug Interaction Potential During Drug Development , 2001, Drug safety.

[3]  Kent A. Spackman,et al.  SNOMED clinical terms: overview of the development process and project status , 2001, AMIA.

[4]  James Geller,et al.  Scalable quality assurance for large SNOMED CT hierarchies using subject-based subtaxonomies , 2015, J. Am. Medical Informatics Assoc..

[5]  I. Kohane Using electronic health records to drive discovery in disease genomics , 2011, Nature Reviews Genetics.

[6]  Yehoshua Perl,et al.  Tracking the Remodeling of SNOMED CT's Bacterial Infectious Diseases , 2016, AMIA.

[7]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[8]  V. Chongsuvivatwong,et al.  Pharmacoepidemiologic study of potential drug interactions in outpatients of a university hospital in Thailand , 2005, Journal of clinical pharmacy and therapeutics.

[9]  James Geller,et al.  Choosing the Granularity of Abstraction Networks for Orientation and Quality Assurance of the Sleep Domain Ontology , 2013, ICBO.

[10]  Nikos Loutas,et al.  A collaborative methodology for developing a semantic model for interlinking Cancer Chemoprevention linked-data sources , 2014, Semantic Web.

[11]  V. Chongsuvivatwong,et al.  Clinical drug interactions in outpatients of a university hospital in Thailand , 2005, Journal of clinical pharmacy and therapeutics.

[12]  O Bodenreider,et al.  Biomedical ontologies in action: role in knowledge management, data integration and decision support. , 2008, Yearbook of medical informatics.

[13]  George Hripcsak,et al.  Utilizing a structural meta-ontology for family-based quality assurance of the BioPortal ontologies , 2016, J. Biomed. Informatics.

[14]  James Geller,et al.  A Family-Based Framework for Supporting Quality Assurance of Biomedical Ontologies in BioPortal , 2013, AMIA.

[15]  Yue Wang,et al.  Analysis of Error Concentrations in SNOMED , 2007, AMIA.

[16]  R. Aparasu,et al.  Clinically important potential drug-drug interactions in outpatient settings. , 2007, Research in social & administrative pharmacy : RSAP.

[17]  Christopher G. Chute,et al.  Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience , 2011, J. Am. Medical Informatics Assoc..

[18]  Richard H. Scheuermann,et al.  The Human Studies Database Project: Federating Human Studies Design Data Using the Ontology of Clinical Research , 2010, Summit on translational bioinformatics.

[19]  Ross D. King,et al.  An Ontology for Description of Drug Discovery Investigations , 2010, J. Integr. Bioinform..

[20]  Larry Wright,et al.  Overview and Utilization of the NCI Thesaurus , 2004, Comparative and functional genomics.

[21]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[22]  James Geller,et al.  Scalability of Abstraction-Network-Based Quality Assurance to Large SNOMED Hierarchies , 2013, AMIA.

[23]  Ni Li,et al.  Gene Ontology Annotations and Resources , 2012, Nucleic Acids Res..

[24]  S. Trent Rosenbloom,et al.  VA National Drug File Reference Terminology: A Cross-Institutional Content Coverage Study , 2004, MedInfo.

[25]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[26]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[27]  Yue Wang,et al.  Research Paper: Auditing as Part of the Terminology Design Life Cycle , 2006, J. Am. Medical Informatics Assoc..

[28]  Frank van Harmelen,et al.  A semantic web primer , 2004 .

[29]  George Hripcsak,et al.  A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships , 2015, J. Am. Medical Informatics Assoc..

[30]  Abel N. Kho,et al.  Practical challenges in integrating genomic data into the electronic health record , 2013, Genetics in Medicine.

[31]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[32]  James Geller,et al.  Auditing Redundant Import in Reuse of a Top Level Ontology for the Drug Discovery Investigations Ontology , 2013, VDOS+DO@ICBO.

[33]  James Geller,et al.  How to Summarize Big Knowledge Subjects , 2016, ICBO/BioCreative.

[34]  James Geller,et al.  Using aggregate taxonomies to summarize SNOMED CT evolution , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[35]  Jennie Chang,et al.  Cerivastatin and reports of fatal rhabdomyolysis. , 2002, The New England journal of medicine.

[36]  Neil A. Ernst,et al.  Jambalaya: Interactive visualization to enhance ontology authoring and knowledge acquisition in Protégé , 2001 .

[37]  Michelle Dunn,et al.  The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data , 2014, J. Am. Medical Informatics Assoc..

[38]  R. Lipton,et al.  Assessment of potential drug-drug interactions with a prescription claims database. , 2005, American journal of health-system pharmacy : AJHP : official journal of the American Society of Health-System Pharmacists.

[39]  S. Speedie,et al.  Detecting Drug Interactions: A Review of the Literature , 1990, DICP : the annals of pharmacotherapy.

[40]  James Geller,et al.  A unified software framework for deriving, visualizing, and exploring abstraction networks for ontologies , 2016, J. Biomed. Informatics.

[41]  James Geller,et al.  Deriving an Abstraction Network to Support Quality Assurance in OCRe , 2012, AMIA.

[42]  Yehoshua Perl,et al.  Abstraction networks for terminologies: Supporting management of "big knowledge" , 2015, Artif. Intell. Medicine.

[43]  F Y Aoki,et al.  Drug‐Associated Hospital Admissions in Older Medical Patients , 1988, Journal of the American Geriatrics Society.

[44]  William D. Figg,et al.  Drug interactions in cancer therapy , 2006, Nature Reviews Cancer.

[45]  Yue Wang,et al.  Auditing complex concepts of SNOMED using a refined hierarchical abstraction network , 2012, J. Biomed. Informatics.

[46]  Yue Wang,et al.  Abstraction of complex concepts with a refined partial-area taxonomy of SNOMED , 2012, J. Biomed. Informatics.

[47]  James Geller,et al.  New Abstraction Networks and a New Visualization Tool in Support of Auditing the SNOMED CT Content , 2012, AMIA.

[48]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[49]  Marlene R. Miller,et al.  Suboptimal Prescribing in Elderly Outpatients: Potentially Harmful Drug‐Drug and Drug‐Disease Combinations , 2005, Journal of the American Geriatrics Society.

[50]  C. McDonald,et al.  LOINC, a universal standard for identifying laboratory observations: a 5-year update. , 2003, Clinical chemistry.

[51]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[52]  James Geller,et al.  Research Paper: Representing the UMLS as an Object-oriented Database: Modeling Issues and Advantages , 2000, J. Am. Medical Informatics Assoc..

[53]  James Geller,et al.  Drug-drug Interaction Discovery Using Abstraction Networks for "National Drug File - Reference Terminology" Chemical Ingredients , 2015, AMIA.

[54]  Daniel L. Rubin,et al.  Biomedical ontologies: a functional perspective , 2007, Briefings Bioinform..

[55]  A. Scheen,et al.  Drug Interactions of Clinical Importance with Antihyperglycaemic Agents , 2005, Drug safety.

[56]  Yue Wang,et al.  Structural methodologies for auditing SNOMED , 2007, J. Biomed. Informatics.