Wikidata: A large-scale collaborative ontological medical database

Created in October 2012, Wikidata is a large-scale, human-readable, machine-readable, multilingual, multidisciplinary, centralized, editable, structured, and linked knowledge-base with an increasing diversity of use cases. Here, we raise awareness of the potential use of Wikidata as a useful resource for biomedical data integration and semantic interoperability between biomedical computer systems. We show the data model and characteristics of Wikidata and explain how this database can be automatically processed by users as well as by computer methods and programs. Then, we give an overview of the medical entities and relations provided by the database and how they can be useful for various medical purposes such as clinical decision support.

[1]  Jake Orlowitz,et al.  Why Medical Schools Should Embrace Wikipedia: Final-Year Medical Student Contributions to Wikipedia Articles for Academic Credit at One School , 2016, Academic medicine : journal of the Association of American Medical Colleges.

[2]  B. Orgun,et al.  HL7 ontology and mobile agents for interoperability in heterogeneous medical information systems , 2006, Comput. Biol. Medicine.

[3]  Denny Vrandecic,et al.  Wikidata: a new platform for collaborative data collection , 2012, WWW.

[4]  Rajaram Kaliyaperumal,et al.  Wikidata as an intuitive resource towards semantic data modeling in data FAIRification , 2018, SWAT4LS.

[5]  Michel Dumontier,et al.  Interoperability between Biomedical Ontologies through Relation Expansion, Upper-Level Ontologies and Automatic Reasoning , 2011, PloS one.

[6]  Thomas Shafee,et al.  Academics can help shape Wikipedia , 2017, Science.

[7]  Peter Szolovits,et al.  The coming of age of artificial intelligence in medicine , 2009, Artif. Intell. Medicine.

[8]  Guido Zuccon,et al.  Task-oriented search for evidence-based medicine , 2017, International Journal on Digital Libraries.

[9]  M. Moore From Birth to Death: The Complex Lives of Eukaryotic mRNAs , 2005, Science.

[10]  Pieterjan De Potter,et al.  Semantic patient information aggregation and medicinal decision support , 2012, Comput. Methods Programs Biomed..

[11]  Houcemeddine Turki,et al.  MeSH qualifiers, publication types and relation occurrence frequency are also useful for a better sentence-level extraction of biomedical relations , 2018, J. Biomed. Informatics.

[12]  John Walsh,et al.  Research Tool Patenting and Licensing and Biomedical Innovation , 2003 .

[13]  Benjamin M. Good,et al.  Opportunities and Challenges Presented by Wikidata in the Context of Biocuration , 2016, ICBO/BioCreative.

[14]  Mike Conway,et al.  Corpus-Driven Terminology Development: Populating Swedish SNOMED CT with Synonyms Extracted from Electronic Health Records , 2013, BioNLP@ACL.

[15]  Les Carr,et al.  A Glimpse into Babel: An Analysis of Multilinguality in Wikidata , 2017, OpenSym.

[16]  W. Kibbe,et al.  Annotating the human genome with Disease Ontology , 2009, BMC Genomics.

[17]  G. B. Robb,et al.  mRNA capping: biological functions and applications , 2016, Nucleic acids research.

[18]  Denny Vrandecic The Rise of Wikidata , 2013, IEEE Intelligent Systems.

[19]  Giancarlo Guizzardi,et al.  Applying a Multi-Level Modeling Theory to Assess Taxonomic Hierarchies in Wikidata , 2016, WWW.

[20]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[21]  Gerd Stumme,et al.  Discovering Implicational Knowledge in Wikidata , 2019, ICFCA.

[22]  Rodney D. Nielsen,et al.  Towards comprehensive syntactic and semantic annotations of the clinical narrative , 2013, J. Am. Medical Informatics Assoc..

[23]  Michel Dumontier,et al.  Toward a complete dataset of drug-drug interaction information from publicly available sources , 2015, J. Biomed. Informatics.

[24]  H. Gharbi,et al.  Ultrasound examination of the hydatic liver. , 1981, Radiology.

[25]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[26]  Nahum Sonenberg,et al.  Cap and cap‐binding proteins in the control of gene expression , 2011, Wiley interdisciplinary reviews. RNA.

[27]  Thomas Pellissier Tanon,et al.  Question Answering Benchmarks for Wikidata , 2017, SEMWEB.

[28]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[29]  Ian Horrocks,et al.  Description Logics , 2008, Handbook of Knowledge Representation.

[30]  Houcemeddine Turki,et al.  Using WikiData as a Multi-lingual Multi-dialectal Dictionary for Arabic Dialects , 2017, 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA).

[31]  Thomas Shafee,et al.  Evolution of Wikipedia’s medical content: past, present and future , 2017, Journal of Epidemiology & Community Health.

[32]  Fan Meng,et al.  Open Biomedical Ontology-based Medline exploration , 2009, BMC Bioinformatics.

[33]  Richard H. Hunter,et al.  Abdominal Pain , 1935, The Ulster medical journal.

[34]  O Bodenreider,et al.  Biomedical ontologies in action: role in knowledge management, data integration and decision support. , 2008, Yearbook of medical informatics.

[35]  Alexander Pfundner,et al.  Utilizing the Wikidata System to Improve the Quality of Medical Content in Wikipedia in Diverse Languages: A Pilot Study , 2015, Journal of medical Internet research.

[36]  Thomas Shafee,et al.  Eukaryotic and Prokaryotic Gene Structure , 2017 .

[37]  Diptanshu Das,et al.  Medical journals and Wikipedia: a global health matter. , 2016, The Lancet. Global health.

[38]  James M Heilman,et al.  Wikipedia: A Key Tool for Global Public Health Promotion , 2011, Journal of medical Internet research.

[39]  Claudia Müller-Birn,et al.  Peer-production system or collaborative ontology engineering effort: what is Wikidata? , 2015, OpenSym.

[40]  The Gene Ontology Consortium,et al.  Expansion of the Gene Ontology knowledgebase and resources , 2016, Nucleic Acids Res..

[41]  Mohammed Yeasin,et al.  Semantically linking and browsing PubMed abstracts with gene ontology , 2008, BMC Genomics.

[42]  Gang Fu,et al.  Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data , 2014, Nucleic Acids Res..

[43]  Carol L. Davis,et al.  Oxford Textbook of Palliative Medicine , 1994, British Journal of Cancer.

[44]  Heiko Paulheim,et al.  One Knowledge Graph to Rule Them All? Analyzing the Differences Between DBpedia, YAGO, Wikidata & co , 2017, KI.

[45]  J. Guhaniyogi,et al.  Regulation of mRNA stability in mammalian cells. , 2001, Gene.

[46]  James Jungho Pak,et al.  2 , 2009, NEMS.

[47]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[48]  Martha Millan,et al.  Semantic Annotation of Medical Images , 2010 .

[49]  Egon L. Willighagen,et al.  Scholia, Scientometrics and Wikidata , 2017, ESWC.

[50]  Benjamin M. Good,et al.  Wikidata: A platform for data integration and dissemination for the life sciences and beyond , 2015, bioRxiv.

[51]  Martha Fallahay Loesch VIAF (The Virtual International Authority File) – http://viaf.org , 2011 .

[52]  Peng Gang Sun,et al.  The human Drug-Disease-Gene Network , 2015, Inf. Sci..

[53]  Thomas Pellissier Tanon,et al.  From Freebase to Wikidata: The Great Migration , 2016, WWW.

[54]  Achim Rettinger,et al.  Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO , 2017, Semantic Web.

[55]  Rong Xu,et al.  dRiskKB: a large-scale disease-disease risk relationship knowledge base constructed from biomedical text , 2014, BMC Bioinformatics.

[56]  Yue Ma,et al.  Learning Formal Definitions for Snomed CT from Text , 2013, AIME.

[57]  Yvonne Brenden Hansen,et al.  Norwegian Artist Names Authority List of Artists in Norwegian Art Collections , 2019, Heritage.

[58]  C. Schmid,et al.  A new equation to estimate glomerular filtration rate. , 2009, Annals of internal medicine.

[59]  Saif Aldeen AlRyalat,et al.  The change from an eponym to a representative name: Wegener to granulomatosis with polyangiitis , 2018, Scientometrics.

[60]  Cliff Landis Linked Open Data in Libraries , 2019 .

[61]  Benjamin M. Good,et al.  Wikidata as a semantic framework for the Gene Wiki initiative , 2015, bioRxiv.

[62]  Grace I. Paterson,et al.  Systematized nomenclature of medicine clinical terms (SNOMED CT) to represent computed tomography procedures , 2011, Comput. Methods Programs Biomed..

[63]  Hyoil Han,et al.  Biomedical question answering: A survey , 2010, Comput. Methods Programs Biomed..

[64]  Winter Guerra,et al.  Planning an innovation marathon at an infectious disease conference with results from the International Meeting on Emerging Diseases and Surveillance 2016 Hackathon , 2017, International Journal of Infectious Diseases.

[65]  Benjamin M. Good,et al.  WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata , 2017, bioRxiv.

[66]  Fang Liu,et al.  BabelMeSH: Development of a Cross-Language Tool for MEDLINE/PubMed , 2006, AMIA.

[67]  Zoran Budimac,et al.  An overview of ontologies and data resources in medical domains , 2014, Expert Syst. Appl..

[68]  Nicholas A. Christakis,et al.  Oxford Textbook of Palliative Medicine , 2011 .

[69]  F. Loghin,et al.  Safety issues of current analgesics: an update , 2015, Clujul medical.

[70]  Basil Ell,et al.  A Comparative Survey of DBpedia , Freebase , OpenCyc , Wikidata , and YAGO , 2015 .

[71]  Deren Kudeki,et al.  Identity and Access Management for Libraries , 2019 .

[72]  Lipika Dey,et al.  Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining , 2007, Data Knowl. Eng..

[73]  Markus Krötzsch,et al.  Practical Linked Data Access via SPARQL: The Case of Wikidata , 2018, LDOW@WWW.

[74]  The Gene Ontology Consortium Expansion of the Gene Ontology knowledgebase and resources , 2016, Nucleic Acids Res..

[75]  M. Farré,et al.  Drug Interactions With New Synthetic Opioids , 2018, Front. Pharmacol..

[76]  Nigel Collier,et al.  A multilingual ontology for infectious disease surveillance: rationale, design and challenges , 2007, Lang. Resour. Evaluation.

[77]  Honglin Li,et al.  Ontology-based information integration in virtual learning environment , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).