Design and Development of a Linked Open Data-Based Health Information Representation and Visualization System: Potentials and Preliminary Evaluation

Background Healthcare organizations around the world are challenged by pressures to reduce cost, improve coordination and outcome, and provide more with less. This requires effective planning and evidence-based practice by generating important information from available data. Thus, flexible and user-friendly ways to represent, query, and visualize health data becomes increasingly important. International organizations such as the World Health Organization (WHO) regularly publish vital data on priority health topics that can be utilized for public health policy and health service development. However, the data in most portals is displayed in either Excel or PDF formats, which makes information discovery and reuse difficult. Linked Open Data (LOD)—a new Semantic Web set of best practice of standards to publish and link heterogeneous data—can be applied to the representation and management of public level health data to alleviate such challenges. However, the technologies behind building LOD systems and their effectiveness for health data are yet to be assessed. Objective The objective of this study is to evaluate whether Linked Data technologies are potential options for health information representation, visualization, and retrieval systems development and to identify the available tools and methodologies to build Linked Data-based health information systems. Methods We used the Resource Description Framework (RDF) for data representation, Fuseki triple store for data storage, and Sgvizler for information visualization. Additionally, we integrated SPARQL query interface for interacting with the data. We primarily use the WHO health observatory dataset to test the system. All the data were represented using RDF and interlinked with other related datasets on the Web of Data using Silk—a link discovery framework for Web of Data. A preliminary usability assessment was conducted following the System Usability Scale (SUS) method. Results We developed an LOD-based health information representation, querying, and visualization system by using Linked Data tools. We imported more than 20,000 HIV-related data elements on mortality, prevalence, incidence, and related variables, which are freely available from the WHO global health observatory database. Additionally, we automatically linked 5312 data elements from DBpedia, Bio2RDF, and LinkedCT using the Silk framework. The system users can retrieve and visualize health information according to their interests. For users who are not familiar with SPARQL queries, we integrated a Linked Data search engine interface to search and browse the data. We used the system to represent and store the data, facilitating flexible queries and different kinds of visualizations. The preliminary user evaluation score by public health data managers and users was 82 on the SUS usability measurement scale. The need to write queries in the interface was the main reported difficulty of LOD-based systems to the end user. Conclusions The system introduced in this article shows that current LOD technologies are a promising alternative to represent heterogeneous health data in a flexible and reusable manner so that they can serve intelligent queries, and ultimately support decision-making. However, the development of advanced text-based search engines is necessary to increase its usability especially for nontechnical users. Further research with large datasets is recommended in the future to unfold the potential of Linked Data and Semantic Web for future health information systems development.

[1]  Martin G. Skjæveland Sgvizler: A JavaScript Wrapper for Easy Visualization of SPARQL Result Sets , 2012, ESWC.

[2]  Han Qin,et al.  Development of a Web GIS Application for Visualizing and Analyzing Community Out of Hospital Cardiac Arrest Patterns , 2013, Online journal of public health informatics.

[3]  Achille Zappa,et al.  Towards linked open gene mutations data , 2011 .

[4]  Mirina Grosz,et al.  World Wide Web Consortium , 2010 .

[5]  Krzysztof Janowicz,et al.  Linked Data, Big Data, and the 4th Paradigm , 2013, Semantic Web.

[6]  N Andes,et al.  Linking public health data using geographic information system techniques: Alaskan community characteristics and infant mortality. , 1995, Statistics in medicine.

[7]  Michel Dumontier,et al.  SMART: A Web-Based, Ontology-Driven, Semantic Web Query Answering Application , 2007, Semantic Web Challenge.

[8]  Cui Tao,et al.  A semantic-web oriented representation of the clinical element model for secondary use of electronic health records data , 2013, J. Am. Medical Informatics Assoc..

[9]  Marco A. Casanova,et al.  Surfacing scientific and financial data with the Xcel2RDF plug-in , 2012, 2012 Second International Workshop on Developing Tools as Plug-Ins (TOPI).

[10]  Andrea Splendiani,et al.  Towards linked open gene mutations data , 2011, BMC Bioinformatics.

[11]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[12]  Christopher G. Chute,et al.  The linked clinical data project: applying semantic web technologies for clinical and translational research using electronic medical records , 2011, SWAT4LS.

[13]  Sören Auer,et al.  The Linked Data Visualization Model , 2012, SEMWEB.

[14]  Daniela Petrelli,et al.  Exploring user and system requirements of linked data visualization through a visual dashboard approach , 2014, Semantic Web.

[15]  Jens Lehmann,et al.  Publishing and interlinking the Global Health Observatory dataset - Towards increasing transparency in Global Health , 2013, Semantic Web.

[16]  Cui Tao,et al.  Towards Semantic-Web Based Representation and Harmonization of Standard Meta-data Models for Clinical Studies , 2011, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[17]  Suzette J. Bielinski,et al.  Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank , 2012, Journal of Biomedical Semantics.

[18]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[19]  Krzysztof Janowicz,et al.  A Linked-Data-Driven and Semantically-Enabled Journal Portal for Scientometrics , 2013, SEMWEB.

[20]  Akinori Yonezawa,et al.  Building Linked Open Data towards integration of biomedical scientific literature with DBpedia , 2013, Journal of Biomedical Semantics.

[21]  Renée J. Miller,et al.  LinkedCT: A Linked Data Space for Clinical Trials , 2009, ArXiv.

[22]  K H Englmeier,et al.  Visualization of Medical Data Based on EHR Standards , 2012, Methods of Information in Medicine.

[23]  Sheng Gao,et al.  Towards Web-based representation and processing of health information , 2009, International journal of health geographics.

[24]  Michel Dumontier,et al.  Ontology-Based Querying with Bio2RDF’s Linked Open Data , 2013, Journal of Biomedical Semantics.

[25]  Gunther Eysenbach,et al.  The Semantic Web and healthcare consumers: a new challenge and opportunity on the horizon? , 2003 .

[26]  Nurefsan Gür GI Systems for public health with an ontology based approach , 2012 .

[27]  Robert Isele,et al.  Silk - Generating RDF Links while Publishing or Consuming Linked Data , 2010, SEMWEB.

[28]  Glen P. Mays,et al.  Public Health Administration: Principles for Population-Based Management , 2004 .