Harvest: an open platform for developing web-based biomedical data discovery and reporting applications

Biomedical researchers share a common challenge of making complex data understandable and accessible as they seek inherent relationships between attributes in disparate data types. Data discovery in this context is limited by a lack of query systems that efficiently show relationships between individual variables, but without the need to navigate underlying data models. We have addressed this need by developing Harvest, an open-source framework of modular components, and using it for the rapid development and deployment of custom data discovery software applications. Harvest incorporates visualizations of highly dimensional data in a web-based interface that promotes rapid exploration and export of any type of biomedical information, without exposing researchers to underlying data models. We evaluated Harvest with two cases: clinical data from pediatric cardiology and demonstration data from the OpenMRS project. Harvest's architecture and public open-source code offer a set of rapid application development tools to build data discovery applications for domain-specific biomedical data repositories. All resources, including the OpenMRS demonstration, can be found at http://harvest.research.chop.edu

[1]  Paul G. Biondich,et al.  The OpenMRS System: Collaborating Toward an Open Source EMR for Developing Countries , 2006, AMIA.

[2]  Torben Bach Pedersen,et al.  A Survey of Open Source Tools for Business Intelligence , 2005, Int. J. Data Warehous. Min..

[3]  Daniel Choquet,et al.  The data deluge , 2012, Nature Cell Biology.

[4]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .

[5]  P Aldhous,et al.  Managing the genome data deluge. , 1993, Science.

[6]  Griffin M. Weber,et al.  Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) , 2010, J. Am. Medical Informatics Assoc..

[7]  Bill Karwin,et al.  SQL Antipatterns: Avoiding the Pitfalls of Database Programming , 2010 .

[8]  Torben Bach Pedersen,et al.  A Survey of Open Source Tools for Business Intelligence , 2009, Int. J. Data Warehous. Min..

[9]  Peter Fox,et al.  Changing the Equation on Scientific Data Visualization , 2011, Science.

[10]  Leonard W D'Avolio,et al.  Comparative effectiveness research and medical informatics. , 2010, The American journal of medicine.

[11]  L. Fiore,et al.  Comparative Effectiveness Research and Medical , 2010 .

[12]  Charles Safran,et al.  Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[13]  A. Bevan The data deluge , 2015, Antiquity.

[14]  Prakash M. Nadkarni,et al.  Data Extraction and Ad Hoc Query of an Entity– Attribute–Value Database , 2000 .