Metadata-driven creation of data marts from an EAV-modeled clinical research database

Generic clinical study data management systems can record data on an arbitrary number of parameters in an arbitrary number of clinical studies without requiring modification of the database schema. They achieve this by using an Entity-Attribute-Value (EAV) model for clinical data. While very flexible for creating transaction-oriented systems for data entry and browsing of individual forms, EAV-modeled data is unsuitable for direct analytical processing, which is the focus of data marts. For this purpose, such data must be extracted and restructured appropriately. This paper describes how such a process, which is non-trivial and highly error prone if performed using non-systematic approaches, can be automated by judicious use of the study metadata-the descriptions of measured parameters and their higher-level grouping. The metadata, in addition to driving the process, is exported along with the data, in order to facilitate its human interpretation.

[1]  Perry L. Miller,et al.  Research Paper: Exploring Performance Issues for a Clinical Database Organized Using an Entity-Attribute-Value Representation , 2000, J. Am. Medical Informatics Assoc..

[2]  Robert Sedgewick,et al.  Algorithms in C , 1990 .

[3]  George Hripcsak,et al.  Research Paper: Access to Data: Comparing AccessMed With Query by Review , 1996, J. Am. Medical Informatics Assoc..

[4]  C. Chute,et al.  Exploration and exploitation of clinical databases. , 1995, International journal of bio-medical computing.

[5]  Cynthia Brandt,et al.  Application of Information Technology: Metadata-driven Ad Hoc Query of Patient Data: Meeting the Needs of Clinical Studies , 2002, J. Am. Medical Informatics Assoc..

[6]  George Hripcsak,et al.  Creating an environment for linking knowledge-based systems to a clinical database: a suite of tools , 1997, AMIA.

[7]  Nuno Salgado,et al.  Towards a common framework for clinical trials information systems , 2000, AMIA.

[8]  S. Ruberg,et al.  A Proposal and Challenge for a New Approach to Integrated Electronic Solutions , 2002 .

[9]  Nuno Salgado,et al.  A database system for integrated clinical trial management, control, statistical analysis and ICH-compliant reporting , 1999, AMIA.

[10]  Isaac S. Kohane,et al.  Data mining by clinicians , 1998, AMIA.

[11]  T A Pryor,et al.  Evaluation of an SQL model of the HELP patient database. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[12]  F Banhart,et al.  A Graphical Query Generator for Clinical Research Databases , 1995, Methods of Information in Medicine.

[13]  W. H. Inmon,et al.  Data Warehouse Performance , 1998 .

[14]  George Hripcsak,et al.  Using Metadata to Integrate Medical Knowledge in a Clinical Information System , 1990 .

[15]  P J Haug,et al.  HELP the next generation: a new client-server architecture. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[16]  Erik Thomsen,et al.  OLAP Solutions - Building Multidimensional Information Systems , 1997 .

[17]  George Hripcsak,et al.  A Generalized Relational Schema for an Integrated Clinical Patient Database. , 1990 .

[18]  W Premauer,et al.  ArchiMed: a medical information and retrieval system. , 1999, Methods of information in medicine.

[19]  Prakash M. Nadkarni,et al.  Data Extraction and Ad Hoc Query of an Entity– Attribute–Value Database , 2000 .

[20]  H. Wijkstra,et al.  Semi-automated Database Design by the End-user , 1995, Methods of Information in Medicine.

[21]  Ping Wang,et al.  Theater-Style Demonstration: The Web Enabled IHC Enterprise Data Warehouse for Clinical Process Improvement and Outcomes Measurement , 1997, AMIA.

[22]  Walter Gall,et al.  Extracting a statistical data matrix from electronic patient records , 2001, Comput. Methods Programs Biomed..