Portal of medical data models: information infrastructure for medical research and healthcare

Introduction: Information systems are a key success factor for medical research and healthcare. Currently, most of these systems apply heterogeneous and proprietary data models, which impede data exchange and integrated data analysis for scientific purposes. Due to the complexity of medical terminology, the overall number of medical data models is very high. At present, the vast majority of these models are not available to the scientific community. The objective of the Portal of Medical Data Models (MDM, https://medical-data-models.org) is to foster sharing of medical data models. Methods: MDM is a registered European information infrastructure. It provides a multilingual platform for exchange and discussion of data models in medicine, both for medical research and healthcare. The system is developed in collaboration with the University Library of Münster to ensure sustainability. A web front-end enables users to search, view, download and discuss data models. Eleven different export formats are available (ODM, PDF, CDA, CSV, MACRO-XML, REDCap, SQL, SPSS, ADL, R, XLSX). MDM contents were analysed with descriptive statistics. Results: MDM contains 4387 current versions of data models (in total 10 963 versions). 2475 of these models belong to oncology trials. The most common keyword (n = 3826) is ‘Clinical Trial’; most frequent diseases are breast cancer, leukemia, lung and colorectal neoplasms. Most common languages of data elements are English (n = 328 557) and German (n = 68 738). Semantic annotations (UMLS codes) are available for 108 412 data items, 2453 item groups and 35 361 code list items. Overall 335 087 UMLS codes are assigned with 21 847 unique codes. Few UMLS codes are used several thousand times, but there is a long tail of rarely used codes in the frequency distribution. Discussion: Expected benefits of the MDM portal are improved and accelerated design of medical data models by sharing best practice, more standardised data models with semantic annotation and better information exchange between information systems, in particular Electronic Data Capture (EDC) and Electronic Health Records (EHR) systems. Contents of the MDM portal need to be further expanded to reach broad coverage of all relevant medical domains. Database URL: https://medical-data-models.org

[1]  Igor Jurisica,et al.  Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions , 2014, BMC Bioinformatics.

[2]  Benjamin Trinczek,et al.  Multilingual Medical Data Models in ODM Format , 2012, Applied Clinical Informatics.

[3]  Martin Dugas,et al.  Integrated Data Management for Clinical Studies: Automatic Transformation of Data Models with Semantic Annotations for Principal Investigators, Data Managers and Statisticians , 2014, PloS one.

[4]  Julian Varghese,et al.  Standardized Quality Assurance Forms for Organ Transplantations with Multilingual Support, Open Access and UMLS Coding , 2015, eHealth.

[5]  Paul A. Harris,et al.  Procurement of shared data instruments for Research Electronic Data Capture (REDCap) , 2013, J. Biomed. Informatics.

[6]  Hanspeter Pfister,et al.  UpSet: Visualization of Intersecting Sets , 2014, IEEE Transactions on Visualization and Computer Graphics.

[7]  Robert A. Israel,et al.  International Classification of Diseases (ICD) , 2005 .

[8]  S. Warach,et al.  Standardizing the Structure of Stroke Clinical and Epidemiologic Research Data: The National Institute of Neurological Disorders and Stroke (NINDS) Stroke Common Data Element (CDE) Project , 2012, Stroke.

[9]  O Gefeller,et al.  Memorandum “Open Metadata” , 2015, Methods of Information in Medicine.

[10]  Martin Dugas,et al.  ODM2CDA and CDA2ODM: Tools to convert documentation forms between EDC and EHR systems , 2015, BMC Medical Informatics and Decision Making.

[11]  K. F. レンツ,et al.  the Creative Commons , 2011 .

[12]  Martin Dugas,et al.  The need for harmonized structured documentation and chances of secondary use - Results of a systematic analysis with automated form comparison for prostate and breast cancer , 2014, J. Biomed. Informatics.

[13]  H. Chandler Database , 1985 .

[14]  Fleur Fritz,et al.  Automated UMLS-Based Comparison of Medical Forms , 2013, PloS one.

[15]  Fleur Fritz,et al.  Interoperability in clinical research: from metadata registries to semantically annotated CDISC ODM. , 2012, Studies in health technology and informatics.

[16]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[17]  Christian Biemann,et al.  Interactive and Iterative Annotation for Biomedical Entity Recognition , 2015, BIH.

[18]  Kenneth Getz,et al.  Protocol Design Trends and their Effect on Clinical Trial Performance A new study suggests that changes in protocol design may be adversely affecting clinical trial performance. Kenneth Getz discusses the results. , 2008 .

[19]  Joachim Szecsenyi,et al.  [Quality measurement using administrative data in mandatory quality assurance]. , 2014, Zeitschrift fur Evidenz, Fortbildung und Qualitat im Gesundheitswesen.