Data quality aspects of a database for abdominal septic shock patients

Since many years, medical researchers have investigated the mechanisms that may cause a septic shock. Despite many approaches that analyzed smaller parts of the relevant data or single variables, respectively, no larger database with all the possible relevant data existed. Our work was to bridge this gap. We built a large database for abdominal septic shock patients. While building it, we were confronted with many problems concerning the database realization and the data quality. Thus, we will demonstrate how we built our database and how we assured data quality. This is of interest for all medical or computer scientists who are concerned with building medical databases with retrospective data, e.g. for data mining purposes.

[1]  A. Fein,et al.  Sepsis and multiorgan failure , 1997 .

[2]  Jürgen Paetz,et al.  A Neuro-fuzzy Based Alarm System for Septic Shock Patients with a Comparison to Medical Scores , 2002, ISMDA.

[3]  W. Knaus,et al.  Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. The ACCP/SCCM Consensus Conference Committee. American College of Chest Physicians/Society of Critical Care Medicine. , 1992, Chest.

[4]  Nada Lavrac,et al.  Machine Learning for Data Mining in Medicine , 1999, AIMDM.

[5]  Yuval Shahar,et al.  Artificial Intelligence in Medicine , 1999, Lecture Notes in Computer Science.

[6]  D. Rubin Multiple imputation for nonresponse in surveys , 1989 .

[7]  J. Vincent,et al.  The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure , 1996, Intensive Care Medicine.

[8]  Hardaway Rm,et al.  A review of septic shock. , 2000 .

[9]  Martti Juhola,et al.  Treatment of missing data values in a neural network based decision support system for acute abdominal pain , 1998, Artif. Intell. Medicine.

[10]  Jürgen Paetz Intersection based generalization rules for the analysis of symbolic septic shock patient data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11]  E. Hanisch,et al.  Epidemiologie von SIRS, Sepsis und septischem Schock bei chirurgischen Intensivpatienten , 1998, Der Chirurg.

[12]  J L Schafer,et al.  Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective. , 1998, Multivariate behavioral research.

[13]  R. Doll,et al.  The Epidemiology of Cancer , 1980, Cancer.

[14]  Shusaku Tsumoto,et al.  Clinical Knowledge Discovery in Hospital Information Systems: Two Case Studies , 2000, PKDD.

[15]  Jürgen Paetz Metric rule generation with septic shock patient data , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[16]  M. Suistomaa,et al.  Sampling rate causes bias in APACHE II and SAPS II scores , 2000, Intensive Care Medicine.

[17]  Jürgen Paetz,et al.  About the Analysis of Septic Shock Patient Data , 2000, ISMDA.

[18]  Norberto F. Ezquerra,et al.  Mining constrained association rules to predict heart disease , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Nicolette de Keizer,et al.  Model Formulation: Defining and Improving Data Quality in Medical Registries: A Literature Review, Case Study, and Generic Framework , 2002, J. Am. Medical Informatics Assoc..

[20]  Thomas Villmann,et al.  Data Mining and Knowledge Discovery in Medical Applications Using Self-Organizing Maps , 2000, ISMDA.