A rough sets approach to patient classification in medical records.

This study used the original model of rough sets for data analysis of objective clinical findings from pneumonia patients. Rough sets data analysis was used to draw data dependencies, data reductions, approximate set classifications, and rule inductions from records in a clinical database. This study utilized Pawlak's rough classification algorithm [1] to examine the relationship between a constellation of findings and the ICD-9-CM classification scheme used for disease coding in order to generate a reduct. The reduct is a logical construct of the most information-preserving findings from a decision table. The Iliad expert system, which is based on Bayes Theorem, was used to validate the results we obtained using rough sets analysis to discriminate between the different types of pneumonia in the patient population. From a set of 25 objective clinical findings, chosen because they had a positive or negative predictive value for probability of death for pneumonia patients, the rough sets analysis constructed a logical classifier of six attributes that was as discriminatory as the statistical classifier in Iliad which utilized 11 of those 25 attributes. The rough sets methods are capable of identifying a minimal set of attribute values that are associated with a disease label. These methods could be applied to develop more consistent labeling methods, and, potentially flag activities which do not contribute to diagnostic labeling or health results.