Identifying patient subgroups with simple Bayes'

Medical records can form the basis of retrospective studies, be used to evaluate hospital practices and guidelines, and provide examples for teaching medicine. Each of these tasks presumes the ability to accurately identify patient subgroups. We describe a method for selecting patient subgroups based on the text of their medical records and demonstrate its effectiveness. We also describe a modification of the basic system that does not assume the existence of a preclassified training set, and illustrate its effectiveness in one retrieval task.