Machine Learning of Text Analysis Rules for Clinical Records

Automatically extracting information in clinical free text can make available an information resource that is largely untapped. This paper describes the BADGER text analysis system, which identifies concepts contained in a text based on linguistic context. A key component of BADGER is the CRYSTAL dictionary induction system that automatically learns text analysis rules from a set of training documents. Each of these rules is generalized as far as possible without producing errors, so that a minimum number of dictionary entries cover the positive training instances.

[1]  Y Satomura,et al.  Automated diagnostic indexing by natural language processing. , 1992, Medical informatics = Medecine et informatique.

[2]  David Fisher,et al.  CRYSTAL: Inducing a Conceptual Dictionary , 1995, IJCAI.

[3]  D. Aronow,et al.  Information technology applications in quality assurance and quality improvement, Part I. , 1993, The Joint Commission journal on quality improvement.

[4]  N Sager,et al.  Automatic encoding into SNOMED III: a preliminary investigation. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.