Application of Formal Concept Analysis to Information Retrieval Using a Hierarchically Structured Thesaurus

Our trial uses a set of 9,000 patient medical discharge summaries from the Thorasic unit at the Royal Adelaide Hospital. The discharge summaries are indexed using SNOMED, Systematized Nomenclature of Medicine. The documents are semi-structured free text documents with their structure described using SGML, Standardized Generalized Markup Language. The medical discharge summaries are used as a training set and a concept lattice is synthesized whose structure reeects both the specialization/generalization information present in SNOMED and combinations in which the SNOMED concepts appear in the documents. The concept lattice is synthesised using formal concept analysis, where the documents are considered objects and the medical concepts contained in SNOMED are considered the attributes. This research is a rst step towards an IR, information retrieval, system which uses conceptual graphs for semantic analysis of medical discharge summaries. The next phase requires a natural language parser to extract semantic connections between concepts. Both concepts and their connections will be described using conceptual graphs.