Semantics-based Automatic Literal Reconstruction Of Dictations

This paper describes a method for the automatic literal reconstruction of dictations in the domain of medical reports. The raw output of an automatic speech recognition system and the final report edited by a professional medical transcriptionist serve as input to the reconstruction algorithm. Reconstruction is based on automatic alignment between the speech recognition result and the edited report. Based on an ontology (i.e. UMLS) and lexical resources (i.e. WordNet and an inventory of spoken variants for each concept), semantic representations are assigned to terms and phrases. Alignment takes into account semantic similarity scores, based on the similarity between semantic representations of the two sources, and phonetic similarity scores. This paper explains how the speech recognition output is compared and aligned to the edited written documents and how the two different input sources are complementary for the task of reconstructing a literal transcript.