Recognition and Understanding Simulation for a Spoken Dialog Corpus Acquisition

Since the design and acquisition of a new dialog corpus is a complex task, new methods to facilitate this task are necessary. In this paper, we present a methodology to make use of our previous work within the framework of dialog systems in order to acquire a dialog corpus for a new domain. The main idea is the simulation of recognition and understanding errors in the acquisition of the new dialog corpus. This simulation is based on the analysis of such errors in a previously acquired corpus and the definition of a correspondence table among the concepts and attributes of both tasks. This correspondence table is based on the similarity of semantic meaning and frequencies. Finally, the application of this methodology is illustrated in some examples.