Using Multimodal Information to Support Spoken Dialogue Interaction between Humans and Robots without Intrusive Language processing

This position paper expounds our recent views on autonomous spoken dialogue processing without (or with only minimal use of) automatic speech recognition (ASR). It argues that autonomous systems can be trained to read non-verbal signals which facilitate transition through a pre-prepared or stored utterance sequence so that only minimal processing of actual spoken content is needed. Of course no system, machine or human, will be able to continue an extended conversation without understanding the meaning, but we claim that it is not necessary to process each and every spoken word in order to satisfactorily complete an everyday spoken interaction.