Preferred modalities in dialogue systems

This research describes which modalities are preferred in particular contexts when interacting with a multi-modal dialogue system. The trade-off between three factors is investigated: (i) speech recognition performance, (ii) efficiency of input modality and (iii) the system’s output modality. Four versions were developed of a multimodal examinator to be used in elementary school. The versions differed in recognition performance (‘perfect’ vs. realistic) and output modality (speech or text). In all systems, subjects could provide input via speaking or typing. Answer length in characters was used as a measure of efficiency. Results show that both speech recognition performance and efficiency have a strong impact on preferred modalities. No effect was found of the system’s output modality.