MVA: The Multimodal Virtual Assistant

The Multimodal Virtual Assistant (MVA) is an application that enables users to plan an outing through an interactive multimodal dialog with a mobile device. MVA demonstrates how a cloud-based multimodal language processing infrastructure can support mobile multimodal interaction. This demonstration will highlight incremental recognition, multimodal speech and gesture input, contextually-aware language understanding, and the targeted clarification of potentially incorrect segments within user input.

[1]  Michael Johnston,et al.  Multimodal dialogue in mobile local search , 2012, ICMI '12.

[2]  Julia Hirschberg,et al.  Localized detection of speech recognition errors , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[3]  Michael Johnston,et al.  Location grounding in multimodal local search , 2010, ICMI-MLMI '10.

[4]  Andrej Ljolje,et al.  Your Mobile Virtual Assistant Just Got Smarter! , 2011, INTERSPEECH.

[5]  Julia Hirschberg,et al.  Modelling Human Clarification Strategies , 2013, SIGDIAL Conference.

[6]  Dilek Z. Hakkani-Tür,et al.  The AT&T WATSON speech recognizer , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..