XM-flow: An Extensible Micro-flow for Multimodal Interaction

This paper presents a synchronization module in multimodal dialogue system architecture based on the model-view-controller (MVC) pattern for human-computer interaction. The MVC pattern is based on a clear separation of objects into three categories, i.e. model for defining and maintaining data, view for rendering interactions based on the data, and controller for coordinating actions and events that affect the model and view(s). As part of our layered multimodal dialog system architecture, this synchronization module in our approach controls the synchronization of multiple modalities, such as speech, mouse and keyboard, by interpreting XML document that incorporates SMIL and EMMA. It isolates dialog model from complex presentations associated with different channels and user interfaces through the adoption of a generic object binding mechanism. These flexibilities lead to enhanced design freedom in multimodal dialog system architecture that supports client based, sever based and distributed solutions