论文信息 - Review of a framework for audiovisual dialog-based in human computer interaction

Review of a framework for audiovisual dialog-based in human computer interaction

This paper gives a review about a practical system that aims to detect user intent to speak to a computer. The system is based on recognized speech from both audio and visual information to be contextual information, thus improving the human-like communication between users and computers. It employs an adaptive module to select an appropriate grammar that suits the program. Furthermore, the system utilizes the visual modality in addition to audio, for increasing word accuracy.

Hasanudin | Sri Supadmini | Marina Agathya | S. Muhammad Brilliant | N. Rizki Akbar

[1] Fillia Makedon,et al. Audio-visual speech recognition incorporating facial depth information captured by the Kinect , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[2] Andrew McCallum,et al. Information Extraction with HMMs and Shrinkage , 1999 .

[3] Fillia Makedon,et al. A Framework for Audiovisual Dialog-based Human Computer Interaction , 2014 .

[4] Oliver Lemon,et al. Reinforcement Learning for Adaptive Dialogue Systems - A Data-driven Methodology for Dialogue Management and Natural Language Generation , 2011, Theory and Applications of Natural Language Processing.

[5] Alexander I. Rudnicky,et al. The RavenClaw dialog management framework: Architecture and systems , 2009, Comput. Speech Lang..

[6] Ben Shneiderman,et al. Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .

[7] Tsuhan Chen,et al. Audio-visual integration in multimodal communication , 1998, Proc. IEEE.

[8] Dov Te'eni,et al. Human-Computer Interaction: Developing Effective Organizational Information Systems , 2006 .