论文信息 - Multimodal Human-Agent Communication

Multimodal Human-Agent Communication

In this paper I will present my approach on how to merge the HCI field with the new growing field of agent technology, especially from the point of view of the use and need of a personal assistant. Multimodality is a must if we are ever to implement a user-friendly, intuitive and effective interface for a user to interact with a system (agent or other). Selecting modality for presentation can be done by mapping the values of their characteristics onto each other and setting up constraints to select the one most appropriate during runtime. When designing a system the same constraints can be set up to find what kind of multimodality that should be supported to allow a user to enter the input that the user wants or that the system requires. When allowing a user to interact with a system using several modalities there is also a need to synchronize and interpret the input, which can be done by merging input from different modalities based on their temporal relation and the expected input. As with all technologies, especially new ones, standards is an important issue. FIPA (Foundation for Intelligent Physical Agents) is working on several areas within this field, both in creating an agent communication language (ACL) and a standard for human-agent communication. The PIM project (Personal Information Management system) here at Telia Research AB, Software Tech. is an attempt to determine the needs in a personal assistant. Even though its just a prototype, with some minor extensions to the system, it will act as the base for this paper when determining the need of modalities and architecture. The need of modalities for this system is heavily based on speech. In most cases a simple word spotter is enough but when dictating a letter over the phone, a more sophisticated natural language recognizer is needed. Another modality that is needed is the pen based gestures and written text, this is invaluable when interacting with the graphical interface to the users calendar. To choose an architecture for this particular type of application was not a very difficult task. The OAA has been used in similar applications and it follows the basic notion on being a system of cooperating agents. Nevertheless it still lacks some parts in the multimodal communication support, e.g. user modeling, which is one thing that can not go unmentioned. Therefore I have outlined the basics of my …

[1] Donald A. Norman,et al. How might people interact with agents , 1994, CACM.

[2] AgentsSteve B. CousinsMay. Intelligent Interface , 1994 .

[3] P. Maes. Modeling adaptive autonomous agents , 1993 .

[4] Matthias Schneider-Hufschmidt,et al. A taxonomy of adaptive user interfaces , 1993 .

[5] John R. Anderson. Cognitive Psychology and Its Implications , 1980 .