论文信息 - Smart Sight: a tourist assistant system

Smart Sight: a tourist assistant system

In this paper, we present our efforts towards developing an intelligent tourist system. The system is equipped with a unique combination of sensors and software. The hardware includes two computers, a GPS receiver, a lapel microphone plus an earphone, a video camera and a head-mounted display. This combination includes a multimodal interface to take advantage of speech and gesture input to provide assistance for a tourist. The software supports natural language processing, speech recognition, machine translation, handwriting recognition and multimodal fusion. A vision module is trained to locate and read written language, is able to adapt to to new environments, and is able to interpret intentions offered by the user such as a spoken clarification or pointing gesture. We illustrate the applications of the system using two examples.

Alexander H. Waibel | Jie Yang | Matthias Denecke | Weiyi Yang

[1] Alex Waibel,et al. A framework and toolkit for the construction of multimodal learning interfaces , 1998 .

[2] Warren Robinett,et al. Synthetic Experience:A Proposed Taxonomy , 1992, Presence: Teleoperators & Virtual Environments.

[3] Alex Waibel,et al. Multimodal interfaces for multimedia information agents , 1997 .

[4] Bob Carpenter,et al. The logic of typed feature structures , 1992 .

[5] Sharon L. Oviatt,et al. Toward interface design for human language technology: Modality and structure as determinants of linguistic complexity , 1994, Speech Communication.

[6] Minh Tue Vo,et al. An adaptive multimodal interface for wireless applications , 1998, Digest of Papers. Second International Symposium on Wearable Computers (Cat. No.98EX215).

[7] Seiichi Nakagawa,et al. An Input Interface with Speech and Touch Screen , 1994 .

[8] Alexander H. Waibel,et al. Visual tracking for multimodal human computer interaction , 1998, CHI.

[9] Jennifer Healey,et al. Augmented Reality through Wearable Computing , 1997, Presence: Teleoperators & Virtual Environments.

[10] Alexander H. Waibel,et al. Growing Semantic Grammars , 1998, COLING-ACL.

[11] Katsuhiko Shirai,et al. Multimodal drawing tool using speech, mouse and key-board , 1994, ICSLP.

[12] T. P. Caudell,et al. Augmented reality: an application of heads-up display technology to manual manufacturing processes , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[13] Matthias Denecke,et al. A Programmable Multi-Blackboard Architecture for Dialogue Processing Systems , 1997, Real Applications@ACL/EACL.

[14] Ulrich Neumann,et al. Dynamic registration correction in augmented-reality systems , 1995, Proceedings Virtual Reality Annual International Symposium '95.

[15] Nobuo Hataoka,et al. Evaluation of multimodal interface using spoken language and pointing gesture on interior design system , 1994, ICSLP.