论文信息 - The Spoken Language Component of the Mask KioskJ

The Spoken Language Component of the Mask KioskJ

The aim of the Multimodal-Multimedia Automated Service Kiosk (MASK) project is to pave the way for more advanced public service applications by user interfaces employing multimodal, multi-media input and output. The project has analyzed the technological requirements in the context of users and the tasks they perform in carrying out travel enquiries, and developed a prototype information kiosk that will be installed in the Gare St. Lazare in Paris. The kiosk will improve the eeectiveness of such services by enabling interaction through the coordinated use of multimodal inputs (speech and touch) and multimedia output (sound, video, text, and graphics) and in doing so create the opportunity for new public services. Vocal input is managed by a spoken language system, which aims to provide a natural interface between the user and the computer through the use of simple and natural dialogs. In this paper the architecture and the capabilities of the spoken language system are described, with emphasis on the speaker-independent, large vocabulary continuous speech recognizer, the natural language component (including semantic analysis and dialog management), and the response generator. We also describe our data collection and evaluation activities which are crucial to system development.

L. F. Lamel | . L. Gauvain

[1] Lori Lamel,et al. Speaker-independent continuous speech dictation , 1993, Speech Communication.

[2] Lori Lamel,et al. The LIMSI continuous speech dictation system: evaluation on the ARPA Wall Street Journal task , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[3] Wolfgang Minker,et al. A spoken language system for information retrieval , 1994, ICSLP.

[4] Hermann Ney,et al. The use of a one-stage dynamic programming algorithm for connected word recognition , 1984 .

[5] Jean-Luc Gauvain,et al. Development of spoken language corpora for travel information , 1995, EUROSPEECH.

[6] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[7] David Goodine,et al. A French version of the MIT-ATIS system: portability issues , 1993, EUROSPEECH.

[8] Bertram C. Bruce. Case Systems for Natural Language , 1975, Artif. Intell..