Handling Knowledge Sources in Human-Machine Interaction

This article describes the various knowledge sources that, in general, are required to handle multimodal human-machine interaction efficiently: these are called the task, user, dialogue, environment and system models. The first part discusses the content of these models. Special emphasis is given on problems that occur when speech is combined with other modalities. The second part focuses on spoken language characteristics and proposes an adapted semantic representation for the task model. It also describes a stochastic method to collect and process the information related to this model. The conclusion discusses an extension of such a stochastic method to multimodality.

[1]  James F. Allen,et al.  TRAINS-95: Towards a Mixed-Initiative Planning Assistant , 1996, AIPS.

[2]  Yacine Bellik,et al.  Media integration in multimodal interfaces , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[3]  Alexander I. Rudnicky,et al.  Expanding the Scope of the ATIS Task: The ATIS-3 Corpus , 1994, HLT.

[4]  Jaime G. Carbonell,et al.  Parsing Spoken Language: a Semantic Caseframe Approach , 1986, COLING.

[5]  Richard M. Schwartz,et al.  Language understanding using hidden understanding models , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  M. Baltin,et al.  The Mental representation of grammatical relations , 1985 .

[7]  Wolfgang Minker,et al.  A spoken language system for information retrieval , 1994, ICSLP.

[8]  Richard M. Schwartz,et al.  A Fully Statistical Approach to Natural Language Interfaces , 1996, ACL.

[9]  Wolfgang Minker Stochastically-based natural language understanding across tasks and languages , 1997, EUROSPEECH.

[10]  Wolfgang Minker,et al.  Stochastic versus rule-based speech understanding for information retrieval , 1998, Speech Commun..

[11]  Wolfgang Minker Stochastically-Based Semantic Analysis for ARISE - Automatic Railway Information Systems for Europe , 1999, Grammars.

[12]  Franck Multon,et al.  Speech and tactile-based georal system , 1995, EUROSPEECH.

[13]  Frederick Jelinek,et al.  Basic Methods of Probabilistic Context Free Grammars , 1992 .

[14]  Elisabeth Maier,et al.  Context construction as subtask of dialogue processing - the VERBMOBIL case , 1996 .

[15]  John D. Lafferty,et al.  Decision Tree Parsing using a Hidden Derivation Model , 1994, HLT.

[16]  Wayne H. Ward,et al.  The CMU Air Travel Information Service: Understanding Spontaneous Speech , 1990, HLT.

[17]  Jean-Luc Gauvain,et al.  Spoken Language Component of the MASK Kiosk , 1997 .

[18]  Wolfgang Wahlster,et al.  Verbmobil: Translation of Face-To-Face Dialogs , 1993, MTSUMMIT.

[19]  P. J. Price,et al.  Evaluation of Spoken Language Systems: the ATIS Domain , 1990, HLT.

[20]  Stephen E. Levinson,et al.  A conversational-mode airline information and reservation system using speech input and output , 1980 .

[21]  Robbert-Jan Beun,et al.  The function of repetitions in information dialogues , 1985 .

[22]  H. Grice Further Notes on Logic and Conversation , 1978 .

[23]  Aravind K. Joshi,et al.  Tree-adjoining grammars and lexicalized grammars , 1992, Tree Automata and Languages.

[24]  James F. Allen,et al.  Integrating agent-based mixed-initiative control with an existing multi-agent planning system , 2000, Proceedings Fourth International Conference on MultiAgent Systems.

[25]  Rachel Reichman-Adar,et al.  Extended Person-Machine Interface , 1984, Artif. Intell..

[26]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[27]  J. Austin How to do things with words , 1962 .

[28]  H. Grice Logic and conversation , 1975 .

[29]  A. Koller,et al.  Speech Acts: An Essay in the Philosophy of Language , 1969 .

[30]  Harald Aust,et al.  A realtime prototype of an automatic inquiry system , 1994, ICSLP.

[31]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[32]  Jan Alexandersson Some ideas for the automatic acquisition of dialogue structure , 1996 .

[33]  Alex Waibel,et al.  Stochastically-Based Semantic Analysis , 1999 .

[34]  Y. Bellik La composante temporelle dans les interfaces multimodales , 1997 .

[35]  E. Levin,et al.  CHRONUS, The next generation , 1995 .

[36]  Jean-Claude Martin,et al.  Developing Multimodal Interfaces: A Theoretical Framework and Guided Propagation Networks , 1995, Multimodal Human-Computer Communication.

[37]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.

[38]  M.-A. Morel,et al.  Analyse linguistique d'un corpus de dialogues homme/machine , 1988 .

[39]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[40]  Niels Ole Bernsen,et al.  A dedicated task-oriented dialogue theory in support of spoken language dialogue systems design , 1994, ICSLP.

[41]  D. Bouwhuis,et al.  The Structure of Multimodal Dialogue , 1989 .

[42]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[43]  Yacine Bellik,et al.  Multimodal text editor interface including speech for the blind , 1997, Speech Commun..

[44]  Lori Lamel,et al.  The LIMSI ARISE system , 2000, Speech Commun..

[45]  Bertram C. Bruce Case Systems for Natural Language , 1975, Artif. Intell..

[46]  Sharon L. Oviatt,et al.  Taming recognition errors with a multimodal interface , 2000, CACM.