How may I help you?

Abstract We are interested in providing automated services via natural spoken dialog systems. By natural , we mean that the machine understands and acts upon what people actually say, in contrast to what one would like them to say. There are many issues that arise when such systems are targeted for large populations of non-expert users . In this paper, we focus on the task of automatically routing telephone calls based on a user's fluently spoken response to the open-ended prompt of “ How may I help you? ”. We first describe a database generated from 10,000 spoken transactions between customers and human agents. We then describe methods for automatically acquiring language models for both recognition and understanding from such data. Experimental results evaluating call-classification from speech are reported for that database. These methods have been embedded within a spoken dialog system, with subsequent processing for information retrieval and formfilling.

[1]  A. L. Gorin,et al.  How may I help you , 1996 .

[2]  Michael Riley,et al.  The Watson speech recognition engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Philip N. Garner,et al.  A keyword selection strategy for dialogue move recognition and multi-class topic identification , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Michael Riley,et al.  Speech Recognition by Composition of Weighted Finite Automata , 1996, ArXiv.

[5]  Richard J. Mammone,et al.  Artificial neural networks for speech and vision , 1994 .

[6]  Allen L. Gorin,et al.  Processing of semantic information in fluently spoken language , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  Janet M. Baker,et al.  Topic and Speaker Identification via Large Vocabulary Continuous Speech Recognition , 1993, HLT.

[8]  N. S. Barnett,et al.  Private communication , 1969 .

[9]  Yoshinori Sagisaka,et al.  Variable-order N-gram generation by word-class splitting and consecutive word grouping , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  L. G. Miller,et al.  Spoken language acquisition for automated call routing , 1994, ICSLP.

[11]  A.L. Gorin,et al.  An experiment in spoken language acquisition , 1992, IEEE Trans. Speech Audio Process..

[12]  Ananth Sankar,et al.  Visual focus of attention in adaptive language acquisition , 1992, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[14]  A. Gorin On automated language acquisition , 1989 .

[15]  Frederick Jelinek,et al.  Self-organizing language modeling for speech recognition , 1990 .

[16]  Takeshi Matsumura,et al.  Non-uniform unit based HMMs for continuous speech recognition , 1995, Speech Commun..

[17]  Allen L. Gorin,et al.  User Interface Issues for Natural Spoken Dialog Systems , 1998 .

[18]  Herbert Gish,et al.  Issues in topic identification on the switchboard corpus , 1994, ICSLP.

[19]  Andrej Ljolje,et al.  High accuracy phone recognition using context clustering and quasi-triphonic models , 1994, Comput. Speech Lang..

[20]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[21]  Richard J. Mammone,et al.  Adaptive language acquisition using incremental learning , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Fernando Pereira,et al.  The AT&t 60,000 word speech-to-text system , 1995, EUROSPEECH.

[23]  Nelson M. Blachman,et al.  The amount of information that y gives about X , 1968, IEEE Trans. Inf. Theory.

[24]  Andrej Ljolje,et al.  A spoken language system for automated call routing , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Laura G. Miller,et al.  Structured Networks for Adaptive Language Acquisition , 1993, Int. J. Pattern Recognit. Artif. Intell..

[26]  Egidio P. Giachin,et al.  Phrase bigrams for continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[27]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[28]  Michael K. Brown,et al.  Development Principles for Dialog-Based Interfaces , 1996, ECAI Workshop on Dialogue Processing in Spoken Language Systems.