A scalable architecture for Directory Assistance automation

We present a novel architecture for providing automated telephone Directory Assistance (DA). The architecture couples a large-vocabulary, statistical n-gram, speech recognition engine with a statistical retrieval system. The use of a statistical n-gram allows for the recognition of unconstrained spoken queries while the statistical retrieval engine allows for an inexact match between a particular spoken query and the training data. Allowing for unconstrained recognition and an inexact match provides the framework for high levels of automation. Once the retrieval engine returns a ranked set of frequently requested telephone numbers (FRN), the rejection module uses a classifier to compute a confidence-like score that is used to make the automation decision. With actual customer calls into an operational, automated DA call center and an FRN set size of 25000 numbers, the new architecture is capable of delivering more than 17% correct automation at a false accept rate of 0.76%.

[1]  Pietro Laface,et al.  Learning of user formulations for business listings in automatic directory assistance , 2001, INTERSPEECH.

[2]  Richard M. Schwartz,et al.  Efficient 2-pass n-best decoder , 1997, EUROSPEECH.

[3]  Pascale Fung,et al.  The estimation of powerful language models from small and large corpora , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.