Geo-Centric Language Models for Local Business Voice Search

Voice search is increasingly popular, especially for local business directory assistance. However, speech recognition accuracy on business listing names is still low, leading to user frustration. In this paper, we present a new algorithm for geo-centric language model generation for local business voice search for mobile users. Our algorithm has several advantages: it provides a language model for any user in any location; the geographic area covered by the language model is adapted to the local business density, giving high recognition accuracy; and the language models can be pre-compiled, giving fast recognition time. In an experiment using spoken business listing name queries from a business directory assistance service, we achieve a 16.8% absolute improvement in recognition accuracy and a 3-fold speedup in recognition time with geocentric language models when compared with a nationwide language model.

[1]  Geoffrey Zweig,et al.  Language modeling for voice search: A machine translation approach , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Geoffrey Zweig,et al.  Confidence measures for voice search applications , 2007, INTERSPEECH.

[3]  Lou Boves,et al.  Business listings in automatic directory assistance , 2001, INTERSPEECH.

[4]  Vishwa Gupta,et al.  Automation of locality recognition in ADAS plus , 2000, Speech Commun..

[5]  Candace A. Kamm,et al.  Speech recognition issues for directory assistance applications , 1995, Speech Commun..

[6]  Shrikanth S. Narayanan,et al.  VPQ: a spoken language interface to large scale directory information , 1998, ICSLP.

[7]  Bo Thiesson,et al.  Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search , 2008, UIST '08.

[8]  Kallirroi Georgila,et al.  A Speech-Based Human-Computer Interaction System for Automating Directory Assistance Services , 2003, Int. J. Speech Technol..

[9]  Roberto Garigliano,et al.  The Durham telephone enquiry system , 1997, Int. J. Speech Technol..

[10]  Geoffrey Zweig,et al.  Live search for mobile:Web services by voice on the cellphone , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[12]  Johan Schalkwyk,et al.  Deploying GOOG-411: Early lessons in data, measurement, and testing , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Andreas Kellner,et al.  Towards an automated directory information system , 1997, EUROSPEECH.

[14]  Richard M. Schwartz,et al.  A scalable architecture for Directory Assistance automation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15]  Dilek Z. Hakkani-Tür,et al.  The AT&T WATSON speech recognizer , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[16]  Geoffrey Zweig,et al.  Automated directory assistance system - from theory to practice , 2007, INTERSPEECH.

[17]  Sheng Chang,et al.  Modalities and demographics in voice search: Learnings from three case studies , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.