Translating natural language utterances to search queries for SLU domain detection using query click logs

Logs of user queries from a search engine (such as Bing or Google) together with the links clicked provide valuable implicit feedback to improve statistical spoken language understanding (SLU) models. However, the form of natural language utterances occurring in spoken interactions with a computer differs stylistically from that of keyword search queries. In this paper, we propose a machine translation approach to learn a mapping from natural language utterances to search queries. We train statistical translation models, using task and domain independent semantically equivalent natural language and keyword search query pairs mined from the search query click logs. We then extend our previous work on enriching the existing classification feature sets for input utterance domain detection with features computed using the click distribution over a set of clicked URLs from search engine query click logs of user utterances with automatically translated queries. This approach results in significant improvements for domain detection, especially when detecting the domains of user utterances that are formulated as natural language queries and effectively complements to the earlier work using syntactic transformations.

[1]  Gökhan Tür,et al.  Learning Weighted Entity Lists from Web Click Logs for Spoken Language Understanding , 2011, INTERSPEECH.

[2]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3]  Gökhan Tür,et al.  Employing web search query click logs for multi-domain spoken language understanding , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[4]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[5]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[6]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[7]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[8]  Gökhan Tür,et al.  Exploiting query click logs for utterance domain detection in spoken language understanding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Gokhan Tur,et al.  Multi-Domain Spoken Language Understanding with Approximate Inference , 2011 .

[10]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[11]  Dilek Z. Hakkani-Tür,et al.  MODEL ADAPTATION FOR DIALOG ACT TAGGING , 2006, 2006 IEEE Spoken Language Technology Workshop.

[12]  Gökhan Tür,et al.  Optimizing SVMs for complex call classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Gökhan Tür,et al.  Sentence simplification for spoken language understanding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Gökhan Tür,et al.  Bootstrapping Domain Detection Using Query Click Logs for New Domains , 2011, INTERSPEECH.