Leveraging semantic web search and browse sessions for multi-turn spoken dialog systems

Training statistical dialog models in spoken dialog systems (SDS) requires large amounts of annotated data. The lack of scalable methods for data mining and annotation poses a significant hurdle for state-of-the-art statistical dialog managers. This paper presents an approach that directly leverage billions of web search and browse sessions to overcome this hurdle. The key insight is that task completion through web search and browse sessions is (a) predictable and (b) generalizes to spoken dialog task completion. The new method automatically mines behavioral search and browse patterns from web logs and translates them into spoken dialog models. We experiment with naturally occurring spoken dialogs and large scale web logs. Our session-based models outperform the state-of-the-art method for entity extraction task in SDS. We also achieve better performance for both entity and relation extraction on web search queries when compared with nontrivial baselines.

[1]  Gökhan Tür,et al.  Towards Unsupervised Spoken Language Understanding: Exploiting Query Click Logs for Slot Filling , 2011, INTERSPEECH.

[2]  Larry Heck,et al.  The Conversational Web , 2012 .

[3]  Gökhan Tür,et al.  Exploiting the Semantic Web for Unsupervised Natural Language Semantic Parsing , 2012, INTERSPEECH.

[4]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[5]  Gökhan Tür,et al.  Multi-Modal Conversational Search and Browse , 2013, SLAM@INTERSPEECH.

[6]  Marius Pasca,et al.  Weakly-supervised discovery of named entities using web search queries , 2007, CIKM '07.

[7]  Oliver Lemon,et al.  Recent research advances in Reinforcement Learning in Spoken Dialogue Systems , 2009, The Knowledge Engineering Review.

[8]  Andrew McCallum,et al.  Collective Cross-Document Relation Extraction Without Labelled Data , 2010, EMNLP.

[9]  Andrew McCallum,et al.  Learning Extractors from Unlabeled Text using Relevant Databases , 2007 .

[10]  Gökhan Tür,et al.  Using a knowledge graph and query click logs for unsupervised learning of relation detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Gökhan Tür,et al.  Leveraging knowledge graphs for web-scale unsupervised semantic parsing , 2013, INTERSPEECH.

[12]  Steve Young,et al.  Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning , 2002 .

[13]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[14]  Gökhan Tür,et al.  Employing web search query click logs for multi-domain spoken language understanding , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[15]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Ihab F. Ilyas,et al.  Interpreting keyword queries over web knowledge bases , 2012, CIKM '12.

[18]  Dilek Z. Hakkani-Tür,et al.  Easy contextual intent prediction and slot detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[20]  Oliver Lemon,et al.  Author manuscript, published in "European Conference on Speech Communication and Technologies (Interspeech'07), Anvers: Belgium (2007)" Machine Learning for Spoken Dialogue Systems , 2022 .

[21]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[22]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[23]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[24]  Dilek Z. Hakkani-Tür,et al.  Exploiting the Semantic Web for unsupervised spoken language understanding , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[25]  David Griol,et al.  A statistical dialog manager for the LUNA project , 2009, INTERSPEECH.