Two-stage multi-intent detection for spoken language understanding

This paper presents a system to detect multiple intents (MIs) in an input sentence when only single-intent (SI)-labeled training data are available. To solve the problem, this paper categorizes input sentences into three types and uses a two-stage approach in which each stage attempts to detect MIs in different types of sentences. In the first stage, the system generates MI hypotheses based on conjunctions in the input sentence, then evaluates the hypotheses and then selects the best one that satisfies specified conditions. In the second stage, the system applies sequence labeling to mark intents on the input sentence. The sequence labeling model is trained based on SI-labeled training data. In experiments, the proposed two-stage MI detection method reduced errors for written and spoken input by 20.54 and 17.34 % respectively.

[1]  Sanjeev Khudanpur,et al.  Maximum entropy language modeling with non-local dependencies , 2003 .

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Mitchell P. Marcus,et al.  Maximum entropy models for natural language ambiguity resolution , 1998 .

[4]  Réda Adjoudj,et al.  Multimodal Biometric Using a Hierarchical Fusion of a Person's Face, Voice, and Online Signature , 2014, J. Inf. Process. Syst..

[5]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[6]  Gina-Anne Levow,et al.  Predicting User Satisfaction in Spoken Dialog System Evaluation With Collaborative Filtering , 2012, IEEE Journal of Selected Topics in Signal Processing.

[7]  Hervé Déjean,et al.  Introduction to the CoNLL-2001 shared task: clause identification , 2001, CoNLL.

[8]  Gökhan Tür,et al.  A Discriminative Classification-Based Approach to Information State Updates for a Multi-Domain Dialog System , 2012, INTERSPEECH.

[9]  Michael F. McTear,et al.  Implementing advanced spoken dialogue management in Java , 2005, Sci. Comput. Program..

[10]  Prabhat Verma,et al.  A framework to integrate speech based interface for blind web users on the websites of public interest , 2013, Human-centric Computing and Information Sciences.

[11]  Xiao Li,et al.  Lexicon modeling for query understanding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Gary Geunbae Lee,et al.  Hybrid approach to robust dialog management using agenda and dialog examples , 2010, Comput. Speech Lang..

[13]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[14]  Ki-Ho Choi,et al.  A Study on Error Correction Using Phoneme Similarity in Post-Processing of Speech Recognition , 2007 .

[15]  Mary P. Harper,et al.  Reranking for Sentence Boundary Detection in Conversational Speech , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[16]  James R. Glass,et al.  Query understanding enhanced by hierarchical parsing structures , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[17]  Sobha Lalitha Devi,et al.  Clause Boundary Identification Using Conditional Random Fields , 2008, CICLing.

[18]  Gary Geunbae Lee,et al.  An Example-Based Approach to Ranking Multiple Dialog States for Flexible Dialog Management , 2012, IEEE Journal of Selected Topics in Signal Processing.

[19]  Gary Geunbae Lee,et al.  Recent Approaches to Dialog Management for Spoken Dialog Systems , 2010, J. Comput. Sci. Eng..

[20]  Jan W. Amtrup,et al.  Sentence boundary detection: a comparison of paradigms for improving MT quality , 2001, MTSUMMIT.

[22]  Ruhi Sarikaya,et al.  Exploiting shared information for multi-intent natural language sentence classification , 2013, INTERSPEECH.

[23]  S. Young,et al.  Scaling POMDPs for Spoken Dialog Management , 2007, IEEE Transactions on Audio, Speech, and Language Processing.