Dialog state tracking for interview coaching using two-level LSTM

This study presents an approach to dialog state tracking (DST) in an interview conversation by using the long short-term memory (LSTM) and artificial neural network (ANN). First, the techniques of word embedding are employed for word representation by using the word2vec model. Then, each input sentence is represented by a sentence hidden vector using the LSTM-based sentence model. The sentence hidden vectors for each sentence are fed to the LSTM-based answer model to map the interviewee's answer to an answer hidden vector. For dialog state detection, the answer hidden vector is finally used to detect the dialog state using an ANN-based dialog state detection model. To evaluate the proposed method, an interview conversation system was constructed, and an average accuracy of 89.93% was obtained for dialog state detection.

[1]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Alexander I. Rudnicky,et al.  Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda , 2003, INTERSPEECH.

[3]  Jürgen Schmidhuber,et al.  Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Kai Chen,et al.  A LSTM-based method for stock returns prediction: A case study of China stock market , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[6]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[7]  Lu Chen,et al.  A generalized rule based tracker for dialogue state tracking , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[8]  Alexander I. Rudnicky,et al.  A “K Hypotheses + Other” Belief Updating Model , 2006 .

[9]  Ryuichiro Higashinaka,et al.  On the difficulty of improving hand-crafted rules in chat-oriented dialogue systems , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[10]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[11]  Rafael E. Banchs,et al.  The Fourth Dialog State Tracking Challenge , 2016, IWSDS.

[12]  Oliver Lemon,et al.  A Simple and Generic Belief Tracking Mechanism for the Dialog State Tracking Challenge: On the believability of observed information , 2013, SIGDIAL Conference.

[13]  Eric Horvitz,et al.  Conversation as Action Under Uncertainty , 2000, UAI.

[14]  Filip Jurcícek,et al.  Comparison of Bayesian Discriminative and Generative Models for Dialogue State Tracking , 2013, SIGDIAL Conference.

[15]  Filip Jurcícek,et al.  LecTrack: Incremental Dialog State Tracking with Long Short-Term Memory Networks , 2015, TSD.

[16]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[17]  Jia Liu,et al.  Dialog state tracking using long short-term memory neural networks , 2015, INTERSPEECH.

[18]  Li Jin,et al.  LISSA — Live Interactive Social Skill Assistance , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).

[19]  Steve J. Young,et al.  Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems , 2010, Comput. Speech Lang..

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  Prabhat,et al.  Artificial Neural Network , 2018, Encyclopedia of GIS.

[22]  Bilge Mutlu,et al.  MACH: my automated conversation coach , 2013, UbiComp.

[23]  Antoine Raux,et al.  The Dialog State Tracking Challenge Series: A Review , 2016, Dialogue Discourse.

[24]  Nicole Beringer,et al.  Human language acquisition methods in a machine learning task , 2004, INTERSPEECH.

[25]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[26]  Jürgen Schmidhuber,et al.  Biologically Plausible Speech Recognition with LSTM Neural Nets , 2004, BioADIT.

[27]  Staffan Larsson,et al.  Information state and dialogue management in the TRINDI dialogue move engine toolkit , 2000, Natural Language Engineering.

[28]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[29]  Milica Gasic,et al.  Parameter learning for POMDP spoken dialogue models , 2010, 2010 IEEE Spoken Language Technology Workshop.