论文信息 - Semi-supervised Learning for Information Extraction from Dialogue

Semi-supervised Learning for Information Extraction from Dialogue

In this work we present a method for semi-supervised learning from transcripts of dialogue between humans. We consider the scenario in which a large amount of transcripts are available, and we would like to extract some semantic information from them; however, only a small number of transcripts have been labeled with this information. We present a method for leveraging the unlabeled data to learn a better model than could be learned from the labeled data alone. First, a recurrent neural network (RNN) encoder-decoder is trained on the task of predicting nearby turns on the full dialogue corpus; next, the RNN encoder is reused as a feature representation for the supervised learning problem. While previous work has explored the use of pre-training for non-dialogue corpora, our method is specifically geared toward the dialogue use case. We demonstrate an improvement on a clinical documentation task, particularly in the regime of small amounts of labeled data. We compare several types of encoders, both in the context of a classification task and in a human-evaluation of their learned representations. We show that our method significantly improves the classification task in the case where only a small amount of labeled data is available.

[1] Evgeny A. Stepanov,et al. Towards End-to-End Spoken Dialogue Systems with Turn Embeddings , 2017, INTERSPEECH.

[2] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[3] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.

[4] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.

[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[6] Sanja Fidler,et al. Skip-Thought Vectors , 2015, NIPS.

[7] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.

[8] Ilya Sutskever,et al. Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[9] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.

[10] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11] N. Shah,et al. What This Computer Needs Is a Physician: Humanism and Artificial Intelligence. , 2018, Journal of the American Medical Association (JAMA).

[12] Geoffrey Zweig,et al. Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[13] Eneko Agirre,et al. Unsupervised Neural Machine Translation , 2017, ICLR.

[14] Geoffrey Zweig,et al. Spoken language understanding using long short-term memory neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[15] Joelle Pineau,et al. Hierarchical Neural Network Generative Models for Movie Dialogues , 2015, ArXiv.

[16] Yaser Al-Onaizan,et al. Zero-Resource Translation with Multi-Lingual Neural Machine Translation , 2016, EMNLP.

[17] Tie-Yan Liu,et al. Dual Learning for Machine Translation , 2016, NIPS.

[18] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[19] Navdeep Jaitly,et al. Speech recognition for medical conversations , 2017, INTERSPEECH.

[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[22] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23] Geoffrey Zweig,et al. Recurrent neural networks for language understanding , 2013, INTERSPEECH.

[24] Yoshua Bengio,et al. Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[25] Maosong Sun,et al. Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.