Semi-Supervised Neural Text Generation by Joint Learning of Natural Language Generation and Natural Language Understanding Models

In Natural Language Generation (NLG), End-to-End (E2E) systems trained through deep learning have recently gained a strong interest. Such deep models need a large amount of carefully annotated data to reach satisfactory performance. However, acquiring such datasets for every new NLG application is a tedious and time-consuming task. In this paper, we propose a semi-supervised deep learning scheme that can learn from non-annotated data and annotated data when available. It uses an NLG and a Natural Language Understanding (NLU) sequence-to-sequence models which are learned jointly to compensate for the lack of annotation. Experiments on two benchmark datasets show that, with limited amount of annotated data, the method can achieve very competitive results while not using any pre-processing or re-scoring tricks. These findings open the way to the exploitation of non-annotated datasets which is the current bottleneck for the E2E NLG system development to new applications.

[1]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[2]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[3]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[4]  Ondrej Dusek,et al.  Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings , 2016, ACL.

[5]  Satoshi Nakamura,et al.  Listening while speaking: Speech chain by deep learning , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[6]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[8]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[11]  François Portet,et al.  Generation of Company descriptions using concept-to-text and text-to-text deep models: dataset collection and systems evaluation , 2018, INLG.

[12]  Verena Rieser,et al.  Why We Need New Evaluation Metrics for NLG , 2017, EMNLP.

[13]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[14]  Jihun Choi,et al.  Learning to Compose Task-Specific Tree Structures , 2017, AAAI.

[15]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[16]  Jack Xin,et al.  Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.

[17]  Dilek Z. Hakkani-Tür,et al.  End-to-end joint learning of natural language understanding and dialogue manager , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Szu-Lin Wu,et al.  Improving Unsupervised Style Transfer in end-to-end Speech Synthesis with end-to-end Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[19]  Victor O. K. Li,et al.  Neural Machine Translation with Gumbel-Greedy Decoding , 2017, AAAI.

[20]  Marc Dymetman,et al.  Char2char Generation with Reranking for the E2E NLG Challenge , 2018, INLG.

[21]  Satoshi Nakamura,et al.  End-to-end Feedback Loss in Speech Chain Framework via Straight-through Estimator , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Will Radford,et al.  Learning to generate one-sentence biographies from Wikidata , 2017, EACL.

[23]  Alexander M. Rush,et al.  End-to-End Content and Plan Selection for Data-to-Text Generation , 2018, INLG.

[24]  Marilyn A. Walker,et al.  A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence Natural Language Generation , 2018, NAACL.

[25]  Emiel Krahmer,et al.  Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation , 2017, J. Artif. Intell. Res..

[26]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.