Transformer-based Natural Language Understanding and Generation

Facilitating the sharing of information between the two complementary tasks of Natural Language Understanding(NLU) and Natural Language Generation(NLG) is crucial to the study of Natural Language Processing(NLP). NLU extracts the core semantics from a given utterance, while NLG, in contrast, aims to construct the corresponding sentence based on the given semantics. However, model training for both research topics relies on manually annotated data, but the complexity of the annotation process involved makes it costly to acquire manually annotated data on a large scale. Also, in the existing research, few scholars have treated NLU and NLG as dual tasks. Indeed, both NLG and NLU can be approached as translation problems: NLU translates natural language into formal representations, while NLG converts formal representations into natural language. In this paper, we propose a Transformer-based Natural Language Understanding and Generation (T-NLU&G) model that jointly model NLU and NLG by introducing a shared latent variable. The model can help us explore the intrinsic connection between the natural language space and the formal representation space, and use this latent variable to facilitate information sharing between the two spaces. Experiment shows that our model achieves performance gains on both the E2E dataset and the Weather dataset, validates the feasibility and effectiveness of performance gains for the respective tasks via the T-NLU&G model, and is competitive with current state-of-the-art methods.

[1]  Alex Marin,et al.  Jointly Improving Language Understanding and Generation with Quality-Weighted Weak Supervision of Automatic Labeling , 2021, EACL.

[2]  David Vandyke,et al.  A Generative Model for Joint Natural Language Understanding and Generation , 2020, ACL.

[3]  Verena Rieser,et al.  Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge , 2019, Comput. Speech Lang..

[4]  Verena Rieser,et al.  The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Houfeng Wang,et al.  A Joint Model of Intent Determination and Slot Filling for Spoken Language Understanding , 2016, IJCAI.

[7]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[12]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[13]  Marilyn A. Walker,et al.  Trainable Sentence Planning for Complex Information Presentations in Spoken Dialog Systems , 2004, ACL.

[14]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[16]  Chris Mellish,et al.  Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04) , 2004, ACL 2004.