TransSent: Towards Generation of Structured Sentences with Discourse Marker

This paper focuses on the task of generating long structured sentences with explicit discourse markers, by proposing a new task Sentence Transfer and a novel model architecture TransSent. Previous works on text generation fused semantic and structure information in one mixed hidden representation. However, the structure was difficult to maintain properly when the generated sentence became longer. In this work, we explicitly separate the modeling process of semantic information and structure information. Intuitively, humans produce long sentences by directly connecting discourses with discourse markers like and, but, etc. We thus define a new task called Sentence Transfer. This task represents a long sentence as (head discourse, discourse marker, tail discourse) and aims at tail discourse generation based on head discourse and discourse marker. Then, by connecting original head discourse and generated tail discourse with a discourse marker, we generate a long structured sentence. We also propose a model architecture called TransSent, which models relations between two discourses by interpreting them as transferring from one discourse to the other in the embedding space. Experiment results show that our model achieves better performance in automatic evaluations, and can generate structured sentences with high quality. The datasets can be accessed by this https URL dataset.

[1]  Hua Wu,et al.  Modeling Coherence for Discourse Neural Machine Translation , 2018, AAAI.

[2]  Zhi Chen,et al.  Adversarial Feature Matching for Text Generation , 2017, ICML.

[3]  Matthew Stone,et al.  Discourse Relations: A Structural and Presuppositional Account Using Lexicalised TAG , 1999, ACL.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Po-Sen Huang,et al.  Discourse-Aware Neural Rewards for Coherent Text Generation , 2018, NAACL.

[6]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[7]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[8]  Graham Neubig,et al.  Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[11]  Noah D. Goodman,et al.  DisSent: Sentence Representation Learning from Explicit Discourse Relations , 2017, ArXiv.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Jerry R. Hobbs Literature And Cognition , 1990 .

[14]  Sanja Fidler,et al.  Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[16]  Daniel Marcu A surface-based approach to identifying discourse markers and elementary textual units in unrestricted texts , 1998 .

[17]  Yong Yu,et al.  Long Text Generation via Adversarial Training with Leaked Information , 2017, AAAI.

[18]  Stefan Thater,et al.  Sequence to Sequence Learning for Event Prediction , 2017, IJCNLP.

[19]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[20]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[21]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[22]  Richard Socher,et al.  Pointer Sentinel Mixture Models , 2016, ICLR.

[23]  Minlie Huang,et al.  Long and Diverse Text Generation with Planning-based Hierarchical Variational Model , 2019, EMNLP.

[24]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[25]  Sandeep Subramanian,et al.  Adversarial Generation of Natural Language , 2017, Rep4NLP@ACL.

[26]  Kevin Lin,et al.  Adversarial Ranking for Language Generation , 2017, NIPS.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Xin Wang,et al.  Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models , 2019, ACL.

[29]  Eric P. Xing,et al.  Unsupervised Text Style Transfer using Language Models as Discriminators , 2018, NeurIPS.

[30]  Xiaodong Gu,et al.  DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder , 2018, ICLR.

[31]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[32]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.