Improved Training Of Mixture-Of-Experts Language GANs

Despite the dramatic success in image generation, Generative Adversarial Networks (GANs) still face great challenges in synthesizing sequences of discrete elements, in particular human language. The difficulty in generator training arises from the limited representation capacity and uninformative learning signals obtained from the discriminator. In this work, we (1) first empirically show that the mixture-of-experts approach is able to enhance the representation capacity of the generator for language GANs and (2) harness the Feature Statistics Alignment (FSA) paradigm to render fine-grained learning signals to advance the generator training. Specifically, FSA forces the mean statistics of the distribution of fake data to approach that of real samples as close as possible in the finite-dimensional feature space. Empirical study on synthetic and real benchmarks shows the superior performance in quantitative evaluation and demonstrates the effectiveness of our approach to adversarial text generation.

[1]  Sylvain Lamprier,et al.  ColdGANs: Taming Language GANs with Cautious Sampling Strategies , 2020, NeurIPS.

[2]  Ke Xu,et al.  Self-Adversarial Learning with Comparative Discrimination for Text Generation , 2020, ICLR.

[3]  Jiahai Wang,et al.  CatGAN: Category-aware Generative Adversarial Networks with Hierarchical Evolutionary Learning for Category Text Generation , 2019, AAAI.

[4]  Hua Wu,et al.  Multi-agent Learning for Neural Machine Translation , 2019, EMNLP.

[5]  Minlie Huang,et al.  ARAML: A Stable Adversarial Training Framework for Text Generation , 2019, EMNLP.

[6]  Shakir Mohamed,et al.  Training language GANs from Scratch , 2019, NeurIPS.

[7]  Nina Narodytska,et al.  RelGAN: Relational Generative Adversarial Networks for Text Generation , 2019, ICLR.

[8]  Zhe Gan,et al.  Adversarial Text Generation via Feature-Mover's Distance , 2018, NeurIPS.

[9]  Alexia Jolicoeur-Martineau,et al.  The relativistic discriminator: a key element missing from standard GAN , 2018, ICLR.

[10]  Razvan Pascanu,et al.  Relational recurrent neural networks , 2018, NeurIPS.

[11]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[12]  Lei Zheng,et al.  Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.

[13]  Andrew M. Dai,et al.  MaskGAN: Better Text Generation via Filling in the ______ , 2018, ICLR.

[14]  Yong Yu,et al.  Long Text Generation via Adversarial Training with Leaked Information , 2017, AAAI.

[15]  Zhi Chen,et al.  Adversarial Feature Matching for Text Generation , 2017, ICML.

[16]  Kevin Lin,et al.  Adversarial Ranking for Language Generation , 2017, NIPS.

[17]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[18]  Yoshua Bengio,et al.  Maximum-Likelihood Augmented Discrete Generative Adversarial Networks , 2017, ArXiv.

[19]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[20]  Matt J. Kusner,et al.  GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution , 2016, ArXiv.

[21]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[22]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[23]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[24]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[25]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[26]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[27]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[28]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[29]  Qiyue Yin,et al.  Counter-Contrastive Learning for Language GANs , 2021, EMNLP.