论文信息 - Evaluating Text GANs as Language Models - 字舞流文

Evaluating Text GANs as Language Models

Generative Adversarial Networks (GANs) are a promising approach for text generation that, unlike traditional language models (LM), does not suffer from the problem of ``exposure bias''. However, A major hurdle for understanding the potential of GANs for text generation is the lack of a clear evaluation metric. In this work, we propose to approximate the distribution of text generated by a GAN, which permits evaluating them with traditional probability-based LM metrics. We apply our approximation procedure on several GAN-based models and show that they currently perform substantially worse than state-of-the-art LMs. Our evaluation procedure promotes better understanding of the relation between GANs and LMs, and can accelerate progress in GAN-based text generation.

Jonathan Berant | Vered Shwartz | Guy Tevet | Gavriel Habib | Jonathan Berant | Vered Shwartz | Guy Tevet | Gavriel Habib

[1] Joelle Pineau,et al. Language GANs Falling Short , 2018, ICLR.

[2] Ben Krause,et al. Explorer Dynamic Evaluation of Neural Sequence Models , 2018 .

[3] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[4] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[5] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.

[6] Sandeep Subramanian,et al. Adversarial Generation of Natural Language , 2017, Rep4NLP@ACL.

[7] H. Chernoff. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[8] Yong Yu,et al. Neural Text Generation: Past, Present and Beyond , 2018, 1803.07133.

[9] Lior Wolf,et al. Language Generation with Recurrent Generative Adversarial Networks without Pre-training , 2017, ArXiv.

[10] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.

[11] Yong Yu,et al. Long Text Generation via Adversarial Training with Leaked Information , 2017, AAAI.

[12] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13] Steve Renals,et al. Multiplicative LSTM for sequence modelling , 2016, ICLR.

[14] Meng Zhang,et al. Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[15] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.

[16] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[17] N. Fisher,et al. Probability Inequalities for Sums of Bounded Random Variables , 1994 .

[18] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[19] Graham Neubig,et al. Neural Machine Translation and Sequence-to-sequence Models: A Tutorial , 2017, ArXiv.

[20] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .

[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[22] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.

[23] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[24] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[25] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.

[26] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[27] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[28] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[29] Stanislau Semeniuta,et al. On Accurate Evaluation of GANs for Language Generation , 2018, ArXiv.

[30] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[31] Jiebo Luo,et al. Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Alexander M. Rush,et al. Adversarially Regularized Autoencoders for Generating Discrete Structures , 2017, ArXiv.