论文信息 - Director: Generator-Classifiers For Supervised Language Modeling - 字舞流文

Director: Generator-Classifiers For Supervised Language Modeling

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness, and contradictions. The standard language modeling setup fails to address these issues. In this paper, we introduce a new architecture, Director, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token. Training is conducted jointly using both standard language modeling data, and data labeled with desirable and undesirable sequences. Experiments in several settings show that the model has competitive training and decoding speed compared to standard language models while yielding superior results, avoiding undesirable behaviors while maintaining generation quality. It also outperforms existing model guiding approaches in terms of both accuracy and efficiency. Our code is made publicly available.

J. Weston | Sainbayar Sukhbaatar | Kurt Shuster | Kushal Arora

[1] Yejin Choi,et al. Quark: Controllable Text Generation with Reinforced Unlearning , 2022, NeurIPS.

[2] M. de Rijke,et al. A Simple Contrastive Learning Objective for Alleviating Neural Text Degeneration , 2022, ArXiv.

[3] Shankar Kumar,et al. Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES , 2022, NAACL.

[4] CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation , 2022, 2204.00862.

[5] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.

[6] Jason Weston,et al. Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity , 2021, NAACL-HLT.

[7] Noah A. Smith,et al. Benchmarking Generalization via In-Context Instructions on 1, 600+ Language Tasks , 2022, ArXiv.

[8] Jeff Wu,et al. WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.

[9] Dario Amodei,et al. A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.

[10] Po-Sen Huang,et al. Challenges in Detoxifying Language Models , 2021, EMNLP.

[11] Jason Weston,et al. Bot-Adversarial Dialogue for Safe Conversational Agents , 2021, NAACL.

[12] D. Klein,et al. FUDGE: Controlled Text Generation With Future Discriminators , 2021, NAACL.

[13] Naman Goyal,et al. BASE Layers: Simplifying Training of Large, Sparse Models , 2021, ICML.

[14] Mohit Bansal,et al. I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling , 2020, ACL.

[15] Shafiq R. Joty,et al. GeDi: Generative Discriminator Guided Sequence Generation , 2020, EMNLP.

[16] Mary Williamson,et al. Recipes for Building an Open-Domain Chatbot , 2020, EACL.

[17] J. Weston,et al. Recipes for Safety in Open-domain Chatbots , 2020, ArXiv.

[18] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.

[19] Y-Lan Boureau,et al. Controlling Style in Generated Dialogue , 2020, ArXiv.

[20] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.

[21] Mary Williamson,et al. Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills , 2020, ACL.

[22] Quoc V. Le,et al. Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.

[23] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[24] J. Yosinski,et al. Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2019, ICLR.

[25] Jason Weston,et al. Neural Text Generation with Unlikelihood Training , 2019, ICLR.

[26] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.

[27] Jason Weston,et al. Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack , 2019, EMNLP.

[28] Joelle Pineau,et al. The Second Conversational Intelligence Challenge (ConvAI2) , 2019, The NeurIPS '18 Competition.

[29] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[30] Lucas Dixon,et al. Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.