论文信息 - NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics - 字舞流文

NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing inspiration from the A* search algorithm, we propose NEUROLOGIC AFesque,1 a decoding algorithm that incorporates heuristic estimates of future cost. We develop efficient lookahead heuristics that are efficient for large-scale language models, making our method a drop-in replacement for common techniques such as beam search and top-k sampling. To enable constrained generation, we build on NEUROLOGIC decoding (Lu et al., 2021), combining its flexibility in incorporating logical constraints with AFesque estimates of future constraint satisfaction. Our approach outperforms competitive baselines on five generation tasks, and achieves new state-of-the-art performance on table-totext generation, constrained machine translation, and keyword-constrained generation. The improvements are particularly notable on tasks that require complex constraint satisfaction or in few-shot or zero-shot settings. NEUROLOGIC AFesque illustrates the power of decoding for improving and enabling new capabilities of large-scale language models.

Jungo Kasai | Noah A. Smith | Ronan Le Bras | Daniel Khashabi | Peter West | Sean Welleck | Youngjae Yu | Ximing Lu | Rowan Zellers | Yejin Choi | Liwei Jiang | Lianhui Qin | Yejin Choi | Rowan Zellers | Lianhui Qin | Daniel Khashabi | Liwei Jiang | Jungo Kasai | S. Welleck | Peter West | Youngjae Yu | Ximing Lu | Sean Welleck

[1] Yaser Al-Onaizan,et al. Training Neural Machine Translation to Apply Terminology Constraints , 2019, ACL.

[2] Yann Dauphin,et al. Hierarchical Neural Story Generation , 2018, ACL.

[3] Matt Post,et al. Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[4] E. KorfRichard. Depth-first iterative-deepening: an optimal admissible tree search , 1985 .

[5] Chris Callison-Burch,et al. Comparison of Diverse Decoding Methods from Conditional Language Models , 2019, ACL.

[6] Andreas Maletti,et al. Recurrent Neural Networks as Weighted Language Recognizers , 2017, NAACL.

[7] Kyunghyun Cho,et al. Non-Monotonic Sequential Text Generation , 2019, ICML.

[8] Richard E. Korf,et al. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search , 1985, Artif. Intell..

[9] Verena Rieser,et al. Findings of the E2E NLG Challenge , 2018, INLG.

[10] Jason Yosinski,et al. Plug and Play Language Models: A Simple Approach to Controlled Text Generation , 2020, ICLR.

[11] Jesfis Peral,et al. Heuristics -- intelligent search strategies for computer problem solving , 1984 .

[12] Hermann Ney,et al. An Efficient A* Search Algorithm for Statistical Machine Translation , 2001, DDMMT@ACL.

[13] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Huda Khayrallah,et al. Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting , 2019, NAACL.

[15] Adam Lopez,et al. Efficient CCG Parsing: A* versus Adaptive Supertagging , 2011, ACL.

[16] Ondrej Dusek,et al. Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings , 2016, ACL.

[17] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[18] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[19] Daniel Gildea,et al. Efficient Search for Inversion Transduction Grammar , 2006, EMNLP.

[20] Basura Fernando,et al. SPICE: Semantic Propositional Image Caption Evaluation , 2016, ECCV.

[21] Yejin Choi,et al. Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models , 2020, ACL.

[22] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[23] Ryan Cotterell,et al. Best-First Beam Search , 2020, Transactions of the Association for Computational Linguistics.

[24] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[25] Oriol Vinyals,et al. Machine Translation Decoding beyond Beam Search , 2021, EMNLP.

[26] Luke S. Zettlemoyer,et al. Global Neural CCG Parsing with Optimality Guarantees , 2016, EMNLP.

[27] David Vandyke,et al. Multi-domain Neural Network Language Generation for Spoken Dialogue Systems , 2016, NAACL.

[28] André F. T. Martins,et al. Marian: Fast Neural Machine Translation in C++ , 2018, ACL.

[29] Lei Li,et al. CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling , 2018, AAAI.

[30] Basura Fernando,et al. Guided Open Vocabulary Image Captioning with Constrained Beam Search , 2016, EMNLP.

[31] Dan Klein,et al. Learning Semantic Correspondences with Less Supervision , 2009, ACL.

[32] Dan Klein,et al. A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[33] Wenhu Chen,et al. KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation , 2020, EMNLP.

[34] Lucia Specia,et al. Guiding Neural Machine Translation Decoding with External Knowledge , 2017, WMT.

[35] Qun Liu,et al. Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[36] John DeNero,et al. Approximate Factoring for A* Search , 2007, HLT-NAACL.

[37] Mark Hopkins,et al. Cube Pruning as Heuristic Search , 2009, EMNLP.

[38] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.

[39] Nathanael Chambers,et al. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[40] Yejin Choi,et al. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning , 2020, EMNLP.

[41] Alexander M. Rush,et al. Challenges in Data-to-Document Generation , 2017, EMNLP.

[42] Yejin Choi,et al. NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints , 2020, NAACL.

[43] Gonzalo Iglesias,et al. Neural Machine Translation Decoding with Terminology Constraints , 2018, NAACL.

[44] Wenhu Chen,et al. Logical Natural Language Generation from Open-Domain Tables , 2020, ACL.

[45] Philipp Koehn,et al. Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.

[46] Nan Jiang,et al. Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced Monte-Carlo Approach , 2020, EMNLP 2020.

[47] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[48] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .

[49] Daniel Gildea,et al. Text Alignment for Real-Time Crowd Captioning , 2013, NAACL.

[50] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[51] Jason Weston,et al. Neural Text Generation with Unlikelihood Training , 2019, ICLR.

[52] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[53] Eric P. Xing,et al. Toward Controlled Generation of Text , 2017, ICML.

[54] Dan Klein,et al. Pragmatically Informative Text Generation , 2019, NAACL.

[55] Renjie Zheng,et al. Opportunistic Decoding with Timely Correction for Simultaneous Translation , 2020, ACL.

[56] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.