Small Language Models Improve Giants by Rewriting Their Outputs
暂无分享,去创建一个
Aliaksei Severyn | Jonathan Mallinson | Eric Malmi | Arthur Bravzinskas | Jakub Adamek | Giorgos Vernikos | Jakub Adamek | Giorgos Vernikos | Arthur Bravzinskas | Jonathan Mallinson | Aliaksei Severyn | Eric Malmi
[1] Hongsheng Li,et al. LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention , 2023, ArXiv.
[2] Ashish Sabharwal,et al. Specializing Smaller Language Models towards Multi-Step Reasoning , 2023, ICML.
[3] Orhan Firat,et al. Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction , 2023, ArXiv.
[4] Alexandra Birch,et al. Prompting Large Language Model for Machine Translation: A Case Study , 2023, ICML.
[5] Dan Jurafsky,et al. Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding , 2022, ACL.
[6] Xiang Ren,et al. PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales , 2022, ICLR.
[7] Yejin Choi,et al. Generating Sequences by Learning to Self-Correct , 2022, ICLR.
[8] Hua Wu,et al. Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards , 2022, EMNLP.
[9] Andrew M. Dai,et al. Scaling Instruction-Finetuned Language Models , 2022, ArXiv.
[10] Noah A. Smith,et al. Measuring and Narrowing the Compositionality Gap in Language Models , 2022, ArXiv.
[11] Keith B. Hall,et al. Promptagator: Few-shot Dense Retrieval From 8 Examples , 2022, ICLR.
[12] Dan Iter,et al. Generate rather than Retrieve: Large Language Models are Strong Context Generators , 2022, ICLR.
[13] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.
[14] Dan Jurafsky,et al. Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models , 2022, EMNLP.
[15] D. Schuurmans,et al. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models , 2022, ICLR.
[16] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[17] Stella Rose Biderman,et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model , 2022, BIGSCIENCE.
[18] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[19] D. Schuurmans,et al. Self-Consistency Improves Chain of Thought Reasoning in Language Models , 2022, ICLR.
[20] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[21] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.
[22] Ronan Le Bras,et al. Generated Knowledge Prompting for Commonsense Reasoning , 2021, ACL.
[23] Yelong Shen,et al. LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.
[24] S. Riedel,et al. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity , 2021, ACL.
[25] Weizhu Chen,et al. What Makes Good In-Context Examples for GPT-3? , 2021, DEELIO.
[26] David Bieber,et al. Show Your Work: Scratchpads for Intermediate Computation with Language Models , 2021, ArXiv.
[27] Mohammad Bavarian,et al. Training Verifiers to Solve Math Word Problems , 2021, ArXiv.
[28] Jure Leskovec,et al. LM-Critic: Language Models for Unsupervised Grammatical Error Correction , 2021, EMNLP.
[29] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[30] Aliaksei Severyn,et al. A Simple Recipe for Multilingual Grammatical Error Correction , 2021, ACL.
[31] Brian Lester,et al. The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.
[32] Danqi Chen,et al. Making Pre-trained Language Models Better Few-shot Learners , 2021, ACL.
[33] Graham Neubig,et al. How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering , 2020, Transactions of the Association for Computational Linguistics.
[34] Joe Davison,et al. Compacter: Efficient Low-Rank Hypercomplex Adapter Layers , 2021, NeurIPS.
[35] Percy Liang,et al. Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.
[36] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[37] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[38] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[39] Verena Rieser,et al. Semantic Noise Matters for Neural Natural Language Generation , 2019, INLG.
[40] Ted Briscoe,et al. The BEA-2019 Shared Task on Grammatical Error Correction , 2019, BEA@ACL.
[41] Mona Attariyan,et al. Parameter-Efficient Transfer Learning for NLP , 2019, ICML.
[42] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.
[43] Verena Rieser,et al. The E2E Dataset: New Challenges For End-to-End Generation , 2017, SIGDIAL Conference.
[44] Raymond Hendy Susanto,et al. The CoNLL-2014 Shared Task on Grammatical Error Correction , 2014 .
[45] Hwee Tou Ng,et al. Better Evaluation for Grammatical Error Correction , 2012, NAACL.
[46] Helen Yannakoudakis,et al. A New Dataset and Method for Automatically Grading ESOL Texts , 2011, ACL.
[47] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.