Aligning Large Language Models through Synthetic Feedback
暂无分享,去创建一个
Kang Min Yoo | Minjoon Seo | Sanghwan Bae | Donghyun Kwak | Soyoung Kang | Jamin Shin | Sungdong Kim | Sungdong Kim | Kang Min Yoo | Sanghwan Bae | Jamin Shin | Soyoung Kang | Donghyun Kwak | Minjoon Seo
[1] Zhi Rui Tam,et al. OpenAssistant Conversations - Democratizing Large Language Model Alignment , 2023, ArXiv.
[2] Songfang Huang,et al. RRHF: Rank Responses to Align Language Models with Human Feedback without tears , 2023, ArXiv.
[3] Chunyuan Li,et al. Instruction Tuning with GPT-4 , 2023, ArXiv.
[4] Jon Ander Campos,et al. Training Language Models with Language Feedback at Scale , 2023, ArXiv.
[5] Ethan Perez,et al. Pretraining Language Models with Human Preferences , 2023, ArXiv.
[6] P. Abbeel,et al. Chain of Hindsight Aligns Language Models with Feedback , 2023, ArXiv.
[7] Noah A. Smith,et al. Self-Instruct: Aligning Language Model with Self Generated Instructions , 2022, ArXiv.
[8] Lisa Anne Hendricks,et al. Improving alignment of dialogue agents via targeted human judgements , 2022, ArXiv.
[9] Gerard de Melo,et al. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models , 2022, ArXiv.
[10] Tom B. Brown,et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[11] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[12] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[13] Soroush Vosoughi,et al. Aligning Generative Language Models with Human Values , 2022, NAACL-HLT.
[14] Jeff Wu,et al. WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.
[15] Dario Amodei,et al. A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.
[16] Dawn Song,et al. Measuring Massive Multitask Language Understanding , 2020, ICLR.
[17] Ryan J. Lowe,et al. Learning to summarize from human feedback , 2020, NeurIPS 2020.
[18] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[19] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[20] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[21] Tom B. Brown,et al. Fine-Tuning Language Models from Human Preferences , 2019, ArXiv.
[22] Arthur Francis,et al. DOLLY , 2019, Los llorones.
[23] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[24] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[25] Sandro Pezzelle,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.