OpenAssistant Conversations - Democratizing Large Language Model Alignment
暂无分享,去创建一个
Zhi Rui Tam | Dimitri von Rutte | Yannic Kilcher | K. Stevens | Sotiris Anagnostidis | Andreas Kopf | Christoph Schuhmann | Andrew Maguire | Nguyen Minh Duc | Abdullah Barhoum | Oliver Stanley | Rich'ard Nagyfi | ES Shahul | Sameer Suri | David Glushkov | Arnav Dantuluri | Huu Nguyen | A. Mattick
[1] Chunyuan Li,et al. Instruction Tuning with GPT-4 , 2023, ArXiv.
[2] Oskar van der Wal,et al. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling , 2023, ArXiv.
[3] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[4] Noah A. Smith,et al. Self-Instruct: Aligning Language Model with Self Generated Instructions , 2022, ArXiv.
[5] Tom B. Brown,et al. Discovering Language Model Behaviors with Model-Written Evaluations , 2022, ACL.
[6] Lisa Anne Hendricks,et al. Improving alignment of dialogue agents via targeted human judgements , 2022, ArXiv.
[7] Felix Naumann,et al. The Effects of Data Quality on Machine Learning Performance , 2022, 2207.14529.
[8] Tom B. Brown,et al. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback , 2022, ArXiv.
[9] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[10] Po-Sen Huang,et al. Ethical and social risks of harm from Language Models , 2021, ArXiv.
[11] Dario Amodei,et al. A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.
[12] Jason Weston,et al. Retrieval Augmentation Reduces Hallucination in Conversation , 2021, EMNLP.
[13] Ryan J. Lowe,et al. Learning to summarize from human feedback , 2020, NeurIPS 2020.
[14] Richard Socher,et al. Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.
[15] Lucy Vasserman,et al. Measuring and Mitigating Unintended Bias in Text Classification , 2018, AIES.
[16] Peter Henderson,et al. Ethical Challenges in Data-Driven Dialogue Systems , 2017, AIES.
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[19] Marco Valtorta,et al. The Effects of Data Quality on Machine Learning Algorithms , 2006, ICIQ.
[20] Jürgen H. P. Hoffmeyer-Zlotnik,et al. How to measure education in cross-national comparison: Hoffmeyer-Zlotnik/ Warner-Matrix of Education as a new instrument , 2005 .
[21] Herbert H. Clark,et al. Grounding in communication , 1991, Perspectives on socially shared cognition.
[22] T. Tideman,et al. Independence of clones as a criterion for voting rules , 1987 .