Mistral 7B
暂无分享,去创建一个
Devendra Singh Chaplot | Diego de Las Casas | Teven Le Scao | Guillaume Lample | Thibaut Lavril | Timothée Lacroix | Marie-Anne Lachaux | A. Mensch | Thomas Wang | Lucile Saulnier | Albert Qiaochu Jiang | Alexandre Sablayrolles | Arthur Mensch | Chris Bamford | Florian Bressand | Gianna Lengyel | L'elio Renard Lavaud | Pierre Stock | William El Sayed
[1] Joseph E. Gonzalez,et al. Efficient Memory Management for Large Language Model Serving with PagedAttention , 2023, SOSP.
[2] Manish P Bhatt,et al. Code Llama: Open Foundation Models for Code , 2023, ArXiv.
[3] Eric Michael Smith,et al. Llama 2: Open Foundation and Fine-Tuned Chat Models , 2023, ArXiv.
[4] Michiel de Jong,et al. GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints , 2023, EMNLP.
[5] Weizhu Chen,et al. AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models , 2023, NAACL-HLT.
[6] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[7] Quoc V. Le,et al. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them , 2022, ACL.
[8] Daniel Y. Fu,et al. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness , 2022, NeurIPS.
[9] Mohammad Bavarian,et al. Training Verifiers to Solve Math Word Problems , 2021, ArXiv.
[10] Charles Sutton,et al. Program Synthesis with Large Language Models , 2021, ArXiv.
[11] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[12] Dawn Song,et al. Measuring Mathematical Problem Solving With the MATH Dataset , 2021, NeurIPS Datasets and Benchmarks.
[13] Dawn Song,et al. Measuring Massive Multitask Language Understanding , 2020, ICLR.
[14] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[15] Yejin Choi,et al. PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.
[16] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.
[17] Ming-Wei Chang,et al. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.
[18] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[19] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[20] Eunsol Choi,et al. QuAC: Question Answering in Context , 2018, EMNLP.
[21] Peter Clark,et al. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.
[22] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.
[25] Lisa Anne Hendricks,et al. An empirical analysis of compute-optimal large language model training , 2022, NeurIPS.
[26] Yejin Choi,et al. An Adversarial Winograd Schema Challenge at Scale , 2019 .
[27] Jonathan Berant,et al. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.