How well do Large Language Models perform in Arithmetic tasks?
暂无分享,去创建一个
[1] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[2] P. Shakarian,et al. An Independent Evaluation of ChatGPT on Mathematical Word Problems (MWP) , 2023, AAAI Spring Symposium: MAKE.
[3] Luke Zettlemoyer,et al. Toolformer: Language Models Can Teach Themselves to Use Tools , 2023, NeurIPS.
[4] Thomas Lukasiewicz,et al. Mathematical Capabilities of ChatGPT , 2023, ArXiv.
[5] Matteo Muffo,et al. Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition , 2023, LREC.
[6] Guillem Cucurull,et al. Galactica: A Large Language Model for Science , 2022, ArXiv.
[7] Alexander M. Rush,et al. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model , 2022, ArXiv.
[8] Dragomir R. Radev,et al. Crosslingual Generalization through Multitask Finetuning , 2022, ArXiv.
[9] Hyung Won Chung,et al. Language Models are Multilingual Chain-of-Thought Reasoners , 2022, ICLR.
[10] Nikunj Saunshi,et al. Symbolic Math Reasoning with Language Models , 2022, 2022 IEEE MIT Undergraduate Research Technology Conference (URTC).
[11] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[12] Markus N. Rabe,et al. Autoformalization with Large Language Models , 2022, NeurIPS.
[13] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.
[14] D. Schuurmans,et al. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models , 2022, ICLR.
[15] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[16] Stella Rose Biderman,et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model , 2022, BIGSCIENCE.
[17] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[18] Matt Gardner,et al. Impact of Pretraining Term Frequencies on Few-Shot Reasoning , 2022, ArXiv.
[19] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[20] Reza Yazdani Aminabadi,et al. Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model , 2022, ArXiv.
[21] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.
[22] Alexander M. Rush,et al. Multitask Prompted Training Enables Zero-Shot Task Generalization , 2021, ICLR.
[23] Mohammad Bavarian,et al. Training Verifiers to Solve Math Word Problems , 2021, ArXiv.
[24] Pooyan Jamshidi,et al. Pretrained Language Models are Symbolic Mathematics Solvers too! , 2021, ArXiv.
[25] Yue Zhang,et al. Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning , 2021, NLPCC.
[26] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[27] Dawn Song,et al. Measuring Mathematical Problem Solving With the MATH Dataset , 2021, NeurIPS Datasets and Benchmarks.
[28] Rodrigo Nogueira,et al. Investigating the Limitations of Transformers with Simple Arithmetic Tasks , 2021, 2102.13019.
[29] Sung-Hyon Myaeng,et al. Have You Seen That Number? Investigating Extrapolation in Question Answering Models , 2021, EMNLP.
[30] Ilya Sutskever,et al. Generative Language Modeling for Automated Theorem Proving , 2020, ArXiv.
[31] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[32] Pushmeet Kohli,et al. Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.