TheoremQA: A Theorem-driven Question Answering dataset
暂无分享,去创建一个
Wenhu Chen | Xinyi Wang | Ming Yin | Xueguang Ma | Yixin Wan | Pan Lu | Tony Xia | Wenhu Chen | Jianyu Xu | Max Ku | Max W.F. Ku
[1] Andrew M. Dai,et al. PaLM 2 Technical Report , 2023, ArXiv.
[2] Nghi D. Q. Bui,et al. CodeT5+: Open Code Large Language Models for Code Understanding and Generation , 2023, EMNLP.
[3] Harm de Vries,et al. StarCoder: may the source be with you! , 2023, ArXiv.
[4] Song-Chun Zhu,et al. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models , 2023, ArXiv.
[5] Zhenguo Li,et al. Progressive-Hint Prompting Improves Reasoning in Large Language Models , 2023, ArXiv.
[6] Yong Jae Lee,et al. Visual Instruction Tuning , 2023, ArXiv.
[7] Oskar van der Wal,et al. Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling , 2023, ICML.
[8] Marco Tulio Ribeiro,et al. Sparks of Artificial General Intelligence: Early experiments with GPT-4 , 2023, ArXiv.
[9] Henrique Pondé de Oliveira Pinto,et al. GPT-4 Technical Report , 2023, 2303.08774.
[10] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[11] Noah D. Goodman,et al. Task Ambiguity in Humans and Language Models , 2022, ArXiv.
[12] Tom B. Brown,et al. Constitutional AI: Harmlessness from AI Feedback , 2022, ArXiv.
[13] Jinghui Qin,et al. UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression , 2022, EMNLP.
[14] William W. Cohen,et al. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks , 2022, ArXiv.
[15] Jamie Callan,et al. PAL: Program-aided Language Models , 2022, ICML.
[16] Guillem Cucurull,et al. Galactica: A Large Language Model for Science , 2022, ArXiv.
[17] Oyvind Tafjord,et al. LILA: A Unified Benchmark for Mathematical Reasoning , 2022, EMNLP.
[18] P. Zhang,et al. GLM-130B: An Open Bilingual Pre-trained Model , 2022, ICLR.
[19] Xinyun Chen,et al. Compositional Semantic Parsing with Large Language Models , 2022, ArXiv.
[20] Song-Chun Zhu,et al. Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning , 2022, ICLR.
[21] Song-Chun Zhu,et al. Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering , 2022, NeurIPS.
[22] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[23] S. Gu,et al. Large Language Models are Zero-Shot Reasoners , 2022, NeurIPS.
[24] D. Schuurmans,et al. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models , 2022, ICLR.
[25] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[26] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[27] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[28] S. Savarese,et al. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis , 2022, ICLR.
[29] D. Schuurmans,et al. Self-Consistency Improves Chain of Thought Reasoning in Language Models , 2022, ICLR.
[30] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[31] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[32] S. Hoi,et al. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.
[33] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[34] David Bieber,et al. Show Your Work: Scratchpads for Intermediate Computation with Language Models , 2021, ArXiv.
[35] Mohammad Bavarian,et al. Training Verifiers to Solve Math Word Problems , 2021, ArXiv.
[36] Sameena Shah,et al. FinQA: A Dataset of Numerical Reasoning over Financial Data , 2021, EMNLP.
[37] Charles Sutton,et al. Program Synthesis with Large Language Models , 2021, ArXiv.
[38] Wojciech Zaremba,et al. Evaluating Large Language Models Trained on Code , 2021, ArXiv.
[39] Eric P. Xing,et al. GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning , 2021, FINDINGS.
[40] Fuli Feng,et al. TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance , 2021, ACL.
[41] Song-Chun Zhu,et al. Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning , 2021, ACL.
[42] Navin Goyal,et al. Are NLP Models really able to Solve Simple Math Word Problems? , 2021, NAACL.
[43] Dawn Song,et al. Measuring Mathematical Problem Solving With the MATH Dataset , 2021, NeurIPS Datasets and Benchmarks.
[44] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[45] Keh-Yih Su,et al. A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers , 2020, ACL.
[46] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[47] Yejin Choi,et al. MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms , 2019, NAACL.
[48] Pushmeet Kohli,et al. Analysing Mathematical Reasoning Abilities of Neural Models , 2019, ICLR.
[49] Shuming Shi,et al. Deep Neural Solver for Math Word Problems , 2017, EMNLP.
[50] Wang Ling,et al. Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems , 2017, ACL.
[51] Ming-Wei Chang,et al. Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems , 2016, EACL.
[52] Dan Roth,et al. Solving General Arithmetic Word Problems , 2016, EMNLP.
[53] Hannaneh Hajishirzi,et al. MAWPS: A Math Word Problem Repository , 2016, NAACL.
[54] Oren Etzioni,et al. Parsing Algebraic Word Problems into Equations , 2015, TACL.
[55] Ming-Wei Chang,et al. DRAW: A Challenging and Diverse Algebra Word Problem Set , 2015 .
[56] Oren Etzioni,et al. Solving Geometry Problems: Combining Text and Diagram Interpretation , 2015, EMNLP.
[57] Oren Etzioni,et al. Learning to Solve Arithmetic Word Problems with Verb Categorization , 2014, EMNLP.