Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond
暂无分享,去创建一个
Jun Liu | E. Cambria | Qika Lin | Fangzhi Xu | Jiawei Han | Tianzhe Zhao
[1] Andrew M. Dai,et al. PaLM 2 Technical Report , 2023, ArXiv.
[2] Maosong Sun,et al. C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models , 2023, NeurIPS.
[3] Xiaozhi Wang,et al. ChatLog: Recording and Analyzing ChatGPT Across Time , 2023, ArXiv.
[4] Yuexin Zhang,et al. Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4 , 2023, ArXiv.
[5] Chunyuan Li,et al. Instruction Tuning with GPT-4 , 2023, ArXiv.
[6] Qika Lin,et al. Contrastive Graph Representations for Logical Formulas Embedding , 2023, IEEE Transactions on Knowledge and Data Engineering.
[7] Wayne Xin Zhao,et al. A Survey of Large Language Models , 2023, ArXiv.
[8] Le Sun,et al. ChatGPT Is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models , 2023, LREC.
[9] Fei Yu,et al. Natural Language Reasoning, A Survey , 2023, ACM Computing Surveys.
[10] E. Cambria,et al. Logical Reasoning over Natural Language as Knowledge Representation: A Survey , 2023, ArXiv.
[11] Shima Imani,et al. MathPrompter: Mathematical Reasoning using Large Language Models , 2023, ACL.
[12] Björn Schuller,et al. Will Affective Computing Emerge From Foundation Models and General Artificial Intelligence? A First Evaluation of ChatGPT , 2023, IEEE Intelligent Systems.
[13] Michihiro Yasunaga,et al. Is ChatGPT a General-Purpose Natural Language Processing Task Solver? , 2023, EMNLP.
[14] Dan Su,et al. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity , 2023, IJCNLP.
[15] Percy Liang,et al. Benchmarking Large Language Models for News Summarization , 2023, ArXiv.
[16] Yao Zhou,et al. Towards High-Order Complementary Recommendation via Logical Reasoning Network , 2022, 2022 IEEE International Conference on Data Mining (ICDM).
[17] Shafiq R. Joty,et al. FOLIO: Natural Language Reasoning with First-Order Logic , 2022, ArXiv.
[18] Zijian Huang,et al. LinE: Logical Query Reasoning over Hierarchical Knowledge Graphs , 2022, KDD.
[19] Yizhou Sun,et al. RLogic: Recursive Logical Rule Learning from Knowledge Graphs , 2022, KDD.
[20] Qika Lin,et al. Incorporating Context Graph with Logical Reasoning for Inductive Relation Prediction , 2022, SIGIR.
[21] Michael Witbrock,et al. AbductionRules: Training Transformers to Explain Unexpected Inputs , 2022, FINDINGS.
[22] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[23] Liqiang Nie,et al. MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning , 2022, FINDINGS.
[24] Pascale Fung,et al. Survey of Hallucination in Natural Language Generation , 2022, ACM Comput. Surv..
[25] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[26] G. Strang,et al. A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level , 2021, Proceedings of the National Academy of Sciences of the United States of America.
[27] Oyvind Tafjord,et al. Explaining Answers with Entailment Trees , 2021, EMNLP.
[28] Jian Yin,et al. Multi-Hop Reasoning Question Generation and Its Application , 2021, IEEE Transactions on Knowledge and Data Engineering.
[29] Peter Clark,et al. ProofWriter: Generating Implications, Proofs, and Abductive Statements over Natural Language , 2020, FINDINGS.
[30] M. Goedhart,et al. Logical Reasoning in Formal and Everyday Reasoning Tasks , 2020 .
[31] Edward Grefenstette,et al. Learning Reasoning Strategies in End-to-End Differentiable Proving , 2020, ICML.
[32] Yue Zhang,et al. LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning , 2020, IJCAI.
[33] Jonathan Berant,et al. Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge , 2020, ArXiv.
[34] Oyvind Tafjord,et al. Transformers as Soft Reasoners over Language , 2020, IJCAI.
[35] Jiashi Feng,et al. ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning , 2020, ICLR.
[36] Doug Downey,et al. Abductive Commonsense Reasoning , 2019, ICLR.
[37] Joelle Pineau,et al. CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text , 2019, EMNLP.
[38] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.
[39] Evan Heit,et al. Relations between inductive reasoning and deductive reasoning. , 2010, Journal of experimental psychology. Learning, memory, and cognition.
[40] Vinod Goel,et al. Anatomy of deductive reasoning , 2007, Trends in Cognitive Sciences.
[41] Thomas Lukasiewicz,et al. A Novel Combination of Answer Set Programming with Description Logics for the Semantic Web , 2007, IEEE Transactions on Knowledge and Data Engineering.
[42] Grigoris Antoniou,et al. DR-Prolog: A System for Defeasible Reasoning with Rules and Ontologies on the Semantic Web , 2007, IEEE Transactions on Knowledge and Data Engineering.
[43] Peter A. Flach,et al. Abductive and inductive reasoning: background and issues , 2000 .
[44] Frank Z. Xing,et al. SenticNet 7: A Commonsense-based Neurosymbolic AI Framework for Explainable Sentiment Analysis , 2022, LREC.
[45] Qika Lin,et al. Inductive Relation Prediction with Logical Reasoning Using Contrastive Representations , 2022, EMNLP.
[46] Aidong Zhang,et al. A Survey on Context Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[47] D. Walton. Abductive, presumptive and plausible arguments , 2001 .
[48] Peter A. Flach,et al. Abduction and induction: essays on their relation and integration , 2000 .
[49] P N Johnson-Laird,et al. Deductive reasoning. , 1999, Annual review of psychology.
[50] John R. Josephson,et al. Abductive inference : computation, philosophy, technology , 1994 .