Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
暂无分享,去创建一个
[1] Henrique Pondé de Oliveira Pinto,et al. GPT-4 Technical Report , 2023, 2303.08774.
[2] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[3] E. Davis. Benchmarks for Automated Commonsense Reasoning: A Survey , 2023, ACM Comput. Surv..
[4] A. Borji. A Categorical Archive of ChatGPT Failures , 2023, ArXiv.
[5] Ronan Le Bras,et al. I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation , 2022, ArXiv.
[6] Swarat Chaudhuri,et al. Natural Language Deduction with Incomplete Information , 2022, EMNLP.
[7] Yuhuai Wu,et al. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs , 2022, ICLR.
[8] Oyvind Tafjord,et al. Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning , 2022, EMNLP.
[9] Yejin Choi,et al. Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering , 2022, EMNLP.
[10] Danqi Chen,et al. Generating Natural Language Proofs with Verifier-Guided Search , 2022, EMNLP.
[11] Ronan Le Bras,et al. Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations , 2022, EMNLP.
[12] Dongyan Zhao,et al. Things not Written in Text: Exploring Spatial Commonsense from Visual Signals , 2022, ACL.
[13] Hannaneh Hajishirzi,et al. UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training , 2022, ArXiv.
[14] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[15] Swarat Chaudhuri,et al. Natural Language Deduction through Search over Statement Compositions , 2022, EMNLP.
[16] Noah A. Smith,et al. WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation , 2022, EMNLP.
[17] Ronan Le Bras,et al. Generated Knowledge Prompting for Commonsense Reasoning , 2021, ACL.
[18] Ronan Le Bras,et al. Symbolic Knowledge Distillation: from General Language Models to Commonsense Models , 2021, NAACL.
[19] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[20] Oyvind Tafjord,et al. General-Purpose Question-Answering with Macaw , 2021, ArXiv.
[21] Eunsol Choi,et al. CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge , 2021, NeurIPS Datasets and Benchmarks.
[22] Yejin Choi,et al. CommonsenseQA 2.0: Exposing the Limits of AI through Gamification , 2021, NeurIPS Datasets and Benchmarks.
[23] Alessandro Roncone,et al. PROST: Physical Reasoning about Objects through Space and Time , 2021, FINDINGS.
[24] Nanyun Peng,et al. COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences , 2021, FINDINGS.
[25] Eunsol Choi,et al. Can NLI Models Verify QA Systems' Predictions? , 2021, EMNLP.
[26] Yejin Choi,et al. UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark , 2021, AAAI.
[27] Jonathan Berant,et al. Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies , 2021, Transactions of the Association for Computational Linguistics.
[28] Yejin Choi,et al. COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs , 2020, AAAI.
[29] Yejin Choi,et al. Thinking Like a Skeptic: Defeasible Inference in Natural Language , 2020, FINDINGS.
[30] Yue Zhang,et al. SemEval-2020 Task 4: Commonsense Validation and Explanation , 2020, SEMEVAL.
[31] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[32] Hannaneh Hajishirzi,et al. UnifiedQA: Crossing Format Boundaries With a Single QA System , 2020, FINDINGS.
[33] Bill Yuchen Lin,et al. Birds Have Four Legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models , 2020, EMNLP.
[34] Peter Clark,et al. GenericsKB: A Knowledge Base of Generic Statements , 2020, ArXiv.
[35] Hannaneh Hajishirzi,et al. Fact or Fiction: Verifying Scientific Claims , 2020, EMNLP.
[36] Ce Liu,et al. Supervised Contrastive Learning , 2020, NeurIPS.
[37] Yejin Choi,et al. PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.
[38] Ashish Sabharwal,et al. QASC: A Dataset for Question Answering via Sentence Composition , 2019, AAAI.
[39] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[40] Ronan Le Bras,et al. WinoGrande , 2019, AAAI.
[41] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[42] Peter Clark,et al. QuaRTz: An Open-Domain Dataset of Qualitative Relationship Questions , 2019, EMNLP.
[43] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[44] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[45] Peter Clark,et al. QuaRel: A Dataset and Models for Answering Questions about Qualitative Relationships , 2018, AAAI.
[46] Yejin Choi,et al. Social IQA: Commonsense Reasoning about Social Interactions , 2019, EMNLP 2019.
[47] Jonathan Berant,et al. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.
[48] Yejin Choi,et al. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.
[49] Peter Clark,et al. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.
[50] Andreas Vlachos,et al. FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.
[51] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[52] Nelson F. Liu,et al. Crowdsourcing Multiple Choice Science Questions , 2017, NUT@EMNLP.
[53] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[54] Sheng Zhang,et al. Ordinal Common-sense Inference , 2016, TACL.
[55] Nathanael Chambers,et al. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.
[56] Milos Hauskrecht,et al. Obtaining Well Calibrated Probabilities Using Bayesian Binning , 2015, AAAI.
[57] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[58] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[59] Hector J. Levesque,et al. The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[60] Raymond Reiter,et al. A Logic for Default Reasoning , 1987, Artif. Intell..