Leveraging Large Language Models for Multiple Choice Question Answering
暂无分享,去创建一个
[1] Jane A. Yu,et al. Few-shot Learning with Retrieval Augmented Language Models , 2022, J. Mach. Learn. Res..
[2] O. Winther,et al. Can large language models reason about medical questions? , 2022, Patterns.
[3] Tom B. Brown,et al. Language Models (Mostly) Know What They Know , 2022, ArXiv.
[4] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[5] Graham Neubig,et al. Testing the Ability of Language Models to Interpret Figurative Language , 2022, NAACL.
[6] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[7] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[8] Ankit Pal,et al. MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering , 2022, CHIL.
[9] D. Wingate,et al. An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels , 2022, ACL.
[10] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[11] Liqiang Nie,et al. MERIt: Meta-Path Guided Contrastive Learning for Logical Reasoning , 2022, FINDINGS.
[12] J. Dean,et al. ST-MoE: Designing Stable and Transferable Sparse Expert Models , 2022, 2202.08906.
[13] Reza Yazdani Aminabadi,et al. Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model , 2022, ArXiv.
[14] Quoc V. Le,et al. GLaM: Efficient Scaling of Language Models with Mixture-of-Experts , 2021, ICML.
[15] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[16] Jianfeng Gao,et al. Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing , 2020, ACM Trans. Comput. Heal..
[17] H. Yamana,et al. HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method , 2022, LREC.
[18] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[19] M. Samwald,et al. GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain , 2021, ArXiv.
[20] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[21] Douwe Kiela,et al. True Few-Shot Learning with Language Models , 2021, NeurIPS.
[22] Luke Zettlemoyer,et al. Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right , 2021, EMNLP.
[23] Yejin Choi,et al. UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark , 2021, AAAI.
[24] D. Klein,et al. Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.
[25] Tom Henighan,et al. Scaling Laws for Transfer , 2021, ArXiv.
[26] Bill Yuchen Lin,et al. RiddleSense: Reasoning about Riddle Questions Featuring Linguistic Creativity and Commonsense Knowledge , 2021, FINDINGS.
[27] Yu Cheng,et al. InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective , 2020, ICLR.
[28] Dawn Song,et al. Measuring Massive Multitask Language Understanding , 2020, ICLR.
[29] Peng Meng,et al. Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning , 2020, ArXiv.
[30] Mark Chen,et al. Scaling Laws for Autoregressive Generative Modeling , 2020, ArXiv.
[31] Hanmeng Liu,et al. LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning , 2020, IJCAI.
[32] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[33] Hannaneh Hajishirzi,et al. UnifiedQA: Crossing Format Boundaries With a Single QA System , 2020, FINDINGS.
[34] Ronan Le Bras,et al. G-DAug: Generative Data Augmentation for Commonsense Reasoning , 2020, FINDINGS.
[35] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[36] Steven Schockaert,et al. Inducing Relational Knowledge from BERT , 2019, AAAI.
[37] Yejin Choi,et al. PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.
[38] J. Weston,et al. Adversarial NLI: A New Benchmark for Natural Language Understanding , 2019, ACL.
[39] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[40] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[41] Doug Downey,et al. Abductive Commonsense Reasoning , 2019, ICLR.
[42] Ronan Le Bras,et al. WinoGrande , 2019, AAAI.
[43] Yejin Choi,et al. Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning , 2019, EMNLP.
[44] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[45] Yejin Choi,et al. COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.
[46] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[47] Claire Cardie,et al. DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension , 2019, TACL.
[48] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[49] Yejin Choi,et al. Social IQA: Commonsense Reasoning about Social Interactions , 2019, EMNLP 2019.
[50] Jonathan Berant,et al. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Changsheng Xu,et al. Representation Learning of Knowledge Graphs with Entity Attributes and Multimedia Descriptions , 2018, 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM).
[53] Yoav Goldberg,et al. Word Sense Induction with Neural biLM and Symmetric Patterns , 2018, EMNLP.
[54] Yejin Choi,et al. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference , 2018, EMNLP.
[55] Peter Clark,et al. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.
[56] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[57] Omer Levy,et al. Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.
[58] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[59] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[60] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.
[61] Catherine Havasi,et al. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.
[62] Nathanael Chambers,et al. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.
[63] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[64] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[65] Nathan Schneider,et al. Association for Computational Linguistics: Human Language Technologies , 2011 .
[66] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .