SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
暂无分享,去创建一个
[1] Pengfei Liu,et al. GPTScore: Evaluate as You Desire , 2023, NAACL.
[2] M. Gales,et al. MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization , 2023, ArXiv.
[3] M. Gales,et al. “World Knowledge” in Multiple Choice Reading Comprehension , 2022, FEVER.
[4] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[5] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[6] Stella Rose Biderman,et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model , 2022, BIGSCIENCE.
[7] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[8] Lisa Anne Hendricks,et al. Training Compute-Optimal Large Language Models , 2022, ArXiv.
[9] Pascale Fung,et al. Survey of Hallucination in Natural Language Generation , 2022, ACM Comput. Surv..
[10] Andreas Vlachos,et al. A Survey on Automated Fact-Checking , 2021, TACL.
[11] M. Gales,et al. Answer Uncertainty and Unanswerability in Multiple-Choice Machine Reading Comprehension , 2022, FINDINGS.
[12] Weizhe Yuan,et al. BARTScore: Evaluating Generated Text as Text Generation , 2021, NeurIPS.
[13] Mark J. F. Gales,et al. Uncertainty Estimation in Autoregressive Structured Prediction , 2021, ICLR.
[14] Bing Qin,et al. The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey , 2021, ArXiv.
[15] Jason Weston,et al. Retrieval Augmentation Reduces Hallucination in Conversation , 2021, EMNLP.
[16] William Yang Wang,et al. On Hallucination and Predictive Uncertainty in Conditional Language Generation , 2021, EACL.
[17] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[18] Ryan McDonald,et al. On Faithfulness and Factuality in Abstractive Summarization , 2020, ACL.
[19] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[20] Richard Socher,et al. Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.
[21] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[22] M. Zhou,et al. Reasoning Over Semantic-Level Graph for Fact Checking , 2019, ACL.
[23] Zhao Hai,et al. Semantics-aware BERT for Language Understanding , 2019, AAAI.
[24] Richard Socher,et al. Neural Text Summarization: A Critical Evaluation , 2019, EMNLP.
[25] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[26] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[27] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[28] Andreas Vlachos,et al. The Fact Extraction and VERification (FEVER) Shared Task , 2018, FEVER@EMNLP.
[29] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[30] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.
[31] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.
[32] David Grangier,et al. Generating Text from Structured Data with Application to the Biography Domain , 2016, ArXiv.