Global Explainability of BERT-Based Evaluation Metrics by Disentangling along Linguistic Factors
暂无分享,去创建一个
[1] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[2] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[3] Yonatan Belinkov,et al. Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? , 2020, EACL.
[4] Philipp Koehn,et al. Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.
[5] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[6] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[7] Nikolaos Aletras,et al. Translation Error Detection as Rationale Extraction , 2021, FINDINGS.
[8] Wei Zhao,et al. The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results , 2021, EVAL4NLP.
[9] Wei Zhao,et al. SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization , 2020, ACL.
[10] Philipp Koehn,et al. Findings of the 2015 Workshop on Statistical Machine Translation , 2015, WMT@EMNLP.
[11] Lucia Specia,et al. BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task , 2020, WMT.
[12] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[13] Kaizhong Zhang,et al. Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..
[14] Kun Qian,et al. A Survey of the State of Explainable AI for Natural Language Processing , 2020, AACL/IJCNLP.
[15] C.-C. Jay Kuo,et al. SBERT-WK: A Sentence Embedding Method by Dissecting BERT-Based Word Models , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Holger Schwenk,et al. Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.
[17] Iryna Gurevych,et al. Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems , 2019, NAACL.
[18] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[19] Naveen Arivazhagan,et al. Language-agnostic BERT Sentence Embedding , 2020, ArXiv.
[20] Atsushi Fujita,et al. Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers , 2021, ACL.
[21] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.
[22] Fei Liu,et al. MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance , 2019, EMNLP.
[23] Ray Kurzweil,et al. Multilingual Universal Sentence Encoder for Semantic Retrieval , 2019, ACL.
[24] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[25] Philip Bille,et al. A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..
[26] Philipp Koehn,et al. Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.
[27] Sampo Pyysalo,et al. Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection , 2020, LREC.
[28] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[29] Lucia Specia,et al. SentSim: Crosslingual Semantic Evaluation of Machine Translation , 2021, NAACL.
[30] Ray Kurzweil,et al. Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model , 2019, RepL4NLP@ACL.
[31] Thibault Sellam,et al. BLEURT: Learning Robust Metrics for Text Generation , 2020, ACL.
[32] Karin M. Verspoor,et al. Findings of the 2016 Conference on Machine Translation , 2016, WMT.
[33] Chi-kiu Lo,et al. YiSi - a Unified Semantic MT Quality Evaluation and Estimation Metric for Languages with Different Levels of Available Resources , 2019, WMT.
[34] Iryna Gurevych,et al. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.
[35] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[36] Nitika Mathur,et al. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics , 2020, ACL.
[37] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[38] Danqi Chen,et al. A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.
[39] Iryna Gurevych,et al. Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation , 2020, EMNLP.
[40] Xipeng Qiu,et al. BERT-ATTACK: Adversarial Attack against BERT Using BERT , 2020, EMNLP.
[41] Steffen Eger,et al. BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks , 2021, FINDINGS.
[42] Markus Freitag,et al. BLEU Might Be Guilty but References Are Not Innocent , 2020, EMNLP.
[43] Yonatan Belinkov,et al. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.
[44] Wei Zhao,et al. On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation , 2020, ACL.
[45] Dan Klein,et al. Multilingual Alignment of Contextual Word Representations , 2020, ICLR.
[46] Jason Baldridge,et al. PAWS: Paraphrase Adversaries from Word Scrambling , 2019, NAACL.
[47] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[48] Iryna Gurevych,et al. How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation , 2020, CONLL.
[49] Markus Freitag,et al. KoBE: Knowledge-Based Machine Translation Evaluation , 2020, FINDINGS.
[50] Alon Lavie,et al. COMET: A Neural Framework for MT Evaluation , 2020, EMNLP.