ferret: a Framework for Benchmarking Explainers on Transformers
暂无分享,去创建一个
[1] Oskar van der Wal,et al. Inseq: An Interpretability Toolkit for Sequence Generation Models , 2023, ACL.
[2] Himabindu Lakkaraju,et al. OpenXAI: Towards a Transparent Evaluation of Model Explanations , 2022, 2206.11104.
[3] Robert Schwarzenberg,et al. Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools , 2021, EMNLP.
[4] Soumya Sanyal,et al. Discretized Integrated Gradients for Explaining Language Models , 2021, EMNLP.
[5] A. Chandar,et al. Post-hoc Interpretability for Neural NLP: A Survey , 2021, ACM Comput. Surv..
[6] Elena Baralis,et al. How Divergent Is Your Data? , 2021, Proc. VLDB Endow..
[7] Elena Baralis,et al. Looking for Trouble: Analyzing Classifier Behavior via Pattern Divergence , 2021, SIGMOD Conference.
[8] Cho-Jui Hsieh,et al. On the Sensitivity and Stability of Model Interpretations in NLP , 2021, ACL.
[9] Ana Marasović,et al. Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing , 2021, NeurIPS Datasets and Benchmarks.
[10] Mohit Bansal,et al. Robustness Gym: Unifying the NLP Evaluation Landscape , 2021, NAACL.
[11] Matthew E. Peters,et al. Explaining NLP Models via Minimal Contrastive Editing (MiCE) , 2020, FINDINGS.
[12] Seid Muhie Yimam,et al. HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection , 2020, AAAI.
[13] Chenhao Tan,et al. Evaluating and Characterizing Human Rationales , 2020, EMNLP.
[14] R. Aharonov,et al. A Survey of the State of Explainable AI for Natural Language Processing , 2020, AACL.
[15] Jakob Grue Simonsen,et al. A Diagnostic Study of Explainability Techniques for Text Classification , 2020, EMNLP.
[16] Bilal Alsallakh,et al. Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.
[17] Sebastian Gehrmann,et al. The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models , 2020, Conference on Empirical Methods in Natural Language Processing.
[18] Yoav Goldberg,et al. Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? , 2020, ACL.
[19] Byron C. Wallace,et al. ERASER: A Benchmark to Evaluate Rationalized NLP Models , 2019, ACL.
[20] X. Xue,et al. Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models , 2019, ICLR.
[21] Peter J. Liu,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[22] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[23] Sameer Singh,et al. AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models , 2019, EMNLP.
[24] Francesca Toni,et al. Human-grounded Evaluations of Explanation Methods for Text Classification , 2019, EMNLP.
[25] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.
[26] Quanshi Zhang,et al. Towards a Deep and Unified Understanding of Deep Neural Models in NLP , 2019, ICML.
[27] Elena Baralis,et al. Explaining black box models by means of local rules , 2019, SAC.
[28] Jesse Vig,et al. Visualizing Attention in Transformer-Based Language Representation Models , 2019, ArXiv.
[29] Klaus-Robert Müller,et al. Evaluating Recurrent Neural Network Explanations , 2019, BlackboxNLP@ACL.
[30] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.
[31] Tao Li,et al. Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension , 2018, EMNLP.
[32] Been Kim,et al. Sanity Checks for Saliency Maps , 2018, NeurIPS.
[33] D. Erhan,et al. A Benchmark for Interpretability Methods in Deep Neural Networks , 2018, NeurIPS.
[34] Alexander M. Rush,et al. Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models , 2018, IEEE Transactions on Visualization and Computer Graphics.
[35] Shi Feng,et al. Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.
[36] Luke S. Zettlemoyer,et al. AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.
[37] Scott Lundberg,et al. A Unified Approach to Interpreting Model Predictions , 2017, NIPS.
[38] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[39] Daniel Jurafsky,et al. Understanding Neural Networks through Representation Erasure , 2016, ArXiv.
[40] Seth Flaxman,et al. European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..
[41] Marco Tulio Ribeiro,et al. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.
[42] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[43] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[44] Jason Eisner,et al. Modeling Annotators: A Generative Approach to Learning from Annotator Rationales , 2008, EMNLP.
[45] F. Mercorio,et al. Contrastive Explanations of Text Classifiers as a Service , 2022, NAACL.
[46] Dirk Hovy,et al. Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection , 2022, NLPPOWER.
[47] Luis Espinosa Anke,et al. XLM-T: A Multilingual Language Model Toolkit for Twitter , 2021, ArXiv.
[48] Marko Bohanec,et al. Perturbation-Based Explanations of Prediction Models , 2018, Human and Machine Learning.