论文信息 - Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling - 字舞流文

Explaining Natural Language Processing Classifiers with Occlusion and Language Modeling

[1] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2] Been Kim,et al. Sanity Checks for Saliency Maps , 2018, NeurIPS.

[3] K. Pearson. VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[4] G. Yule. On the Methods of Measuring Association between Two Attributes , 1912 .

[5] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[6] Marko Robnik-Sikonja,et al. Explaining Classifications For Individual Instances , 2008, IEEE Transactions on Knowledge and Data Engineering.

[7] Tony R. Martinez,et al. The general inefficiency of batch training for gradient descent learning , 2003, Neural Networks.

[8] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[9] Gonçalo Simões,et al. Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings , 2018, ACL.

[10] Tim Miller,et al. Explanation in Artificial Intelligence: Insights from the Social Sciences , 2017, Artif. Intell..

[11] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[12] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).

[13] Thomas Simpson. Essays on several curious and useful subjects, in speculative and mix'd mathematics : illustrated by a variety of examples , 1972 .

[14] Seth Flaxman,et al. European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[15] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[16] Dale T. Miller,et al. Norm theory: Comparing reality to its alternatives , 1986 .

[17] Martin Wattenberg,et al. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) , 2017, ICML.

[18] Yuval Pinter,et al. Attention is not not Explanation , 2019, EMNLP.

[19] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[20] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[21] Luo Si,et al. StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding , 2019, ICLR.

[22] Kevin Gimpel,et al. Charagram: Embedding Words and Sentences via Character n-grams , 2016, EMNLP.

[23] Michael I. Jordan,et al. Artificial Intelligence—The Revolution Hasn’t Happened Yet , 2019, Issue 1.

[24] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[25] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[26] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.

[27] T. Hayton. The Advanced Theory of Statistics, Vol. 3 , 1968 .

[28] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.

[29] Klaus-Robert Müller,et al. "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.

[30] Robert L. Mercer,et al. An Estimate of an Upper Bound for the Entropy of English , 1992, CL.

[31] Kyle Gorman,et al. We Need to Talk about Standard Splits , 2019, ACL.

[32] Amina Adadi,et al. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI) , 2018, IEEE Access.

[33] T. Lombrozo. The structure and function of explanations , 2006, Trends in Cognitive Sciences.

[34] R. Byrne. Précis of The Rational Imagination: How People Create Alternatives to Reality , 2007, Behavioral and Brain Sciences.

[35] Aleksander Madry,et al. Adversarial Examples Are Not Bugs, They Are Features , 2019, NeurIPS.

[36] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37] Ellen M. Voorhees,et al. Building a question answering test collection , 2000, SIGIR '00.

[38] Chris Russell,et al. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR , 2017, ArXiv.

[39] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[40] Xinlei Chen,et al. Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[41] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[42] Zachary Chase Lipton. The mythos of model interpretability , 2016, ACM Queue.

[43] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .

[44] N. Roese,et al. What Might Have Been: The Social Psychology of Counterfactual Thinking , 1995 .

[45] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[46] Liwei Wang,et al. The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[47] Carlos Guestrin,et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[48] Avanti Shrikumar,et al. Learning Important Features Through Propagating Activation Differences , 2017, ICML.

[49] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[50] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[51] Klaus Kofler,et al. Performance and Scalability of GPU-Based Convolutional Neural Networks , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[52] Klaus-Robert Müller,et al. Learning how to explain neural networks: PatternNet and PatternAttribution , 2017, ICLR.

[53] Franco Turini,et al. A Survey of Methods for Explaining Black Box Models , 2018, ACM Comput. Surv..

[54] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[55] Andrea Vedaldi,et al. Interpretable Explanations of Black Boxes by Meaningful Perturbation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[56] Quanshi Zhang,et al. Interpretable Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57] Wojciech Samek,et al. Methods for interpreting and understanding deep neural networks , 2017, Digit. Signal Process..

[58] Cengiz Öztireli,et al. Towards better understanding of gradient-based attribution methods for Deep Neural Networks , 2017, ICLR.

[59] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[60] Eric D. Ragan,et al. A Human-Grounded Evaluation Benchmark for Local Explanations of Machine Learning , 2018, ArXiv.

[61] Warren S. Sarle,et al. Neural Networks and Statistical Models , 1994 .

[62] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[63] Welch Bl. THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[64] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[65] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.

[66] Matthias Bethge,et al. Generalisation in humans and deep neural networks , 2018, NeurIPS.

[67] H. Robbins. A Stochastic Approximation Method , 1951 .

[68] Boris Hanin,et al. Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations , 2017, Mathematics.

[69] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[70] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.

[71] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[72] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[73] Max Welling,et al. Visualizing Deep Neural Network Decisions: Prediction Difference Analysis , 2017, ICLR.

[74] Zellig S. Harris,et al. Distributional Structure , 1954 .

[75] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.

[76] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.

[77] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.

[78] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.

[79] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[80] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.

[81] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[82] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.

[83] N. Roese. Counterfactual thinking. , 1997, Psychological bulletin.

[84] Motoaki Kawanabe,et al. How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[85] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[86] Yiming Wang,et al. Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI , 2016, INTERSPEECH.

[87] A. Tversky,et al. The simulation heuristic , 1982 .

[88] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[89] Alexander Binder,et al. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation , 2015, PloS one.

[90] David Weinberger,et al. Accountability of AI Under the Law: The Role of Explanation , 2017, ArXiv.

[91] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[92] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[93] Dumitru Erhan,et al. The (Un)reliability of saliency methods , 2017, Explainable AI.

[94] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[95] Taku Kudo,et al. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[96] Samuel R. Bowman,et al. Verb Argument Structure Alternations in Word and Sentence Embeddings , 2018, ArXiv.

[97] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[98] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.

[99] Edouard Grave,et al. End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures , 2019, ArXiv.

[100] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[101] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[102] R. Thomas McCoy,et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.

[103] Been Kim,et al. Towards A Rigorous Science of Interpretable Machine Learning , 2017, 1702.08608.

[104] Tommi S. Jaakkola,et al. A causal framework for explaining the predictions of black-box sequence-to-sequence models , 2017, EMNLP.

[105] S. Piantadosi. Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.

[106] Francois Fleuret,et al. Full-Gradient Representation for Neural Network Visualization , 2019, NeurIPS.

[107] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[108] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[109] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[110] David Harbecke,et al. Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling , 2020, ACL.