Probing Classifiers: Promises, Shortcomings, and Alternatives

Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple— a classifier is trained to predict some linguistic property from a model’s representations—and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological weaknesses of this approach. This article critically reviews the probing classifiers framework, highlighting shortcomings, improvements, and alternative approaches.

[1]  Shikha Bordia,et al.  Do Attention Heads in BERT Track Syntactic Dependencies? , 2019, ArXiv.

[2]  Uri Shalit,et al.  CausaLM: Causal Model Explanation Through Counterfactual Language Models , 2020, CL.

[3]  Xing Shi,et al.  Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.

[4]  Ivan Titov,et al.  Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.

[5]  Niranjan Balasubramanian,et al.  DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering , 2020, ACL.

[6]  Afra Alishahi,et al.  Analyzing analytical methods: The case of phonology in neural models of spoken language , 2020, ACL.

[7]  Karen Livescu,et al.  Hierarchical Multitask Learning for CTC-based Speech Recognition , 2018, ArXiv.

[8]  Ryan Cotterell,et al.  Pareto Probing: Trading-Off Accuracy and Complexity , 2020, EMNLP.

[9]  Yonatan Belinkov,et al.  Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[10]  Allyson Ettinger,et al.  Probing for semantic evidence of composition by means of simple classification tasks , 2016, RepEval@ACL.

[11]  Qun Liu,et al.  Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT , 2020, ACL.

[12]  Yonatan Belinkov,et al.  Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? , 2020, EACL.

[13]  Yonatan Belinkov,et al.  What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[14]  Ryan Cotterell,et al.  A Tale of a Probe and a Parser , 2020, ACL.

[15]  Yoshua Bengio,et al.  Understanding intermediate layers using linear classifier probes , 2016, ICLR.

[16]  Yonatan Belinkov,et al.  Interpretability and Analysis in Neural NLP , 2020, ACL.

[17]  Yonatan Belinkov,et al.  On internal language representations in deep learning: an analysis of machine translation and speech recognition , 2018 .

[18]  Florian Mohnert,et al.  Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information , 2018, BlackboxNLP@EMNLP.

[19]  Yonatan Belinkov,et al.  Investigating Gender Bias in Language Models Using Causal Mediation Analysis , 2020, NeurIPS.

[20]  Frank Rudzicz,et al.  An Information Theoretic View on Selecting Linguistic Probes , 2020, EMNLP.

[21]  James R. Glass,et al.  On the Linguistic Representational Power of Neural Machine Translation Models , 2019, CL.

[22]  Rowan Hall Maudslay,et al.  Information-Theoretic Probing for Linguistic Structure , 2020, ACL.

[23]  Andy Way,et al.  Investigating ‘Aspect’ in NMT and SMT: Translating the English Simple Past and Present Perfect , 2017 .

[24]  Jörg Tiedemann,et al.  An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.

[25]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[26]  Judea Pearl,et al.  Direct and Indirect Effects , 2001, UAI.

[27]  Willem H. Zuidema,et al.  Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..

[28]  Christopher D. Manning,et al.  A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.

[29]  Afra Alishahi,et al.  Correlating Neural and Symbolic Representations of Language , 2019, ACL.

[30]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[31]  Yonatan Belinkov,et al.  Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems , 2017, NIPS.

[32]  Michael A. Lepori,et al.  Picking BERT’s Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis , 2020, COLING.

[33]  Yoav Goldberg,et al.  Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals , 2021, Transactions of the Association for Computational Linguistics.

[34]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[35]  Yonatan Belinkov,et al.  Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks , 2017, IJCNLP.

[36]  Arne Köhn,et al.  What’s in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation , 2015, EMNLP.

[37]  Kun Qian,et al.  A Survey of the State of Explainable AI for Natural Language Processing , 2020, AACL/IJCNLP.

[38]  Yonatan Belinkov,et al.  Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.

[39]  Gemma Boleda,et al.  Distributional vectors encode referential attributes , 2015, EMNLP.

[40]  John Hewitt,et al.  Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.

[41]  Nadir Durrani,et al.  Analyzing Redundancy in Pretrained Transformer Models , 2020, EMNLP.

[42]  Yonatan Belinkov,et al.  Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.

[43]  Samuel R. Bowman,et al.  Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Syntactic Task Analysis , 2018, BlackboxNLP@EMNLP.

[44]  Noah Goodman,et al.  Investigating Transferability in Pretrained Language Models , 2020, EMNLP.

[45]  Marco Baroni,et al.  The emergence of number and syntax units in LSTM language models , 2019, NAACL.

[46]  Rudolf Rosa,et al.  From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions , 2019, BlackboxNLP@ACL.

[47]  Julian Michael,et al.  Asking without Telling: Exploring Latent Ontologies in Contextual Representations , 2020, EMNLP.

[48]  Yonatan Belinkov,et al.  Identifying and Controlling Important Neurons in Neural Machine Translation , 2018, ICLR.

[49]  Sameer Singh,et al.  Interpreting Predictions of NLP Models , 2020, EMNLP.