Probing Classifiers: Promises, Shortcomings, and Advances

Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple —a classifier is trained to predict some linguistic property from a model's representations—and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological limitations of this approach. This article critically reviews the probing classifiers framework, highlighting their promises, shortcomings, and advances.

[1]  James R. Glass,et al.  On the Linguistic Representational Power of Neural Machine Translation Models , 2019, CL.

[2]  Rowan Hall Maudslay,et al.  Information-Theoretic Probing for Linguistic Structure , 2020, ACL.

[3]  Ellie Pavlick,et al.  Predicting Inductive Biases of Pre-Trained Models , 2021, ICLR.

[4]  Pareto Probing: Trading Off Accuracy for Complexity , 2020, EMNLP.

[5]  Yoav Goldberg,et al.  Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals , 2021, Transactions of the Association for Computational Linguistics.

[6]  Yonatan Belinkov,et al.  Investigating Gender Bias in Language Models Using Causal Mediation Analysis , 2020, NeurIPS.

[7]  Karen Livescu,et al.  Hierarchical Multitask Learning for CTC-based Speech Recognition , 2018, ArXiv.

[8]  Willem H. Zuidema,et al.  Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..

[9]  Yonatan Belinkov,et al.  Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks , 2017, IJCNLP.

[10]  Arne Köhn,et al.  What’s in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation , 2015, EMNLP.

[11]  Andy Way,et al.  Investigating ‘Aspect’ in NMT and SMT: Translating the English Simple Past and Present Perfect , 2017 .

[12]  Qun Liu,et al.  Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT , 2020, ACL.

[13]  Afra Alishahi,et al.  Correlating Neural and Symbolic Representations of Language , 2019, ACL.

[14]  Shikha Bordia,et al.  Do Attention Heads in BERT Track Syntactic Dependencies? , 2019, ArXiv.

[15]  John Hewitt,et al.  Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.

[16]  Ivan Titov,et al.  Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.

[17]  Yonatan Belinkov,et al.  What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[18]  Ryan Cotterell,et al.  A Tale of a Probe and a Parser , 2020, ACL.

[19]  Yoshua Bengio,et al.  Understanding intermediate layers using linear classifier probes , 2016, ICLR.

[20]  Yonatan Belinkov,et al.  On internal language representations in deep learning: an analysis of machine translation and speech recognition , 2018 .

[21]  Omer Levy,et al.  What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.

[22]  Peng Qian,et al.  What if This Modified That? Syntactic Interventions with Counterfactual Embeddings , 2021, FINDINGS.

[23]  Niranjan Balasubramanian,et al.  DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering , 2020, ACL.

[24]  Afra Alishahi,et al.  Analyzing analytical methods: The case of phonology in neural models of spoken language , 2020, ACL.

[25]  Nikolaus Kriegeskorte,et al.  Frontiers in Systems Neuroscience Systems Neuroscience , 2022 .

[26]  Frank Rudzicz,et al.  An Information Theoretic View on Selecting Linguistic Probes , 2020, EMNLP.

[27]  Yonatan Belinkov,et al.  Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems , 2017, NIPS.

[28]  Michael A. Lepori,et al.  Picking BERT’s Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis , 2020, COLING.

[29]  Yonatan Belinkov,et al.  Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.

[30]  Guillaume Lample,et al.  What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.

[31]  Alex Wang,et al.  What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.

[32]  Yonatan Belinkov,et al.  Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[33]  Allyson Ettinger,et al.  Probing for semantic evidence of composition by means of simple classification tasks , 2016, RepEval@ACL.

[34]  Anna Rumshisky,et al.  A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.

[35]  Yonatan Belinkov,et al.  Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? , 2020, EACL.

[36]  Yonatan Belinkov,et al.  Interpretability and Analysis in Neural NLP , 2020, ACL.

[37]  Florian Mohnert,et al.  Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information , 2018, BlackboxNLP@EMNLP.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Samuel R. Bowman,et al.  When Do You Need Billions of Words of Pretraining Data? , 2020, ACL.

[40]  Kun Qian,et al.  A Survey of the State of Explainable AI for Natural Language Processing , 2020, AACL/IJCNLP.

[41]  Nadir Durrani,et al.  Analyzing Redundancy in Pretrained Transformer Models , 2020, EMNLP.

[42]  Samuel R. Bowman,et al.  Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Syntactic Task Analysis , 2018, BlackboxNLP@EMNLP.

[43]  Alexander M. Rush,et al.  Low-Complexity Probing via Finding Subnetworks , 2021, NAACL.

[44]  Noah Goodman,et al.  Investigating Transferability in Pretrained Language Models , 2020, EMNLP.

[45]  Gemma Boleda,et al.  Distributional vectors encode referential attributes , 2015, EMNLP.

[46]  Sara Veldhoen,et al.  Diagnostic Classifiers Revealing how Neural Networks Process Hierarchical Structure , 2016, CoCo@NIPS.

[47]  Marco Baroni,et al.  The emergence of number and syntax units in LSTM language models , 2019, NAACL.

[48]  Rudolf Rosa,et al.  From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions , 2019, BlackboxNLP@ACL.

[49]  Julian Michael,et al.  Asking without Telling: Exploring Latent Ontologies in Contextual Representations , 2020, EMNLP.

[50]  Jörg Tiedemann,et al.  An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.

[51]  Uri Shalit,et al.  CausaLM: Causal Model Explanation Through Counterfactual Language Models , 2020, CL.

[52]  Xing Shi,et al.  Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.

[53]  Vivek Srikumar,et al.  DirectProbe: Studying Representations without Classifiers , 2021, NAACL.

[54]  Yonatan Belinkov,et al.  Identifying and Controlling Important Neurons in Neural Machine Translation , 2018, ICLR.

[55]  Sameer Singh,et al.  Interpreting Predictions of NLP Models , 2020, EMNLP.

[56]  Yonatan Belinkov,et al.  Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.