Transferring BERT-like Transformers' Knowledge for Authorship Verification

The task of identifying the author of a text spans several decades and was tackled using linguistics, statistics, and, more recently, machine learning. Inspired by the impressive performance gains across a broad range of natural language processing tasks and by the recent availability of the PAN large-scale authorship dataset, we first study the effectiveness of several BERT-like transformers for the task of authorship verification. Such models prove to achieve very high scores consistently. Next, we empirically show that they focus on topical clues rather than on author writing style characteristics, taking advantage of existing biases in the dataset. To address this problem, we provide new splits for PAN-2020, where training and test data are sampled from disjoint topics or authors. Finally, we introduce DarkReddit, a dataset with a different input data distribution. We further use it to analyze the domain generalization performance of models in a low-data regime and how performance varies when using the proposed PAN-2020 splits for fine-tuning. We show that those splits can enhance the models’ capability to transfer knowledge over a new, significantly different dataset.

[1]  Cuntai Guan,et al.  A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[2]  G. Yule ON SENTENCE- LENGTH AS A STATISTICAL CHARACTERISTIC OF STYLE IN PROSE: WITH APPLICATION TO TWO CASES OF DISPUTED AUTHORSHIP , 1939 .

[3]  Ankur Taly,et al.  Axiomatic Attribution for Deep Networks , 2017, ICML.

[4]  Justin Zobel,et al.  Effective and Scalable Authorship Attribution Using Function Words , 2005, AIRS.

[5]  Benno Stein,et al.  Overview of the Cross-domain Authorship Attribution Task at PAN 2019 , 2019, CLEF.

[6]  Bilal Alsallakh,et al.  Captum: A unified and generic model interpretability library for PyTorch , 2020, ArXiv.

[7]  Pradeep Ravikumar,et al.  The Risks of Invariant Risk Minimization , 2020, ICLR.

[8]  Noah A. Smith,et al.  Is Attention Interpretable? , 2019, ACL.

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Ingrid Zukerman,et al.  Authorship Attribution with Author-aware Topic Models , 2012, ACL.

[11]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[12]  Martin Potthast,et al.  Overview of the Cross-Domain Authorship Verification Task at PAN 2020 , 2020, CLEF.

[13]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[14]  Moshe Koppel,et al.  Measuring Differentiability: Unmasking Pseudonymous Authors , 2007, J. Mach. Learn. Res..

[15]  David Lopez-Paz,et al.  Invariant Risk Minimization , 2019, ArXiv.

[16]  Efstathios Stamatatos,et al.  Cross-Domain Authorship Attribution Using Pre-trained Language Models , 2020, AIAI.

[17]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[18]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[19]  George M. Mohay,et al.  Mining e-mail content for author identification forensics , 2001, SGMD.

[20]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[21]  G. Zipf Selected Studies of the Principle of Relative Frequency in Language , 2014 .

[22]  Mark Dras,et al.  Siamese Networks for Large-Scale Author Identification , 2019, Comput. Speech Lang..

[23]  Jasmijn Bastings,et al.  The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? , 2020, BLACKBOXNLP.

[24]  H. T. Eddy The characteristic curves of composition. , 1887, Science.

[25]  Shlomo Argamon,et al.  Effects of Age and Gender on Blogging , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[26]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[27]  Matthias Hagen,et al.  Generalizing Unmasking for Short Texts , 2019, NAACL-HLT.

[28]  Pierre Zweigenbaum,et al.  CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters , 2020, COLING.

[29]  Graham Neubig,et al.  Learning to Deceive with Attention-Based Explanations , 2020, ACL.

[30]  F. Mosteller,et al.  Inference and Disputed Authorship: The Federalist , 1966 .

[31]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[32]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.