Towards Ethical Content-Based Detection Of Online Influence Campaigns

The detection of clandestine efforts to influence users in online communities is a challenging problem with significant active development. We demonstrate that features derived from the text of user comments are useful for identifying suspect activity, but lead to increased erroneous identifications (false positive classifications) when keywords over-represented in past influence campaigns are present. Drawing on research in native language identification (NLI), we use “named entity masking” (NEM) to create sentence features robust to this shortcoming, while maintaining comparable classification accuracy. We demonstrate that while NEM consistently reduces false positives when key named entities are mentioned, both masked and unmasked models exhibit increased false positive rates on English sentences by Russian native speakers, raising ethical considerations that should be addressed in future research.

[1]  Shervin Malmasi,et al.  Arabic Native Language Identification , 2014, ANLP@EMNLP.

[2]  Eibe Frank,et al.  Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms , 2004, PAKDD.

[3]  Shuly Wintner,et al.  Native Language Identification with User Generated Content , 2018, EMNLP.

[4]  Shervin Malmasi,et al.  Subdialectal Differences in Sorani Kurdish , 2016, VarDial@COLING.

[5]  Joel R. Tetreault,et al.  A Report on the 2017 Native Language Identification Shared Task , 2017, BEA@EMNLP.

[6]  Scott A. Crossley,et al.  Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application , 2015 .

[7]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[8]  Graeme Hirst,et al.  Measuring Interlanguage: Native Language Identification with L1-influence Metrics , 2012, LREC.

[9]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  R. Perkins The Application of Forensic Linguistics in Cybercrime Investigations , 2018, Policing: A Journal of Policy and Practice.

[12]  Eric Gilbert,et al.  Still out there: Modeling and Identifying Russian Troll Accounts on Twitter , 2019, WebSci.

[13]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[14]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[15]  Agostino Poggi,et al.  A holistic system for troll detection on Twitter , 2018, Comput. Hum. Behav..

[16]  Yulia Tsvetkov,et al.  Native Language Cognate Effects on Second Language Lexical Choice , 2018, TACL.

[17]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[18]  Isaac Woungang,et al.  Verifying Online User Identity using Stylometric Analysis for Short Messages , 2014, J. Networks.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[21]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[22]  Joel R. Tetreault,et al.  A Report on the First Native Language Identification Shared Task , 2013, BEA@NAACL-HLT.

[23]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.