$A^{4}NT$: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation

Text-based analysis methods allow to reveal privacy relevant author attributes such as gender, age and identify of the text's author. Such methods can compromise the privacy of an anonymous author even when the author tries to remove privacy sensitive content. In this paper, we propose an automatic method, called Adversarial Author Attribute Anonymity Neural Translation ($A^4NT$), to combat such text-based adversaries. We combine sequence-to-sequence language models used in machine translation and generative adversarial networks to obfuscate author attributes. Unlike machine translation techniques which need paired data, our method can be trained on unpaired corpora of text containing different authors. Importantly, we propose and evaluate techniques to impose constraints on our $A^4NT$ to preserve the semantics of the input text. $A^4NT$ learns to make minimal changes to the input text to successfully fool author attribute classifiers, while aiming to maintain the meaning of the input. We show through experiments on two different datasets and three settings that our proposed method is effective in fooling the author attribute classifiers and thereby improving the anonymity of authors.

[1]  L. Sweeney Simple Demographics Often Identify People Uniquely , 2000 .

[2]  Rachel Greenstadt,et al.  Translate Once, Translate Twice, Translate Thrice and Attribute: Identifying Authors and Machine Translation Tools in Translated Text , 2012, 2012 IEEE Sixth International Conference on Semantic Computing.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Ariel Stolerman,et al.  Doppelgänger Finder: Taking Stylometry to the Underground , 2014, 2014 IEEE Symposium on Security and Privacy.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Bernt Schiele,et al.  Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Minh N. Do,et al.  Semantic Image Inpainting with Perceptual and Contextual Losses , 2016, ArXiv.

[8]  Wei Chen,et al.  Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets , 2017, NAACL.

[9]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.

[10]  Navdeep Jaitly,et al.  Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech , 2017, ArXiv.

[11]  Alan Ritter,et al.  Adversarial Learning for Neural Dialogue Generation , 2017, EMNLP.

[12]  Shlomo Argamon,et al.  Automatically profiling the author of an anonymous text , 2009, CACM.

[13]  Douglas Bagnall,et al.  Author Identification Using Multi-headed Recurrent Neural Networks , 2015, CLEF.

[14]  J. Oberlander,et al.  Proceedings of the COLING/ACL on Main Conference Poster Sessions , 2006 .

[15]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[16]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[17]  Lantao Yu,et al.  SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.

[18]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[19]  Shari Laster,et al.  The American Presidency Project , 2014 .

[20]  Jürgen Schmidhuber,et al.  Recurrent Highway Networks , 2016, ICML.

[21]  R. Steinpreis,et al.  The Impact of Gender on the Review of the Curricula Vitae of Job Applicants and Tenure Candidates: A National Empirical Study , 1999 .

[22]  Michael Gamon,et al.  Obfuscating Document Stylometry to Preserve Author Anonymity , 2006, ACL.

[23]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[24]  Arvind Narayanan,et al.  De-anonymizing Programmers via Code Stylometry , 2015, USENIX Security Symposium.

[25]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[26]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27]  Jason Lee,et al.  Fully Character-Level Neural Machine Translation without Explicit Segmentation , 2016, TACL.

[28]  Rachel Greenstadt,et al.  Detecting Hoaxes, Frauds, and Deception in Writing Style Online , 2012, 2012 IEEE Symposium on Security and Privacy.

[29]  Ralph Grishman,et al.  Paraphrasing for Style , 2012, COLING.

[30]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  I.N. Bozkurt,et al.  Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[32]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[33]  Rachel Greenstadt,et al.  Blogs, Twitter Feeds, and Reddit Comments: Cross-domain Authorship Attribution , 2016, Proc. Priv. Enhancing Technol..

[34]  Pankaj Rohatgi,et al.  Can Pseudonymity Really Guarantee Privacy? , 2000, USENIX Security Symposium.

[35]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.

[36]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[37]  Dawn Xiaodong Song,et al.  On the Feasibility of Internet-Scale Author Identification , 2012, 2012 IEEE Symposium on Security and Privacy.

[38]  Teruo Higashino,et al.  Twitter user profiling based on text and community mining for market analysis , 2013, Knowl. Based Syst..

[39]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[40]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[41]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[42]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[43]  Prasenjit Majumder,et al.  Author Masking through Translation , 2016, CLEF.

[44]  Robert F. Chew,et al.  Predicting age groups of Twitter users based on language and metadata features , 2017, PloS one.

[45]  Hsinchun Chen,et al.  Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace , 2008, TOIS.

[46]  William J. Clinton The American Presidency Project , 2011 .

[47]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[48]  Judith Donath,et al.  Identity and deception in the virtual community , 1998 .

[49]  Navdeep Jaitly,et al.  Sequence-to-Sequence Models Can Directly Translate Foreign Speech , 2017, INTERSPEECH.

[50]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Rachel Greenstadt,et al.  Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity , 2012, TSEC.

[52]  Ariel Stolerman,et al.  Use Fewer Instances of the Letter "i": Toward Writing Style Anonymization , 2012, Privacy Enhancing Technologies.

[53]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[54]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[55]  Yoshua Bengio,et al.  A Character-level Decoder without Explicit Segmentation for Neural Machine Translation , 2016, ACL.

[56]  Emerson Murphy-Hill,et al.  Gender differences and bias in open source: pull request acceptance of women versus men , 2017, PeerJ Comput. Sci..

[57]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[58]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[59]  Preslav Nakov,et al.  The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation - (Best of the Labs Track at CLEF-2017) , 2017, CLEF.

[60]  Davood Rafiei,et al.  Predicting political preference of Twitter users , 2013, ASONAM.

[61]  John G. Breslin,et al.  Character-level and Multi-channel Convolutional Neural Networks for Large-scale Authorship Attribution , 2016, ArXiv.