Overview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited

We report on the second large-scale evaluation of style obfuscation approaches in a shared task on author obfuscation, organized at the PAN 2017 lab on digital text forensics. Author obfuscation means to automatically paraphrase a given text such that state-of-the-art authorship verification approaches misjudge a given pair of documents as having been written by “different authors” if in fact they would have decided otherwise without obfuscation. This year, two new obfuscators are compared to the participants from last year’s task against a total of 44 authorship verification approaches. The best-performing obfuscator successfully impacts the decision-making process of the authorship verifiers significantly. However, as in the last year, the paraphrased texts are often not really human-readable anymore and have some changed context, indicating that there is still way to go to “perfect” automatic obfuscation that (1) tricks verification approaches, (2) keeps the meaning of the original, and (3) is, regarding its obfuscation, unsuspicious to a human eye.