Who framed Roger Reindeer? De-censorship of Facebook posts by snippet classification

This paper considers online news censorship and it concentrates on censorship of identities. Obfuscating identities may occur for disparate reasons, from military to judiciary ones. In the majority of cases, this happens to protect individuals from being identified and persecuted by hostile people. However, being the collaborative web characterised by a redundancy of information, it is not unusual that the same fact is reported by multiple sources, which may not apply the same restriction policies in terms of censorship. Also, the proven aptitude of social network users to disclose personal information leads to the phenomenon that comments to news can reveal the data withheld in the news itself. This gives us a mean to figure out who the subject of the censored news is. We propose an adaptation of a text analysis approach to unveil censored identities. The approach is tested on a synthesised scenario, which however resembles a real use case. Leveraging a text analysis based on a context classifier trained over snippets from posts and comments of Facebook pages, we achieve promising results. Despite the quite constrained settings in which we operate -- such as considering only snippets of very short length -- our system successfully detects the censored name, choosing among 10 different candidate names, in more than 50\% of the investigated cases. This outperforms the results of two reference baselines. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the insidious issues of censorship on the web.

[1]  Jason Martin,et al.  Identity construction on Facebook: Digital empowerment in anchored relationships , 2008, Comput. Hum. Behav..

[2]  David G. Schwartz,et al.  News censorship in online social networks: A study of circumvention in the commentsphere , 2017, J. Assoc. Inf. Sci. Technol..

[3]  Dawn Carla Nunziato,et al.  Virtual Freedom: Net Neutrality and Free Speech in the Internet Age , 2009 .

[4]  Sean P. Goggins,et al.  Shepherding and Censorship: Discourse Management in the Tea Party Patriots Facebook Group , 2012, 2012 45th Hawaii International Conference on System Sciences.

[5]  Nick Feamster,et al.  Infranet: Circumventing Web Censorship and Surveillance , 2002, USENIX Security Symposium.

[6]  Maurizio Tesconi,et al.  Semi-supervised Knowledge Extraction for Detection of Drugs and Their Effects , 2016, SocInfo.

[7]  Eileen Wood,et al.  All about me: Disclosure in online social networking profiles: The case of FACEBOOK , 2010, Comput. Hum. Behav..

[8]  Luigi V. Mancini,et al.  OSSINT - Open Source Social Network Intelligence An efficient and effective way to uncover "private" information in OSN profiles , 2016, Online Soc. Networks Media.

[9]  Guido Caldarelli,et al.  Science vs Conspiracy: Collective Narratives in the Age of Misinformation , 2014, PloS one.

[10]  Minaxi Gupta,et al.  Inferring Mechanics of Web Censorship Around the World , 2012, FOCI.

[11]  Adam Senft,et al.  Characterizing Web Censorship Worldwide: Another Look at the OpenNet Initiative Data , 2015, TWEB.

[12]  Lada A. Adamic,et al.  The role of social networks in information diffusion , 2012, WWW.

[13]  Matthew Rowe Applying Semantic Social Graphs to Disambiguate Identity References , 2009, ESWC.

[14]  Cécile Favre,et al.  Information diffusion in online social networks: a survey , 2013, SGMD.

[15]  Hui Han,et al.  A Model-based K-means Algorithm for Name Disambiguation , 2003 .

[16]  Mauro Conti,et al.  SocialSpy: Browsing (Supposedly) Hidden Information in Online Social Networks , 2014, CRiSIS.

[17]  Carol Grbich,et al.  Qualitative Data Analysis: An Introduction , 2007 .

[18]  Jinyoung Min,et al.  How are people enticed to disclose personal information despite privacy concerns in social network sites? The calculus between benefit and cost , 2015, J. Assoc. Inf. Sci. Technol..

[19]  Andrei Serjantov,et al.  Anonymizing Censorship Resistant Systems , 2002, IPTPS.

[20]  George Danezis,et al.  An Automated Social Graph De-anonymization Technique , 2014, WPES.

[21]  Mung Chiang,et al.  A Taxonomy of Censors and Anti-Censors Part II: Anti-Censorship Technologies , 2012, Int. J. E Politics.

[22]  Matthew Rowe The credibility of digital identity information on the social web: a user study , 2010, WICOW '10.

[23]  Sotiris Ioannidis,et al.  CensMon: A Web Censorship Monitor , 2011, FOCI.

[24]  José P. González-Brenes,et al.  Coreference Resolution : Current Trends and Future Directions , 2008 .

[25]  Wei Xu,et al.  A hierarchical naive Bayes mixture model for name disambiguation in author citations , 2005, SAC '05.

[26]  Kuan-Ta Chen,et al.  Involuntary Information Leakage in Social Network Services , 2008, IWSEC.

[27]  Thomas M. Chen,et al.  Web Filtering and Censoring , 2010, Computer.

[28]  Nick Feamster,et al.  Thwarting Web Censorship with Untrusted Messenger Discovery , 2003, Privacy Enhancing Technologies.

[29]  Kristina Lerman,et al.  Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks , 2010, ICWSM.

[30]  Bhavani M. Thuraisingham,et al.  Inferring private information using social network data , 2009, WWW '09.

[31]  A. Phillips SOCIABILITY, SPEED AND QUALITY IN THE CHANGING NEWS ENVIRONMENT , 2012 .

[32]  Zubair Nabi The Anatomy of Web Censorship in Pakistan , 2013, FOCI.

[33]  David G. Schwartz,et al.  Revealing censored information through comments and commenters in online social networks , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[34]  Mung Chiang,et al.  A Taxonomy of Censors and Anti-Censors: Part I-Impacts of Internet Censorship , 2012, Int. J. E Politics.

[35]  Luigi V. Mancini,et al.  Anonymity in an Electronic Society: A Survey , 2016, Cyber Deception.