Enforcing transparent access to private content in social networks by means of automatic sanitization

Social networks have become an essential meeting point for millions of individuals willing to publish and consume huge quantities of heterogeneous information. Some studies have shown that the data published in these platforms may contain sensitive personal information and that external entities can gather and exploit this knowledge for their own benefit. Even though some methods to preserve the privacy of social networks users have been proposed, they generally apply rigid access control measures to the protected content and, even worse, they do not enable the users to understand which contents are sensitive. Last but not least, most of them require the collaboration of social network operators or they fail to provide a practical solution capable of working with well-known and already deployed social platforms. In this paper, we propose a new scheme that addresses all these issues. The new system is envisaged as an independent piece of software that does not depend on the social network in use and that can be transparently applied to most existing ones. According to a set of privacy requirements intuitively defined by the users of a social network, the proposed scheme is able to: (i) automatically detect sensitive data in users’ publications; (ii) construct sanitized versions of such data; and (iii) provide privacy-preserving transparent access to sensitive contents by disclosing more or less information to readers according to their credentials toward the owner of the publications. We also study the applicability of the proposed system in general and illustrate its behavior in two case studies.

[1]  David Sánchez,et al.  A New Model to Compute the Information Content of Concepts from Taxonomic Knowledge , 2012, Int. J. Semantic Web Inf. Syst..

[2]  Ahmed K. Elmagarmid,et al.  Privometer: Privacy protection in social networks , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[3]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[4]  David Sánchez,et al.  Minimizing the disclosure risk of semantic correlations in document sanitization , 2013, Inf. Sci..

[5]  Jessica Staddon,et al.  Are privacy concerns a turn-off?: engagement and privacy in social networks , 2012, SOUPS.

[6]  Qi Xie,et al.  FaceCloak: An Architecture for User Privacy on Social Networking Sites , 2009, 2009 International Conference on Computational Science and Engineering.

[7]  Refik Molva,et al.  Safebook: A privacy-preserving online social network leveraging on real-life trust , 2009, IEEE Communications Magazine.

[8]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[9]  Yuguang Fang,et al.  Privacy and security for online social networks: challenges and opportunities , 2010, IEEE Network.

[10]  Mauro Conti,et al.  Virtual private social networks , 2011, CODASPY '11.

[11]  David Sánchez,et al.  Profiling social networks to provide useful and privacy‐preserving web search , 2014, J. Assoc. Inf. Sci. Technol..

[12]  David Sánchez,et al.  Ontology-based semantic similarity: A new feature-based approach , 2012, Expert Syst. Appl..

[13]  Montserrat Batet,et al.  An information theoretic approach to improve semantic similarity assessments across multiple ontologies , 2014, Inf. Sci..

[14]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[15]  David Sánchez,et al.  Utility-preserving privacy protection of textual healthcare documents , 2014, J. Biomed. Informatics.

[16]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  Jessica Staddon,et al.  Detecting privacy leaks using corpus-based association rules , 2008, KDD.

[18]  Talel Abdessalem,et al.  Primates: a privacy management system for social networks , 2012, CIKM '12.

[19]  David Sánchez,et al.  Automatic General-Purpose Sanitization of Textual Documents , 2013, IEEE Transactions on Information Forensics and Security.

[20]  David Sánchez,et al.  C‐sanitized: A privacy model for document redaction and sanitization , 2014, J. Assoc. Inf. Sci. Technol..

[21]  Juan D. Velásquez,et al.  Web mining and privacy concerns: Some important legal issues to be consider before applying any data and information extraction technique in web-based environments , 2013, Expert Syst. Appl..

[22]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[23]  Bhavani M. Thuraisingham,et al.  Semantic web-based social network access control , 2011, Comput. Secur..

[24]  Moni Naor,et al.  Revocation and Tracing Schemes for Stateless Receivers , 2001, CRYPTO.

[25]  Nanda Kumar,et al.  Improving privacy settings control in online social networks with a wheel interface , 2014, J. Assoc. Inf. Sci. Technol..

[26]  Karl Aberer,et al.  Enabling Secure Secret Sharing in Distributed Online Social Networks , 2009, 2009 Annual Computer Security Applications Conference.

[27]  Yong Wang,et al.  Social network privacy measurement and simulation , 2014, 2014 International Conference on Computing, Networking and Communications (ICNC).

[28]  Balamurugan Anandan,et al.  Significance of Term Relationships on Anonymization , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[29]  David Sánchez,et al.  Utility-preserving sanitization of semantically correlated terms in textual documents , 2014, Inf. Sci..

[30]  Barbara Carminati,et al.  Enforcing access control in Web-based social networks , 2009, TSEC.

[31]  Jordi Castellà-Roca,et al.  Preventing automatic user profiling in Web 2.0 applications , 2012, Knowl. Based Syst..

[32]  Guillermo Navarro-Arribas,et al.  On the Declassification of Confidential Documents , 2011, MDAI.

[33]  Bobby Bhattacharjee,et al.  Persona: an online social network with user-defined privacy , 2009, SIGCOMM '09.

[34]  Justine Becker Measuring privacy risk in online social networks , 2009 .

[35]  David Sánchez,et al.  Ontology-driven web-based semantic similarity , 2010, Journal of Intelligent Information Systems.

[36]  Alexandre Viejo,et al.  Preserving the User's Privacy in Social Networking Sites , 2013, TrustBus.

[37]  Rebecca Wong,et al.  Data Protection Directive 95/46/EC , 2013 .

[38]  David C. Hay,et al.  Requirements Analysis: From Business Views to Architecture , 2002 .

[39]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[40]  M. Truyens,et al.  Privacy and social networks , 2010, Comput. Law Secur. Rev..

[41]  Nikita Borisov,et al.  Cachet: a decentralized architecture for privacy preserving social networking with caching , 2012, CoNEXT '12.