Identity vs. Attribute Disclosure Risks for Users with Multiple Social Profiles

Individuals sharing data on today's social computing systems face privacy losses due to information disclosure that go much beyond the data they directly share. Indeed, it was shown that it is possible to infer additional information about a user from data shared by other users--- this type of information disclosure is called attribute disclosure. Such studies, however, were limited to a single social computing system. In reality, users have identities across several social computing systems and reveal different aspects of their lives in each. This enlarges considerably the scope of information disclosure, but also complicates its analysis. Indeed, when considering multiple social computing systems, information disclosure can be of two types: attribute disclosure or identity disclosure--- which relates to the risk of pinpointing, for a given identity in a social computing system, the identity of the same individual in another social computing system. This raises the key question: how do these two privacy risks relate to each other? In this paper, we perform the first combined study of attribute and identity disclosure risks across multiple social computing systems. We first propose a framework to quantify these risks. Our empirical evaluation on a real-world dataset from Facebook and Twitter then shows that, in some regime, there is a tradeoff between the two information disclosure risks, that is, users with a lower identity disclosure risk suffer a higher attribute disclosure risk. We investigate in depth the different parameters that impact this tradeoff.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Krishna P. Gummadi,et al.  You are who you know: inferring user profiles in online social networks , 2010, WSDM '10.

[3]  George Varghese,et al.  I seek you: searching and matching individuals in social networks , 2009, WIDM.

[4]  Nicholas Jing Yuan,et al.  You Are Where You Go: Inferring Demographic Attributes from Location Check-ins , 2015, WSDM.

[5]  Jahna Otterbacher,et al.  Inferring gender of movie reviewers: exploiting writing style, content and metadata , 2010, CIKM.

[6]  Nitesh V. Chawla,et al.  Inferring user demographics and social strategies in mobile social networks , 2014, KDD.

[7]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  A. Acquisti,et al.  Privacy in the Age of Augmented Reality , 2011 .

[9]  Elena Ferrari,et al.  Privacy, Security, and Trust in KDD, Second ACM SIGKDD International Workshop, PinKDD 2008, Las Vegas, NV, USA, August 24, 2008, Revised Selected Papers , 2009, PinKDD.

[10]  Michael L. Nelson,et al.  An Unsupervised Approach to Discovering and Disambiguating Social Media Profiles , 2011 .

[11]  Claude Castelluccia,et al.  How Unique and Traceable Are Usernames? , 2011, PETS.

[12]  Lars Backstrom,et al.  ePluribus: Ethnicity on Social Networks , 2010, ICWSM.

[13]  Vincent Y. Shen,et al.  User identification across multiple social networks , 2009, 2009 First International Conference on Networked Digital Technologies.

[14]  Alina Campan,et al.  Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[15]  Roksana Boreli,et al.  Is more always merrier?: a deep dive into online social footprints , 2012, WOSN '12.

[16]  Lars Backstrom,et al.  Find me if you can: improving geographical prediction with social and spatial proximity , 2010, WWW '10.

[17]  Lior Rokach,et al.  Entity Matching in Online Social Networks , 2013, 2013 International Conference on Social Computing.

[18]  Lise Getoor,et al.  To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles , 2009, WWW '09.

[19]  Alessandro Acquisti,et al.  Face Recognition and Privacy in the Age of Augmented Reality , 2014, J. Priv. Confidentiality.

[20]  Nitesh V. Chawla,et al.  User Modeling on Demographic Attributes in Big Mobile Social Networks , 2017, ACM Trans. Inf. Syst..

[21]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[22]  Fan Zhang,et al.  What's in a name?: an unsupervised approach to link users across communities , 2013, WSDM.

[23]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[24]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[25]  Krishna P. Gummadi,et al.  R-Susceptibility: An IR-Centric Approach to Assessing Privacy Risks for Users in Online Communities , 2016, SIGIR.

[26]  Sue Moon,et al.  Inferring Twitter user locations with 10 km accuracy , 2014, WWW.

[27]  Reza Zafarani,et al.  Connecting Corresponding Identities across Communities , 2009, ICWSM.

[28]  David Jurgens,et al.  That's What Friends Are For: Inferring Location in Online Social Media Platforms Based on Social Relationships , 2013, ICWSM.

[29]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[30]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[31]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[32]  Pradeep Ravikumar,et al.  A Comparison of String Distance Metrics for Name-Matching Tasks , 2003, IIWeb.

[33]  Krishna P. Gummadi,et al.  On the Reliability of Profile Matching Across Large Online Social Networks , 2015, KDD.

[34]  Hua Li,et al.  Demographic prediction based on user's browsing behavior , 2007, WWW '07.

[35]  Krishna P. Gummadi,et al.  On Profile Linkability despite Anonymity in Social Media Systems , 2016, WPES@CCS.

[36]  Krishna P. Gummadi,et al.  Inferring user interests in the Twitter social network , 2014, RecSys '14.

[37]  Hannes Hartenstein,et al.  What Your Friends Tell Others About You: Low Cost Linkability of Social Network Profiles , 2011, SNAKDD 2011.

[38]  Anupam Joshi,et al.  @i seek 'fb.me': identifying users across multiple online social networks , 2013, WWW.

[39]  Jian Pei,et al.  The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[40]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[41]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[42]  Gene Tsudik,et al.  Exploring Linkability of User Reviews , 2012, ESORICS.

[43]  Virgílio A. F. Almeida,et al.  Studying User Footprints in Different Online Social Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[44]  Mohamed Ali Kâafar,et al.  You are what you like! Information leakage through users' Interests , 2012, NDSS.

[45]  Richard Chbeir,et al.  User Profile Matching in Social Networks , 2010, 2010 13th International Conference on Network-Based Information Systems.

[46]  Bin Liu,et al.  You Are Who You Know and How You Behave: Attribute Inference Attacks via Users' Social Friends and Behaviors , 2016, USENIX Security Symposium.

[47]  Seung-won Hwang,et al.  SocialSearch: enhancing entity search with social network matching , 2011, EDBT/ICDT '11.

[48]  Pengfei Wang,et al.  Your Cart tells You: Inferring Demographic Attributes from Purchase Data , 2016, WSDM.