User identification across online social networks in practice: Pitfalls and solutions

To take advantage of the full range of services that online social networks (OSNs) offer, people commonly open several accounts on diverse OSNs where they leave lots of different types of profile information. The integration of these pieces of information from various sources can be achieved by identifying individuals across social networks. In this article, we address the problem of user identification by treating it as a classification task. Relying on common public attributes available through the official application programming interface (API) of social networks, we propose different methods for building negative instances that go beyond usual random selection so as to investigate the effectiveness of each method in training the classifier. Two test sets with different levels of discrimination are set up to evaluate the robustness of our different classifiers. The effectiveness of the approach is measured in real conditions by matching profiles gathered from Google+, Facebook and Twitter.

[1]  Bartunov Sergey,et al.  Joint Link-Attribute User Identity Resolution in Online Social Networks , 2012 .

[2]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[3]  Federica Cena,et al.  User identification for cross-system personalisation , 2009, Inf. Sci..

[4]  Silvio Lattanzi,et al.  An efficient reconciliation algorithm for social networks , 2013, Proc. VLDB Endow..

[5]  Lior Rokach,et al.  Entity Matching in Online Social Networks , 2013, 2013 International Conference on Social Computing.

[6]  Michael Hicks,et al.  Deanonymizing mobility traces: using social network as a side-channel , 2012, CCS.

[7]  Vincent Y. Shen,et al.  User identification across multiple social networks , 2009, 2009 First International Conference on Networked Digital Technologies.

[8]  Gene Tsudik,et al.  Exploring Linkability of User Reviews , 2012, ESORICS.

[9]  William E. Winkler,et al.  The State of Record Linkage and Current Research Problems , 1999 .

[10]  Krishna P. Gummadi,et al.  On the Reliability of Profile Matching Across Large Online Social Networks , 2015, KDD.

[11]  Reza Rejaie,et al.  Google+ or Google-?: dissecting the evolution of the new OSN in its first year , 2013, WWW '13.

[12]  Claude Castelluccia,et al.  How Unique and Traceable Are Usernames? , 2011, PETS.

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  George Varghese,et al.  I seek you: searching and matching individuals in social networks , 2009, WIDM.

[15]  Erhard Rahm,et al.  Frameworks for entity matching: A comparison , 2010, Data Knowl. Eng..

[16]  Anupam Joshi,et al.  @i seek 'fb.me': identifying users across multiple online social networks , 2013, WWW.

[17]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[18]  Virgílio A. F. Almeida,et al.  Studying User Footprints in Different Online Social Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[19]  Seung-won Hwang,et al.  SocialSearch: enhancing entity search with social network matching , 2011, EDBT/ICDT '11.

[20]  Reza Zafarani,et al.  Connecting Corresponding Identities across Communities , 2009, ICWSM.

[21]  Peter Fankhauser,et al.  Identifying Users Across Social Tagging Systems , 2011, ICWSM.

[22]  Hannes Hartenstein,et al.  What Your Friends Tell Others About You: Low Cost Linkability of Social Network Profiles , 2011, SNAKDD 2011.