SocialLink: exploiting graph embeddings to link DBpedia entities to Twitter profiles

SocialLink is a project designed to match social media profiles on Twitter to corresponding entities in DBpedia. Built to bridge the vibrant Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two, both assisting Semantic Web practitioners in better harvesting the vast amounts of information available on Twitter and allowing leveraging of DBpedia data for social media analysis tasks. In this paper, we further extend the original SocialLink approach by exploiting graph-based features based on both DBpedia and Twitter, represented as graph embeddings learned from vast amounts of unlabeled data. The introduction of such new features required to redesign our deep neural network-based candidate selection algorithm and, as a result, we experimentally demonstrate a significant improvement of the performances of SocialLink.

[1]  Claudio Giuliano,et al.  Concealing Interests of Passive Users in Social Media , 2017, BlackMirror@ISWC.

[2]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[3]  Reza Zafarani,et al.  Connecting Corresponding Identities across Communities , 2009, ICWSM.

[4]  Heiko Paulheim,et al.  Semantic Web in data mining and knowledge discovery: A comprehensive survey , 2016, J. Web Semant..

[5]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[6]  Heiko Paulheim,et al.  RDF2Vec: RDF graph embeddings and their applications , 2019, Semantic Web.

[7]  Claudio Giuliano,et al.  Linking knowledge bases to social media profiles , 2017, SAC.

[8]  Ramayya Krishnan,et al.  HYDRA: large-scale social identity linkage via heterogeneous behavior modeling , 2014, SIGMOD Conference.

[9]  Reza Zafarani,et al.  Connecting users across social media sites: a behavioral-modeling approach , 2013, KDD.

[10]  Sree Hari Krishnan Parthasarathi,et al.  Exploiting innocuous activity for correlating users across sites , 2013, WWW.

[11]  Noam Shazeer,et al.  Swivel: Improving Embeddings by Noticing What's Missing , 2016, ArXiv.

[12]  Krishna P. Gummadi,et al.  On the Reliability of Profile Matching Across Large Online Social Networks , 2015, KDD.

[13]  Anne-Lyse Minard,et al.  FBK-NLP at NEEL-IT: Active Learning for Domain Adaptation , 2016, CLiC-it/EVALITA.

[14]  RistoskiPetar,et al.  Semantic Web in data mining and knowledge discovery , 2016 .

[15]  Michael Granitzer,et al.  Inferring semantic interest profiles from Twitter followees: does Twitter know better than your friends? , 2016, SAC.

[16]  Lise Getoor,et al.  To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles , 2009, WWW '09.

[17]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[18]  Stefano Faralli,et al.  Large Scale Homophily Analysis in Twitter Using a Twixonomy , 2015, IJCAI.

[19]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[20]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[21]  S. T. Buckland,et al.  Computer-Intensive Methods for Testing Hypotheses. , 1990 .

[22]  Nello Cristianini,et al.  Latent Semantic Kernels , 2001, Journal of Intelligent Information Systems.

[23]  Philip S. Yu,et al.  Identifying Your Customers in Social Networks , 2014, CIKM.

[24]  John G. Breslin,et al.  Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies , 2017, ECIR.

[25]  Heiko Paulheim,et al.  Biased graph walks for RDF graph embeddings , 2017, WIMS.

[26]  Claudio Giuliano,et al.  SocialLink: Linking DBpedia Entities to Corresponding Twitter Accounts , 2017, International Semantic Web Conference.

[27]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[29]  Claudio Giuliano,et al.  Automatic Expansion of DBpedia Exploiting Wikipedia Cross-Language Information , 2013, ESWC.

[30]  Oana Goga,et al.  Matching user accounts across online social networks : methods and applications. (Corrélation des profils d'utilisateurs dans les réseaux sociaux : méthodes et applications) , 2014 .

[31]  Heiko Paulheim,et al.  RDF2Vec: RDF graph embeddings and their applications , 2019, Semantic Web.

[32]  Michael Günther,et al.  Introducing Wikidata to the Linked Data Web , 2014, SEMWEB.

[33]  Lior Rokach,et al.  Matching entities across online social networks , 2014, Neurocomputing.

[34]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[35]  Heiko Paulheim,et al.  Global RDF Vector Space Embeddings , 2017, SEMWEB.

[36]  Michele Mostarda,et al.  Processing billions of RDF triples on a single machine using streaming and sorting , 2015, SAC.

[37]  Claudio Giuliano,et al.  MicroNeel: Combining NLP Tools to Perform Named Entity Detection and Linking on Microposts , 2016, CLiC-it/EVALITA.

[38]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[39]  Henry A. Kautz,et al.  Finding your friends and following them to where you are , 2012, WSDM '12.

[40]  Avishek Anand,et al.  How much is Wikipedia Lagging Behind News? , 2015, WebSci.

[41]  Claudio Giuliano,et al.  Pokedem: an Automatic Social Media Management Application , 2017, RecSys.