Retweets as a Predictor of Relationships among Users on Social Media

Link prediction is the problem of detecting missing links or predicting future link formation in a network. Application of link prediction to social media, such as Twitter and Facebook, is useful both for developing novel services and for sociological analyses. While most existing research on link prediction uses only the social network topology for the prediction, in social media, records of user activities such as posting, replying, and reposting are available. These records are expected to reflect user interest, and so incorporating them should improve link prediction. However, research into link prediction using the records of user activities is still in its infancy, and the effectiveness of such records for link prediction has not been fully explored. In this study, we focus in particular on records of reposting as a promising source that could be useful for link prediction, and investigate their effectiveness for link prediction on the popular social media platform Twitter. Our results show that (1) the prediction accuracy of techniques using reposting records is higher than that of popular topology-based techniques such as common neighbors and resource allocation for actively retweeting users, (2) the accuracy of link prediction techniques that use network topology alone can be improved by incorporating reposting records.

[1]  Nitesh V. Chawla,et al.  Multi-relational Link Prediction in Heterogeneous Information Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[2]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[3]  Roger Guimerà,et al.  Missing and spurious interactions and the reconstruction of complex networks , 2009, Proceedings of the National Academy of Sciences.

[4]  Philip S. Yu,et al.  A framework for dynamic link prediction in heterogeneous networks , 2014, Stat. Anal. Data Min..

[5]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[6]  Rossano Schifanella,et al.  The role of information diffusion in the evolution of social networks , 2013, KDD.

[7]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[8]  Daniel Dajun Zeng,et al.  A Link Prediction Approach to Anomalous Email Detection , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[9]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.

[10]  Ed H. Chi,et al.  Language Matters In Twitter: A Large Scale Study , 2011, ICWSM.

[11]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[12]  Gerd Stumme,et al.  On the Predictability of Human Contacts: Influence Factors and the Strength of Stronger Ties , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[13]  Hisashi Kashima,et al.  A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction , 2006, Sixth International Conference on Data Mining (ICDM'06).

[14]  Michelle Girvan,et al.  Robustness of Network Measures to Link Errors , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Yoshihiro Yamanishi,et al.  propagation: A fast semisupervised learning algorithm for link prediction , 2009 .

[16]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[17]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[18]  Jure Leskovec,et al.  The bursty dynamics of the Twitter information network , 2014, WWW.

[19]  Nitesh V. Chawla,et al.  Link Prediction: Fair and Effective Evaluation , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[20]  Ciro Cattuto,et al.  New Insights and Methods For Predicting Face-To-Face Contacts , 2013, ICWSM.

[21]  Frank M. Shipman,et al.  Link prediction applied to an open large-scale online social network , 2010, HT '10.

[22]  Rushed Kanawati,et al.  Link prediction in multiplex networks , 2015, Networks Heterog. Media.

[23]  Kristina Lerman,et al.  A Visibility-based Model for Link Prediction in Social Media , 2014 .

[24]  Rushed Kanawati,et al.  Link Prediction in Complex Networks by Supervised Rank Aggregation , 2012, 2012 IEEE 24th International Conference on Tools with Artificial Intelligence.

[25]  Hiroyuki Ohsaki,et al.  Effectiveness of Link Prediction for Face-to-Face Behavioral Networks , 2013, PloS one.

[26]  Yin Zhang,et al.  Scalable proximity estimation and link prediction in online social networks , 2009, IMC '09.

[27]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[28]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[29]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[30]  Linyuan Lu,et al.  SIMILARITY-BASED CLASSIFICATION IN PARTIALLY LABELED NETWORKS , 2010 .

[31]  Tao Zhou,et al.  Evaluating network models: A likelihood analysis , 2011, ArXiv.

[32]  Dong Li,et al.  Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo , 2016, Scientific reports.

[33]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Christos Faloutsos,et al.  Using ghost edges for classification in sparsely labeled networks , 2008, KDD.

[35]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[36]  Lada A. Adamic,et al.  Computational Social Science , 2009, Science.

[37]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[38]  Kathleen M. Carley,et al.  On the robustness of centrality measures under conditions of imperfect data , 2006, Soc. Networks.