Analysis of Facebook Interaction as Basis for Synthetic Expanded Social Graph Generation

Social networks have long been the subject of scientific researches, frequently hindered by the unavailability of representative datasets. The advent of online social networks (OSNs), which store data about interactions between billions of people, has greatly alleviated this problem. Since user interaction on the OSNs can correspond to their real-life relationships, OSN datasets quickly became a highly sought-after resource for social network research. However, enabling open access to such data entails serious security and privacy risks, especially after the introduction of the European General Data Protection Regulation. Some researchers mitigate this problem through anonymization, while others argue for the creation of synthetic datasets. We consider synthetic datasets preferable since they circumvent the security and privacy issues. Existing synthetic dataset generators produce a social graph containing only information whether a pair of nodes are connected. However, interpersonal relationships are much more complex. Because of that, our research considers the possibility of generating a synthetic expanded social graph which, in addition to the information about the existence of a connection between a pair of users, also provides information about the types and intensities of users’ interactions. As Facebook is the leading OSN today, we performed an extensive analysis of Facebook users’ interaction records with the aim of getting an insight into real-life interaction patterns. In this paper, we present results of this analysis at ego-user level and offer the conceptual solution for synthetic expanded social graph generation, which uses conducted analysis results as its basis.

[1]  Luka Humski,et al.  Proof of Concept for Comparison and Classification of Online Social Network Friends Based on Tie Strength Calculation Model , 2016 .

[2]  Nur Wahida,et al.  Automatic Artificial Data Generator: Framework and implementation , 2016, 2016 International Conference on Information and Communication Technology (ICICTM).

[3]  Luka Humski,et al.  Applying the multiclass classification methods for the classification of online social network friends , 2017, 2017 25th International Conference on Software, Telecommunications and Computer Networks (SoftCOM).

[4]  L. R. Silva,et al.  Scale-free homophilic network , 2013 .

[5]  Yingshu Li,et al.  An Empirical Study on the Privacy Preservation of Online Social Networks , 2018, IEEE Access.

[6]  Xiaodong Lin,et al.  Itrust: interpersonal trust measurements from social interactions , 2016, IEEE Network.

[7]  Brian V. Carolan Social Network Analysis and Education: Theory, Methods & Applications , 2013 .

[8]  Jennifer Badham,et al.  A Spatial Approach to Network Generation for Three Properties: Degree Distribution, Clustering Coefficient and Degree Assortativity , 2010, J. Artif. Soc. Soc. Simul..

[9]  Lei Shi,et al.  Social Network Analysis in Enterprise , 2012, Proceedings of the IEEE.

[10]  Luka Humski,et al.  Applying the binary classification methods for discovering the best friends on an online social network , 2017, 2017 14th International Conference on Telecommunications (ConTEL).

[11]  Paul Erdös,et al.  On random graphs, I , 1959 .

[12]  Lars Backstrom,et al.  The Anatomy of the Facebook Social Graph , 2011, ArXiv.

[13]  Darko Striga,et al.  How to calculate trust between social network users? , 2012, SoftCOM 2012, 20th International Conference on Software, Telecommunications and Computer Networks.

[14]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[15]  Patrick P. K. Chan,et al.  Synthetic Data Generator for Classification Rules Learning , 2016, 2016 7th International Conference on Cloud Computing and Big Data (CCBD).

[16]  Marco Conti,et al.  The structure of online social networks mirrors those in the offline world , 2015, Soc. Networks.

[17]  Honggang Zhang,et al.  Social interaction based video recommendation: Recommending YouTube videos to facebook users , 2014, 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[18]  Kate Ehrlich,et al.  SmallBlue: People Mining for Expertise Search , 2008, IEEE MultiMedia.

[19]  Bonghee Hong,et al.  A generator of test data set for tactical moving objects based on velocity , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[20]  Robin I. M. Dunbar Neocortex size as a constraint on group size in primates , 1992 .

[21]  Faraz Zaidi,et al.  Generating online social networks based on socio-demographic attributes , 2014, J. Complex Networks.

[22]  Xiufeng Liu,et al.  A Prediction-Based Smart Meter Data Generator , 2016, 2016 19th International Conference on Network-Based Information Systems (NBiS).

[23]  Boris Lubarsky RE-IDENTIFICATION OF “ A NONYMIZED DATA ” , 2017 .

[24]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[25]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[26]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[27]  Vitaly Shmatikov,et al.  De-anonymizing Social Networks , 2009, 2009 30th IEEE Symposium on Security and Privacy.

[28]  Teresa Correa,et al.  Who interacts on the Web?: The intersection of users' personality and social media use , 2010, Comput. Hum. Behav..

[29]  C. D. Vale,et al.  Simulating multivariate nonnormal distributions , 1983 .

[30]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[31]  Ivan Marsic,et al.  Semi-Synthetic Trauma Resuscitation Process Data Generator , 2017, 2017 IEEE International Conference on Healthcare Informatics (ICHI).

[32]  Przemyslaw Kazienko,et al.  Predicting Social Network Measures Using Machine Learning Approach , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[33]  Luka Humski,et al.  Determination of Friendship Intensity between Online Social Network Users Based on Their Interaction , 2018, ArXiv.

[34]  John Ruscio,et al.  Simulating Multivariate Nonnormal Data Using an Iterative Algorithm , 2008, Multivariate behavioral research.

[35]  Luka Humski,et al.  Using the interaction on social networks to predict real life friendship , 2014, 2014 22nd International Conference on Software, Telecommunications and Computer Networks (SoftCOM).

[36]  Hüseyin Uzunalioglu,et al.  Prediction of subscriber churn using social network analysis , 2013, Bell Labs Technical Journal.

[37]  Luka Humski,et al.  Exploratory analysis of pairwise interactions in online social networks , 2017, ArXiv.

[38]  Osmar R. Zaïane,et al.  Analyzing Participation of Students in Online Courses Using Social Network Analysis Techniques , 2011, EDM.

[39]  Sougata Mukherjea,et al.  Social ties and their relevance to churn in mobile telecom networks , 2008, EDBT '08.

[40]  Lina Ni,et al.  DP-MCDBSCAN: Differential Privacy Preserving Multi-Core DBSCAN Clustering for Network User Data , 2018, IEEE Access.

[41]  Jennifer Neville,et al.  Using Transactional Information to Predict Link Strength in Online Social Networks , 2009, ICWSM.

[42]  Dawn Xiaodong Song,et al.  Preserving Link Privacy in Social Network Based Systems , 2012, NDSS.

[43]  Michael Harris Bond,et al.  Social Psychology Across Cultures , 1993 .

[44]  Garry Robins,et al.  A spatial model for social networks , 2006 .

[45]  Marko Robnik-Sikonja Data Generators for Learning Systems Based on RBF Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[46]  Sebastián Ventura,et al.  Data mining in education , 2013, WIREs Data Mining Knowl. Discov..

[47]  Luka Humski,et al.  Building implicit corporate social networks: The case of a multinational company , 2013, Proceedings of the 12th International Conference on Telecommunications.

[48]  Chenyang Liu,et al.  Attribute Couplet Attacks and Privacy Preservation in Social Networks , 2017, IEEE Access.

[49]  Mamadou Diaby,et al.  Toward the next generation of recruitment tools: An online social network-based job recommender system , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[50]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[51]  Bernardo A. Huberman,et al.  Intentional Walks on Scale Free Small Worlds , 2001, ArXiv.

[52]  Andrea Biancini Social Psychology Testing Platform Leveraging Facebook and SNA Techniques , 2012, 2012 Eighth International Conference on Signal Image Technology and Internet Based Systems.

[53]  Gábor Benedek,et al.  The Importance of Social Embeddedness: Churn Models at Mobile Providers , 2014, Decis. Sci..