Simulation and Augmentation of Social Networks for Building Deep Learning Models

A limitation of the Graph Convolutional Networks (GCNs) is that it assumes at a particular $l^{th}$ layer of the neural network model only the $l^{th}$ order neighbourhood nodes of a social network are influential. Furthermore, the GCN has been evaluated on citation and knowledge graphs, but not extensively on friendship-based social graphs. The drawback associated with the dependencies between layers and the order of node neighbourhood for the GCN can be more prevalent for friendship-based graphs. The evaluation of the full potential of the GCN on friendship-based social network requires openly available datasets in larger quantities. However, most available social network datasets are not complete. Also, the majority of the available social network datasets do not contain both the features and ground truth labels. In this work, firstly, we provide a guideline on simulating dynamic social networks, with ground truth labels and features, both coupled with the topology. Secondly, we introduce an open-source Python-based simulation library. We argue that the topology of the network is driven by a set of latent variables, termed as the social DNA (sDNA). We consider the sDNA as labels for the nodes. Finally, by evaluating on our simulated datasets, we propose four new variants of the GCN, mainly to overcome the limitation of dependency between the order of node-neighbourhood and a particular layer of the model. We then evaluate the performance of all the models and our results show that on 27 out of the 30 simulated datasets our proposed GCN variants outperform the original model.

[1]  Alexis Papadimitriou,et al.  Fast and accurate link prediction in social networking systems , 2012, J. Syst. Softw..

[2]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[3]  Cecilia Mascolo,et al.  Exploiting place features in link prediction on location-based social networks , 2011, KDD.

[4]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.

[5]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[6]  Kostas E. Psannis,et al.  Social networking data analysis tools & challenges , 2016, Future Gener. Comput. Syst..

[7]  Panagiotis Symeonidis,et al.  Transitive node similarity for link prediction in social networks with positive and negative links , 2010, RecSys '10.

[8]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[9]  Katarzyna Musial,et al.  How to predict social relationships — Physics-inspired approach to link prediction , 2019, Physica A: Statistical Mechanics and its Applications.

[10]  Kai Han,et al.  A Supervised Learning Approach to Link Prediction in Dynamic Networks , 2018, WASA.

[11]  Boleslaw K. Szymanski,et al.  Social Networks through the Prism of Cognition , 2018, Complex..

[12]  Elaine Shi,et al.  Link prediction by de-anonymization: How We Won the Kaggle Social Network Challenge , 2011, The 2011 International Joint Conference on Neural Networks.

[13]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[14]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[15]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[16]  Katarzyna Musial,et al.  NetSim - The framework for complex network generator , 2018, KES.

[17]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[18]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[19]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[20]  Christopher M. Danforth,et al.  An evolutionary algorithm approach to link prediction in dynamic social networks , 2013, J. Comput. Sci..

[21]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[22]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[23]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[24]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[25]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[26]  A. Rapoport,et al.  Connectivity of random nets , 1951 .

[27]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[28]  Jon Atwell,et al.  Agent-Based Models in Empirical Social Research , 2015, Sociological methods & research.

[29]  Colin J. Bennett The European General Data Protection Regulation: An instrument for the globalization of privacy standards? , 2018, Inf. Polity.

[30]  Christopher J. Lynch,et al.  Big data, agents, and machine learning: towards a data-driven agent-based modeling approach , 2018, SpringSim.

[31]  Katarzyna Musial,et al.  Newton's Gravitational Law for Link Prediction in Social Networks , 2017, COMPLEX NETWORKS.

[32]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[33]  David J. Hand,et al.  Aspects of Data Ethics in a Changing World: Where Are We Now? , 2018, Big Data.

[34]  P. Erdos,et al.  On the strength of connectedness of a random graph , 1964 .

[35]  Drossel,et al.  Self-organized critical forest-fire model. , 1992, Physical review letters.

[36]  Cecilia Mascolo,et al.  A multilayer approach to multiplexity and link prediction in online geo-social networks , 2016, EPJ Data Science.

[37]  Maria Cláudia Reis Cavalcanti,et al.  Automatic feature selection for supervised learning in link prediction applications: a comparative study , 2017, Knowledge and Information Systems.

[38]  Xiao-Ming Wu,et al.  Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning , 2018, AAAI.

[39]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[40]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[41]  B. Bollobás The evolution of random graphs , 1984 .

[42]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .