SAMI: an algorithm for solving the missing node problem using structure and attribute information

An important area of social networks research is identifying missing information which is not explicitly represented in the network, or is not visible to all. Recently, the Missing Node Identification problem was introduced where missing members in the social network structure must be identified. However, previous works did not consider the possibility that information about specific users (nodes) within the network could be useful in solving this problem. In this paper, we present two algorithms: SAMI--A and SAMI--N. Both of these algorithms use the known nodes' specific information, such as demographic information and the nodes' historical behavior in the network. We found that both SAMI--A and SAMI--N perform significantly better than other missing node algorithms. However, as each of these algorithms and the parameters within these algorithms often perform better in specific problem instances, a mechanism is needed to select the best algorithm and the best variation within that algorithm. Towards this challenge, we also present OASCA, a novel online selection algorithm. We present results that detail the success of the algorithms presented within this paper.

[1]  Bart Selman,et al.  Algorithm portfolios , 2001, Artif. Intell..

[2]  Mason A. Porter,et al.  Communities in Networks , 2009, ArXiv.

[3]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[4]  Yuri Malitsky,et al.  Instance-specific algorithm configuration , 2014, Constraints.

[5]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.

[6]  Michalis Vazirgiannis,et al.  A Data Set Oriented Approach for Clustering Algorithm Selection , 2001, PKDD.

[7]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[8]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[9]  Jure Leskovec,et al.  The Network Completion Problem: Inferring Missing Nodes and Edges in Networks , 2011, SDM.

[10]  Y. Shavitt,et al.  An analysis of the Steam community network evolution , 2012, 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel.

[11]  Francesco Bonchi,et al.  Cold start link prediction , 2010, KDD.

[12]  Ling Huang,et al.  Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN) , 2011, ArXiv.

[13]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[14]  Jure Leskovec,et al.  Inferring Networks of Diffusion and Influence , 2012, ACM Trans. Knowl. Discov. Data.

[15]  Jure Leskovec,et al.  Latent Multi-group Membership Graph Model , 2012, ICML.

[16]  Jure Leskovec,et al.  Correcting for missing data in information cascades , 2011, WSDM '11.

[17]  Jiawei Han,et al.  LINKREC: a unified framework for link recommendation with user attributes and graph structure , 2010, WWW '10.

[18]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[19]  Gemma C. Garriga,et al.  Learning to Recommend Links using Graph Structure and Node Content , 2011, NIPS 2011.

[20]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[21]  Hamid R. Rabiee,et al.  DNE: A Method for Extracting Cascaded Diffusion Networks from Social Networks , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[22]  Christos Faloutsos,et al.  Scalable modeling of real graphs using Kronecker multiplication , 2007, ICML '07.

[23]  Matthew Brand,et al.  A Random Walks Perspective on Maximizing Satisfaction and Profit , 2005, SDM.

[24]  Philip S. Yu,et al.  Community detection in incomplete information networks , 2012, WWW.

[25]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[26]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..

[27]  Jiawei Han,et al.  A Unified Framework for Link Recommendation Using Random Walks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[28]  Yuri Malitsky,et al.  ISAC - Instance-Specific Algorithm Configuration , 2010, ECAI.

[29]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[30]  Sarit Kraus,et al.  Identifying Missing Node Information in Social Networks , 2011, AAAI.

[31]  Steven Minton,et al.  Minimizing Conflicts: A Heuristic Repair Method for Constraint Satisfaction and Scheduling Problems , 1992, Artif. Intell..

[32]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[33]  Joris Kinable,et al.  Improved call graph comparison using simulated annealing , 2011, SAC.