Enhancement to community-based multi-relational link prediction using co-occurrence probability feature

Predicting future links or missing links is one of the useful application tasks in the analysis of social networks. Time and memory are major challenges for the link prediction task in large multi-relational social networks. This challenge is addressed in this paper, by proposing a parallel method for link prediction. Community information is used for parallelization since social networks tend to form natural communities, and probability of intra community node interaction is much more than inter community interaction. For prediction task, along with the standard topological features, the recently proposed local probabilistic graph model is also used. This model infers the joint co-occurrence probability of two nodes (i, j) from Markov Random Field constructed using the nodes in the neighbourhood of (i, j). In this paper, we adopt the supervised framework of MR-HPLP of the literature by including the co-occurrence probability feature in the multi relational environment, and reducing the dimensionality of the feature vector. This method is evaluated on a challenging benchmark multi relational dataset and COP is shown to outperform as an unsupervised predictor. Further MR-HPLP-COP shows significant improvement in AUROC as well as AUPR scores over all the ten existing predictors on the benchmark data set. In particular, MR-HPLP-COP shows AUROC of over 90% for two data sets for which the existing methods give a prediction performance of around 75%.

[1]  Frederick Jelinek,et al.  Continuous speech recognition , 1977, SGAR.

[2]  Huan Liu,et al.  Community detection via heterogeneous interaction analysis , 2012, Data Mining and Knowledge Discovery.

[3]  Alneu de Andrade Lopes,et al.  Link Prediction in Complex Networks Based on Cluster Information , 2012, SBIA.

[4]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Srinivasan Parthasarathy,et al.  Local Probabilistic Models for Link Prediction , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[7]  Nitesh V. Chawla,et al.  Supervised methods for multi-relational link prediction , 2013, Social Network Analysis and Mining.

[8]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[9]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Joris M. Mooij,et al.  libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models , 2010, J. Mach. Learn. Res..

[12]  T. Jaya Lakshmi,et al.  Heterogeneous link prediction based on multi relational community information , 2014, 2014 Sixth International Conference on Communication Systems and Networks (COMSNETS).

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Toon Calders,et al.  Non-derivable itemset mining , 2007, Data Mining and Knowledge Discovery.

[15]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[16]  Nitesh V. Chawla,et al.  LPmade: Link Prediction Made Easy , 2011, J. Mach. Learn. Res..

[17]  Heikki Mannila,et al.  Probabilistic Models for Query Approximation with Large Sparse Binary Data Sets , 2000, UAI.

[18]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[19]  Padhraic Smyth,et al.  A Spectral Clustering Approach To Finding Communities in Graph , 2005, SDM.

[20]  Nitesh V. Chawla,et al.  Link Prediction: Fair and Effective Evaluation , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[21]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[22]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..