Exploring Efficiency of Data Mining Techniques for Missing Link in Online Social Network

Missing link in Online Social Network (OSN) is an interesting problem for capturing missing relation and understanding user's behavior. The existing work introduced social features for training predictive models, but they used only SVM prediction technique for solving the problem. However, we suspect that other prediction techniques may give better performance. This study investigates prediction performances of SVM, k-NN, Decision Tree, Neural Networks, Naïve Bayes, Logistic Regression and Random Forest using two OSN datasets (high-density and low density). We realize that the Random Forest technique has the best performance with F1-measure score. Moreover, this technique is most robust technique for the both datasets.

[1]  Noël Crespi,et al.  Link prediction for new users in Social Networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[2]  David J. Fleet,et al.  CO2 Forest: Improved Random Forest by Continuous Optimization of Oblique Splits , 2015, ArXiv.

[3]  Saurabh Kr. Srivastava,et al.  Machine Learning: A Review on Binary Classification , 2017 .

[4]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[5]  Chih-Min Ma,et al.  How the Parameters of K-nearest Neighbor Algorithm Impact on the Best Classification Accuracy: In Case of Parkinson Dataset , 2014 .

[6]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[7]  Mahesh Panchal,et al.  Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network , 2014 .

[8]  Bruce Hoppe,et al.  Social network analysis and the evaluation of leadership networks , 2010 .

[9]  Noël Crespi,et al.  Alike people, alike interests? A large-scale study on interest similarity in social networks , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[10]  Peng Wang,et al.  Link prediction in social networks: the state-of-the-art , 2014, Science China Information Sciences.

[11]  Miltos Petridis,et al.  On Predicting the Optimal Number of Hidden Nodes , 2015, 2015 International Conference on Computational Science and Computational Intelligence (CSCI).

[12]  Nitesh V. Chawla,et al.  Evaluating link prediction methods , 2014, Knowledge and Information Systems.

[13]  Mohammad Al Hasan,et al.  Link prediction using supervised learning , 2006 .

[14]  Mohamed Medhat Gaber,et al.  An Information-Theoretic Approach for Setting the Optimal Number of Decision Trees in Random Forests , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[15]  Leon N. Cooper,et al.  Improving nearest neighbor rule with a simple adaptive distance measure , 2006, Pattern Recognit. Lett..

[16]  Nicola Parolini,et al.  Link Prediction in Criminal Networks: A Tool for Criminal Intelligence Analysis , 2016, PloS one.