A comprehensive study on the effects of using data mining techniques to predict tie strength

The use of social networks has grown noticeably in recent years and this fact has led to the production of numerous volumes of data. Data that are widely used by users on the social media sites are very large, noisy, unstructured and dynamic. Providing a flexible framework and method to apply in all of these networks can be the perfect solution. The uncertainties arising from the complexity of decisions in recognition of the Tie Strength among people have led researchers to seek effective variables of intimacy among people. Since there are several effective variables which their effectiveness rate are not precisely determined and their relations are nonlinear and complex, using data mining techniques can be considered as one of the practical solutions for this problem. Some types of unsupervised mining methods have been conducted in the field of detecting the type of tie. Data mining could be considered as one of the applicable tools for researchers in exploring the relationships among users.In this paper, the problem of tie strength prediction is modeled as a data mining problem on which different supervised and unsupervised mining methods are applicable. We propose a comprehensive study on the effects of using different classification techniques such as decision trees, Naive Bayes and so on; in addition to some ensemble classification methods such as Bagging and Boosting methods for predicting tie strength of users of a social network. LinkedIn social network is used as a real case study and our experimental results are proposed on its extracted data. Several models, based on basic techniques and ensemble methods are created and their efficiencies are compared based on F-Measure, accuracy, and average executing time. Our experimental results show that, our profile-behavioral based model has much better accuracy in comparison with profile-data based models techniques. The problem of tie strength is modeled as a data mining problem.Different supervised and unsupervised mining methods are used.We propose a comprehensive study on the effects of using different classification techniques.Several models are created and their efficiencies are compared based on F-Measure and executing time.Profile-behavioral based model has better accuracy than profile-data based models techniques.

[1]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[2]  Mark S. Granovetter Getting a Job: A Study of Contacts and Careers , 1974 .

[3]  Yu Liu,et al.  Home location profiling for users in social media , 2016, Inf. Manag..

[4]  Sonja Utz,et al.  The emotional responses of browsing Facebook: Happiness, envy, and the role of tie strength , 2015, Comput. Hum. Behav..

[5]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[6]  Jose M. Such,et al.  BFF: A tool for eliciting tie strength and user communities in social networking services , 2013, Information Systems Frontiers.

[7]  Muhammad Abulaish,et al.  A social graph based text mining framework for chat log investigation , 2014, Digit. Investig..

[8]  Jon M. Kleinberg,et al.  Center of Attention: How Facebook Users Allocate Attention across Friends , 2011, ICWSM.

[9]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[10]  Ahmad Abdollahzadeh Barforoush,et al.  Efficient colossal pattern mining in high dimensional datasets , 2012, Knowl. Based Syst..

[11]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[12]  N. Lin,et al.  Social Resources and Strength of Ties: Structural Factors in Occupational Status Attainment , 1981, Social Capital, Social Support and Stratification.

[13]  Qingyu Zhang,et al.  Web Mining: a Survey of Current Research, Techniques, and Software , 2008, Int. J. Inf. Technol. Decis. Mak..

[14]  R. Lazarus,et al.  The health-related functions of social support , 1981, Journal of Behavioral Medicine.

[15]  Jennifer Neville,et al.  Using Transactional Information to Predict Link Strength in Online Social Networks , 2009, ICWSM.

[16]  Sushama Nagpal,et al.  Using Strong, Acquaintance and Weak Tie Strengths for Modeling Relationships in Facebook Network , 2012, IC3.

[17]  Ahmad Abdollahzadeh Barforoush,et al.  Parallel frequent itemset mining using systolic arrays , 2013, Knowl. Based Syst..

[18]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[19]  Kun Chang Lee,et al.  The impact of hyperlink affordance, psychological reactance, and perceived business tie on trust transfer , 2014, Comput. Hum. Behav..

[20]  Robert N. Stern,et al.  Informal Networks and Organizational Crises: An Experimental Simulation , 1988 .

[21]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[22]  Vahid Ghods,et al.  Materialized View Selection for a Data Warehouse Using Frequent Itemset Mining , 2016, J. Comput..

[23]  Chien-Wen Shen,et al.  Learning in massive open online courses: Evidence from social media mining , 2015, Comput. Hum. Behav..

[24]  P. V. Marsden,et al.  Measuring Tie Strength , 1984 .

[25]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[26]  David R. Karger,et al.  Tie strength in question & answer on social network sites , 2012, CSCW '12.

[27]  Vahid Ghods,et al.  Top-down vertical itemset mining , 2015, International Conference on Graphic and Image Processing.

[28]  Inge van de Weerd,et al.  Understanding users' behavior with software operation data mining , 2014, Comput. Hum. Behav..

[29]  Sandra Servia Rodríguez,et al.  A tie strength based model to socially-enhance applications and its enabling implementation: mySocialSphere , 2014, Expert Syst. Appl..

[30]  Sohrabi Mohammad Karim,et al.  Association Rule Mining Using New FP-Linked List Algorithm , 2016 .

[31]  John Zimmerman,et al.  Assessing Call and SMS Logs as an Indication of Tie Strength , 2014 .

[32]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[33]  Hemant Kumar Singh,et al.  Web Data Mining research: A survey , 2010, 2010 IEEE International Conference on Computational Intelligence and Computing Research.

[34]  B. Wellman,et al.  Different Strokes from Different Folks: Community Ties and Social Support , 1990, American Journal of Sociology.

[35]  Sukumar Nandi,et al.  Influence of edge weight on node proximity based link prediction methods: An empirical analysis , 2016, Neurocomputing.

[36]  Ben Y. Zhao,et al.  User interactions in social networks and their implications , 2009, EuroSys '09.

[37]  Ung-Mo Kim,et al.  Mining Information of Anonymous User on a Social Network Service , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[38]  Sandra Servia Rodríguez,et al.  Mining Facebook Activity to Discover Social Ties: Towards a Social-Sensitive Ecosystem , 2012, CLOSER.

[39]  Michael J. A. Berry,et al.  An Introduction to Data Mining , 2003 .

[40]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).