An Improved Model for Depression Detection in Micro-Blog Social Network

Social networks contain a tremendous amount of node and linkage data, providing unprecedented opportunities for a wide variety of fields. As the world's fourth largest disease, depression has become one of the most significant research subjects. Previously, a depression classifier has been proposed to classify the users in online social networks to be depressed or not, however, the classifier takes only node features into account and neglects the influence of linkages. This paper proposes an improved model to calculate the probability of a user being depressed, which is based on both node and linkage features. The linkage features are measured in two aspects: tie strength and interaction content analysis. Moreover, the propagation rule of depression is considered for improving the prediction accuracy. Finally, our experiments on the data derived from Sina Micro-blog shows that the highest accuracy of the improved model is 95%, increasing by 15% compared to the classifier with node features considered only. In this paper, it is well proved that adding linkage features analysis performs much better than node features analysis only. It also implies that tie strength and interaction content have different effects on depression probability estimation. Although this model is proposed for depression detection, the basic idea of linkage features analysis could be explicitly used in a wide scenario.

[1]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[4]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[5]  Sinan Aral,et al.  Identifying Influential and Susceptible Members of Social Networks , 2012, Science.

[6]  Daniel Dajun Zeng,et al.  Sentiment analysis of Chinese documents: From sentence to document level , 2009, J. Assoc. Inf. Sci. Technol..

[7]  James W. Pennebaker,et al.  The Psychology of Word Use in Depression Forums in English and in Spanish: Texting Two Text Analytic Approaches , 2008, ICWSM.

[8]  P. Kaye Infectious diseases of humans: Dynamics and control , 1993 .

[9]  Charu C. Aggarwal,et al.  Social Network Data Analytics , 2011 .

[10]  Yang Ji,et al.  Principle Features for Tie Strength Estimation in Micro-blog Social Network , 2012, 2012 IEEE 12th International Conference on Computer and Information Technology.

[11]  Lise Getoor,et al.  Active Learning for Networked Data , 2010, ICML.

[12]  Jimeng Sun,et al.  A Survey of Models and Algorithms for Social Influence Analysis , 2011, Social Network Data Analytics.

[13]  N. Christakis,et al.  Social network determinants of depression , 2011, Molecular Psychiatry.

[14]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[15]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[16]  Niloy Ganguly,et al.  Discriminative Link Prediction Using Local Links, Node Features and Community Structure , 2013, 2013 IEEE 13th International Conference on Data Mining.

[17]  Elizabeth D. Cox,et al.  Feeling bad on Facebook: depression disclosures by college students on a social networking site , 2011, Depression and anxiety.

[18]  C. Dolea,et al.  World Health Organization , 1949, International Organization.

[19]  Li Sun,et al.  A Depression Detection Model Based on Sentiment Analysis in Micro-blog Social Network , 2013, PAKDD Workshops.

[20]  Foster Provost,et al.  Relational Learning Problems and Simple Models , 2003 .

[21]  Antonino Staiano,et al.  Investigation of Single Nucleotide Polymorphisms Associated to Familial Combined Hyperlipidemia with Random Forests , 2012, WIRN.