Multi-kernel SVM based depression recognition using social media data

Depression has become the world’s fourth major disease. Compared with the high incidence, however, the rate of depression medical treatment is very low because of the difficulty of diagnosis of mental problems. The social media opens one window to evaluate the users’ mental status. With the rapid development of Internet, people are accustomed to express their thoughts and feelings through social media. Thus social media provides a new way to find out the potential depressed people. In this paper, we propose a multi-kernel SVM based model to recognize the depressed people. Three categories of features, user microblog text, user profile and user behaviors, are extracted from their social media to describe users’ situations. According to the new characteristics of social media language, we build a special emotional dictionary consisted of text emotional dictionary and emoticon dictionary to extract microblog text features for word frequency statistics. Considering the heterogeneity between text feature and another two features, we employ multi-kernel SVM methods to adaptively select the optimal kernel for different features to find out users who may suffer from depression. Compared with Naive Bayes, Decision Trees, KNN, single-kernel SVM and ensemble method (libD3C), whose error reduction rates are 38, 43, 22, 21 and 11% respectively, the error rate of multi-kernel SVM method for identifying the depressed people is reduced to 16.54%. This indicates that the multi-kernel SVM method is the most appropriate way to find out depressed people based on social media data.

[1]  Heidi Ledford Medical research: If depression were cancer , 2014, Nature.

[2]  John Zimmerman,et al.  Detection of Behavior Change in People with Depression , 2014, AAAI Workshop: Modern Artificial Intelligence for Health Analytics.

[3]  P. Ekman An argument for basic emotions , 1992 .

[4]  Chen Lin,et al.  LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy , 2014, Neurocomputing.

[5]  Eric Horvitz,et al.  Characterizing and predicting postpartum depression from shared facebook data , 2014, CSCW.

[6]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[7]  Melanie Hilario,et al.  Margin and Radius Based Multiple Kernel Learning , 2009, ECML/PKDD.

[8]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[9]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[10]  Panagiotis Takis Metaxas,et al.  The power of prediction with social media , 2013, Internet Res..

[11]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[12]  Jianwu Dang,et al.  Improved support vector machine algorithm for heterogeneous data , 2015, Pattern Recognit..

[13]  Shadi Banitaan,et al.  Using Data Mining to Predict Possible Future Depression Cases , 2014 .

[14]  Christine T. Wolf,et al.  Using Depression Analytics to Reduce Stigma via Social Media: BlueFriends , 2014 .

[15]  Eric Horvitz,et al.  Social media as a measurement tool of depression in populations , 2013, WebSci.

[16]  Chen Lin,et al.  Identify content quality in online social networks , 2012, IET Commun..

[17]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[18]  Danah Boyd,et al.  Social Network Sites: Definition, History, and Scholarship , 2007, J. Comput. Mediat. Commun..

[19]  Li Sun,et al.  A Depression Detection Model Based on Sentiment Analysis in Micro-blog Social Network , 2013, PAKDD Workshops.

[20]  C. Darwin The Expression of the Emotions in Man and Animals , .

[21]  Liujuan Cao,et al.  A novel features ranking metric with application to scalable visual and bioinformatics data classification , 2016, Neurocomputing.

[22]  Minsu Park,et al.  Depressive Moods of Users Portrayed in Twitter , 2012 .

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[24]  David W. McDonald,et al.  Perception Differences between the Depressed and Non-Depressed Users in Twitter , 2013, ICWSM.

[25]  Keikichi Hirose,et al.  Comparison of Emotion Perception among Different Cultures , 2009 .

[26]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[27]  Hsin-Hsi Chen,et al.  Mining opinions from the Web: Beyond relevance retrieval , 2007 .

[28]  Qun Liu,et al.  HHMM-based Chinese Lexical Analyzer ICTCLAS , 2003, SIGHAN.

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[30]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[31]  Svetha Venkatesh,et al.  Affective and Content Analysis of Online Depression Communities , 2014, IEEE Transactions on Affective Computing.

[32]  Max L. Wilson,et al.  Finding information about mental health in microblogging platforms: a case study of depression , 2014, IIiX.

[33]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[34]  Qi Hu,et al.  Supervised word sense disambiguation using semantic diffusion kernel , 2014, Eng. Appl. Artif. Intell..

[35]  D. Mohr,et al.  Harnessing Context Sensing to Develop a Mobile Intervention for Depression , 2011, Journal of medical Internet research.

[36]  R. Fletcher Practical Methods of Optimization , 1988 .

[37]  Li Sun,et al.  An Improved Model for Depression Detection in Micro-Blog Social Network , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[38]  Cliff Lampe,et al.  The Benefits of Facebook "Friends: " Social Capital and College Students' Use of Online Social Network Sites , 2007, J. Comput. Mediat. Commun..

[39]  Songcan Chen,et al.  MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Yair Neuman,et al.  Proactive screening for depression through metaphorical and automatic text analysis , 2012, Artif. Intell. Medicine.

[41]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[42]  T H Ollendick,et al.  Only children and children with siblings in the People's Republic of China: levels of fear, anxiety, and depression. , 1995, Child development.

[43]  Kerri Smith,et al.  Mental health: A world of depression , 2014, Nature.

[44]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[45]  Qiang Dong,et al.  Hownet and the Computation of Meaning: (With CD-ROM) , 2006 .

[46]  Svetha Venkatesh,et al.  Effect of Mood, Social Connectivity and Age in Online Depression Community via Topic and Linguistic Analysis , 2014, WISE.

[47]  B. Jeong,et al.  Activities on Facebook Reveal the Depressive State of Users , 2013, Journal of medical Internet research.