Using Linguistic Features to Estimate Suicide Probability of Chinese Microblog Users

If people with high risk of suicide can be identified through social media like microblog, it is possible to implement an active intervention system to save their lives. Based on this motivation, the current study administered the Suicide Probability Scale(SPS) to 1041 weibo users at Sina Weibo, which is a leading microblog service provider in China. Two NLP (Natural Language Processing) methods, the Chinese edition of Linguistic Inquiry and Word Count (LIWC) lexicon and Latent Dirichlet Allocation (LDA), are used to extract linguistic features from the Sina Weibo data. We trained predicting models by machine learning algorithm based on these two types of features, to estimate suicide probability based on linguistic features. The experiment results indicate that LDA can find topics that relate to suicide probability, and improve the performance of prediction. Our study adds value in prediction of suicidal probability of social network users with their behaviors.

[1]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[2]  Tingshao Zhu,et al.  Predicting Big Five Personality Traits of Microblog Users , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[3]  Weili Wu,et al.  Maximizing rumor containment in social networks with constrained time , 2014, Social Network Analysis and Mining.

[4]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[5]  Derek L. Hansen,et al.  A method for computing political preference among Twitter followers , 2014, Soc. Networks.

[6]  Desney S. Tan,et al.  CHI '11 Extended Abstracts on Human Factors in Computing Systems , 2008, CHI 2011.

[7]  Sibel Adali,et al.  Predicting personality with social behavior: a comparative study , 2014, Social Network Analysis and Mining.

[8]  Wanxiang Che,et al.  LTP: A Chinese Language Technology Platform , 2010, COLING.

[9]  Jennifer Golbeck,et al.  Predicting Personality from Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[10]  Daniele Quercia,et al.  Our Twitter Profiles, Our Selves: Predicting Personality with Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[11]  Zhu Yu-xiao,et al.  Evaluation Metrics for Recommender Systems , 2012 .

[12]  Philip Resnik,et al.  Using Topic Modeling to Improve Prediction of Neuroticism and Depression in College Students , 2013, EMNLP.

[13]  Naiji Lu,et al.  Connecting the invisible dots: reaching lesbian, gay, and bisexual adolescents and young adults at risk for suicide through online social networks. , 2009, Social science & medicine.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Garyfalia Ampanozi,et al.  Suicide announcement on Facebook. , 2011, Crisis.

[16]  Tingshao Zhu,et al.  Determining personality traits from renren status usage behavior , 2012, CVM'12.

[17]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[18]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[19]  Taghi M. Khoshgoftaar,et al.  Using Twitter Content to Predict Psychopathy , 2012, 2012 11th International Conference on Machine Learning and Applications.

[20]  Michael D. Barnes,et al.  Tracking suicide risk factors through Twitter in the US. , 2014, Crisis.

[21]  Jennifer Golbeck,et al.  Predicting personality with social media , 2011, CHI Extended Abstracts.

[22]  J. G. Cull,et al.  Suicide Probability Scale , 2011 .

[23]  He Li,et al.  Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog , 2013, Brain and Health Informatics.

[24]  Lin Li,et al.  Sensing Subjective Well-Being from Social Media , 2014, AMT.