Topic Model for Identifying Suicidal Ideation in Chinese Microblog

Suicide is one of major public health problems worldwide. Traditionally, suicidal ideation is assessed by surveys or interviews, which lacks of a real-time assessment of personal mental state. Online social networks, with large amount of user-generated data, offer opportunities to gain insights of suicide assessment and prevention. In this paper, we explore potentiality to identify and monitor suicide expressed in microblog on social networks. First, we identify users who have committed suicide and collect millions of microblogs from social networks. Second, we build suicide psychological lexicon by psychological standards and word embedding technique. Third, by leveraging both language styles and online behaviors, we employ Topic Model and other machine learning algorithms to identify suicidal ideation. Our approach achieves the best results on topic-500, yielding F1 − measure of 80.0%, Precision of 87.1%, Recall of 73.9%, and Accuracy of 93.2%. Furthermore, a prototype system for monitoring suicidal ideation on several social networks is deployed.

[1]  Michael D. Barnes,et al.  Tracking suicide risk factors through Twitter in the US. , 2014, Crisis.

[2]  S. Kay,et al.  The positive and negative syndrome scale (PANSS) for schizophrenia. , 1987, Schizophrenia bulletin.

[3]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[4]  Philip Resnik,et al.  Using Topic Modeling to Improve Prediction of Neuroticism and Depression in College Students , 2013, EMNLP.

[5]  Chung-Hsien Wu,et al.  Using Semantic Dependencies to Mine Depressive Symptoms from Consultation Records , 2005, IEEE Intell. Syst..

[6]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[7]  Qiang Dong,et al.  HowNet - a hybrid language and knowledge resource , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[8]  Matthew K Nock,et al.  The American Association of Suicidology Warning Signs for Suicide : Theory , Research , and Clinical Applications , 2006 .

[9]  Jennifer D. June,et al.  Technology-based suicide prevention: current applications and future directions. , 2011, Telemedicine journal and e-health : the official journal of the American Telemedicine Association.

[10]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[11]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[12]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[13]  Tingshao Zhu,et al.  How did the Suicide Act and Speak Differently Online? Behavioral and Linguistic Features of China's Suicide Microblog Users , 2014, ArXiv.

[14]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[15]  K. Bretonnel Cohen,et al.  Sentiment Analysis of Suicide Notes: A Shared Task , 2012, Biomedical informatics insights.

[16]  R. Wyer,et al.  Mood as Input: People Have to Interpret the Motivational Implications of Their Moods Moods and Processing , 2022 .

[17]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[18]  He Li,et al.  Developing Simplified Chinese Psychological Linguistic Analysis Dictionary for Microblog , 2013, Brain and Health Informatics.

[19]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[21]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[22]  Liang-Chih Yu,et al.  Identifying Emotion Labels from Psychiatric Social Texts Using Independent Component Analysis , 2014, COLING.

[23]  Jie Huang,et al.  Psychological stress detection from cross-media microblog data using Deep Sparse Neural Network , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[24]  Li Sun,et al.  A Depression Detection Model Based on Sentiment Analysis in Micro-blog Social Network , 2013, PAKDD Workshops.

[25]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[26]  Lin Li,et al.  Predicting Active Users' Personality Based on Micro-Blogging Behaviors , 2014, PloS one.

[27]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[28]  Michael Chau,et al.  Temporal and computerized psycholinguistic analysis of the blog of a Chinese adolescent suicide. , 2014, Crisis.

[29]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[30]  Qijin Cheng,et al.  Responses to a self-presented suicide attempt in social media: a social network analysis. , 2013, Crisis.

[31]  A. Leenaars,et al.  Suicide Note Classification Using Natural Language Processing: A Content Analysis , 2010, Biomedical informatics insights.