Learning Models for Suicide Prediction from Social Media Posts

We propose a deep learning architecture and test three other machine learning models to automatically detect individuals that will attempt suicide within (1) 30 days and (2) six months, using their social media post data provided in the CL-Psych-Challenge. Additionally, we create and extract three sets of handcrafted features for suicide detection based on the three-stage theory of suicide and prior work on emotions and the use of pronouns among persons exhibiting suicidal ideations. Extensive experimentations show that some of the traditional machine learning methods outperform the baseline with an F1 score of 0.741 and F2 score of 0.833 on subtask 1 (prediction of a suicide attempt 30 days prior). However, the proposed deep learning method outperforms the baseline with F1 score of 0.737 and F2 score of 0.843 on subtask2 (prediction of suicide 6 months prior).

[1]  T. Forkmann,et al.  Entrapment, perceived burdensomeness and thwarted belongingness as predictors of suicide ideation , 2017, Psychiatry Research.

[2]  Shiwen Yu,et al.  An Improved k-Nearest Neighbor Algorithm for Text Categorization , 2003, ArXiv.

[3]  Natasha Jaques,et al.  Analysis of Online Suicide Risk with Document Embeddings and Latent Dirichlet Allocation , 2019, 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW).

[4]  Véronique Hoste,et al.  Emotion detection in suicide notes , 2013, Expert Syst. Appl..

[5]  Xiaohao He,et al.  Latent Suicide Risk Detection on Microblog via Suicide-Oriented Word Embeddings and Layered Attention , 2019, EMNLP.

[6]  D. Low,et al.  Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on Reddit during COVID-19. , 2020 .

[7]  Saif Mohammad,et al.  Word Affect Intensities , 2017, LREC.

[8]  T. Joiner,et al.  Role of Thwarted Belongingness and Perceived Burdensomeness in the Relationship between Violent Daydreaming and Suicidal Ideation in Two Adult Samples. , 2018, Journal of aggression, conflict and peace research.

[9]  E. D. Klonsky,et al.  The Three-Step Theory (3ST): A New Theory of Suicide Rooted in the "Ideation-to-Action" Framework , 2015 .

[10]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Philip Resnik,et al.  Community-level Research on Suicidality Prediction in a Secure Environment: Overview of the CLPsych 2021 Shared Task , 2021, CLPSYCH.

[13]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14]  P. P. Heppner,et al.  Problem-solving appraisal, stress, hopelessness, and suicide ideation in a college population. , 1991 .

[15]  Glen Coppersmith,et al.  Exploratory Analysis of Social Media Prior to a Suicide Attempt , 2016, CLPsych@HLT-NAACL.

[16]  S. Stack Differentiating suicide ideators from attempters: violence--a research note. , 2014, Suicide & life-threatening behavior.

[17]  Lei Zhang,et al.  Using Linguistic Features to Estimate Suicide Probability of Chinese Microblog Users , 2014, HCC.

[18]  Huijun Zhang,et al.  Building and Using Personal Knowledge Graph to Improve Suicidal Ideation Detection on Social Media , 2020, IEEE Transactions on Multimedia.

[19]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[20]  Satrajit S. Ghosh,et al.  Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study , 2020, Journal of medical Internet research.

[21]  Rajarathnam Chandramouli,et al.  An analytical system for user emotion extraction, mental state modeling, and rating , 2019, Expert Syst. Appl..

[22]  Sumithra Velupillai,et al.  Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing , 2018, Scientific Reports.

[23]  Faisal Muhammad Shah,et al.  A Hybridized Feature Extraction Approach To Suicidal Ideation Detection From Social Media Post , 2020, 2020 IEEE Region 10 Symposium (TENSYMP).

[24]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[25]  Alex B. Fine,et al.  Natural Language Processing of Social Media as Screening for Suicide Risk , 2018, Biomedical informatics insights.

[26]  F. Crestani,et al.  Suicide Risk Assessment on Social Media: USI-UPF at the CLPsych 2019 Shared Task , 2019, Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology.

[27]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[28]  M. Åsberg,et al.  Shame-proneness in attempted suicide patients , 2012, BMC Psychiatry.

[29]  Xinyu Dong,et al.  Detection of Suicidality Among Opioid Users on Reddit: Machine Learning–Based Approach , 2020, Journal of medical Internet research.

[30]  Z. Kaminsky,et al.  A machine learning approach predicts future risk to suicidal ideation from social media data , 2020, npj Digital Medicine.

[31]  Guodong Long,et al.  Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications , 2019, IEEE Transactions on Computational Social Systems.

[32]  Pushpak Bhattacharyya,et al.  A Multitask Framework to Detect Depression, Sentiment and Multi-label Emotion from Suicide Notes , 2021, Cognitive Computation.

[33]  Ramit Sawhney,et al.  Exploring and Learning Suicidal Ideation Connotations on Social Media with Deep Learning , 2018, WASSA@EMNLP.

[34]  P. Resnik,et al.  CLPsych 2019 Shared Task: Predicting the Degree of Suicide Risk in Reddit Posts , 2019, Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Ahmet Emre Aladağ,et al.  Detecting Suicidal Ideation on Forums: Proof-of-Concept Study , 2018, Journal of medical Internet research.

[37]  J. M. Gonfaus,et al.  Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis , 2020, Journal of medical Internet research.

[38]  Ying LU,et al.  Decision tree methods: applications for classification and prediction , 2015, Shanghai archives of psychiatry.

[39]  D. Wolk-Wasserman The intensive care unit and the suicide attempt patient , 1985, Acta psychiatrica Scandinavica.

[40]  Susan M Roubidoux Linguistic Manifestations of Power in Suicide Notes : an Investigation of Personal Pronouns , 2012 .

[41]  Wiebke Wagner,et al.  Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit , 2010, Lang. Resour. Evaluation.

[42]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.