Expert, Crowdsourced, and Machine Assessment of Suicide Risk via Online Postings

We report on the creation of a dataset for studying assessment of suicide risk via online postings in Reddit. Evaluation of risk-level annotations by experts yields what is, to our knowledge, the first demonstration of reliability in risk assessment by clinicians based on social media postings. We also introduce and demonstrate the value of a new, detailed rubric for assessing suicide risk, compare crowdsourced with expert performance, and present baseline predictive modeling experiments using the new dataset, which will be made available to researchers through the American Association of Suicidology.

[1]  J. Jenkins,et al.  Simplification of Flesch Reading Ease Formula. , 1951 .

[2]  E A Smith,et al.  Automated readability index. , 1967, AMRL-TR. Aerospace Medical Research Laboratories.

[3]  R. Gunning The Technique of Clear Writing. , 1968 .

[4]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[5]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[6]  M. Coleman,et al.  A computer readability formula designed for machine scoring. , 1975 .

[7]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[8]  Jonathan Anderson Lix and Rix: Variations on a Little-Known Readability Index. , 1983 .

[9]  A. Pokorny Prediction of suicide in psychiatric patients. Report of a prospective study. , 1983, Archives of general psychiatry.

[10]  R. Goldstein,et al.  The prediction of suicide. Sensitivity, specificity, and predictive value of a multivariate model applied to suicide among 1906 patients with affective disorders. , 1991, Archives of general psychiatry.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[13]  K. Krippendorff Reliability in Content Analysis: Some Common Misconceptions and Recommendations , 2004 .

[14]  T. Joiner,et al.  Scientizing and Routinizing the Assessment of Suicidality in Outpatient Practice , 2004 .

[15]  Kelly C. Cukrowicz,et al.  Evidence-based assessment of depression in adults. , 2005, Psychological assessment.

[16]  Kimberly M. Christopherson The positive and negative implications of anonymity in Internet social interactions: "On the Internet, Nobody Knows You're a Dog" , 2007, Comput. Hum. Behav..

[17]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[18]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[19]  H. Christensen,et al.  The efficacy of internet interventions for depression and anxiety disorders: a review of randomised controlled trials , 2010, The Medical journal of Australia.

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[21]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[22]  M. Moreno,et al.  College Students’ Responses to Mental Health Status Updates on Facebook , 2013, Issues in mental health nursing.

[23]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[24]  Cecilia Ovesdotter Alm,et al.  Toward Macro-Insights for Suicide Prevention: Analyzing Fine-Grained Distress at Scale , 2014, CLPsych@ACL.

[25]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[26]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[27]  Mark Dredze,et al.  Quantifying Mental Health Signals in Twitter , 2014, CLPsych@ACL.

[28]  Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, CLPsych@ACL 2014, Baltimore, Maryland, USA, June 27, 2004 , 2014, CLPsych@ACL.

[29]  Munmun De Choudhury,et al.  Mental Health Discourse on reddit: Self-Disclosure, Social Support, and Anonymity , 2014, ICWSM.

[30]  A. Young,et al.  Suicide attempts in major depressive episode: evidence from the BRIDGE‐II‐Mix study , 2015, Bipolar disorders.

[31]  Thang Nguyen,et al.  The University of Maryland CLPsych 2015 Shared Task System , 2015, CLPsych@HLT-NAACL.

[32]  Munmun De Choudhury,et al.  Identity Management and Mental Health Discourse in Social Media , 2015, WWW.

[33]  Kevin A Padrez,et al.  Linking social media and medical record data: a study of adults presenting to an academic, urban emergency department , 2015, BMJ Quality & Safety.

[34]  H. Christensen,et al.  A systematic review and evaluation of measures for suicidal ideation and behaviors in population-based research. , 2015, Psychological assessment.

[35]  Glen A. Coppersmith,et al.  Quantifying Suicidal Ideation via Language Usage on Social Media , 2015 .

[36]  Tracy K. Witte,et al.  College Students' Responses to Suicidal Content on Social Networking Sites: An Examination Using a Simulated Facebook Newsfeed. , 2016, Suicide & life-threatening behavior.

[37]  Philip Resnik,et al.  The GW/UMD CLPsych 2016 Shared Task System , 2016, CLPsych@HLT-NAACL.

[38]  Glen Coppersmith,et al.  Exploratory Analysis of Social Media Prior to a Suicide Attempt , 2016, CLPsych@HLT-NAACL.

[39]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[40]  Mark Dredze,et al.  Discovering Shifts to Suicidal Ideation from Mental Health Content in Social Media , 2016, CHI.

[41]  Iryna Gurevych,et al.  Supersense Embeddings: A Unified Model for Supersense Interpretation, Prediction, and Utilization , 2016, ACL.

[42]  Mike Conway,et al.  Social Media, Big Data, and Mental Health: Current Advances and Ethical Implications. , 2016, Current opinion in psychology.

[43]  Rafael A. Calvo,et al.  CLPsych 2016 Shared Task: Triaging content in online peer-support forums , 2016, CLPsych@HLT-NAACL.

[44]  Michael S. Bernstein,et al.  Empath: Understanding Topic Signals in Large-Scale Text , 2016, CHI.

[45]  Mark Dredze,et al.  Ethical Research Protocols for Social Media Health Research , 2017, EthNLP@EACL.

[46]  Kristy Hollingshead,et al.  Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology –- From Linguistic Signal to Clinical Reality , 2017 .

[47]  Tong Liu,et al.  Learning from various labeling strategies for suicide-related messages on social media: An experimental study , 2017, ArXiv.

[48]  James Pustejovsky,et al.  SemEval-2017 Task 12: Clinical TempEval , 2017, *SEMEVAL.

[49]  Fabienne Lind,et al.  Content Analysis by the Crowd: Assessing the Usability of Crowdsourcing for Coding Latent Constructs , 2017, Communication methods and measures.

[50]  Sharath Chandra Guntuku,et al.  Detecting depression and mental illness on social media: an integrative review , 2017, Current Opinion in Behavioral Sciences.

[51]  Nazli Goharian,et al.  Depression and Self-Harm Risk Assessment in Online Forums , 2017, EMNLP.

[52]  Rafael A. Calvo,et al.  Natural language processing in mental health applications using non-clinical texts† , 2017, Natural Language Engineering.

[53]  Sandra Bringay,et al.  Detection of suicide-related posts in Twitter data streams , 2018, IBM J. Res. Dev..