Mental Distress Detection and Triage in Forum Posts: The LT3 CLPsych 2016 Shared Task System

This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts.

[1]  G. Gleser,et al.  An analysis of the verbal content of suicide notes. , 1960, The British journal of medical psychology.

[2]  Matthew Mulholland,et al.  Suicidal Tendencies: The Automatic Classification of Suicidal and Non-Suicidal Lyricists Using NLP , 2013, IJCNLP.

[3]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[4]  Arzucan Özgür,et al.  BOUNCE: Sentiment Classification in Twitter using Rich Feature Sets , 2013, *SEMEVAL.

[5]  Véronique Hoste,et al.  Recognising suicidal messages in Dutch social media , 2014, LREC.

[6]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[7]  Michael D. Barnes,et al.  Tracking suicide risk factors through Twitter in the US. , 2014, Crisis.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  J. Pennebaker,et al.  Word Use in the Poetry of Suicidal and Nonsuicidal Poets , 2001, Psychosomatic medicine.

[10]  Thang Nguyen,et al.  The University of Maryland CLPsych 2015 Shared Task System , 2015, CLPsych@HLT-NAACL.

[11]  Tingshao Zhu,et al.  Identifying Chinese Microblog Users With High Suicide Probability Using Internet-Based Profile and Linguistic Features: Classification Model , 2015, JMIR mental health.

[12]  Carol Friedman,et al.  Methods for Identifying Suicide or Suicidal Ideation in EHRs , 2012, AMIA.

[13]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[14]  Chern Li Liew,et al.  Hunting Suicide Notes in Web 2.0 - Preliminary Findings , 2007, Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007).

[15]  Jess Jann Shapero The language of suicide notes , 2011 .

[16]  Els Lefever,et al.  TExSIS: Bilingual terminology extraction from parallel corpora using chunk-based alignment. , 2013 .

[17]  Walter Daelemans,et al.  Pattern for Python , 2012, J. Mach. Learn. Res..

[18]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[19]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[20]  C. Osgood,et al.  Motivation and language behavior: a content analysis of suicide notes. , 1959, Journal of abnormal psychology.

[21]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.