Detection and Classification of mental illnesses on social media using RoBERTa

Given the current social distancing regulations across the world, social media has become the primary mode of communication for most people. This has resulted in the isolation of many people suffering from mental illnesses who are unable to receive assistance in person. They have increasingly turned to social media to express themselves and to look for guidance in dealing with their illnesses. Keeping this in mind, we propose a solution to detect and classify mental illness posts on social media thereby enabling users to seek appropriate help. In this work, we detect and classify five prominent kinds of mental illnesses: depression, anxiety, bipolar disorder, ADHD and PTSD by analyzing unstructured user data on social media platforms. In addition, we are sharing a new high-quality dataset to drive research on this topic. We believe that our work is the first multi-class model that uses a Transformer-based architecture such as RoBERTa to analyze people's emotions and psychology. We also demonstrate how we stress-test our model using behavioral testing. With this research, we hope to be able to contribute to the public health system by automating some of the detection and classification process.

[1]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[2]  Michael Strube,et al.  Adapting Deep Learning Methods for Mental Health Prediction on Social Media , 2019, EMNLP.

[3]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[4]  P. Resnik,et al.  CLPsych 2019 Shared Task: Predicting the Degree of Suicide Risk in Reddit Posts , 2019, Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology.

[5]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[6]  Eunil Park,et al.  A deep learning model for detecting mental illness from user content on social media , 2020, Scientific Reports.

[7]  Dirk Hovy,et al.  Multitask Learning for Mental Health Conditions with Limited Social Media Data , 2017, EACL.

[8]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Sameer Singh,et al.  Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.

[11]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Diana Inkpen,et al.  Deep Learning for Depression Detection of Twitter Users , 2018, CLPsych@NAACL-HTL.

[16]  Mark Dredze,et al.  Shared Task : Depression and PTSD on Twitter , 2015 .

[17]  R. Dobson,et al.  Characterisation of mental health conditions in social media using Informed Deep Learning , 2017, Scientific Reports.