Privacy Aware Question-Answering System for Online Mental Health Risk Assessment

Social media platforms have enabled individuals suffering from mental illnesses to share their lived experiences and find the online support necessary to cope. However, many users fail to receive genuine clinical support, thus exacerbating their symptoms. Screening users based on what they post online can aid providers in administering targeted healthcare and minimize false positives. Pre-trained Language Models (LMs) can assess users’ social media data and classify them in terms of their mental health risk. We propose a Question-Answering (QA) approach to assess mental health risk using the Unified-QA model on two large mental health datasets. To protect user data, we extend Unified-QA by anonymizing the model training process using differential privacy. Our results demonstrate the effectiveness of modeling risk assessment as a QA task, specifically for mental health use cases. Furthermore, the model’s performance decreases by less than 1% with the inclusion of differential privacy. The proposed system’s performance is indicative of a promising research direction that will lead to the development of privacy-aware diagnostic systems.

[1]  Lei A. Clifton,et al.  Lightweight Transformers for Clinical Natural Language Processing , 2023, ArXiv.

[2]  G. Štiglic,et al.  Review of artificial intelligence‐based question‐answering systems in healthcare , 2023, WIREs Data. Mining. Knowl. Discov..

[3]  Rajkumar Tekchandani,et al.  Adaptive federated learning scheme for recognition of malicious attacks in an IoT network , 2023, Computing.

[4]  Jason L. Pacheco,et al.  EW-Tune: A Framework for Privately Fine-Tuning Large Language Models with Differential Privacy , 2022, 2022 IEEE International Conference on Data Mining Workshops (ICDMW).

[5]  D. Clifton,et al.  On the effectiveness of compact biomedical transformers , 2022, Bioinform..

[6]  Annika Marie Schoene,et al.  Natural language processing applied to mental illness detection: a narrative review , 2022, npj Digital Medicine.

[7]  Yalew Zelalem Jembre,et al.  A Novel Text Mining Approach for Mental Health Prediction Using Bi-LSTM and BERT Model , 2022, Computational intelligence and neuroscience.

[8]  Florian Tramèr,et al.  What Does it Mean for a Language Model to Preserve Privacy? , 2022, FAccT.

[9]  Uday Kamath,et al.  PsychBERT: A Mental Health Language Model for Social Media Mental Health Behavioral Analysis , 2021, 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[10]  Erik Cambria,et al.  MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare , 2021, LREC.

[11]  Shaofu Lin,et al.  Mental Health Question and Answering System Based on Bert Model and Knowledge Graph Technology , 2021, ISAIMS.

[12]  Javier Parapar,et al.  Multiple-Choice Question Answering Models for Automatic Depression Severity Estimation , 2021, Engineering Proceedings.

[13]  Tiasa Singha Roy,et al.  Benchmarking Differential Privacy and Federated Learning for BERT Models , 2021, ArXiv.

[14]  Zhiyuan Liu,et al.  Pre-Trained Models: Past, Present and Future , 2021, AI Open.

[15]  Colin Raffel,et al.  Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.

[16]  InkpenDiana,et al.  Using Social Media for Mental Health Surveillance , 2020, ACM Comput. Surv..

[17]  Sushma Ravichandran,et al.  Detection and Classification of mental illnesses on social media using RoBERTa , 2020, ArXiv.

[18]  Julia Hirschberg,et al.  Detection of Mental Health from Reddit via Deep Contextualized Representations , 2020, LOUHI.

[19]  Satrajit S. Ghosh,et al.  Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study , 2020, Journal of medical Internet research.

[20]  Hannaneh Hajishirzi,et al.  UnifiedQA: Crossing Format Boundaries With a Single QA System , 2020, FINDINGS.

[21]  Kirk Roberts,et al.  Evaluation of Dataset Selection for Pre-Training and Fine-Tuning Transformer Language Models for Clinical Question Answering , 2020, LREC.

[22]  Prateek Chhikara,et al.  An Ensemble Approach for Extractive Text Summarization , 2020, 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE).

[23]  Suvrit Sra,et al.  Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity , 2019, ICLR.

[24]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[25]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[26]  Quoc V. Le,et al.  Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.

[27]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[28]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[29]  M. Aktaş,et al.  A Systematic Literature Review of Question Answering: Research Trends, Datasets, Methods , 2022, ICCSA.

[30]  Q. Nisa,et al.  Towards transfer learning using BERT for early detection of self-harm of social media users , 2021, CLEF.

[31]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.