A Novel Co-Training-Based Approach for the Classification of Mental Illnesses Using Social Media Posts

Context: Recently, research community of certain domain showing their eagerness towards the use of social media networks to gain constructive knowledge in decision making and automation, such as aid to perform software development activities, crypto-currencies usage, network community detection and recommendation and so on. Recently, besides other domains of eHealth, the use of social media and big data analytics has become hot topic to predict the patient of mental illness involved in either depression, schizophrenia, eating disorders, anxiety or addictive behaviors. Problem: Traditional methods either need enough historic data or to keep the regular monitoring on patient activities for identification of a patient associated with a mental illness disease. Method: In order to address this issue, we propose a methodology to classify the patients associated with chronic mental illness diseases (i.e. Anxiety, Depression, Bipolar, and ADHD (Attention Deficit Hyperactivity Disorder) based on the data extracted from the Reddit, a well-known network community platform. The proposed method is employed through Co-training (type of semi-supervised learning approach) technique by incorporating the discriminative power of widely used classifiers namely Random Forrest (RF), Support Vector Machine (SVM), and Naïve Bayes (NB). We used Reddit API to download posts and top five associated comments for construction of a feature space. Results: The experimental results indicate the effectiveness of Co-training based classification rather than the state of the art classifiers by a margin of 3% on average in par with every state of art technique. In future, the proposed method could be employed to investigate any classification problem of any domain by extracting date from the social media.

[1]  Mark Heitmann,et al.  Comparing automated text classification methods , 2019, International Journal of Research in Marketing.

[2]  Qasem A. Al-Radaideh,et al.  Integrating associative rule-based classification with Naïve Bayes for text classification , 2018, Appl. Soft Comput..

[3]  Leonardo Max Batista Claudino,et al.  Beyond LDA: Exploring Supervised Topic Modeling for Depression-Related Language in Twitter , 2015, CLPsych@HLT-NAACL.

[4]  Arif Ali Khan,et al.  Software design patterns classification and selection using text categorization approach , 2017, Appl. Soft Comput..

[5]  Mark Dredze,et al.  From ADHD to SAD: Analyzing the Language of Mental Health on Twitter through Self-Reported Diagnoses , 2015, CLPsych@HLT-NAACL.

[6]  Alper Kursat Uysal,et al.  On Two-Stage Feature Selection Methods for Text Classification , 2018, IEEE Access.

[7]  Jian Weng,et al.  Feature selection for text classification: A review , 2018, Multimedia Tools and Applications.

[8]  Munmun De Choudhury,et al.  A Social Media Based Index of Mental Well-Being in College Campuses , 2017, CHI.

[9]  Shahid Hussain A methodology to predict the instable classes: student research abstract , 2017, SAC.

[10]  Maarten Sap,et al.  Towards Assessing Changes in Degree of Depression through Facebook , 2014, CLPsych@ACL.

[11]  J. Unützer,et al.  Exploring opportunities to support mental health care using social media: A survey of social media users with mental illness , 2019, Early Intervention in Psychiatry.

[12]  Liyana Shuib,et al.  Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study. , 2017, Journal of forensic and legal medicine.

[13]  Mouloud Koudil,et al.  A Novel Active Learning Method Using SVM for Text Classification , 2018, Int. J. Autom. Comput..

[14]  Arif Ali Khan,et al.  Automated framework for classification and selection of software design patterns , 2019, Appl. Soft Comput..

[15]  Adrian B. R. Shatte,et al.  Machine learning in mental health: a scoping review of methods and applications , 2019, Psychological Medicine.

[16]  Phillip Wolff,et al.  Predicting future mental illness from social media: A big-data approach , 2019, Behavior research methods.

[17]  J. Pennebaker,et al.  The Electronically Activated Recorder (EAR): A device for sampling naturalistic daily activities and conversations , 2001, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[18]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[19]  Sumithra Velupillai,et al.  Corrigendum: Characterisation of mental health conditions in social media using Informed Deep Learning , 2017, Scientific reports.

[20]  Yu Xue,et al.  Text classification based on deep belief network and softmax regression , 2016, Neural Computing and Applications.