Transgender Community Sentiment Analysis from Social Media Data: A Natural Language Processing Approach

Transgender community is experiencing a huge disparity in mental health conditions compared with the general population. Interpreting the social medial data posted by transgender people may help us understand the sentiments of these sexual minority groups better and apply early interventions. In this study, we manually categorize 300 social media comments posted by transgender people to the sentiment of negative, positive, and neutral. 5 machine learning algorithms and 2 deep neural networks are adopted to build sentiment analysis classifiers based on the annotated data. Results show that our annotations are reliable with a high Cohen's Kappa score over 0.8 across all three classes. LSTM model yields an optimal performance of accuracy over 0.85 and AUC of 0.876. Our next step will focus on using advanced natural language processing algorithms on a larger annotated dataset.

[1]  A. Lucksted Lesbian, Gay, Bisexual, and Transgender People Receiving Services in the Public Mental Health System: Raising Issues , 2004 .

[2]  Dinesh Kumar Vishwakarma,et al.  Sentiment analysis using deep learning architectures: a review , 2019, Artificial Intelligence Review.

[3]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[4]  Rong Jin,et al.  Understanding bag-of-words model: a statistical framework , 2010, Int. J. Mach. Learn. Cybern..

[5]  Yuan Luo,et al.  Graph Convolutional Networks for Text Classification , 2018, AAAI.

[6]  Hanyin Wang,et al.  Using Machine Learning to Integrate Socio-Behavioral Factors in Predicting Cardiovascular-Related Mortality Risk , 2019, MedInfo.

[7]  P. Scheepers,et al.  Disapproval of Homosexuality: Comparative Research on Individual and National Determinants of Disapproval of Homosexuality in 20 European Countries , 2013 .

[8]  Xiaoqian Jiang,et al.  Early Prediction of Acute Kidney Injury in Critical Care Setting Using Clinical Notes , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9]  Eline M. van den Broek-Altenburg,et al.  Using Social Media to Identify Consumers’ Sentiments towards Attributes of Health Insurance during Enrollment Season , 2019, Applied Sciences.

[10]  Kristen Schilt,et al.  Doing Gender, Doing Heteronormativity , 2009 .

[11]  Paul J. Kennedy,et al.  An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit , 2020, Inf. Process. Manag..

[12]  A. Abramovich,et al.  Transgender-inclusive care , 2019, Canadian Medical Association Journal.

[13]  P. Mielke,et al.  A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters , 1988 .

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  R. Blanchard,et al.  Heterosexual and homosexual gender dysphoria , 1987, Archives of sexual behavior.

[16]  S. Reisner,et al.  Transgender stigma and health: A critical review of stigma determinants, mechanisms, and interventions. , 2015, Social science & medicine.

[17]  B. Frey,et al.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.

[18]  Erik Cambria,et al.  Tweeting in Support of LGBT?: A Deep Learning Approach , 2019, COMAD/CODS.

[19]  Andrew Y. Ng,et al.  CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning , 2017, ArXiv.

[20]  Ram Mohana Reddy Guddeti,et al.  Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques , 2015, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).

[21]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[22]  Rachmadita Andreswari,et al.  Sentiment Analysis of Social Media Twitter with Case of Anti-LGBT Campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest Algorithm , 2019, Procedia Computer Science.

[23]  Hanyin Wang,et al.  A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports , 2020, 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).