Discovery of Informal Topics from Post Traumatic Stress Disorder Forums

Post Traumatic Stress Disorder (PTSD) is a public health problem afflicting millions of people each year. It is especially prominent among military veterans. Understanding the language, attitudes, and topics associated with PTSD presents an important and challenging problem. Based on their expertise, mental health professionals have constructed a formal definition of PTSD. However, even the most assiduous mental health professionals can care for only a small fraction of those suffering from PTSD, limiting their perspective of the disorder. As social networking sites have grown in acceptance, users have begun to express personal thoughts and feelings, such as those related to PTSD. This wealth of content can be viewed as an enormous collective description of PTSD and its related issues. We automatically extract informal latent topics from thousands of social media posts in which users describe their experience with PTSD and compare these topics to the formal description generated by mental health professionals. We then explore the pattern and associations of these topics. Our informal topic discovery evaluation reveals that we can successfully identify meaningful topics in PTSD social media related data. When comparing our topics to the criteria included in the Diagnostic and Statistical Manual of Mental Disorders (DSM), we found that we were able to automatically reproduce many of the criteria. We also discovered new topics which were not mentioned in the DSM, but were prevalent across the collaborative narrative of thousands of user's experience with PTSD.

[1]  E. Foa,et al.  Linguistic predictors of trauma pathology and physical health , 2001 .

[2]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[3]  Brian D. Davison,et al.  Empirical study of topic modeling in Twitter , 2010, SOMA '10.

[4]  Mark Dredze,et al.  Discovering Shifts to Suicidal Ideation from Mental Health Content in Social Media , 2016, CHI.

[5]  Christophe Giraud-Carrier,et al.  Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality , 2016, JMIR mental health.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[8]  Jonathan Gemmell,et al.  Infusing Collaborative Recommenders with Distributed Representations , 2016, DLRS@RecSys.

[9]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[10]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[11]  D. Hand,et al.  Prospecting for gems in credit card data , 2001 .

[12]  F. Jones,et al.  Psychiatry in the U.S. Army: Lessons for Community Psychiatry , 2005 .

[13]  David A. Bell,et al.  A Data Mining methodology for cross-sales , 1998, Knowl. Based Syst..

[14]  Rebecca P. Ang,et al.  An introduction to association rule mining: An application in counseling and help-seeking behavior of adolescents , 2007, Behavior research methods.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Michael D. Barnes,et al.  Tracking suicide risk factors through Twitter in the US. , 2014, Crisis.

[17]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[18]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[19]  Glen A. Coppersmith,et al.  Quantifying Suicidal Ideation via Language Usage on Social Media , 2015 .

[20]  H. Christensen,et al.  Detecting suicidality on Twitter , 2015 .

[21]  高橋 栄 Diagnostic and Statistical Manual of Mental Disorders(DSM)-5による分類と診断 (特集 周産期メンタルヘルス : 妊婦の不安とどう立ち向かうか) , 2014 .

[22]  Mark Dredze,et al.  Quantifying Mental Health Signals in Twitter , 2014, CLPsych@ACL.

[23]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[24]  R. O’Kearney,et al.  Trauma narratives in posttraumatic stress disorder: a review. , 2006, Journal of traumatic stress.

[25]  Gal Chechik,et al.  Euclidean Embedding of Co-occurrence Data , 2004, J. Mach. Learn. Res..

[26]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[27]  Gilles Louppe,et al.  Independent consultant , 2013 .

[28]  J. Bisson,et al.  Post-traumatic stress disorder. , 2007, BMJ clinical evidence.

[29]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[30]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[31]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[32]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[33]  D C Torney,et al.  Discovery of association rules in medical data , 2001, Medical informatics and the Internet in medicine.

[34]  Mark Dredze,et al.  Measuring Post Traumatic Stress Disorder in Twitter , 2014, ICWSM.

[35]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[36]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[37]  Ramesh C Agarwal,et al.  Depth first generation of long patterns , 2000, KDD '00.

[38]  Lior Wolf,et al.  Joint word2vec Networks for Bilingual Semantic Representations , 2014, Int. J. Comput. Linguistics Appl..

[39]  Maria Liakata,et al.  The language of mental health problems in social media , 2016, CLPsych@HLT-NAACL.

[40]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[41]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[42]  Kees van Heeringen,et al.  Brain Imaging: Healthy Networks for Suicide Prevention , 2014 .

[43]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[44]  Aaron Smith,et al.  6% of online adults are reddit users , 2013 .

[45]  Mark Dredze,et al.  Detecting Changes in Suicide Content Manifested in Social Media Following Celebrity Suicides , 2015, HT.

[46]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[47]  Andrew Stranieri,et al.  Discovering Interesting Association Rules from Legal Databases , 2002 .

[48]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.