Psychological Language on Twitter Predicts County-Level Heart Disease Mortality

Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions—especially anger—emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  Philip J. Stone,et al.  Extracting Information. (Book Reviews: The General Inquirer. A Computer Approach to Content Analysis) , 1967 .

[3]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[4]  J. Rowe,et al.  Human aging: usual and successful. , 1987, Science.

[5]  G. Pugliese,et al.  Severe Streptococcus pyogenes Infections, United Kingdom, 2003–2004 , 2008, Emerging infectious diseases.

[6]  B. Fredrickson,et al.  The Undoing Effect of Positive Emotions , 2000, Motivation and emotion.

[7]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[8]  R. Rosenthal,et al.  Meta-analysis: recent developments in quantitative methods for literature reviews. , 2001, Annual review of psychology.

[9]  R. Rugulies Depression as a predictor for coronary heart disease. a review and meta-analysis. , 2002, American journal of preventive medicine.

[10]  I. Kawachi,et al.  Social capital and neighborhood mortality rates in Chicago. , 2003, Social science & medicine.

[11]  T. Farley,et al.  Why is poverty unhealthy? Social and physical mediators. , 2003, Social science & medicine.

[12]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  A. Sherwood,et al.  Depression as a Risk Factor for Coronary Artery Disease: Evidence, Mechanisms, and Treatment , 2004, Psychosomatic medicine.

[15]  Sheldon Cohen,et al.  Does positive affect influence health? , 2005, Psychological bulletin.

[16]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[17]  A. Leyland Socioeconomic gradients in the prevalence of cardiovascular disease in Scotland: the roles of composition and context , 2005, Journal of Epidemiology and Community Health.

[18]  Cindy K. Chung,et al.  The development and psychometric properties of LIWC2007 , 2007 .

[19]  L. Gauvin,et al.  Toward the next generation of research into small area effects on health: a synthesis of multilevel investigations published since July 1998 , 2007, Journal of Epidemiology & Community Health.

[20]  Margaret L. Kern,et al.  Health benefits: Meta-analytically determining the impact of well-being on objective health outcomes , 2007 .

[21]  A. Steptoe,et al.  Positive Psychological Well-Being and Mortality: A Quantitative Review of Prospective Observational Studies , 2008, Psychosomatic medicine.

[22]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[23]  A. Steptoe,et al.  The association of anger and hostility with future coronary heart disease: a meta-analytic review of prospective evidence. , 2009, Journal of the American College of Cardiology.

[24]  B. Chaix,et al.  Neighbourhoods in eco-epidemiologic research: delimiting personal exposure areas. A response to Riva, Gauvin, Apparicio and Brodeur. , 2009, Social science & medicine.

[25]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[26]  Patty Kostkova,et al.  Early Warning and Outbreak Detection Using Social Networking Websites: The Potential of Twitter , 2009, eHealth.

[27]  G. Eysenbach Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet , 2009, Journal of medical Internet research.

[28]  Susannah Fox,et al.  Twitter and status updating , 2009 .

[29]  A. D. Diez Roux,et al.  Neighborhoods and health , 2010, Annals of the New York Academy of Sciences.

[30]  J. Denollet,et al.  Anxiety and risk of incident coronary heart disease: a meta-analysis. , 2010, Journal of the American College of Cardiology.

[31]  J. Denollet,et al.  Anxiety and risk of incident coronary heart disease : A meta-analysis , 2010 .

[32]  D. Mozaffarian,et al.  Defining and Setting National Goals for Cardiovascular Health Promotion and Disease Reduction: The American Heart Association's Strategic Impact Goal Through 2020 and Beyond , 2010, Circulation.

[33]  J. Aucott,et al.  The utility of "Google Trends" for epidemiological research: Lyme disease as an example. , 2010, Geospatial health.

[34]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[35]  A. Denison,et al.  Accuracy of Death Certifications and the Implications for Studying Disease Burdens , 2010 .

[36]  Victor R. Preedy,et al.  Handbook of disease burdens and quality of life measures , 2010 .

[37]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[38]  C. Chui,et al.  Article in Press Applied and Computational Harmonic Analysis a Randomized Algorithm for the Decomposition of Matrices , 2022 .

[39]  J. O’Keefe,et al.  Psychological Risk Factors and Cardiovascular Disease: Is it All in Your Head? , 2011, Postgraduate medicine.

[40]  A. Alwan Global status report on noncommunicable diseases 2010. , 2011 .

[41]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[42]  E. Ford,et al.  Proportion of the decline in cardiovascular mortality disease due to prevention versus treatment: public health versus clinical care. , 2011, Annual review of public health.

[43]  Connie St Louis,et al.  Can Twitter predict disease outbreaks? , 2012, BMJ : British Medical Journal.

[44]  A. D. Diez Roux,et al.  A review of spatial methods in epidemiology, 2000-2010. , 2012, Annual review of public health.

[45]  L. Kubzansky,et al.  The heart's content: the association between positive psychological well-being and cardiovascular health. , 2012, Psychological bulletin.

[46]  Eric Horvitz,et al.  Social media as a measurement tool of depression in populations , 2013, WebSci.

[47]  E. Diener,et al.  Social relations, health behaviors, and health outcomes: a survey and synthesis. , 2013, Applied psychology. Health and well-being.

[48]  J. Brownstein,et al.  Influenza A (H7N9) and the importance of digital epidemiology. , 2013, The New England journal of medicine.

[49]  Megha Agrawal,et al.  Characterizing Geographic Variation in Well-Being Using Tweets , 2013, ICWSM.

[50]  Henriette Cramer,et al.  Representation and communication: challenges in interpreting large social media datasets , 2013, CSCW.

[51]  Margaret L. Kern,et al.  Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach , 2013, PloS one.

[52]  Justin Grimmer,et al.  Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.

[53]  J. Kaski,et al.  Atherosclerotic Heart Disease , 2014 .

[54]  Margaret L. Kern,et al.  Personality, well-being, and health. , 2014, Annual review of psychology.

[55]  Epidemiology and prevention of cardiovascular disease , 2015 .

[56]  J. Towbin,et al.  “ Defining and Setting National Goals for Cardiovascular Health Promotion and Disease Reduction : The American Heart Association ’ s Strategic Impact , 2016 .