E-Cigarette Surveillance With Social Media Data: Social Bots, Emerging Topics, and Trends

Background As e-cigarette use rapidly increases in popularity, data from online social systems (Twitter, Instagram, Google Web Search) can be used to capture and describe the social and environmental context in which individuals use, perceive, and are marketed this tobacco product. Social media data may serve as a massive focus group where people organically discuss e-cigarettes unprimed by a researcher, without instrument bias, captured in near real time and at low costs. Objective This study documents e-cigarette–related discussions on Twitter, describing themes of conversations and locations where Twitter users often discuss e-cigarettes, to identify priority areas for e-cigarette education campaigns. Additionally, this study demonstrates the importance of distinguishing between social bots and human users when attempting to understand public health–related behaviors and attitudes. Methods E-cigarette–related posts on Twitter (N=6,185,153) were collected from December 24, 2016, to April 21, 2017. Techniques drawn from network science were used to determine discussions of e-cigarettes by describing which hashtags co-occur (concept clusters) in a Twitter network. Posts and metadata were used to describe where geographically e-cigarette–related discussions in the United States occurred. Machine learning models were used to distinguish between Twitter posts reflecting attitudes and behaviors of genuine human users from those of social bots. Odds ratios were computed from 2x2 contingency tables to detect if hashtags varied by source (social bot vs human user) using the Fisher exact test to determine statistical significance. Results Clusters found in the corpus of hashtags from human users included behaviors (eg, #vaping), vaping identity (eg, #vapelife), and vaping community (eg, #vapenation). Additional clusters included products (eg, #eliquids), dual tobacco use (eg, #hookah), and polysubstance use (eg, #marijuana). Clusters found in the corpus of hashtags from social bots included health (eg, #health), smoking cessation (eg, #quitsmoking), and new products (eg, #ismog). Social bots were significantly more likely to post hashtags that referenced smoking cessation and new products compared to human users. The volume of tweets was highest in the Mid-Atlantic (eg, Pennsylvania, New Jersey, Maryland, and New York), followed by the West Coast and Southwest (eg, California, Arizona and Nevada). Conclusions Social media data may be used to complement and extend the surveillance of health behaviors including tobacco product use. Public health researchers could harness these data and methods to identify new products or devices. Furthermore, findings from this study demonstrate the importance of distinguishing between Twitter posts from social bots and humans when attempting to understand attitudes and behaviors. Social bots may be used to perpetuate the idea that e-cigarettes are helpful in cessation and to promote new products as they enter the marketplace.

[1]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[2]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[3]  Gunther Eysenbach,et al.  Infodemiology and infoveillance tracking online health information and cyberbehavior for public health. , 2011, American journal of preventive medicine.

[4]  J. Prochaska,et al.  Applying linguistic methods to understanding smoking-related conversations on Twitter , 2013, Tobacco Control.

[5]  W. Chapman,et al.  Using Twitter to Examine Smoking Behavior and Perceptions of Emerging Tobacco Products , 2013, Journal of medical Internet research.

[6]  Mark Dredze,et al.  Could behavioral medicine lead the web data revolution? , 2014, JAMA.

[7]  S. Emery,et al.  A cross-sectional examination of marketing of electronic cigarettes on Twitter , 2014, Tobacco Control.

[8]  A. Arvidsson,et al.  Echo Chamber or Public Sphere? Predicting Political Orientation and Measuring Political Homophily in Twitter Using Big Data , 2014 .

[9]  Emilio Ferrara,et al.  Manipulation and Abuse on Social Media , 2015, ArXiv.

[10]  Brian A. King,et al.  Tobacco Use Among Middle and High School Students — United States, 2011–2014 , 2015, MMWR. Morbidity and mortality weekly report.

[11]  C. Schoenborn,et al.  Electronic Cigarette Use Among Adults: United States, 2014. , 2015, NCHS data brief.

[12]  Christopher C. Yang,et al.  Diffusion of Messages from an Electronic Cigarette Brand to Potential Users through Twitter , 2015, PloS one.

[13]  J. Samet,et al.  Association of Electronic Cigarette Use With Initiation of Combustible Tobacco Product Smoking in Early Adolescence. , 2015, JAMA.

[14]  Heather Cole-Lewis,et al.  Social Listening: A Content Analysis of E-Cigarette Discussions on Twitter , 2015, Journal of medical Internet research.

[15]  Sherry Emery,et al.  Price-related promotions for tobacco products on Twitter , 2015, Tobacco Control.

[16]  Jennifer B Unger,et al.  E-cigarette use and subsequent cigarette and marijuana use among Hispanic young adults. , 2016, Drug and alcohol dependence.

[17]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[18]  F. Gibbons,et al.  Longitudinal study of e-cigarette use and onset of cigarette smoking among high school students in Hawaii , 2016, Tobacco Control.

[19]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[20]  Mark Dredze,et al.  Revisiting the Rise of Electronic Nicotine Delivery Systems Using Search Query Surveillance. , 2016, American journal of preventive medicine.

[21]  Hongying Dai,et al.  Mining social media data for opinion polarities about electronic cigarettes , 2016, Tobacco Control.

[22]  Kar-Hai Chu,et al.  Vaping on Instagram: cloud chasing, hand checks and product placement , 2016, Tobacco Control.

[23]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[24]  Emilio Ferrara,et al.  The Importance of Debiasing Social Media Data to Better Understand E-Cigarette-Related Attitudes and Behaviors , 2016, Journal of medical Internet research.

[25]  Richard Bonneau,et al.  Text Classification for Automatic Detection of E-Cigarette Use and Use for Smoking Cessation from Twitter: A Feasibility Pilot , 2016, PSB.

[26]  K. Berhane,et al.  Patterns of Alternative Tobacco Product Use: Emergence of Hookah and E-cigarettes as Preferred Products Amongst Youth. , 2016, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[27]  D. Strong,et al.  Association of e-Cigarette Vaping and Progression to Heavier Patterns of Cigarette Smoking. , 2016, JAMA.

[28]  Jon-Patrick Allem,et al.  Campaigns and counter campaigns: reactions on Twitter to e-cigarette education , 2016, Tobacco Control.

[29]  Christopher M. Danforth,et al.  Vaporous Marketing: Uncovering Pervasive Electronic Cigarette Advertisements on Twitter , 2015, PloS one.

[30]  Mark Dredze,et al.  Leveraging Big Data to Improve Health Awareness Campaigns: A Novel Evaluation of the Great American Smokeout , 2016, JMIR public health and surveillance.

[31]  Cameron D. Norman,et al.  Vape, quit, tweet? Electronic cigarettes and smoking cessation on Twitter , 2016, International Journal of Public Health.

[32]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[33]  Mark Dredze,et al.  They’re heating up: Internet search query trends reveal significant public interest in heat-not-burn tobacco products , 2017, PloS one.

[34]  Mark Dredze,et al.  Why do people use electronic nicotine delivery systems (electronic cigarettes)? A content analysis of Twitter, 2012-2015 , 2017, PloS one.

[35]  Laura A. Gibson,et al.  Association Between Initial Use of e-Cigarettes and Subsequent Cigarette Smoking Among Adolescents and Young Adults: A Systematic Review and Meta-analysis , 2017, JAMA pediatrics.

[36]  Filippo Menczer,et al.  Early detection of promoted campaigns on social media , 2017, EPJ Data Science.

[37]  Jon-Patrick Allem,et al.  Identifying Sentiment of Hookah-Related Posts on Twitter , 2017, JMIR public health and surveillance.

[38]  G. Radicioni,et al.  E‐Cigarette Use Causes a Unique Innate Immune Response in the Lung, Involving Increased Neutrophilic Activation and Altered Mucin Secretion , 2017, American journal of respiratory and critical care medicine.