Detecting Binge Drinking and Alcohol-Related Risky Behaviours from Twitter’s Users: An Exploratory Content- and Topology-Based Analysis

Binge Drinking (BD) is a common risky behaviour that people hardly report to healthcare professionals, although it is not uncommon to find, instead, personal communications related to alcohol-related behaviors on social media. By following a data-driven approach focusing on User-Generated Content, we aimed to detect potential binge drinkers through the investigation of their language and shared topics. First, we gathered Twitter threads quoting BD and alcohol-related behaviours, by considering unequivocal keywords, identified by experts, from previous evidence on BD. Subsequently, a random sample of the gathered tweets was manually labelled, and two supervised learning classifiers were trained on both linguistic and metadata features, to classify tweets of genuine unique users with respect to media, bot, and commercial accounts. Based on this classification, we observed that approximately 55% of the 1 million alcohol-related collected tweets was automatically identified as belonging to non-genuine users. A third classifier was then trained on a subset of manually labelled tweets among those previously identified as belonging to genuine accounts, to automatically identify potential binge drinkers based only on linguistic features. On average, users classified as binge drinkers were quite similar to the standard genuine Twitter users in our sample. Nonetheless, the analysis of social media contents of genuine users reporting risky behaviours remains a promising source for informed preventive programs.

[1]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[2]  H. Wechsler,et al.  Health and behavioral consequences of binge drinking in college. A national survey of students at 140 campuses. , 1994, JAMA.

[3]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[4]  Gobinda G. Chowdhury,et al.  Natural language processing , 2005, Annu. Rev. Inf. Sci. Technol..

[5]  Does Knowledge of College Drinking Policy Influence Student Binge Drinking? , 2005, Journal of American college health : J of ACH.

[6]  T. Naimi,et al.  Binge Drinking and Associated Health Risk Behaviors Among High School Students , 2007, Pediatrics.

[7]  K. Sher,et al.  Decision making and binge drinking: a longitudinal study. , 2007, Alcoholism, clinical and experimental research.

[8]  Marco Viviani,et al.  Trading Anonymity for Influence in Open Communities Voting Schemata , 2009, 2009 International Workshop on Social Informatics.

[9]  S. Casswell,et al.  Intoxigenic digital spaces? Youth, social networking sites and alcohol marketing. , 2010, Drug and alcohol review.

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Christophe G. Giraud-Carrier,et al.  Identifying Health-Related Topics on Twitter - An Exploration of Tobacco-Related Tweets as a Test Topic , 2011, SBP.

[12]  K. Sher,et al.  Decision making and response inhibition as predictors of heavy alcohol use: a prospective study. , 2011, Alcoholism, clinical and experimental research.

[13]  Barbara Carminati,et al.  A Multi-dimensional and Event-Based Model for Trust Computation in the Social Web , 2012, SocInfo.

[14]  Cameron Marlow,et al.  A 61-million-person experiment in social influence and political mobilization , 2012, Nature.

[15]  Sushil Jajodia,et al.  Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? , 2012, IEEE Transactions on Dependable and Secure Computing.

[16]  R. D. de Visser,et al.  My cup runneth over: young people's lack of knowledge of low-risk drinking guidelines. , 2012, Drug and alcohol review.

[17]  Mark Dredze,et al.  How Social Media Will Change Public Health , 2012, IEEE Intelligent Systems.

[18]  Yunming Ye,et al.  An improved random forest classifier for image classification , 2012, 2012 IEEE International Conference on Information and Automation.

[19]  Helmut Leopold,et al.  Social Media , 2012, Elektrotech. Informationstechnik.

[20]  Tavis J. Glassman Implications for College Students Posting Pictures of Themselves Drinking Alcohol on Facebook , 2012 .

[21]  Libby N Brockman,et al.  Associations between displayed alcohol references on Facebook and problem drinking among college students. , 2012, Archives of pediatrics & adolescent medicine.

[22]  A. Copeland,et al.  Cluster analysis of undergraduate drinkers based on alcohol expectancy scores. , 2012, Journal of studies on alcohol and drugs.

[23]  Fan Yu,et al.  Towards large-scale twitter mining for drug-related adverse events , 2012, SHB '12.

[24]  Megan E. Patrick,et al.  Extreme binge drinking among 12th-grade students in the United States: prevalence and predictors. , 2013, JAMA pediatrics.

[25]  Susannah Fox After Dr Google: Peer-to-Peer Health Care , 2013, Pediatrics.

[26]  Barbara Carminati,et al.  Security and Trust in Online Social Networks , 2013, Security and Trust in Online Social Networks.

[27]  R. Lipari Trends in Adolescent Substance Use and Perception of Risk from Substance Use , 2013 .

[28]  Nicholas A. Christakis,et al.  Social contagion theory: examining dynamic social networks and human behavior , 2011, Statistics in medicine.

[29]  Chao Chen,et al.  Detecting Non‐personal and Spam Users on Geo‐tagged Twitter Network , 2014, Trans. GIS.

[30]  S. Ziebland,et al.  User-Generated Online Health Content: A Survey of Internet Users in the United Kingdom , 2014, Journal of medical Internet research.

[31]  N. Kambouropoulos,et al.  Binge drinking, reflection impulsivity, and unplanned sexual behavior: impaired decision-making in young social drinkers. , 2014, Alcoholism, clinical and experimental research.

[32]  B. Lewis,et al.  Ethical research standards in a world of big data , 2014, F1000Research.

[33]  G. Carrà,et al.  Prevalence and Correlates of Binge Drinking among Young Adults Using Alcohol: A Cross-Sectional Survey , 2014, BioMed research international.

[34]  Annice E Kim,et al.  Using Twitter Data to Gain Insights into E-cigarette Marketing and Locations of Use: An Infoveillance Study , 2015, Journal of medical Internet research.

[35]  Melissa J. Krauss,et al.  "Hey Everyone, I'm Drunk." An Evaluation of Drinking-Related Twitter Chatter. , 2015, Journal of studies on alcohol and drugs.

[36]  G. Carrà,et al.  Risk Estimation Modeling and Feasibility Testing for A Mobile Ehealth Intervention for Binge Drinking among Young People: The D-Arianna (Digital-Alcohol Risk Alertness Notifying Network for Adolescents and Young Adults) Project , 2015, Substance abuse.

[37]  Melissa J. Krauss,et al.  Twitter chatter about marijuana. , 2015, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[38]  P. Bebbington,et al.  IMPACT OF A MOBILE E-HEALTH INTERVENTION ON BINGE DRINKING IN YOUNG PEOPLE The D-ARIANNA (Digital-Alcohol RIsk Alertness Notifying Network for Adolescents and young adults) project , 2017 .

[39]  Sylvio Barbon Junior,et al.  Account classification in online social networks with LBCA and wavelets , 2016, Inf. Sci..

[40]  Krishnaprasad Thirunarayan,et al.  “When ‘Bad’ is ‘Good’”: Identifying Personal Communication and Sentiment in Drug-Related Tweets , 2016, JMIR public health and surveillance.

[41]  L. Vale,et al.  Prevalence of alcohol related attendance at an inner city emergency department and its impact: a dual prospective and retrospective cohort study , 2015, Emergency Medicine Journal.

[42]  Maeve Duggan,et al.  Social Media Update 2016 , 2016 .

[43]  S. Wojcik,et al.  A practical guide to big data research in psychology. , 2016, Psychological methods.

[44]  Rumi Chunara,et al.  High-resolution Temporal Representations of Alcohol and Tobacco Behaviors from Social Media Data , 2017, Proc. ACM Hum. Comput. Interact..

[45]  Gabriella Pasi,et al.  Credibility in social media: opinions, news, and health information—a survey , 2017, WIREs Data Mining Knowl. Discov..

[46]  S. Blakemore,et al.  Age-related differences in social influence on risk perception depend on the direction of influence , 2017, Journal of adolescence.

[47]  Kar-Hai Chu,et al.  Methods for Coding Tobacco-Related Twitter Data: A Systematic Review , 2017, Journal of medical Internet research.

[48]  Gloria Bordogna,et al.  An interoperable open data framework for discovering popular tours based on geo-tagged tweets , 2017, Int. J. Intell. Inf. Database Syst..

[49]  M. Williams,et al.  Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation , 2017, Sociology.

[50]  Melissa J. Krauss,et al.  “Get drunk. Smoke weed. Have fun.”: A Content Analysis of Tweets About Marijuana and Alcohol , 2017, American journal of health promotion : AJHP.

[51]  Miguel A. Vadillo,et al.  Researching Mental Health Disorders in the Era of Social Media: Systematic Review , 2017, Journal of medical Internet research.

[52]  N. Lukianova Sten Score Method And Cluster Analysis: Identifying Respondents Vulnerable To Drug Abuse , 2018 .

[53]  Danielle E. Ramo,et al.  Meta‐Analysis of the Association of Alcohol‐Related Social Media Use with Alcohol Consumption and Alcohol‐Related Problems in Adolescents and Young Adults , 2018, Alcoholism, clinical and experimental research.

[54]  Marcos André Gonçalves,et al.  Improving random forests by neighborhood projection for effective text classification , 2018, Inf. Syst..

[55]  C. Montomoli,et al.  Predicting Young Adults Binge Drinking in Nightlife Scenes: An Evaluation of the D-ARIANNA Risk Estimation Model , 2018, Journal of addiction medicine.

[56]  Casey Fiesler,et al.  “Participant” Perceptions of Twitter Research Ethics , 2018 .

[57]  Giovanni Livraga,et al.  Data Confidentiality and Information Credibility in On-line Ecosystems , 2019, MEDES.