Analyzing Genetic Testing Discourse on the Web Through the Lens of Twitter, Reddit, and 4chan

Recent progress in genomics has enabled the emergence of a flourishing market for direct-to-consumer (DTC) genetic testing. Companies like 23andMe and AncestryDNA provide affordable health, genealogy, and ancestry reports, and have already tested tens of millions of customers. Consequently, news, experiences, and views on genetic testing are increasingly shared and discussed on social media. At the same time, far-right groups have also taken an interest in genetic testing, using them to attack minorities and prove their genetic “purity.” In this article, we set to study the genetic testing discourse on a number of mainstream and fringe Web communities. We do so in two steps. First, we conduct an exploratory, large-scale analysis of the genetic testing discourse on a mainstream social network such as Twitter. We find that the genetic testing discourse is fueled by accounts that appear to be interested in digital health and technology. However, we also identify tweets with highly racist connotations. This motivates us to explore the connection between genetic testing and racism on platforms with a reputation for toxicity, namely, Reddit and 4chan, where we find that discussions around genetic testing often include highly toxic language expressed through hateful and racist comments. In particular, on 4chan’s politically incorrect board (/pol/), content from genetic testing conversations involves several alt-right personalities and openly anti-semitic rhetoric, often conveyed through memes.

[1]  Kristina Lerman,et al.  Emotions, Demographics and Sociability in Twitter Interactions , 2015, ICWSM.

[2]  Vishal Monga,et al.  Perceptual Image Hashing Via Feature Points: Performance Evaluation and Tradeoffs , 2006, IEEE Transactions on Image Processing.

[3]  Ariella L. Gladstein,et al.  No Evidence from Genome-Wide Data of a Khazar Origin for the Ashkenazi Jews , 2013, Human biology.

[4]  Demetrius J Porche,et al.  Precision Medicine Initiative , 2015, American journal of men's health.

[5]  Joan Donovan,et al.  Genetic ancestry testing among white nationalists: From identity repair to citizen science , 2019, Social studies of science.

[6]  Felice Dell'Orletta,et al.  Hate Me, Hate Me Not: Hate Speech Detection on Facebook , 2017, ITASEC.

[7]  Michael S. Bernstein,et al.  4chan and /b/: An Analysis of Anonymity and Ephemerality in a Large Online Community , 2011, ICWSM.

[8]  ThelwallMike,et al.  Sentiment strength detection in short informal text , 2010 .

[9]  Nila A. Sathe,et al.  A systematic literature review of individuals’ perspectives on privacy and genetic information in the United States , 2018, PLoS ONE.

[10]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[11]  Munmun De Choudhury,et al.  Mental Health Discourse on reddit: Self-Disclosure, Social Support, and Anonymity , 2014, ICWSM.

[12]  Eric Horvitz,et al.  Predicting Depression via Social Media , 2013, ICWSM.

[13]  Kush R. Varshney,et al.  The Effect of Extremist Violence on Hateful Speech Online , 2018, ICWSM.

[14]  David E Frost,et al.  All of us. , 2011, Journal of oral and maxillofacial surgery : official journal of the American Association of Oral and Maxillofacial Surgeons.

[15]  Shivakant Mishra,et al.  Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network , 2015, SocInfo.

[16]  Robert Mueller Report On The Investigation Into Russian Interference In The 2016 Presidential Election , 2019 .

[17]  Ariadna Matamoros Fernández,et al.  Hate Speech and Covert Discrimination on Social Media: Monitoring the Facebook Pages of Extreme-Right Political Parties in Spain , 2016 .

[18]  Joan Donovan,et al.  Genetic Ancestry Testing among White Nationalists , 2017 .

[19]  Jun Yu,et al.  Deconstructing datafication’s brave new world , 2018, New Media Soc..

[20]  Andelka M Phillips Privacy, Data Protection and Security in the Context of Direct-to-Consumer Genetic Testing Services , 2019 .

[21]  Laura E. Barnes,et al.  "Is This an STD? Please Help!": Online Information Seeking for Sexually Transmitted Diseases on Reddit , 2018, ICWSM.

[22]  Melissa J. Krauss,et al.  "Hey Everyone, I'm Drunk." An Evaluation of Drinking-Related Twitter Chatter. , 2015, Journal of studies on alcohol and drugs.

[23]  Emily Christofides,et al.  Company disclosure and consumer perceptions of the privacy implications of direct-to-consumer genetic testing , 2016 .

[24]  Jacob Eisenstein,et al.  You Can't Stay Here , 2017, Proc. ACM Hum. Comput. Interact..

[25]  Anne Lanceley,et al.  Awareness, knowledge, perceptions, and attitudes towards genetic testing for cancer risk among ethnic minority groups: a systematic review , 2017, BMC Public Health.

[26]  B. Lewis,et al.  Ethical research standards in a world of big data , 2014, F1000Research.

[27]  E J Topol,et al.  Perceptions of genetic counseling services in direct‐to‐consumer personal genomic testing , 2013, Clinical genetics.

[28]  Gianluca Stringhini,et al.  Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web , 2016, ICWSM.

[29]  P. Gonzalez-Alegre,et al.  Towards precision medicine , 2017 .

[30]  Djoerd Hiemstra,et al.  On the Impact of Twitter-based Health Campaigns: A Cross-Country Analysis of Movember , 2015, Louhi@EMNLP.

[31]  Athena Vakali,et al.  A Unified Deep Learning Architecture for Abuse Detection , 2018, WebSci.

[32]  Mark Dredze,et al.  Quantifying Mental Health Signals in Twitter , 2014, CLPsych@ACL.

[33]  Gianluca Stringhini,et al.  Mean Birds: Detecting Aggression and Bullying on Twitter , 2017, WebSci.

[34]  Virgílio A. F. Almeida,et al.  Auditing radicalization pathways on YouTube , 2019, FAT*.

[35]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[36]  David A. Broniatowski,et al.  Weaponized Health Communication: Twitter Bots and Russian Trolls Amplify the Vaccine Debate , 2018, American journal of public health.

[37]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[38]  Vincent A. Knight,et al.  Tweeting the terror: modelling the social media reaction to the Woolwich terrorist attack , 2014, Social Network Analysis and Mining.

[39]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[40]  Genetic Ancestry Testing , 2008 .

[41]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[42]  Ahmed El-Sohemy,et al.  Perceptions of Genetic Testing for Personalized Nutrition: A Randomized Trial of DNA-Based Dietary Advice , 2014, Lifestyle Genomics.

[43]  Geoff Kaufman,et al.  "At Least the Pizzas You Make Are Hot": Norms, Values, and Abrasive Humor on the Subreddit r/RoastMe , 2018, ICWSM.

[44]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[45]  Gianluca Stringhini,et al.  Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities , 2018 .

[46]  Virgílio A. F. Almeida,et al.  Characterizing and Detecting Hateful Users on Twitter , 2018, ICWSM.

[47]  Gianluca Stringhini,et al.  Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying , 2017, WWW.

[48]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[49]  Gianluca Stringhini,et al.  The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources , 2017, Internet Measurement Conference.

[50]  Nilesh Saraf,et al.  ‘Warren Buffet is my cousin’: shaping public understanding of big data biotechnology, direct-to-consumer genomics, and 23andMe on Twitter , 2018 .

[51]  Timothy Caulfield,et al.  Direct-to-consumer genetic testing: perceptions, problems, and policy responses. , 2012, Annual review of medicine.

[52]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[53]  Fabrício Benevenuto,et al.  Analyzing the Targets of Hate in Online Social Media , 2016, ICWSM.

[54]  Sofiane Abbar,et al.  You Tweet What You Eat: Studying Food Consumption Through Twitter , 2014, CHI.

[55]  Wendy D Roth,et al.  Genetic Options: The Impact of Genetic Ancestry Testing on Consumers’ Racial and Ethnic Identities1 , 2018, American Journal of Sociology.

[56]  L. Stewart,et al.  Examining Trolls and Polarization with a Retweet Network , 2018 .

[57]  Eran Halperin,et al.  Identifying Personal Genomes by Surname Inference , 2013, Science.

[58]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[59]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[60]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[61]  Richard A. Mills Pop-up political advocacy communities on reddit.com: SandersForPresident and The Donald , 2017, AI & SOCIETY.

[62]  Saiph Savage,et al.  Mobilizing the Trump Train: Understanding Collective Action in a Political Trolling Community , 2018, ICWSM.

[63]  Savvas Zannettou,et al.  A Quantitative Approach to Understanding Online Antisemitism , 2018, ICWSM.

[64]  Virgílio A. F. Almeida,et al.  Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination , 2018, WebSci.

[65]  Antonis Papasavva,et al.  Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board , 2020, ICWSM.

[66]  C. Bustamante,et al.  Privacy Risks from Genomic Data-Sharing Beacons , 2015, American journal of human genetics.