Social Media for Opioid Addiction Epidemiology: Automatic Detection of Opioid Addicts from Twitter and Case Studies

Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, there is an urgent need for novel tools and methodologies to gain new insights into the behavioral processes of opioid abuse and addiction. The role of social media in biomedical knowledge mining has turned into increasingly significant in recent years. In this paper, we propose a novel framework named AutoDOA to automatically detect the opioid addicts from Twitter, which can potentially assist in sharpening our understanding toward the behavioral process of opioid abuse and addiction. In AutoDOA, to model the users and posted tweets as well as their rich relationships, a structured heterogeneous information network (HIN) is first constructed. Then meta-path based approach is used to formulate similarity measures over users and different similarities are aggregated using Laplacian scores. Based on HIN and the combined meta-path, to reduce the cost of acquiring labeled examples for supervised learning, a transductive classification model is built for automatic opioid addict detection. To the best of our knowledge, this is the first work to apply transductive classification in HIN into drug-addiction domain. Comprehensive experiments on real sample collections from Twitter are conducted to validate the effectiveness of our developed system AutoDOA in opioid addict detection by comparisons with other alternate methods. The results and case studies also demonstrate that knowledge from daily-life social media data mining could support a better practice of opioid addiction prevention and treatment.

[1]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[2]  Xiang Li,et al.  On Transductive Classification in Heterogeneous Information Networks , 2016, CIKM.

[3]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[4]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[5]  Jiawei Han,et al.  KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks , 2015, 2015 IEEE International Conference on Data Mining.

[6]  A T McLellan,et al.  Drug dependence, a chronic medical illness: implications for treatment, insurance, and outcomes evaluation. , 2000, JAMA.

[7]  Master Textbook,et al.  Behavioral Health Trends in the United States: Results from the 2014 National Survey on Drug Use and Health , 2017 .

[8]  Ana-Maria Popescu,et al.  Democrats, republicans and starbucks afficionados: user classification in twitter , 2011, KDD.

[9]  C. Hawn Take two aspirin and tweet me in the morning: how Twitter, Facebook, and other social media are reshaping health care. , 2009, Health affairs.

[10]  Rachel E. Ginn,et al.  Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter , 2016, Drug Safety.

[11]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[12]  Fan Yu,et al.  Towards large-scale twitter mining for drug-related adverse events , 2012, SHB '12.

[13]  Ludovic Denoyer,et al.  Classification and annotation in social corpora using multiple relations , 2011, CIKM '11.

[14]  Amit P. Sheth,et al.  PREDOSE: A semantic web platform for drug abuse epidemiology using social media , 2013, J. Biomed. Informatics.

[15]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[16]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[17]  Chen Luo,et al.  HetPathMine: A Novel Transductive Classification Algorithm on Heterogeneous Information Networks , 2014, ECIR.

[18]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[19]  Pradeep Kumar,et al.  HeteClass: A Meta-path based framework for transductive classification of objects in heterogeneous information networks , 2017, Expert Syst. Appl..

[20]  Rachel L. Goldfeder,et al.  Mining Twitter Data to Improve Detection of Schizophrenia , 2015, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[21]  Abeed Sarker,et al.  Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features , 2015, J. Am. Medical Informatics Assoc..

[22]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[23]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[24]  Jiawei Han,et al.  Text Classification with Heterogeneous Information Network Kernels , 2016, AAAI.

[25]  Kevin A Clauson,et al.  Pharmacist use of social media , 2011, The International journal of pharmacy practice.

[26]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[27]  Han Jiawei,et al.  KnowSim: A Document Similarity Measure on Structured Heterogeneous Information Networks , 2015 .

[28]  Yizhou Sun,et al.  Graph Regularized Transductive Classification on Heterogeneous Information Networks , 2010, ECML/PKDD.