Automated Detection of Substance Use-Related Social Media Posts Based on Image and Text Analysis

Nowadays, teens and young adults spend a significant amount of time on social media. According to the national survey of American attitudes on substance abuse, American teens who spend time on social media sites are at increased risk of smoking, drinking and illicit drug use. Reducing teens’ exposure to substance use-related social media posts may help minimize their risk of future substance use and addiction. In this paper, we present a method for automated detection of substance userelated social media posts. With this technology, substance userelated content can be automatically filtered out from social media. To detect substance use related social media posts, we employ the state-of-the-art social media analytics that combines Neural Network-based image and text processing technologies. Our evaluation results demonstrate that image features derived using Convolutional Neural Network and textual features derived using neural document embedding are effective in identifying substance use-related social media posts.

[1]  Bin Li,et al.  Region-based Pornographic Image Detection , 2005, 2005 IEEE 7th Workshop on Multimedia Signal Processing.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Zhen Li,et al.  Pornographic Images Detection Based on CBIR and Skin Analysis , 2008, 2008 Fourth International Conference on Semantics, Knowledge and Grid.

[4]  Hermann Ney,et al.  Bag-of-visual-words models for adult image classification and filtering , 2008, 2008 19th International Conference on Pattern Recognition.

[5]  Carolyn Penstein Rosé,et al.  Detecting offensive tweets via topical feature discovery over a large scale twitter corpus , 2012, CIKM.

[6]  Shimei Pan,et al.  Multi-View Unsupervised User Feature Embedding for Social Media-based Substance Use Prediction , 2017, EMNLP.

[7]  Ricardo A. Baeza-Yates,et al.  Characterizing objectionable image content (pornography and nude images) of specific Web segments: Chile as a case study , 2005, Third Latin American Web Congress (LA-WEB'2005).

[8]  Michael D. Barnes,et al.  Tweaking and Tweeting: Exploring Twitter for Nonmedical Use of a Psychostimulant Drug (Adderall) Among College Students , 2013, Journal of medical Internet research.

[9]  Rachel E. Ginn,et al.  Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter , 2016, Drug Safety.

[10]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[11]  Luo Jiebo,et al.  Fine-grained mining of illicit drug use patterns using social multimedia data from instagram , 2016 .

[12]  Karel Jezek,et al.  Comparing Semantic Models for Evaluating Automatic Document Summarization , 2015, TSD.

[13]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[14]  Jeff A. Bilmes,et al.  Deep Canonical Correlation Analysis , 2013, ICML.

[15]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Dolf Trieschnigg,et al.  Improving Cyberbullying Detection with User Context , 2013, ECIR.

[17]  Yonatan Belinkov,et al.  VectorSLU: A Continuous Word Vector Approach to Answer Selection in Community Question Answering Systems , 2015, *SEMEVAL.

[18]  Wooju Kim,et al.  Sentiment classification for unlabeled dataset using Doc2Vec with JST , 2016, ICEC.

[19]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.

[21]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Chiou-Shann Fuh,et al.  Pornography Detection Using Support Vector Machine , 2003 .

[23]  Jantima Polpinij,et al.  A web pornography patrol system by content-based analysis: In particular text and image , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[24]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[25]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[26]  Bernd Michaelis,et al.  Two Phases Neural Network-Based System for Pornographic Image Classification , 2009 .

[27]  Ying Chen,et al.  Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[28]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[29]  Amit P. Sheth,et al.  PREDOSE: A semantic web platform for drug abuse epidemiology using social media , 2013, J. Biomed. Informatics.

[30]  Vasile Buzuloiu,et al.  Image processing techniques to detect and filter objectionable images based on skin tone and shape recognition , 2001, ICCE. International Conference on Consumer Electronics (IEEE Cat. No.01CH37182).

[31]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[32]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Mohamed Moustafa,et al.  Applying deep learning to classify pornographic images and videos , 2015, ArXiv.