Disturbed YouTube for Kids: Characterizing and Detecting Disturbing Content on YouTube

A considerable number of the most-subscribed YouTube channels feature content popular among children of very young age. Hundreds of toddler-oriented channels on YouTube offer inoffensive, well produced, and educational videos. Unfortunately, inappropriate (disturbing) content that targets this demographic is also common. YouTube's algorithmic recommendation system regrettably suggests inappropriate content because some of it mimics or is derived from otherwise appropriate content. Considering the risk for early childhood development, and an increasing trend in toddler's consumption of YouTube media, this is a worrying problem. While there are many anecdotal reports of the scale of the problem, there is no systematic quantitative measurement. Hence, in this work, we develop a classifier able to detect toddler-oriented inappropriate content on YouTube with 82.8% accuracy, and we leverage it to perform a first-of-its-kind, large-scale, quantitative characterization that reveals some of the risks of YouTube media consumption by young children. Our analysis indicates that YouTube's currently deployed counter-measures are ineffective in terms of detecting disturbing videos in a timely manner. Finally, using our classifier, we assess how prominent the problem is on YouTube, finding that young children are likely to encounter disturbing videos when they randomly browse the platform starting from benign videos.

[1]  Ashish Sureka,et al.  Mining YouTube metadata for detecting privacy invading harassment and misdemeanor videos , 2014, 2014 Twelfth Annual International Conference on Privacy, Security and Trust.

[2]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Sandra Avila,et al.  Combating the Elsagate Phenomenon: Deep Learning Architectures for Disturbing Cartoons , 2019, 2019 7th International Workshop on Biometrics and Forensics (IWBF).

[4]  Minaxi Gupta,et al.  Identifying fraudulently promoted online videos , 2014, WWW '14 Companion.

[5]  Padraig Cunningham,et al.  Network Analysis of Recurring YouTube Spam Campaigns , 2012, ICWSM.

[6]  Ashish Sureka,et al.  A focused crawler for mining hate and extremism promoting videos on YouTube. , 2014, HT.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Savvas Zannettou,et al.  The Good, the Bad and the Bait: Detecting and Characterizing Clickbait on YouTube , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[9]  Ashish Sureka,et al.  Contextual feature based one-class classifier approach for detecting video response spam on YouTube , 2013, 2013 Eleventh Annual Conference on Privacy, Security and Trust.

[10]  Paul Covington,et al.  Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[11]  Rashedur M. Rahman,et al.  A data mining based spam detection system for YouTube , 2013, Eighth International Conference on Digital Information Management (ICDIM 2013).

[12]  Tiago A. Almeida,et al.  TubeSpam: Comment Spam Filtering on YouTube , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[13]  Virgílio A. F. Almeida,et al.  Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination , 2018, WebSci.

[14]  A. P. deVries,et al.  Identifying Suitable YouTube Videos for Children , 2010 .

[15]  Sergios Theodoridis,et al.  A Multimodal Approach to Violence Detection in Video Sharing Sites , 2010, 2010 20th International Conference on Pattern Recognition.

[16]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[17]  Ponnurangam Kumaraguru,et al.  Mining YouTube to Discover Extremist Videos, Users and Hidden Communities , 2010, AIRS.

[18]  Rishabh Kaushal,et al.  KidsTube: Detection, characterization and analysis of child unsafe content & promoters on YouTube , 2016, 2016 14th Annual Conference on Privacy, Security and Trust (PST).

[19]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[21]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[22]  Rishabh Kaushal,et al.  KidsGUARD: fine grained approach for child unsafe video representation and detection , 2019, SAC.

[23]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[24]  Virgílio A. F. Almeida,et al.  Practical Detection of Spammers and Content Promoters in Online Video Sharing Systems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Marina Buzzi Children and YouTube: access to safe content , 2011, CHItaly.

[26]  Ashish Sureka Mining User Comment Activity for Detecting Forum Spammers in YouTube , 2011, ArXiv.

[27]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[28]  Virgílio A. F. Almeida,et al.  Characterizing Videos, Audience and Advertising in Youtube Channels for Kids , 2017, SocInfo.