Detecting online child grooming conversation

Massive proliferation of social media has opened possibilities for perpetrator to conduct the crime of online child grooming. Because the pervasiveness of the problem scale, it may only be tamed effectively and efficiently by using an automatic grooming conversation detection system. Previously, Pranoto, Gunawan, and Soewito [1] had developed a logistic model for the purpose and the model was able to achieve 95% detection accuracy. The current study intends to address the issue by using Support Vector Machine and fc-nearest neighbors classifiers. In addition, the study also proposes a low-computational cost classification method on the basis of the number of the existing grooming conversation characteristics. All proposed methods are evaluated using 150 conversation texts of which 105 texts are grooming and 45 texts are non-grooming. We identify that grooming conversations possess 17 features of grooming characteristics. The results suggest that the SVM and fc-NN are able to identify grooming conversations at 98.6% and 97.8% of the level of accuracy. Meanwhile, the proposed simple method has 96.8% accuracy. The empirical study also suggests that two among the seventeen characteristics are insignificant for the classification.

[1]  Oguz Findik,et al.  A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine , 2010, Expert Syst. Appl..

[2]  Peter M. Briggs,et al.  An Exploratory Study of Internet-Initiated Sexual Offenses and the Chat Room Sex Offender: Has the Internet Enabled a New Typology of Sex Offender? , 2011, Sexual abuse : a journal of research and treatment.

[3]  Fergyanto E. Gunawan,et al.  Logistic Models for Classifying Online Grooming Conversation , 2015 .

[4]  Benfano Soewito,et al.  Digital Technology: the Effect of Connected World to Computer Ethic and Family , 2015 .

[5]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[6]  Sumarno Sumarno,et al.  Feature Extraction of Electroencephalography Signals Using Fast Fourier Transform , 2016 .

[7]  Ioannis Mavridis,et al.  Utilizing document classification for grooming attack recognition , 2011, 2011 IEEE Symposium on Computers and Communications (ISCC).

[8]  April Kontostathis,et al.  Learning to Identify Internet Sexual Predation , 2011, Int. J. Electron. Commer..

[9]  Shi-Jinn Horng,et al.  A novel intrusion detection system based on hierarchical clustering and support vector machines , 2011, Expert Syst. Appl..

[10]  L. Olson,et al.  Entrapping the Innocent: Toward a Theory of Child Sexual Predators’ Luring Communication , 2007 .

[11]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  A. Beech,et al.  A review of online grooming: Characteristics and concerns , 2013 .

[14]  U. Rajendra Acharya,et al.  Thermography Based Breast Cancer Detection Using Texture Features and Support Vector Machine , 2012, Journal of Medical Systems.

[15]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[16]  Suresh Manandhar,et al.  Detecting Predatory Behaviour from Online Textual Chats , 2012, MCSS.

[17]  Melissa A. Wollis Online Predation: A Linguistic Analysis of Online Predator Grooming , 2011 .