Log-Based Chat Room Monitoring Using Text Categorization: A Comparative Study

The Internet has been utilized in several real life aspects such as online searching, distance learning, and chatting. On the other hand, the Internet has been misused in communication of crime related matters. Monitoring of such communication would aid in crime detection or even crime prevention. This paper presents a text categorization approach for automatic monitoring of chat conversations since the current monitoring techniques are basically manual, which is tedious, costly, and time consuming. We present the results of a cross method comparison between the Naive Bayes, the K-nearest neighbor, and the Support Vector Machine classifiers. Our objective is to determine the most suitable method for the data of chat conversations. Our results showed that the chat room monitoring task can be efficiently automated using the appropriate text categorization method.