A Two-step Approach for Effective Detection of Misbehaving Users in Chats ⋆ Notebook for PAN at CLEF 2012

This paper describes the system jointly developed by the Language Technologies Lab from INAOE and the Language and Reasoning Group from UAM for the Sexual Predators Identification task at the PAN 20 12. The presented system focuses on the problem of identifying sexual predators in a set of suspi- cious chatting. It is mainly based on the following hypotheses: (i) terms used in the process of child exploitation are categorically and psy chologically different than terms used in general chatting; and (ii) predators usually apply the same course of conduct pattern when they are approaching a child. Based on these hypotheses, our participation at the PAN 2012 aimed to demonstrate that it is possible to train a classifier to learn those particular term s that turn a chat con- versation into a case of online child exploitation; and, that it is also possible to learn the behavioral patters of predators during a chat conversation allowing us to accurately distinguish victims from predators.