Detecting Cyberbullying Entries on Informal School Websites Based on Category Relevance Maximization

We propose a novel method to detect cyberbullying entries on the Internet. “Cyberbullying” is defined as humiliating and slandering behavior towards other people through Internet services, such as BBS, Twitter or e-mails. In Japan members of Parent-Teacher Association (PTA) perform manual Web site monitoring called “net-patrol” to stop such activities. Unfortunately, reading through the whole Web manually is an uphill task. We propose a method of automatic detection of cyberbullying entries. In the proposed method we first use seed words from three categories to calculate semantic orientation score PMI-IR and then maximize the relevance of categories. In the experiment we checked the cases where the test data contains 50% (laboratory condition) and 12% (real world condition) of cyberbullying entries. In both cases the proposed method outperformed baseline settings.