Classification of Tweets Using Text Classifier to Detect Cyber Bullying

Cyber bullying and internet predation threaten minors, particular teens and teens who do not have adequate supervision when they use the computer. The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Therefore, specific (pre)processing methods and algorithms are required in order to extract useful patterns. Text Mining is the discovery of valuable, yet hidden, information from the text document. Text classification one of the important research issues in the field of text mining. We propose an effective approach to detect cyber bullying messages from Twitter through a weighting scheme of feature selection.

[1]  D. Butler Cyber Bullying in Schools and the Law: Is there an Effective Means of Addressing the Power Imbalance? , 2009 .

[2]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[3]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Stephen Dann,et al.  Twitter content classification , 2010, First Monday.

[5]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[6]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[9]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.