Application of linguistic cues in the analysis of language of hate groups

Hate speech and fringe ideologies are social phenomena that thrive on-line. Members of the political and religious fringe are able to propagate their ideas via the Internet with less eort than in traditional media. In this article, we attempt to use linguistic cues such as the occurrence of certain parts of speech in order to distinguish the language of fringe groups from strictly informative sources. The aim of this research is to provide a preliminary model for iden- tifying deceptive materials online. Examples of these would include aggressive marketing and hate speech. For the sake of this paper, we aim to focus on the political aspect. Our research has shown that information about sentence length and the occurrence of adjectives and adverbs can provide information for the identification of dierences between the language of fringe political groups and mainstream media.

[1]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[2]  Kevin C. Moffitt,et al.  Identification of fraudulent financial statements using linguistic credibility analysis , 2011, Decis. Support Syst..

[3]  Ariane Lantz Philippe Lacoue-Labarthe, Jean-Luc Nancy, Le mythe nazi, Paris, Éditions de l'Aube, 1991 , 1993 .

[4]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[5]  Adam Wierzbicki,et al.  Application layer multicast for efficient peer-to-peer applications , 2003, Proceedings the Third IEEE Workshop on Internet Applications. WIAPP 2003.

[6]  Massimo Poesio,et al.  Lexical vs. Surface Features in Deceptive Language Analysis , 2011 .

[7]  David Barkai Technologies for sharing and collaborating on the Net , 2001, Proceedings First International Conference on Peer-to-Peer Computing.

[8]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[9]  Richard Power,et al.  Implementing a Characterization of Genre for Automatic Genre Identification of Web Pages , 2006, ACL.

[10]  Piotr Turek,et al.  Learning About the Quality of Teamwork from Wikiteams , 2010, 2010 IEEE Second International Conference on Social Computing.

[11]  Dan Cristea,et al.  Towards an Automated Semiotic Analysis of the Romanian Political Discourse , 2013, Comput. Sci. J. Moldova.

[12]  Serge Sharoff Classifying Web corpora into domain and genre using automatic feature identification , 2007 .

[13]  Panagiotis Takis Metaxas Web Spam, Social Propaganda and the Evolution of Search Engine Rankings , 2009, WEBIST.

[14]  Eric Brown,et al.  Applying natural language processing (NLP) based metadata extraction to automatically acquire user preferences , 2001, K-CAP '01.

[15]  David Barkai,et al.  Peer-To-Peer Computing , 2001 .

[16]  P. Turek,et al.  WikiTeams: How Do They Achieve Success? , 2011, IEEE Potentials.

[17]  Carlo Strapparava,et al.  The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language , 2009, ACL.

[18]  Yejin Choi,et al.  Syntactic Stylometry for Deception Detection , 2012, ACL.

[19]  Daniela Gîfu,et al.  An operational approach of communicational propaganda , 2014 .