Linguistic cues predict fraudulent events in a corporate social network

Linguistic Cues Predict Fraudulent Events in a Corporate Social Network Max Louwerse (mlouwerse@memphis.edu) Department of Psychology / Institute for Intelligent Systems University of Memphis Memphis, TN 38152 USA King-Ip Lin (davidlin@memphis.edu) Department of Computer Science / Institute for Intelligent Systems University of Memphis Memphis, TN 38152 USA Amanda Drescher (adreschr@memphis.edu) Department of Psychology / Institute for Intelligent Systems University of Memphis Memphis, TN 38152 USA Gun Semin (G.R.Semin@uu.nl) Faculty of Social Sciences, Utrecht University 3584 CS Utrecht, The Netherlands far reaching consequences, for the liar or the recipient of the lie. Liars leave non-linguistic and linguistic footprints in their attempts to hide the truth, both in cases of blatant and not so blatant half-truths (DePaulo, et al., 2003). Several experiments have investigated these footprints using a paradigm whereby a participant in a deception condition is asked to tell a lie and/or to tell the truth. For instance, Newman, Pennebaker, Berry and Richards (2003) conducted a study in which they asked pro- (and anti-) abortion participants to produce both pro- and anti-abortion stories. They found that deceptive communication had fewer first-person singular pronouns, fewer third-person pronouns, more negative emotion words (e.g., hate, anger, enemy), fewer exclusive words (e.g., but, except), and more motion verbs (e.g., walk, move, go). Apparently liars wanted to dissociate themselves from their words (fewer first person pronouns), and made an attempt to create a story that seemed less complex (fewer exclusive words) and more concrete (more action words). Hancock, Curry, Goorha, & Woodworth (2008) came to a very similar conclusion. They investigated deception in asynchronous computer-mediated communication. Participants were asked to write stories on five different topics. Half of the participants were asked to not tell the truth. Hancock et al. (2008) found that lies consisted of fewer words, more questions, fewer first person pronouns and more words pertaining to senses (e.g., see, listen) than truthful discussions. Both Newman et al. (2003) and Hancock et al. (2008) found pronoun use, lowered word quantity, emotion words and lower cognitive complexity to be linguistic cues affiliated with deception. Both the experimental design and the findings of these two studies are prototypical for much of the empirical work on deception. Abstract There is an increase in deception studies investigating which non-linguistic and linguistic cues best predict deception. Even though these studies have shown participants consistently use specific cues to deception when they are asked to deceive somebody in a particular situation, it is less clear how these findings translate to non-experimental settings, for instance, do these cues also apply in cases of global deception in social networks. This paper investigated whether fraudulent events can be related to linguistic cues of deception within records of a large corporate social network. Specifically, we investigated the Enron email dataset using a model of interpersonal language use. Results suggest that during times of fraud, emails were composed with higher degrees of abstractness. Keywords: deception, social cognition, computer mediated communication, corpus linguistics. Introduction Humans lie because it helps them manipulate the impressions people have of them. Apologizing for being late (even though you could have been on time), telling a police officer you really thought the speed limit was 40 (even though you knew it was 35), and thanking the waitress for guiding you to your table (even though you had waited for 20 minutes and she just did her job), all help to establish an interpersonal glue between you and your social environment. We tell many lies, on average one or two a day (DePaulo & Kashy, 1998). Of course, there are gradations in the acceptability of twisting the truth. Some lies are blatant transgressions with potentially far reaching consequences, such as cases related to fraud, others are harmless and would have very little or no consequences. Most research in the cognitive sciences on deception centers on lies with little consequences. In fact, very little research has been done on cases of deception with

[1]  Chen Qu,et al.  On Corporate Crime , 2003 .

[2]  Jeffrey T. Hancock,et al.  On Lying and Being Lied To: A Linguistic Analysis of Deception in Computer-Mediated Communication , 2007 .

[3]  Terrill L. Frantz,et al.  Communication Networks from the Enron Email Corpus “It's Always About the People. Enron is no Different” , 2005, Comput. Math. Organ. Theory.

[4]  Gün R. Semin,et al.  Agenda 2000: Communication: Language as an implementational device for cognition. , 2000 .

[5]  Gün R. Semin,et al.  The linguistic category model, its bases, applications and range , 1991 .

[6]  David B. Skillicorn,et al.  Detecting unusual email communication , 2005, CASCON.

[7]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[8]  B. Depaulo,et al.  Everyday lies in close and casual relationships. , 1998, Journal of personality and social psychology.

[9]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[10]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[11]  G. Miller,et al.  Language and Perception , 1976 .

[12]  K. Fiedler,et al.  The cognitive functions of linguistic categories in describing persons: Social cognition and language. , 1988 .

[13]  M. Louwerse,et al.  The linguistic and embodied nature of conceptual processing , 2010, Cognition.

[14]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[15]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[16]  James J. Lindsay,et al.  Cues to deception. , 2003, Psychological bulletin.

[17]  J. Pennebaker,et al.  Lying Words: Predicting Deception from Linguistic Styles , 2003, Personality & social psychology bulletin.

[18]  G. Semin,et al.  The magic spell of language: linguistic categories and their perceptual consequences. , 2007, Journal of personality and social psychology.