Mark my words!: linguistic style accommodation in social media

The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one another's communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestures. In its almost forty years of existence, this theory has been empirically supported exclusively through small-scale or controlled laboratory studies. Here we address this phenomenon in the context of Twitter conversations. Undoubtedly, this setting is unlike any other in which accommodation was observed and, thus, challenging to the theory. Its novelty comes not only from its size, but also from the non real-time nature of conversations, from the 140 character length restriction, from the wide variety of social relation types, and from a design that was initially not geared towards conversation at all. Given such constraints, it is not clear a priori whether accommodation is robust enough to occur given the constraints of this new environment. To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. We apply it to a large Twitter conversational dataset specifically developed for this task. This is the first time the hypothesis of linguistic style accommodation has been examined (and verified) in a large scale, real world setting. Furthermore, when investigating concepts such as stylistic influence and symmetry of accommodation, we discover a complexity of the phenomenon which was never observed before. We also explore the potential relation between stylistic influence and network features commonly associated with social status.

[1]  H. Giles,et al.  Contexts of Accommodation: Developments in Applied Sociolinguistics , 2010 .

[2]  Jeffrey T. Hancock,et al.  Language Style Matching as a Predictor of Social Dynamics in Small Groups , 2010, Commun. Res..

[3]  I.N. Bozkurt,et al.  Authorship attribution , 2007, 2007 22nd international symposium on computer and information sciences.

[4]  G. Yule ON SENTENCE- LENGTH AS A STATISTICAL CHARACTERISTIC OF STYLE IN PROSE: WITH APPLICATION TO TWO CASES OF DISPUTED AUTHORSHIP , 1939 .

[5]  Harry Shum,et al.  An Empirical Study on Learning to Rank of Tweets , 2010, COLING.

[6]  J. Matarazzo,et al.  The interview; research on its anatomy and structure , 1972 .

[7]  J. Pennebaker,et al.  Linguistic Style Matching in Social Interaction , 2002 .

[8]  Marian S. Harris,et al.  Self-disclosure reciprocity, liking and the deviant , 1973 .

[9]  Brian J. Gaines,et al.  Breaking the (Benford) Law , 2007 .

[10]  Shlomo Argamon,et al.  Automatically Categorizing Written Texts by Author Gender , 2002, Lit. Linguistic Comput..

[11]  H. Giles,et al.  Accommodating a New Frontier: The Context of Law Enforcement , 2006 .

[12]  S. Feldstein,et al.  Rhythms of dialogue , 1970 .

[13]  John C. Paolillo Language variation on Internet Relay Chat: A social network approach , 2001 .

[14]  Eric Gilbert,et al.  Predicting tie strength with social media , 2009, CHI.

[15]  Sheida White Backchannels across cultures: A study of Americans and Japanese , 1989, Language in Society.

[16]  W. S. Condon,et al.  A segmentation of behavior , 1967 .

[17]  Heidi E. Hamilton Contexts of Accommodation: Accommodation and mental disability , 1991 .

[18]  Susan C. Herring,et al.  Beyond Microblogging: Conversation and Collaboration via Twitter , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[19]  J. Milroy,et al.  Social network and social class: Toward an integrated sociolinguistic model , 1992, Language in Society.

[20]  H. Giles,et al.  Accommodation theory: Communication, context, and consequence. , 1991 .

[21]  Dominic A. Infante,et al.  Building communication theory , 1990 .

[22]  Stephanie Kelter,et al.  Surface form and memory in question answering , 1982, Cognitive Psychology.

[23]  Paul J. Taylor,et al.  Linguistic Style Matching and Negotiation Outcome , 2005 .

[24]  Kathleen Ferrara Contexts of Accommodation: Accommodation in therapy , 1991 .

[25]  F. Mosteller,et al.  Inference in an Authorship Problem , 1963 .

[26]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[27]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[28]  David Yarowsky,et al.  Web N-gram workshop 2010 , 2011, SIGF.

[29]  R. Krauss,et al.  Dominance and accommodation in the conversational behaviours of same- and mixed-gender dyads. , 1988 .

[30]  P. Taylor,et al.  Linguistic Style Matching and Negotiation Outcome , 2005 .

[31]  Judee K. Burgoon,et al.  Models of reactions to changes in nonverbal immediacy , 1984 .

[32]  Emre Kiciman,et al.  Language Differences and Metadata Features on Twitter , 2010 .

[33]  John C. Paolillo,et al.  Gender and genre variation in weblogs , 2006 .

[34]  Arjun Mukherjee,et al.  Improving Gender Classification of Blog Authors , 2010, EMNLP.

[35]  George A. Miller,et al.  The science of words , 1991 .

[36]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[37]  Brendan T. O'Connor,et al.  A Latent Variable Model for Geographic Lexical Variation , 2010, EMNLP.

[38]  Alan Ritter,et al.  Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[39]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.