Social and Emotional Correlates of Capitalization on Twitter

Social media text is replete with unusual capitalization patterns. We posit that capitalizing a token like THIS performs two expressive functions: it marks a person socially, and marks certain parts of an utterance as more salient than others. Focusing on gender and sentiment, we illustrate using a corpus of tweets that capitalization appears in more negative than positive contexts, and is used more by females compared to males. Yet we find that both genders use capitalization in a similar way when expressing sentiment.

[1]  Dirk Hovy,et al.  Demographic Factors Improve Classification Performance , 2015, ACL.

[2]  Shlomo Argamon,et al.  Effects of Age and Gender on Blogging , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[3]  Shlomo Argamon,et al.  Automatically Categorizing Written Texts by Author Gender , 2002, Lit. Linguistic Comput..

[4]  S. Tagliamonte,et al.  LINGUISTIC RUIN? LOL! INSTANT MESSAGING AND TEEN LANGUAGE , 2008 .

[5]  Jacob Eisenstein Systematic patterning in phonologically‐motivated orthographic variation , 2015 .

[6]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[7]  Nicholas Diakopoulos,et al.  Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs , 2011, EMNLP.

[8]  Ilona Vandergriff Emotive communication online: A contextual analysis of computer-mediated communication (CMC) cues , 2013 .

[9]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Uzay Kaymak,et al.  Exploiting Emoticons in Polarity Classification of Text , 2015, J. Web Eng..

[12]  David Bamman,et al.  Gender identity and lexical variation in social media , 2012, 1210.4567.

[13]  Jacob Eisenstein,et al.  What to do about bad language on the internet , 2013, NAACL.

[14]  Kalina Bontcheva,et al.  ResToRinG CaPitaLiZaTion in #TweeTs , 2015, WWW.

[15]  David Yarowsky,et al.  Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media , 2013, EMNLP.

[16]  Jannis Androutsopoulos Non‐standard spellings in media texts: The case of German fanzines , 2000 .

[17]  Sara Rosenthal,et al.  Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations , 2011, ACL.

[18]  Mary Talbot,et al.  Gender and Language , 2006 .