Using Twitter Data for the Study of Language Change in Low-Resource Languages. A Panel Study of Relative Pronouns in Frisian

This paper investigates the usability of Twitter as a resource for the study of language change in progress in low-resource languages. It is a panel study of a vigorous change in progress, the loss of final t in four relative pronouns (dy't, dêr't, wêr't, wa't) in Frisian, a language spoken by ± 450,000 speakers in the north-west of the Netherlands. This paper deals with the issues encountered in retrieving and analyzing tweets in low-resource languages, in the analysis of low-frequency variables, and in gathering background information on Twitterers. In this panel study we were able to identify and track 159 individual Twitterers, whose Frisian (and Dutch) tweets posted in the era 2010–2019 were collected. Nevertheless, a solid analysis of the sociolinguistic factors in this language change in progress was hampered by unequal age distributions among the Twitterers, the fact that the youngest birth cohorts have given up Twitter almost completely after 2014 and that the variables have a low frequency and are unequally spread over Twitterers.

[1]  Clare Wood,et al.  Txt msg n school literacy: does texting and knowledge of text abbreviations adversely affect children's literacy attainment? , 2008 .

[2]  Robert West,et al.  Adoption of Twitter's New Length Limit: Is 280 the New 140? , 2020, ArXiv.

[3]  Alastair G. H. Walker,et al.  Handbuch des Friesischen/Handbook of Frisian Studies , 2001 .

[4]  John C. Paolillo,et al.  Gender and genre variation in weblogs , 2006 .

[5]  Naomi S. Baron Discourse structures in Instant Messaging: The case of utterance breaks , 2010 .

[6]  A. Stæhr Reflexivity in Facebook interaction – Enregisterment across written and spoken language practices , 2015 .

[7]  D. Sandra,et al.  When two basic principles clash : about the validity of written chat language as a Research tool for spoken language variation : Flemish chatspeak as a test case , 2016 .

[8]  Christian Buchta,et al.  The textcat Package for n-Gram Based Text Categorization in R , 2013 .

[9]  Jannis Androutsopoulos Moments of sharing: Entextualization and linguistic repertoires in social networking , 2014 .

[10]  Triveni Kuchi,et al.  Computer Mediated Communication: Social Interaction and the Internet , 2006 .

[11]  K. Shadan,et al.  Available online: , 2012 .

[12]  D. A. van Leeuwen,et al.  A Real Time Study of Contact-Induced Language Change in Frisian Relative Pronouns , 2017 .

[13]  Isaac L. Bleaman Implicit Standardization in a Minority Language Community: Real-Time Syntactic Change among Hasidic Yiddish Writers , 2020, Frontiers in Artificial Intelligence.

[14]  D. Gorter Extent and position of West Frisian , 2001 .

[15]  Jacob Eisenstein Systematic patterning in phonologically‐motivated orthographic variation , 2015 .

[16]  Grant Blank The Digital Divide Among Twitter Users and Its Implications for Social Research , 2017 .

[17]  Diansheng Guo,et al.  Mapping Lexical Dialect Variation in British English Using Twitter , 2019, Front. Artif. Intell..

[18]  Fabian Flöck,et al.  Demographic Inference and Representative Population Estimates from Multilingual Social Media Data , 2019, WWW.

[19]  L. Verheijen Is textese a threat to traditional literacy? Dutch youths’ language use in written computer-mediated communication and relations with their school writing , 2019 .

[20]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[21]  Jean Aitchison,et al.  Language and the Internet , 2002, Lit. Linguistic Comput..

[22]  H. V. D. Velde,et al.  Language use of Frisian bilingual teenagers on social media , 2016 .

[23]  Lieke Verheijen WhatsApp with social media slang? : Youth language use in Dutch written computer-mediated communication , 2017 .

[24]  Emre Yilmaz,et al.  Chapter 5. Language change caught in the act , 2019 .

[25]  Multilingual Youth Practices in Computer Mediated Communication , 2018 .

[26]  Judith Nobels,et al.  Code eclecticism: Linguistic variation and code alternation in the chat language of Flemish teenagers† , 2010 .

[27]  A. D. Hall,et al.  The PORT Mathematical Subroutine Library , 1978, TOMS.

[28]  S. Wagner,et al.  Panel Studies of Variation and Change , 2017 .

[29]  Hugh Chignell Key Concepts in Radio Studies , 2009 .

[30]  Daniel Cunliffe,et al.  Young Bilinguals' Language Behaviour in Social Networking Sites: The Use of Welsh on Facebook , 2013, J. Comput. Mediat. Commun..

[31]  L. Cornips,et al.  Regional languages on Twitter A comparative study between Frisian and Limburgish , 2017 .

[32]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[33]  L. Jongbloed-Faber Friezen op sosjale media : Rapportaazje ûndersyk Taalfitaliteit II , 2015 .

[34]  S. Herring Computer-mediated communication : linguistic, social and cross-cultural perspectives , 1996 .

[35]  Gertrud K Reershemius Autochthonous heritage languages and social media: writing and bilingual practices in Low German on Facebook , 2017 .

[36]  Dong Nguyen,et al.  "How Old Do You Think I Am?" A Study of Language and Age in Twitter , 2013, ICWSM.

[37]  S. Herring Computer‐Mediated Discourse , 2005 .

[38]  Jannis Androutsopoulos Introduction: Sociolinguistics and computer-mediated communication , 2006 .

[39]  Jacob Eisenstein,et al.  Phonological Factors in Social Media Writing , 2013 .

[40]  H. Van de Velde,et al.  N-deletion in reading style , 2000 .