Decolonising Speech and Language Technology

After generations of exploitation, Indigenous people often respond negatively to the idea that their languages are data ready for the taking. By treating Indigenous knowledge as a commodity, speech and language technologists risk disenfranchising local knowledge authorities, reenacting the causes of language endangerment. Scholars in related fields have responded to calls for decolonisation, and we in the speech and language technology community need to follow suit, and explore what this means for our practices that involve Indigenous languages and the communities who own them. This paper reviews colonising discourses in speech and language technology, and suggests new ways of working with Indigenous communities, and seeks to open a discussion of a postcolonial approach to computational methods for supporting language vitality.

[1]  Robert E. Moore,et al.  Disappearing, Inc.: Glimpsing the sublime in the politics of access to endangered languages , 2006 .

[2]  Michael Henderson,et al.  Using mobile phones as placed resources for literacy learning in a remote Indigenous community in Australia , 2012 .

[3]  Maïa Ponsonnet Difference and Repetition in Language Shift to a Creole , 2019 .

[4]  Lucille J. Watahomigie,et al.  Endangered languages. , 1991, Science.

[5]  Willlam J. Samarin The linguistic world of field colonialism , 1984, Language in Society.

[6]  Abeba Birhane,et al.  Algorithmic Colonization of Africa , 2020, SCRIPT-ed.

[7]  Margaret J. Somerville,et al.  Border work in the contact zone: thinking indigenous/non-indigenous collaboration spatially , 2003 .

[8]  C. Shalizi The Domestication of the Savage Mind , 2009 .

[9]  Satoshi Nakamura,et al.  Unsupervised Phoneme Segmentation of Previously Unseen Languages , 2016, INTERSPEECH.

[10]  M. Walter,et al.  Indigenous data, indigenous methodologies and indigenous data sovereignty , 2018, International Journal of Social Research Methodology.

[11]  M. Walsh WILL INDIGENOUS LANGUAGES SURVIVE , 2005 .

[12]  Shakir Mohamed,et al.  Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence , 2020, Philosophy & Technology.

[13]  N. Himmelmann Meeting the transcription challenge , 2018 .

[14]  Susan Leigh Star,et al.  Institutional Ecology, `Translations' and Boundary Objects: Amateurs and Professionals in Berkeley's Museum of Vertebrate Zoology, 1907-39 , 1989 .

[15]  Paulo Freire,et al.  Pedagogy of the Oppressed , 2019, Toward a Just World Order.

[16]  D. Gerdts Beyond expertise: The role of the linguist in language revitalization programs , 2010 .

[17]  D. Anthony Evidence-based Policy: A Realist Perspective , 2007 .

[18]  Steven Bird Designing mobile applications for endangered languages , 2018 .

[19]  Graham Neubig,et al.  Integrating automatic transcription into the language documentation workflow: Experiments with Na data and the Persephone toolkit , 2018 .

[20]  William J. Samarin,et al.  Language Shift and Cultural Reproduction: Socialization, Self, and Syncretism in a Papua New Guinean Village , 1994 .

[21]  J. Cariño State of the World’s Indigenous Peoples , 2019, State of the World’s Indigenous Peoples.

[22]  Steven Bird A Scalable Method for Preserving Oral Literature from Small Languages , 2010, ICADL.

[23]  Paul Cook,et al.  Towards Language Technology for Mi'kmaq , 2018, LREC.

[24]  Helen Verran,et al.  Using/designing digital technologies of representation in Aboriginal Australian knowledge practices , 2007 .

[25]  John A. Taylor,et al.  Indigenous Peoples and Indicators of Well-being: Australian Perspectives on United Nations Global Frameworks , 2008 .

[26]  Bernard C. Perley Zombie Linguistics: Experts, Endangered Languages and the Curse of Undead Voices , 2012 .

[27]  Steven Bird,et al.  Sparse Transcription , 2021, Computational Linguistics.

[28]  G. Zuckermann Revivalistics: From the Genesis of Israeli to Language Reclamation in Australia and Beyond , 2020 .

[29]  Tao Fu,et al.  Whose global village? rethinking how technology shapes our world , 2018, Information, Communication & Society.

[30]  Nancy C. Dorian Purism vs. compromise in language revitalization and language revival , 1994, Language in society.

[31]  Paul Dourish,et al.  Postcolonial computing: a lens on design and development , 2010, CHI.

[32]  F. Merlan Indigeneity: global and local. , 2009, Current anthropology.

[33]  Javad Nouri,et al.  Revita: a system for language learning and supporting endangered languages , 2017 .

[34]  Ashish Vaswani,et al.  The International Workshop on Language Preservation: An Experiment in Text Collection and Language Technology , 2013 .

[35]  Marie Battiste,et al.  Protecting Indigenous Knowledge and Heritage: A Global Challenge , 2016 .

[36]  Trevor van Weeren,et al.  Designing digital knowledge management tools with Aboriginal Australians , 2007, Digit. Creativity.

[37]  Stefan Sperlich Jos The Economics of "Why is it so hard to save a threatened Language?" , 2014 .

[38]  Tuck Wah Leong,et al.  On Being Iterated: The Affective Demands of Design Participation , 2020, CHI.

[39]  Dawn Bessarab,et al.  Yarning About Yarning as a Legitimate Method in Indigenous Research , 2010 .

[40]  Khalid Choukri,et al.  The european language resources association , 1998, LREC.

[41]  Mark Johnson,et al.  Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.

[42]  Mae Keary,et al.  The Science of Evaluation: A Realist Manifesto , 2014, Online Inf. Rev..

[43]  A. Swaan Endangered languages, sociolinguistics, and linguistic sentimentalism , 2004, European Review.

[44]  Douglas H. Whalen,et al.  Healing through language: Positive physical health effects of indigenous language use , 2016 .

[45]  Lenore A. Grenoble,et al.  Saving Languages: An Introduction to Language Revitalization , 2005 .

[46]  M. Moran Serious Whitefella Stuff: When solutions became the problem in Indigenous affairs , 2016 .

[47]  M. Eleanor Nevins,et al.  Lessons from Fort Apache: Beyond Language Endangerment and Maintenance , 2013 .

[48]  M. Hermes,et al.  Resounding the clarion call: Indigenous language learners and documentation , 2017 .

[49]  Edwin H. Blake,et al.  Being participated: a community approach , 2010, PDC '10.

[50]  Corinne A. Seals,et al.  Translanguaging in Conjunction with language revitalization , 2020 .

[51]  Sebastian Stüker,et al.  Innovative technologies for under-resourced language documentation: The BULB Project , 2016 .

[52]  R.I.A. Mercuri,et al.  Technology as Experience , 2005, IEEE Transactions on Professional Communication.

[53]  S. L. Star Living Grounded Theory: Cognitive and Emotional Forms of Pragmatism , 2007 .

[54]  D. Bradley Language Endangerment and Resilience Linguistics: Case Studies of Gong and Lisu , 2010 .

[55]  Tanja Schultz,et al.  Automatic speech recognition for under-resourced languages: A survey , 2014, Speech Commun..

[56]  Jesper Simonsen,et al.  Routledge International Handbook of Participatory Design , 2012 .

[57]  Micha Elsner,et al.  A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability , 2013, EMNLP.

[58]  P. Austin,et al.  Dying to be counted: the commodification of endangered languages in documentary linguistics 1 , 2007 .

[59]  Jonathan Grudin,et al.  Computer-supported cooperative work: history and focus , 1994, Computer.

[60]  N. Himmelmann,et al.  Documentary and descriptive linguistics , 1998 .

[61]  Valerie Guerin,et al.  Writing an endangered language , 2008 .

[62]  David Chiang,et al.  Machine Translation for Language Preservation , 2012, COLING.

[63]  Steven Bird,et al.  Orthography and identity in Cameroon , 2001, Written Language and Literacy.

[64]  C. Cucchiarini,et al.  Phonetic transcription: a methodological and empirical study , 1993 .

[65]  Steven Bird,et al.  Aikuma: A Mobile App for Collaborative Language Documentation , 2014 .

[66]  Monica Ward Qualitative Research in Less Commonly Taught and Endangered Language CALL. , 2018 .

[67]  Alessandro Soro,et al.  Designing evaluation beyond evaluating design: measuring success in cross-cultural projects , 2016, OZCHI.

[68]  J. Cenoz,et al.  Minority languages and sustainable translanguaging: threat or opportunity? , 2017 .

[69]  S. Kleinman,et al.  Emotions and Fieldwork , 1993 .

[70]  Paul Dourish,et al.  Ubicomp's colonial impulse , 2012, UbiComp.

[71]  Michael E. Krauss,et al.  The vanishing languages of the Pacific Rim , 2007 .

[72]  Alessandro Soro,et al.  A Relational Approach to Designing Social Technologies that Foster Use of the Kuku Yalanji Language , 2019, OZCHI.

[73]  Gary F. Simons,et al.  ASSESSING ENDANGERMENT: EXPANDING FISHMAN'S GIDS , 2010 .

[74]  Aren Jansen,et al.  The zero resource speech challenge 2017 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[75]  M. Turshen Development as Freedom , 2001 .

[76]  Mary R Hermes,et al.  OJIBWE LANGUAGE REVITALIZATION, MULTIMEDIA TECHNOLOGY, AND FAMILY LANGUAGE LEARNING , 2013 .

[77]  S. L. Star,et al.  This is Not a Boundary Object: Reflections on the Origin of a Concept , 2010 .

[78]  Emily Tucker Prud'hommeaux,et al.  ASR for Documenting Acutely Under-Resourced Indigenous Languages , 2018, LREC.

[79]  P. Bedford,et al.  Conflicting Knowledges: Barriers to Language Continuation in the Kimberley , 2010, The Australian Journal of Indigenous Education.

[80]  Roger K. Moore,et al.  Discovering the phoneme inventory of an unwritten language: A machine-assisted approach , 2014, Speech Commun..

[81]  Kapati Time: Storytelling as a Data Collection Method in Indigenous Research , 2017 .

[82]  David Nathan,et al.  Keeping Languages Alive: Re-imagining documentary linguistics as a revitalization-driven practice , 2013 .

[83]  JANE H. Hill,et al.  "Expert Rhetorics" in Advocacy for Endangered Languages: Who Is Listening, and What Do They Hear? , 2002 .

[84]  Josef van Genabith,et al.  CALL for Endangered Languages: Challenges and Rewards , 2003 .

[85]  Lutz Marten,et al.  Linguistic variation and the dynamics of language documentation: Editing in ‘pure’ Kagulu , 2016 .

[86]  Gary Simons,et al.  Seven Dimensions of Portability for Language Documentation and Description , 2002, ArXiv.

[87]  Jörg Franke,et al.  Towards phoneme inventory discovery for documentation of unwritten languages , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[88]  Leanne Hinton,et al.  The Master-Apprentice Language Learning Program , 2001 .

[89]  Tonya N. Stebbins,et al.  Living Languages and New Approaches to Language Revitalisation Research , 2017 .

[90]  Margaret McKeon Decolonizing Solidarity: Dilemmas and Directions for Supporters of Indigenous Struggles , 2016 .

[91]  Helen Verran,et al.  The Touch Pad Body: A Generative Transcultural Digital Device Interrupting Received Ideas and Practices in Aboriginal Health , 2014 .

[92]  Margot Brereton,et al.  Beyond ethnography: engagement and reciprocity as foundations for design research out here , 2014, CHI.

[93]  Steven Bird,et al.  Collecting Bilingual Audio in Remote Indigenous Communities , 2014, COLING.

[94]  Ruth Singer,et al.  Getting in Touch: Language and Digital Inclusion in Australian Indigenous Communities , 2015 .

[95]  K. Hale,et al.  Book Review: The Green Book of Language Revitalization in Practice , 2001 .

[96]  Ulrike Mosel,et al.  Chapter 1 Language documentation: What is it and what is it good for? , 2006 .

[97]  C. Bow Diverse socio-technical aspects of a digital archive of Aboriginal languages , 2019, Archives and Manuscripts.

[98]  Lev Grossman Inside Facebook's Plan to Wire the World , 2014 .

[99]  Racquel-María Sapién Design and Implementation of Collaborative Language Documentation Projects , 2018 .

[100]  K. Charmaz,et al.  Constructing Grounded Theory , 2014 .