Grammatical Constraints on Intra-sentential Code-Switching: From Theories to Working Models

We make one of the first attempts to build working models for intra-sentential code-switching based on the Equivalence-Constraint (Poplack 1980) and Matrix-Language (Myers-Scotton 1993) theories. We conduct a detailed theoretical analysis, and a small-scale empirical study of the two models for Hindi-English CS. Our analyses show that the models are neither sound nor complete. Taking insights from the errors made by the models, we propose a new model that combines features of both the theories.

[1]  David A. van Leeuwen,et al.  Investigating Bilingual Deep Neural Networks for Automatic Recognition of Code-switching Frisian Speech , 2016, SLTU.

[2]  Mona T. Diab,et al.  AIDA: Identifying Code Switching in Informal Arabic Text , 2014, CodeSwitch@EMNLP.

[3]  Dipti Misra Sharma,et al.  Shallow Parsing Pipeline - Hindi-English Code-Mixed Social Media Text , 2016, NAACL.

[4]  Ellen Contini-Morava,et al.  Duelling Languages: Grammatical Structure in Codeswitching , 1995 .

[5]  Jatin Sharma,et al.  POS Tagging of English-Hindi Code-Mixed Social Media Content , 2014, EMNLP.

[6]  John DeNero,et al.  Tailoring Word Alignments to Syntactic Machine Translation , 2007, ACL.

[7]  D. Sankoff A formal production-based explanation of the facts of code-switching , 1998, Bilingualism: Language and Cognition.

[8]  Carol Myers-Scotton,et al.  One speaker, two languages: A lexically based model of code-switching , 1995 .

[9]  Jatin Sharma,et al.  “I am borrowing ya mixing ?" An Analysis of English-Hindi Code Mixing in Facebook , 2014, CodeSwitch@EMNLP.

[10]  Ngoc Thang Vu,et al.  Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling , 2013, ACL.

[11]  Julia Hirschberg,et al.  Overview for the First Shared Task on Language Identification in Code-Switched Data , 2014, CodeSwitch@EMNLP.

[12]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[13]  Alexander Yates,et al.  Improving Word Alignment Using Linguistic Code Switching Data , 2014, EACL.

[14]  Yang Liu,et al.  Part-of-Speech Tagging for English-Spanish Code-Switched Text , 2008, EMNLP.

[15]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[16]  Aravind K. Joshi,et al.  Processing of Sentences With Intra-Sentential Code-Switching , 1982, COLING.

[17]  Amitava Das,et al.  Code-Mixing in Social Media Text. The Last Language Identification Frontier? , 2013, Trait. Autom. des Langues.

[18]  Suzanne Romaine One Speaker, Two Languages: Cross-Disciplinary Perspectives on Code-Switching , 1997 .

[19]  Pieter Muysken,et al.  Government and code-mixing , 1986, Journal of Linguistics.

[20]  Ngoc Thang Vu,et al.  Syntactic and Semantic Features For Code-Switching Factored Language Models , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.