Generating Code-switched Text for Lexical Learning

A vast majority of L1 vocabulary acquisition occurs through incidental learning during reading (Nation, 2001; Schmitt et al., 2001). We propose a probabilistic approach to generating code-mixed text as an L2 technique for increasing retention in adult lexical learning through reading. Our model that takes as input a bilingual dictionary and an English text, and generates a code-switched text that optimizes a defined “learnability” metric by constructing a factor graph over lexical mentions. Using an artificial language vocabulary, we evaluate a set of algorithms for generating code-switched text automatically by presenting it to Mechanical Turk subjects and measuring recall in a sentence completion task.

[1]  M. Kutas,et al.  Brain potentials during reading reflect word expectancy and semantic association , 1984, Nature.

[2]  Almeida Jacqueline Toribio,et al.  Code switching and X-bar theory: the fuctional head constraint , 1994 .

[3]  James P. Lantolf,et al.  Vygotskian Approaches to Second Language Research , 1994 .

[4]  Pieter Muysken,et al.  One Speaker, Two Languages: Cross-Disciplinary Perspectives on Code-Switching , 1995 .

[5]  R. Schmidt Attention and awareness in foreign language learning , 1995 .

[6]  Rakesh Mohan Bhatt Code-switching, constraints, and optimal grammars☆ , 1997 .

[7]  F. Genesee Bilingual first language acquisition: exploring the limits of the language faculty , 2001, Annual Review of Applied Linguistics.

[8]  N. Schmitt,et al.  Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test , 2001 .

[9]  Advaith Siddharthan,et al.  Syntactic Simplification and Text Cohesion , 2006 .

[10]  Ani Nenkova,et al.  Syntactic Simplification for Improving Content Selection in Multi-Document Summarization , 2004, COLING.

[11]  Ben Hutchinson,et al.  Modelling the Substitutability of Discourse Connectives , 2005, ACL.

[12]  John M. Lipski Code-switching or Borrowing? No sé so no puedo decir, you know , 2005 .

[13]  E. Macaro Codeswitching in the L2 Classroom: A Communication and Learning Strategy , 2005 .

[14]  D. R. Hill Graded readers in English , 2006 .

[15]  Noémie Elhadad,et al.  Mining a Lexicon of Technical Terms and Lay Equivalents , 2007, BioNLP@ACL.

[16]  Tom Cobb,et al.  Computing the vocabulary demands of L2 reading , 2007 .

[17]  Almeida Jacqueline Toribio,et al.  Code Switching and X-Bar Theory : The Functional Head Constraint , 2008 .

[18]  Yang Liu,et al.  Learning to Predict Code-Switching Points , 2008, EMNLP.

[19]  Richard Sproat,et al.  Knowing the Unseen: Estimating Vocabulary Size over Unseen Samples , 2009, ACL.

[20]  Cristian Danescu-Niculescu-Mizil,et al.  For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia , 2010, NAACL.

[21]  D. Beglar A Rasch-based validation of the Vocabulary Size Test , 2010 .

[22]  Visvaganthie Moodley Code-switching and communicative competence in the language classroom , 2011 .

[23]  Noémie Elhadad,et al.  Putting it Simply: a Context-Aware Approach to Lexical Simplification , 2011, ACL.

[24]  J. Elman,et al.  Once is Enough: N400 Indexes Semantic Integration of Novel Word Meanings from a Single Exposure in Context , 2012, Language learning and development : the official journal of the Society for Language Development.

[25]  Haizhou Li,et al.  Recurrent neural network language modeling for code switching conversational speech , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Gabriella Vigliocco,et al.  Word surprisal predicts N400 amplitude during reading , 2013, ACL.

[27]  David Kauchak,et al.  Improving Text Simplification Language Modeling Using Unsimplified Text Data , 2013, ACL.

[28]  Anna Papst,et al.  Learning Vocabulary In Another Language , 2016 .