Multilingual learning with parameter co-occurrence clustering

Multilingualism may be viewed in the broadest sense as the knowledge and use of many distinct, though possibly overlapping linguistic systems by individual speakers. It is, according to this view, a pervasive phenomenon, with virtually all language users—even those commonly called monolingual—being able to distinguish and employ multiple linguistic systems at various points on the language-dialect-register continuum. In addition to the canonical examples of native bilingualism, code-switching, etc., this definition will include such things as the “multi-dialectism” described in Clopper’s (2004) extensive study of American English speakers’ abilities to distinguish and categorize multiple dialects, as well as to produce multiple dialects natively and even imitate them non-natively. Also included is the case of register variation, wherein speakers make use of systematically different phonological, morphological, and syntactic forms and processes (in effect, distinct systems of communication) depending on social and conversational context; Biber (1995) gives a detailed cross-linguistic survey. In this view, then, a speaker who “knows a language” like English, Cantonese, Palauan, or Guarani, in fact knows a collection of mostly overlapping, yet distinct, systems of communication, including registers, dialects, and others’ idiolects, some perhaps acquired to different degrees, or only passively (i.e., to be recognized and distinguished, but not produced), along with some specification of their contexts of use. A fundamental question, then, is how language learners in a pervasively multilingual environment, where they receive mixed samples from many distinct linguistic systems, can manage to distinguish the component systems and acquire them separately. For example, if a learner is exposed to languages L1 and L2, where L1 epenthesizes onsets and L2

[1]  Tetsuo Kumatoridani Alternation and co-occurrence in Japanese thanks , 1999 .

[2]  Arto Anttila,et al.  Phonological Constraints on Constituent Ordering , 2008 .

[3]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Regina Barzilay,et al.  Unsupervised Multilingual Learning for Morphological Segmentation , 2008, ACL.

[5]  Johannes Müller-Lancé A Strategy Model of Multilingual Learning , 2003 .

[6]  P. Boersma,et al.  Empirical Tests of the Gradual Learning Algorithm , 2001, Linguistic Inquiry.

[7]  Jason Alan Riggle,et al.  Generation, recognition, and learning in finite state optimality theory , 2004 .

[8]  Dana Angluin,et al.  Inductive Inference of Formal Languages from Positive Data , 1980, Inf. Control..

[9]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[10]  Arto Anttila,et al.  Variation in Finnish phonology and morphology , 1997 .

[11]  David J. Young,et al.  New developments in systemic linguistics , 1987 .

[12]  Mark Newman,et al.  Detecting community structure in networks , 2004 .

[13]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[14]  Dmitriy Genzel Inducing a Multilingual Dictionary from a Parallel Multitext in Related Languages , 2005, HLT/EMNLP.

[15]  P. Matthews Generating a Random Linear Extension of a Partial Order , 1991 .

[16]  Anne Violin-Wigent OPTIMALITY THEORY: CONSTRAINT INTERACTION IN GENERATIVE GRAMMAR , 2006 .

[17]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[18]  Whitney M. Weikum,et al.  Visual Language Discrimination in Infancy , 2007, Science.

[19]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[20]  Douglas Biber,et al.  Dimensions of Register Variation , 1995 .

[21]  D. Hymes Foundations in Sociolinguistics: An Ethnographic Approach , 1974 .