Subject Realization in Japanese Conversation by Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus Research

In the field of Learner Corpus Research, Gries and Deshors (Corpora 9(1):109–136, 2014) developed a two-step regression procedure (MuPDAR) to determine how and why choices made by non-native speakers differ from those made by native speakers more comprehensively than traditional learner corpus research allows for. In this chapter, we will extend and test their proposal to determine whether it can also be applied to pragmatic and grammatical phenomena (subject realization/omission in Japanese), and whether it can help study categorical differences between learner and native-speaker choices; we do so by also showing that the more advanced method of mixed-effects modeling can be very fruitfully integrated into the proposed MuPDAR method. The results of our study show that Japanese native speakers’ choices of subject realization are affected by discourse-functional factors such as givenness and contrast of referents and that, while learners are able to handle extreme values of givenness and marked cases of contrast, they still struggle (more) with intermediate degrees of givenness and unmarked/non-contrastive referents. We conclude by discussing the role of MuPDAR in Learner Corpus Research in general and its advantages over traditional corpus analysis in that field and error analysis in particular.

[1]  M. Shibatani PASSIVES AND RELATED CONSTRUCTIONS: A PROTOTYPE ANALYSIS , 1985 .

[2]  Karin Aijmer,et al.  Modality in advanced Swedish learners’ written interlanguage , 2002 .

[3]  Hilde Hasselgård,et al.  Learner corpora and contrastive interlanguage analysis , 2011 .

[4]  Scott Jarvis,et al.  Approaching language transfer through text classification : explorations in the detection-based approach , 2012 .

[5]  Tomoyo Takagi Contextual Resources for Interferring Unexpressed Referents in Japanese Conversations , 2002 .

[6]  Sandra A. Thompson,et al.  Deconstructing “Zero Anaphora” in Japanese , 1997 .

[7]  Tomoyo Takagi,et al.  Contextual resources for inferring unexpressed referents in Japanese conversation , 2002 .

[8]  Guy Aston,et al.  Corpora and language learners , 2004 .

[9]  Stefan Th. Gries,et al.  Using regressions to explore deviations between corpus data and a standard/target: two suggestions , 2014 .

[10]  Marianne Hundt,et al.  Exploring second-language varieties of English and learner Englishes : bridging a paradigm gap , 2011 .

[11]  Marie-Paule Péry-Woodley,et al.  Contrasting discourses: contrastive analysis and a discourse approach to writing , 1990, Language Teaching.

[12]  Geoffrey Leech,et al.  English Corpus Linguistics: Looking back, Moving forward , 2012 .

[13]  Marianne Hundt,et al.  Overuse of the progressive in ESL and learner Englishes – fact or fiction? , 2011 .

[14]  Stefanie Wulff,et al.  Psycholinguistic and corpus-linguistic evidence for L2 constructions , 2009 .

[15]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[16]  Magali Paquot,et al.  A Taste for Corpora. In Honour of Sylviane Granger , 2011 .

[17]  Stefanie Wulff,et al.  The genitive alternation in Chinese and German ESL learners: Towards a multifactorial notion of context in learner corpus research , 2013 .

[18]  Bengt Altenberg,et al.  Using bilingual corpus evidence in learner corpus research , 2002 .

[19]  Joseph Collentine,et al.  A Corpus‐Based Analysis of the Discourse Functions of Ser/Estar + Adjective in Three Levels of Spanish as FL Learners , 2010 .

[20]  Sylviane Granger,et al.  Computer learner corpus research: current status and future prospects , 2004 .

[21]  K. Sakuma The structure of the Japanese language , 1951 .

[22]  John V. Hinds Ellipsis in Japanese , 1982 .

[23]  JoAnne Neff,et al.  The use of small corpora for tracing the development of academic literacies , 2011 .

[24]  Sylviane Granger,et al.  From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora , 1996 .

[25]  Sylviane Granger,et al.  Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching , 2002 .

[26]  Michael J. Crawley,et al.  The R book , 2022 .

[27]  A. Zuur,et al.  Mixed Effects Models and Extensions in Ecology with R , 2009 .

[28]  Gaëtanelle Gilquin,et al.  Linking up Contrastive and Learner Corpus Research , 2008 .

[29]  Carson T. Schütze The empirical base of linguistics: Grammaticality judgments and linguistic methodology , 1998 .

[30]  J. Faraway Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models , 2005 .

[31]  D. Bates,et al.  Linear Mixed-Effects Models using 'Eigen' and S4 , 2015 .

[32]  Stefan Th. Gries,et al.  & . Spread of on-going changes in an immigrant language: Turkish in the Netherlands. , 2012 .

[33]  Svetla Rogatcheva Perfect problems: A corpus-based comparison of the perfect in Bulgarian and German EFL writing , 2012 .

[34]  Yukio Tono Multiple comparisons of IL, L1 and TL corpora: The case of L2 acquisition of verb subcategorization patterns by Japanese learners of English , 2004 .

[35]  Christelle Cosme Participle clauses in learner English: the role of transfer , 2008 .

[36]  Sylviane Granger,et al.  A Bird’s-eye view of learner corpus research , 2002 .

[37]  Tomasz P. Krzeszowski,et al.  Contrasting languages : the scope of contrastive linguistics , 1990 .

[38]  Elisabeth Dévière,et al.  Analyzing linguistic data: a practical introduction to statistics using R , 2009 .

[39]  R. Baayen,et al.  Mixed-effects modeling with crossed random effects for subjects and items , 2008 .

[40]  Ulla Connor,et al.  Applied corpus linguistics : a multidimensional perspective , 2004 .

[41]  Stefan Th. Gries,et al.  Spanish "lo(s)-le(s)" Clitic Alternations in Psych Verbs: A Multifactorial Corpus-Based Analysis , 2013 .

[42]  N. A. Mccawley,et al.  The structure of the Japanese language , 1973 .

[43]  T. Jaeger,et al.  Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. , 2008, Journal of memory and language.