Using the Relative Entropy of Linguistic Complexity to Assess L2 Language Proficiency Development

This study applies relative entropy in naturalistic large-scale corpus to calculate the difference among L2 (second language) learners at different levels. We chose lemma, token, POS-trigram, conjunction to represent lexicon and grammar to detect the patterns of language proficiency development among different L2 groups using relative entropy. The results show that information distribution discrimination regarding lexical and grammatical differences continues to increase from L2 learners at a lower level to those at a higher level. This result is consistent with the assumption that in the course of second language acquisition, L2 learners develop towards a more complex and diverse use of language. Meanwhile, this study uses the statistics method of time series to process the data on L2 differences yielded by traditional frequency-based methods processing the same L2 corpus to compare with the results of relative entropy. However, the results from the traditional methods rarely show regularity. As compared to the algorithms in traditional approaches, relative entropy performs much better in detecting L2 proficiency development. In this sense, we have developed an effective and practical algorithm for stably detecting and predicting the developments in L2 learners’ language proficiency.

[1]  John Hale,et al.  Information-theoretical Complexity Metrics , 2016, Lang. Linguistics Compass.

[2]  Alex Housen,et al.  Defining and operationalising L2 complexity , 2012 .

[3]  Matti Miestamo Implicational hierarchies and grammatical complexity , 2009 .

[4]  Cynthia S. Puranik,et al.  Modeling the development of written language , 2011, Reading and writing.

[5]  R. Dekeyser WHAT MAKES LEARNING SECOND-LANGUAGE GRAMMAR DIFFICULT? AREVIEW OF ISSUES , 2005 .

[6]  H. Joe Relative Entropy Measures of Multivariate Dependence , 1989 .

[7]  S. Crossley,et al.  Examining lexical development in second language learners: An approximate replication of Salsbury, Crossley & McNamara (2011) , 2017, Language Teaching.

[8]  Morten H. Christiansen,et al.  Experience and sentence processing: Statistical learning and relative clause comprehension , 2009, Cognitive Psychology.

[9]  U. Römer A corpus perspective on the development of verb constructions in second language learners , 2019, Constructions in Applied Linguistics.

[10]  Reza Pishghadam,et al.  Phrasal complexity in academic writing: A comparison of abstracts written by graduate students and expert writers in applied linguistics , 2018 .

[11]  Andreas Trotzke,et al.  The complexity of narrow syntax : Minimalism, representational economy, and simplest Merge , 2014 .

[12]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[13]  Wouter Kusters Complexity in linguistic theory, language learning and language change , 2008 .

[14]  Xiaofei Lu,et al.  Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds , 2015 .

[15]  G. Pezzulo,et al.  An information-theoretic perspective on the costs of cognition , 2018, Neuropsychologia.

[16]  Peter Skehan,et al.  Modelling Second Language Performance: Integrating Complexity, Accuracy, Fluency, and Lexis , 2009 .

[17]  Nina Vyatkina,et al.  The Development of Second Language Writing Complexity in Groups and Individuals: A Longitudinal Learner Corpus Study , 2012 .

[18]  Haitao Liu,et al.  The evolutionary pattern of language in scientific writings: A case study of Philosophical Transactions of Royal Society (1665–1869) , 2020, Scientometrics.

[19]  R. Ellis The Differential Effects of Three Types of Task Planning on the Fluency, Complexity, and Accuracy in L2 Oral Production , 2009 .

[20]  Benedikt Szmrecsanyi,et al.  Introduction: Linguistic complexity Second Language Acquisition, indigenization, contact , 2012 .

[21]  Theodora Alexopoulou,et al.  Dependency parsing of learner English , 2018, International Journal of Corpus Linguistics.

[22]  Xiaofei Lu The Relationship of Lexical Richness to the Quality of ESL Learners' Oral Narratives. , 2012 .

[23]  J. M. Hughes,et al.  Quantitative patterns of stylistic influence in the evolution of literature , 2012, Proceedings of the National Academy of Sciences.

[24]  Alex Housen,et al.  Complexity, accuracy and fluency in second language acquisition , 2009 .

[25]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[26]  Jennifer E. Arnold,et al.  Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering , 2015 .

[27]  Judit Kormos,et al.  Syntactic and lexical development in an intensive English for Academic Purposes programme , 2015 .

[28]  Simon DeDeo,et al.  The civilizing process in London’s Old Bailey , 2014, Proceedings of the National Academy of Sciences.

[29]  Khalid Sayood,et al.  Information Theory and Cognition: A Review , 2018, Entropy.

[30]  Xiaofei Lu,et al.  Automatic analysis of syntactic complexity in second language writing , 2010 .

[31]  G. Pallotti CAF: Defining, Refining and Differentiating Constructs , 2009 .

[32]  Danielle S. McNamara,et al.  Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners , 2014 .

[33]  Benedikt Szmrecsanyi,et al.  Compressing learner language: An information-theoretic measure of complexity in SLA production data , 2019 .

[34]  Folkert Kuiken,et al.  Variation in syntactic complexity: Introduction , 2019, International Journal of Applied Linguistics.

[35]  S. Unsworth Comparing child L2 development with adult L2 development: How to measure L2 proficiency , 2008 .

[36]  Scott A. Crossley,et al.  Measuring Syntactic Complexity in L2 Writing Using Fine‐Grained Clausal and Phrasal Indices , 2018 .

[37]  J. Norris,et al.  Towards an Organic Approach to Investigating CAF in Instructed SLA: The Case of Complexity , 2009 .

[38]  S. Wise,et al.  Neuronal activity in the supplementary eye field during acquisition of conditional oculomotor associations. , 1995, Journal of neurophysiology.

[39]  Folkert Kuiken,et al.  Multiple approaches to complexity in second language research , 2018, Second Language Research.

[40]  Elke Teich,et al.  Toward an optimal code for communication: The case of scientific English , 2019, Corpus Linguistics and Linguistic Theory.

[41]  P. J. Brooks,et al.  Linking Adult Second Language Learning and Diachronic Change: A Cautionary Note , 2018, Front. Psychol..

[42]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[43]  S. Crossley Linguistic features in writing quality and development: An overview , 2020, Journal of Writing Research.

[44]  Magali Paquot,et al.  Phraseological Competence: A Missing Component in University Entrance Language Tests? Insights From a Study of EFL Learners’ Use of Statistical Collocations , 2018 .

[45]  Magali Paquot,et al.  The phraseological dimension in interlanguage complexity research , 2019 .

[46]  Alex Housen,et al.  A Cross‐Linguistic Perspective on Syntactic Complexity in L2 Development: Syntactic Elaboration and Diversity , 2017 .

[47]  Simon DeDeo,et al.  Exploration and exploitation of Victorian science in Darwin’s reading notebooks , 2015, Cognition.