The INCOMSLAV Platform: Experimental Website with Integrated Methods for Measuring Linguistic Distances and Asymmetries in Receptive Multilingualism

We report on a web-based resource for conducting intercomprehension experiments with native speakers of Slavic languages and present our methods for measuring linguistic distances and asymmetries in receptive multilingualism. Through a website which serves as a platform for online testing, a large number of participants with different linguistic backgrounds can be targeted. A statistical language model is used to measure information density and to gauge how language users master various degrees of (un)intelligibilty. The key idea is that intercomprehension should be better when the model adapted for understanding the unknown language exhibits relatively low average distance and surprisal. All obtained intelligibility scores together with distance and asymmetry measures for the different language pairs and processing directions are made available as an integrated online resource in the form of a Slavic intercomprehension matrix (SlavMatrix).

[1]  Tania Avgustinova,et al.  Language models, surprisal and fantasy in Slavic intercomprehension , 2019, Comput. Speech Lang..

[2]  Dietrich Klakow,et al.  Orthographic and Morphological Correspondences between Related Slavic Languages as a Base for Modeling of Mutual Intelligibility , 2016, LREC.

[3]  Levenshtein Distance Levenshtein Distance anD WorD aDaptation surprisaL as MethoDs of Measuring MutuaL inteLLigibiLity in reaDing coMprehension of sLavic Languages , 2017 .

[4]  Carryl L. Baldwin,et al.  Cloze probability and completion norms for 498 sentences: Behavioral and neural validation using event-related potentials , 2010, Behavior research methods.

[5]  Renée van Bezooijen,et al.  Linguistic Determinants of the Intelligibility of Swedish Words among Danes , 2008, Int. J. Humanit. Arts Comput..

[6]  Noora Vidgren Cross-linguistic similarity in foreign language learning , 2011 .

[7]  Dietrich Klakow,et al.  Modeling the impact of orthographic coding on Czech–Polish and Bulgarian–Russian reading intercomprehension , 2017, Nordic Journal of Linguistics.

[8]  M. Huhta,et al.  Guide for the development of Language Education Policies in Europe From Linguistic Diversity to Plurilingual Education Reference Study , 2002 .

[9]  W. Heeringa,et al.  Predicting intelligibility and perceived linguistic distance by means of the Levenshtein algorithm , 2008 .

[10]  T. Avgustinova,et al.  Intelligibility of Highly Predictable Polish Target Words in Sentences Presented to Czech Readers , 2019, CICLing.

[11]  Matthew W. Crocker,et al.  Information Density and Linguistic Encoding (IDeaL) , 2015, KI - Künstliche Intelligenz.

[12]  Charlotte Gooskens The Contribution of Linguistic Factors to the Intelligibility of Closely Related Languages , 2007 .

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  R. Levy Expectation-based syntactic comprehension , 2008, Cognition.

[15]  Pirkko Muikku-Werner,et al.  Co-text and receptive multilingualism ‒ Finnish students comprehending Estonian , 2014 .

[16]  Charlotte Gooskens,et al.  Mutual intelligibility between closely related languages in Europe , 2018 .

[17]  Mark Richard Lauersdorf,et al.  Introduction to the Phonological History of the Slavic Languages , 1991 .

[18]  Jelena Golubović Mutual intelligibility in the Slavic language area , 2016 .

[19]  E. Haugen Semicommunication: The Language Gap in Scandinavia , 1966 .

[20]  Tania Avgustinova,et al.  incom.py - A Toolbox for Calculating Linguistic Distances and Asymmetries between Related Languages , 2019, RANLP.

[21]  A. Verschik,et al.  Mediated receptive multilingualism , 2020, Mental representations in receptive multilingualism.

[22]  Greville G. Corbett,et al.  The Slavonic Languages , 1993 .

[23]  Ludger Zeevaert Semikommunikation, rezeptive Mehrsprachigkeit und verwandte Phänomene : Eine bibliographische Bestandsaufnahme. , 2001 .

[24]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[25]  Charlotte Gooskens,et al.  Linguistic and extra-linguistic predictors of mutual intelligibility between Germanic languages , 2017, Nordic Journal of Linguistics.

[26]  V. V. Heuven,et al.  How well can intelligibility of closely related languages in Europe be predicted by linguistic and non-linguistic variables? , 2020, Mental representations in receptive multilingualism.