Quantifying and Measuring Morphological Complexity

It is a standard assumption in Linguistics that all human languages are equally (and enormously) complex; when looked at as a whole, no language can be called “simpler” than another. Certainly, languages can differ in the distribution of their complexity, so that one might employ a richer inflectional system, or entertain a more complicated gamut of syllable shapes than another, but it is generally supposed that these differences must “even out” as one considers entire linguistic systems. Where there is atypical simplicity in morphology, for instance, it is assumed that one will find compensatory complexity in possible syntactic distinctions, or subtle lexical differences, or something else. A number of researchers have recently begun to approach this equal complexity hypothesis as an empirical claim to be tested under particular definitions of complexity. Perhaps the most famous recent example is McWhorter’s (2001) controversial claim that “creole grammars are the world’s simplest grammars,” but see also Juola (1998), Shosted (2006), Nichols (2007), and Pellegrino et al. (2007). The hypothesis deserves formal articulation and scrutiny because it suggests some important consequences, if confirmed:

[1]  Patrick Juola Measuring Linguistic Complexity: The Morphological Tier , 1998, J. Quant. Linguistics.

[2]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[3]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[4]  Nelson Goodman On the Simplicity of Ideas , 1943, J. Symb. Log..

[5]  R. Baayen,et al.  Putting the bits together: an information theoretical perspective on morphological processing , 2004, Cognition.

[6]  John Goldsmith,et al.  An algorithm for the unsupervised learning of morphology , 2006, Natural Language Engineering.

[7]  David Gil,et al.  The World Atlas of Language Structures , 2005 .

[8]  Sandy Lovie Shannon, Claude E , 2005 .

[9]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[10]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[11]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[12]  Ryan Keith Shosted,et al.  Correlating complexity: A typological approach , 2006 .

[13]  Marcus Hutter,et al.  Algorithmic Information Theory , 1977, IBM J. Res. Dev..

[14]  J. McWhorter,et al.  The worlds simplest grammars are creole grammars , 2001 .

[15]  Balthasar Bickel,et al.  Inflectional synthesis of the verb , 2005 .

[16]  Yu Hu,et al.  Topics in unsupervised language learning , 2007 .

[17]  Patrick Juola Assessing linguistic complexity , 2008 .

[18]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[19]  李幼升,et al.  Ph , 1989 .

[20]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[21]  Balthasar Bickel,et al.  Exponence of selected inflectional formatives , 2005 .