The influence of Chunking on Dependency Crossing and Distance

This paper hypothesizes that chunking plays important role in reducing dependency distance and dependency crossings. Computer simulations, when compared with natural languages, show that chunking reduces mean dependency distance (MDD) of a linear sequence of nodes (constrained by continuity or projectivity) to that of natural languages. More interestingly, chunking alone brings about less dependency crossings as well, though having failed to reduce them, to such rarity as found in human languages. These results suggest that chunking may play a vital role in the minimization of dependency distance, and a somewhat contributing role in the rarity of dependency crossing. In addition, the results point to a possibility that the rarity of dependency crossings is not a mere side-effect of minimization of dependency distance, but a linguistic phenomenon with its own motivations. Introduction. – Language used in communication is invariably presented linearly, one unit after another, which is regarded as one of its fundamental property [1]. However, there is always a sytactic tree structure underlying a onedimensional linear sentence, a structure underpinning both the production and the comprehension of this sentence [2,3]. Therefore, language processing consists, to a considerable degree, in the transformation between the syntactic tree structure and the one-dimensional linear arrangement. What properties can be found in the tree structure of language? What mechanisms constrain the transformation of tree structure into linear structure? The answers to these questions, which may well require researches based on statistical physics and computer simulation, probably will shed much light on how human language operates. In terms of dependency grammar, the structure of a sentence can be visualized as a hierarchical dependency tree, whose nodes (vertices) are words, linked to one another by directed edges (dependency relations) [2,3]. Such a hierarchical tree must be ultimately arranged into a linear sequence, for the purpose of spoken and written communication. So far, researches have repeatedly observed two phenomena in the linear realization of hierarchical dependency structure: the minimization of dependency distance (the number of intervening words) between two syntactically related words [4-13], and the rarity of crossing dependency relations [14,15]. Liu [5] has compared dependency distance of 20 natural languages with that of two different random languages, and pointed out that dependency distance minimization seems to be universal in human languages. Ferrer-i-Cancho has theoretically analyzed these [8,9]. A recent study based on 37 languages has obtained similar findings[11]. Since dependency distance is held as cognitively related to language processing load [16], the minimization of dependency distance is probably a result of the principle of least effort [17]. In addition, it is argued that that the rarity of crossing dependencies is simply a by-product of the pressure to minimize dependency distance and cognitive cost in language processing, having little to do with the syntax of the language [7-10]. Similarly, some studies find that dependency distance will significantly increase if dependency crossings are permitted, and suggests that reducing dependency crossings is probably an important means to restrain dependency distance [4,5]. Dependency distance and crossings are closely related, and in human languages both seem to be subject to minimization. Ferrer-i-Cancho [9,10] has theoretically proven that, for sufficiently short dependency lengths, the probability that two edges cross decreases as their length decreases. However, Liu has found that projective random language (i.e. without any crossing dependency) has significantly longer mean dependency distance than natural langauage [4,5]. Therefore,