This paper investigates probability distributions of dependency distances in six texts ex- tracted from a Chinese dependency treebank. The fitting results reveal that the investigated distribu- tion can be well captured by the right truncated Zeta distribution. In order to restrict the model only to natural language, two samples with randomly generated governors are investigated. One of them can be described e.g. by the Hyperpoisson distribution, the other satisfies the Zeta distribution. The paper also presents a study on sequential plot and mean dependency distance of six texts with three analyses (syntactic, and two random). Of these three analyses, syntactic analysis has a minimum (mean) dependency distance.
[1]
Richard Hudson,et al.
Language Networks: The New Word Grammar
,
2007
.
[2]
Joakim Nivre,et al.
Inductive Dependency Parsing
,
2006,
Text, speech and language technology.
[3]
D. G. Hays.
Dependency Theory: A Formalism and Some Observations
,
1964
.
[4]
R. Ferrer i Cancho.
Why do syntactic links not cross
,
2006
.
[5]
Hans Jürgen Heringer,et al.
Syntax : Fragen, Lösungen, Alternativen
,
1980
.
[6]
E. Gibson.
The dependency locality theory: A distance-based theory of linguistic complexity.
,
2000
.