The paper provides linguistic observations as a motivation for a formal study of an analysis by reduction. It concentrates on a study of the whole mechanism through a class of restarting automata with meta-instructions using pebbles, with delete and shift operations DS-automata. Four types of infinite sets defined by these automata are considered as linguistically relevant: basic languages on word forms marked with grammatical categories, proper languages on unmarked word forms, categorial languages on grammatical categories, and sets of reductions reduction languages. The equivalence of proper languages is considered for a weak equivalence of DS-automata, and the equivalence of reduction languages for a strong equivalence of DS-automata.
The complexity of a language is naturally measured by the number of pebbles, the number of deletions, and the number of word order shifts used in a single reduction step. We have obtained unbounded hierarchies scales for all four types of classes of finite languages considered here, as well as for Chomsky's classes of infinite languages. The scales make it possible to estimate relevant complexity issues of analysis by reduction for natural languages.
[1]
Frantisek Mráz,et al.
The degree of word-expansion of lexicalized RRWW-automata - A new measure for the degree of nondeterminism of (context-free) languages
,
2009,
Theor. Comput. Sci..
[2]
Jan Hajic,et al.
The Prague Dependency Treebank
,
2003
.
[3]
Alexander Gelbukh,et al.
Computational Linguistics and Intelligent Text Processing
,
2015,
Lecture Notes in Computer Science.
[4]
Martin Plátek,et al.
On Formalization of Word Order Properties
,
2012,
CICLing.
[5]
Frantisek Mráz,et al.
(In)Dependencies in Functional Generative Description by Restarting Automata
,
2010,
NCMA.
[6]
Marie Mikulová,et al.
Prague Dependency Treebank 2.0 (PDT 2.0)
,
2006
.
[7]
Eugene Galanter,et al.
Handbook of mathematical psychology: I.
,
1963
.
[8]
Frantisek Mráz.
Lookahead Hierarchies of Restarting Automata
,
2001,
J. Autom. Lang. Comb..