A Grammar Based Approach To A Grammar Checking Of Free Word Order Languages

This paper shows one of the methods used for grammar checking, as it is being developed in the frame of the EC funded project LATESLAV -Language Technology li)r Slavic Languages (PECO 2824). The languages under consideration in the project Czech and Bulgarian are both free word order languages, therefore it is not sufficient to use only simple pattern based methods for error checking. The emphasis is on grammar-based methods, which are much closer to parsing than pattern-based methods. It is necessary to stress that we are dealing with a surface syntactic analysis. Therefore also the errors which are taken into consideration are surface syntactic errors. Our system for identification and localization of (surl:ace) syntactic errors consists of two basic modules the module of lexical analysis and the module of surface syntax checking. In the present paper, we will describe the second module, which is more complicated and creates the core of the whole system. Although it is not crucial for our method, we would like to point out that our approach to the problems of grammar checking is based on dependency syntax. Let us illustrate the degree of licedom of the word order, which is provided by Czech, one of the languages under consideration in the project. If we take a sentence like "Oznaeen3~ (Adj. masc., Nom/Gen Sg.) soubor (N masc., Nom/Gen Sg.) se (Pron.) nepodafilo (V neutr., 3rd pers. Sg) tisp6~nE (Adv.) otev~ft (V inf.)" (The marked file failed to be opened sucessfully); word-tbr-word translation into English "Marked file itself failed succesfully to open", we may modify the word order for instance in the following way: