Contextual Tree Adjoining Grammars

n rhi.\' pape1; 1i:e introduce a formalism called contextual tree adjoining grammar (CTAG). (::TAG.~ are a generalization of multi bracketed contextual reivriting gramnwrs (MBICR) which combine tree adjoini11g grammars (TAGs) and co11textual grammars. The generalization is to add a mechanism similar to obligatory adjoi11i11g in TAGs. Here, we present the definition o.f the model and some results co11cerni11g the ge11eratil'e capaciry and closure properries of rhe classes <!f la11g11ages generared by CTAGs. Introduction Contextual grammars are a formalization of the linguistic idea that more complex, weil tormea strings are obtained by inserting contexts into already we11 forlned strings. They were first introduced by Marcus in 1969; all models presented here are based on so-called internal contextual grammars which were introduced by Pllun and Nguyen. References and further details about contextual grammars can be found in the monograph (Pi\un, 1997); a survey is given in (Ehrenfeucht et a{., J 997). Tree adjoining grammars (TAGs) and contextual grammars are linguistically we11111v,;, "~~~ ""'1 have been considered as a good model for the description of natural languages (c.f. (Marcus, 1997)). Although contextual grammars and tree adjoining grammars seem very different at first sight, a closer look reveals many similarities between both formalisms. Therefore, it seems natural to combine those formalisms in order to obtain a generalized class of grammars for the description of natural languages, which combines tbe mechanisms of various classes. A first step were so-called multi-bracketed contextual grammars (MBIC) and multi-bracketed contextual rewriting grammars (MBICR), c.f. (Kappes, l 999). These grammars operate on a tree structure induced by the grammar (the first approach aiming in this direction was introduced in (Martin-Vide & Päun, l 998)). However, the families oflanguages generated by MBIC and MBICR-grammars are cither strictly included in or incomparable to the family of languages generated by TAGs. This is the case since, in MBIC and MBICR-grammars, each yield of a de1ived tree is immediately a word in the language generated by the grammar. In other words, there.is no mechanism to distinguish between "finished" and "unfinished" trees like obligatory adjoining allows in TAGs. Here, by adding obligatory adjoining to MBICR-grammars, we obtain a generalized class whicli is also a proper extension of TAGs. Definition and Example Let I;• denote the set of all words over the finiLe alpbabet B and I;+ = i:• { .A}, where ,\ denotes the empty word. We denote the Jength of a string :r: by !xi. In this paper, we use the term derived tree for a tree where the internal nodes are labelled by symbols from a nonter~ minal alphabet D.. and the leaves are labelled by symbols from a terminal alphabet I:. We use