The Theory of Control Applied to the Prague Dependency Treebank (PDT)

One of the most difficult issues within corpora annotation on an underlying syntactic level is the restoration of nodes omitted in the surface shape of the sentence, but present on the “underlying” or “deep” syntactic level. In the present paper we concentrate on such type of nodes which are omitted due to the phenomenon usually called grammatical “control” with regard to their respective anaphoric relations. In particular, we extend the notion of control to nominalization and demonstrate how this relation is captured in the Prague Dependency Treebank. The theory of control is present within Chomsky’s framework of Government and Binding (using the terms verb of control, controller and controllee, cf. Chomsky, 1980), but also within many other formal frameworks, e.g. GPSG (Sag and Pollard, 1991) or categorial grammar (Bach, 1979). We analyse this phenomenon within the framework of the dependency grammar, theoretically based on the Functional Generative Description (FGD, cf. Sgall, Hajicova and Panevova, 1986). In FGD, on the “underlying” or “tectogrammatical” level, control is a relation of an obligatory or an optional referential dependency between a controller (antecedent) and a controllee (empty subject of the nonfinite complement (= controlled clause)). The controller is one of the participants in the valency frame of the governing verb (Actor (ACT), Addressee (ADDR), or Patient (PAT)). The controlled clause functions also as a filler of a dependency slot in the valency frame of the governing verb, being labeled as Patient or Actor. The empty subject of the controlled clause may have the function of different dependency relations to its head word (the infinitive): Actor, or, with passivization of the controlled clause, Addressee or Patient (cf. Koktova, 1992).