Clausal Proofs and Discontinuity

We consider the task of theorem proving in Lambek calculi and their generalisation to \multimodal residuation calculi". These form an integral part of categorial logic, a logic of signs stemming from categorial grammar, on the basis of which language processing is essentially theorem proving. The demand of this application is not just for e cient processing of some or other speci c calculus, but for methods that will be generally applicable to categorial logics. It is proposed that multimodal cases be treated by dealing with the highest common factor of all the connectives as linear (propositional) validity. The prosodic (sublinear) aspects are encoded in labels, in e ect the term-structure of quanti ed linear logic. The correctness condition on proof nets (\long trip condition") can be implemented by SLD resolution in linear logic with uni cation on labels/terms limited to one way matching. A suitable uni cation strategy is obtained for calculi of discontinuity by normalisation of the ground goal term followed by recursive decent and redex pattern matching on the head term. 1 Clausal Proofs and Discontinuity1 The associative Lambek calculus (Lambek 1958) and non-associative Lambek calculus (Lambek 1961) were originally proposed as \syntactic calculi" for characterisation of the well-formedness of (respectively) sequential (semigroup structure) and binary hierarchical (groupoid structure) expressions, and were provided with single-conclusioned Gentzen-style sequent presentations which lack the usual structural rules of weakening (or: thinning, or: monotonicity), contraction, and permutation (or: exchange), and which directly provide Cut-free backward-chaining decision procedures for theoremhood. More recently it has become possible to locate the Lambek calculi within a space of \substructural logics" (logics lacking structural rules; Do sen and Schroeder-Heister 1993) of which linear logic (Girard 1987) is a prominent instance. At the same time, Lambek calculi have been extended in their linguistic application to categorial logics (Morrill 1994d), versions of categorial grammar characterising prosodic and semantic dimensions, for which the task of parsing is essentially theorem proving. In particular we can identify as a generalisation \residuation calculi" in which the Lambek connectives (corresponding to linear logic multiplicatives) are de ned in a number of potentially interactive modes (Moortgat and Morrill 1991). In Morrill (1993, 1994d) an improvement of the logic of discontinuity of Moortgat (1988) is developed in this way. Given Cut-elimination, decidability is directly demonstrable from sequent formulations, but in applications to natural language processing our further objective is e ciency. There are two main approaches in existence: sequent proof normalisation, and proof nets. The former, which builds proofs backwards from the goal sequent, even if somehow broadly generalisable, necessarily faces non-determinism with information from subformulas only made available serially according to the construction of formulas. The latter provides a phase of unfolding in which all the parts of a formula are made available in parallel, and then a non-deterministic phase of linking which builds proofs from the axioms, but requires a certain correctness condition. Roorda (1991) expresses this condition by reference to labelling by lambda terms corresponding to proofs under the Curry-Howard correspondence. Roorda (1991) and Moortgat (1990, 1992) do so by reference to labelling by groupoid terms of the algebras in which we interpret by residuation. We aim to improve the latter method, which as it stands presents the task of correctness checking in terms of intractable problems such as semigroup uni cation, i.e. it leaves some more speci c structuring of the task, indicating an e cient strategy, to be desired. Moortgat (1990) presents a scheme for gathering groupoid-labelled unfoldings into de nite clauses directly executable in Prolog, and Moortgat (1992) proposes multimodal generalisation with uni cation under theory. In Morrill (1994a) such a compilation is achieved by a more direct structuring of unfolding relating to Horn clause resolution in linear logic, showing how one term in such uni cation can always be kept ground, and multimodality is exempli ed with the logic of discontinuity of Morrill (1993, 1994d). This re nement however shares with the Moortgat proposals transformation into rst order clauses, resulting in an in ation of the resolution database at compile time to deal with higher order type inferences. In Morrill (1994b) the situation is improved by compiling into higher order clauses such that hypotheticals are emitted dynamically only as they become germane. The present paper aims to explain and motivate these proposals. 1To appear in theBulletin of the Interest Group in Propositional and Predicate Logics. I thankMichaelMoortgat and Dick Oehrle for comments on this work. 21 Residuation Calculi 1.1 Lambek Calculi The types (or: formulas) of (product-free) Lambek calculus are freely generated from a set of primitives by binary in x connectives / (\over") and n (\under"). Models can be given in a variety of structures; we deal here with a simple and transparent interpretation in groupoids. With respect to a groupoid algebra hL;+i (i.e. a set L closed under a binary operation +) for the non-associative Lambek calculus NL, and with respect to a semigroup algebra hL;+i (i.e. a set L closed under an associative binary operation +) for the associative Lambek calculus L, each formula A is \prosodically" interpreted as a subset D(A) of L by residuation as follows (Lambek 1988).D(AnB) = fsj8s0 2 D(A); s0+s 2 D(B)g D(B=A) = fsj8s0 2 D(A); s+s0 2 D(B)g (1) A sequent, ` A, comprises a succedent formula A and one or more formula occurrences in the antecedent con guration which is organised as a binary bracketed sequence for NL, and as a sequence for L. A sequent is valid if and only if in all interpretations applying the prosodic construction indicated by the antecedent con guration to objects inhabiting its formulas always yields an object inhabiting the succedent formula. The Gentzen-style sequent presentations for NL in (2) and for L in (3) are sound and complete for this interpretation (Buszkowski 1986, Do sen 1992); furthermore they enjoy Cut-elimination: every theorem can be generated without the use of Cut. In the following the parenthetical notation ( ) represents a con guration containing a distinguished subcon guration . (2) a. A ` A id ` A (A) ` BCut ( ) ` B b. ` A (B) ` CnL ([ ; AnB]) ` C [A; ] ` BnR ` AnB c. ` A (B) ` C/L ([B=A; ]) ` C [ ; A] ` B/R ` B=A (3) a. A ` A id ` A (A) ` BCut ( ) ` B b. ` A (B) ` CnL ( ; AnB) ` C A; ` BnR ` AnB c. ` A (B) ` C/L (B=A; ) ` C ; A ` B/R ` B=A By way of example, \lifting" A ` B/(AnB) is generated as follows in NL; it is similarly derivable in L. (4) A ` A B ` BnL [A, AnB] ` B/R A ` B/(AnB) On the other hand \composition" AnB, BnC ` AnC, while derivable as follows in L, is NLunderivable in its non-associative form: [AnB, BnC] ` AnC. 3 (5) A ` A B ` B C ` CnL B, BnC ` CnL A, AnB, BnC ` CnR AnB, BnC ` AnC 1.2 Multimodal Lambek Calculi In a slightly di erent formulation of the sequent calculus for L we may con gure antecedents with binary bracketing, and then use the NL rules together with an explicit structural rule of associativity (the double bar indicates bidirectionality): (6) ([ 1; [ 2; 3]]) ` AA ([[ 1; 2]; 3]) ` A From here it is a small step to give sequent calculus for \multimodal" Lambek calculi in which we have several families of connectives f=i; nigi2f1;:::;ng, each de ned by residuation with respect to their adjunction in a \multigroupoid" hL; f+igi2f1;:::;ngi (Moortgat and Morrill 1991): D(AniB) = fsj8s0 2 D(A); s0+is 2 D(B)g D(B=iA) = fsj8s0 2 D(A); s+is0 2 D(B)g (7) Sequent calculus can be given by indexing the brackets of NL-presentations to indicate mode of adjunction (and adding structural rules, including interaction postulates between di erent modes, as appropriate): (8) id A ` A ` A (A) ` BCut ( ) ` B (9) a. ` A (B) ` Cni L ([i ; AniB]) ` C [iA; ] ` Bni R ` AniB b. ` A (B) ` C/iL ([iB=iA; ]) ` C [i ; A] ` B/iR ` B=iA In particular cases of course we may choose non-composite notations for the connectives and brackets. With two modes interpreted in a \bigroupoid" understood as distinguishing left-headed and right-headed adjunction we have a \headed" calculus (Moortgat and Morrill 1991). With families f=; ng and f<;>g for adjunctions + (associative) and (:; :) (not assumed to be associative) respectively in a bigroupoid hL;+; (:; :)i we have a partially associative calculus L+NL (Oehrle and Zhang 1989, Morrill 1990). This latter forms two-thirds of the discontinuity calculus of Morrill (1993, 1994d) which we shall be considering. 1.3 Labelled Sequent Presentations \Labelling" (Gabbay 1991) is a means of presenting proof theory which will enable us to factor out the antecedent formulas of a sequent, and its associated prosodic construction, which is made more explicit. No essential use of sequent labelling is made here, in that the labelled presentation of calculus is just notational variation of ordered presentation. However, labelling is a step on the path to implementing residuation calculi. We notate a sequent ` A as a1: A1; : : : ; an: An ` : A where the multiset fA1; : : : ; Angm comprises the formula occurrences in , a1; : : : ; an are distinct 4atomic labels, and is a term over these labels representing explicitly the prosodic construction that was represented implicitly by the structured con guration . The labelled sequent calculus for NL is as follows: (10) a. a: A ` a: A id b. ` : A a: A; ` (a): BCut ; ` ( ): B c. ` : A b: B; ` (b): CnL ; d: AnB; ` (( + d)): C d. ; a: A ` (a+ ): BnR ` : AnB e. ` : A b: B; ` (b): C/L ; d: B=A; ` ((d + )): C f. ; a: A ` ( + a): B/R ` : B=A To obtain L an associativity equation on terms may be added: ((

[1]  Joachim Lambek,et al.  On the Calculus of Syntactic Types , 1961 .

[2]  Wolfgang Bibel,et al.  On Matrices with Connections , 1981, JACM.

[3]  Glyn Morrill,et al.  Categorial Formalisation of Relativisation: Pied Piping, Islands, and Extraction Sites , 1992 .

[4]  Esther König Parsing as Natural Deduction , 1989, ACL.

[5]  M. Moortgat Categorial Investigations: Logical and Linguistic Aspects of the Lambek Calculus , 1988 .

[6]  Glyn Morrill,et al.  Clausal proof nets and discontinuity , 1994 .

[7]  J. Lambek The Mathematics of Sentence Structure , 1958 .

[8]  Richard T. Oehrle,et al.  Lambek Calculus and Preposing of Embedded Subjects , 1989 .

[9]  Glyn Morrill,et al.  Heads and Phrases. Type Calculus for Dependency and Constituent Structure , 1991 .

[10]  Wolfgang Bibel,et al.  Methods and calculi for deduction , 1993 .

[11]  Roman Jakobson,et al.  Structure of Language and Its Mathematical Aspects , 1961 .

[12]  Kosta Dosen,et al.  A Brief Survey of Frames for the Lambek Calculus , 1992, Math. Log. Q..

[13]  Lincoln A. Wallen,et al.  Automated proof search in non-classical logics - efficient matrix proof methods for modal and intuitionistic logics , 1990, MIT Press series in artificial intelligence.

[14]  Patrick Lincoln,et al.  Linear logic , 1992, SIGA.

[15]  H.L.W. Hendriks,et al.  Studied flexibility : categories and types in syntax and semantics , 1993 .

[16]  Glyn Morrill,et al.  Higher-order Linear Logic Programming of Categorial Deduction , 1995, EACL.

[17]  Emmon W. Bach,et al.  Categorial Grammars and Natural Language Structures , 1988 .

[18]  Dirk Roorda,et al.  Resource Logics : Proof-Theoretical Investigations , 1991 .

[19]  Wojciech Buszkowski Completeness Results for Lambek Syntactic Calculus , 1986, Math. Log. Q..

[20]  Mark Hepple,et al.  Normal Form Theorem Proving for the Lambek Calculus , 1990, COLING.

[21]  J. Lambek,et al.  Categorial and Categorical Grammars , 1988 .

[22]  Glyn Morrill Discontinuity and pied-piping in categorial grammar , 1993 .