论文信息 - Extracting & Learning a Dependency-Enhanced Type Lexicon for Dutch Submitted by Konstantinos Kogkalidis

Extracting & Learning a Dependency-Enhanced Type Lexicon for Dutch Submitted by Konstantinos Kogkalidis

This thesis is concerned with type-logical grammars and their practical applicability as tools of reasoning about sentence syntax and semantics. The focal point is narrowed to Dutch, a language exhibiting a large degree of word order variability. In order to overcome difficulties arising as a result of that variability, the thesis explores and expands upon a type grammar based on Multiplicative Intuitionistic Linear Logic, agnostic to word order but enriched with decorations that aim to reduce its proof-theoretic complexity. An algorithm for the conversion of dependency-annotated sentences into type sequences is then implemented, populating the type logic with concrete, data-driven lexical types. Two experiments are ran on the resulting grammar instantiation. The first pertains to the learnability of the type-assignment process by a neural architecture. A novel application of a self-attentive sequence transduction model is proposed; contrary to established practices, it constructs types inductively by internalizing the type-formation syntax, thus exhibiting generalizability beyond a pre-specified type vocabulary. The second revolves around a deductive parsing system that can resolve structural ambiguities by consulting both word and type information; preliminary results suggest both excellent computational efficiency and performance.

R. Moot | Konstantinos Kogkalidis | Tejaswini Deoskar

[1] J. Lambek. The Mathematics of Sentence Structure , 1958 .

[2] Joachim Lambek,et al. On the Calculus of Syntactic Types , 1961 .

[3] C. L. Hamblin. Translation to and from Polish Notation , 1962, Comput. J..

[4] Johan van Benthem,et al. The semantics of variety in categorial grammar , 1988 .

[5] Johan van Benthem,et al. Language in action , 1991, J. Philos. Log..

[6] Philip Gage,et al. A new algorithm for data compression , 1994 .

[7] M. Moortgat,et al. Structural control , 1997 .

[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[9] Michael Moortgat,et al. Syntactic Annotation for the Spoken Dutch Corpus Project (CGN) , 2000, CLIN.

[10] Reinhard Muskens,et al. Lambda grammars and the syntax-semantics interface , 2001 .

[11] Jürgen Schmidhuber,et al. LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.