The Spanish DELPH-IN grammar

In this article we present a Spanish grammar implemented in the Linguistic Knowledge Builder system and grounded in the theoretical framework of Head-driven Phrase Structure Grammar. The grammar is being developed in an international multilingual context, the DELPH-IN Initiative, contributing to an open-source repository of software and linguistic resources for various Natural Language Processing applications. We will show how we have refined and extended a core grammar, derived from the LinGO Grammar Matrix, to achieve a broad-coverage grammar. The Spanish DELPH-IN grammar is the most comprehensive grammar for Spanish deep processing, and it is being deployed in the construction of a treebank for Spanish of 60,000 sentences based in a technical corpus in the framework of the European project METANET4U (Enhancing the European Linguistic Infrastructure, GA 270893GA; http://www.meta-net.eu/projects/METANET4U/.) and a smaller treebank of about 15,000 sentences based in a corpus from the press.

[1]  Gerald Penn,et al.  Book Review , 2003, Computational Linguistics.

[2]  Dan Flickinger,et al.  Minimal Recursion Semantics: An Introduction , 2005 .

[3]  Ann Copestake,et al.  Implementing typed feature structure grammars , 2001, CSLI lecture notes series.

[4]  Samuel Reese,et al.  FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.

[5]  Francis Bond,et al.  Semi-automatic documentation of an implemented linguistic grammar augmented with a treebank , 2008, Lang. Resour. Evaluation.

[6]  Amaya Mendikoetxea,et al.  Construcciones con "se": medias, pasivas e impersonales , 1999 .

[7]  Dan Flickinger,et al.  On building a more effcient grammar by exploiting types , 2000, Natural Language Engineering.

[8]  Asociación de Academias de la Lengua Española Nueva gramática de la lengua española : manual , 2010 .

[9]  Ivan A. Sag,et al.  Information-Based Syntax and Semantics: Volume 1, Fundamentals , 1987 .

[10]  Stephan Oepen,et al.  Stochastic HPSG Parse Disambiguation using the Redwoods Corpus , 2005 .

[11]  Luis Alberto Pineda Cortés,et al.  The Spanish pronominal clitic system. , 2005 .

[12]  Ulrich Callmeier,et al.  PET – a platform for experimentation with efficient HPSG processing techniques , 2000, Natural Language Engineering.

[13]  Berthold Crysmann,et al.  Syncretism in German: A unified approach to underspecification, indeterminacy, and likeness of case , 2005, Proceedings of the International Conference on Head-Driven Phrase Structure Grammar.

[14]  Montserrat Marimon,et al.  The Spanish Resource Grammar , 2010, LREC.

[15]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[16]  Ivan A. Sag,et al.  Information-based syntax and semantics , 1987 .

[17]  João Graça,et al.  Developing a Deep Linguistic Databank Supporting a Collection of Treebanks: the CINTIL DeepGramBank , 2010, LREC.

[18]  Francisco Costa A Computational Grammar for Deep Linguistic Processing of Portuguese: LXGram, version A.4.1 , 2008 .

[19]  G. Pullum,et al.  CLITICIZATION VS. INFLECTION: ENGLISH N'T , 1983 .

[20]  Valia Kordoni,et al.  Deep Analysis of Modern Greek , 2004, IJCNLP.

[21]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[22]  Ivan A. Sag,et al.  French Clitic Movement Without Clitics or Movement , 1997 .

[23]  Emily M. Bender,et al.  Efficient Deep Processing of Japanese , 2002, ALR@COLING.

[24]  Stephan Oepen,et al.  LinGO Redwoods , 2004 .

[25]  Emily M. Bender,et al.  Rapid Prototyping of Scalable Grammars: Towards Modularity in Extensions to a Language-Independent Core , 2005, IJCNLP.

[26]  Antske Fokkens,et al.  Grammar Customization , 2010 .