A Cascaded Syntactic Analyser for Basque

This article presents a robust syntactic analyser for Basque and the different modules it contains. Each module is structured in different analysis layers for which each layer takes the information provided by the previous layer as its input; thus creating a gradually deeper syntactic analysis in cascade. This analysis is carried out using the Constraint Grammar (CG) formalism. Moreover, the article describes the standardisation process of the parsing formats using XML.

[1]  Kepa Sarasola,et al.  Construcción de un corpus etiquetado sintácticamente para el euskera , 2002, Proces. del Leng. Natural.

[2]  Atro Voutilainen,et al.  A language-independent system for parsing unrestricted text , 1995 .

[3]  W. A. Martin,et al.  Parsing , 1980, ACL.

[4]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984, ACL.

[5]  Gregory Grefenstette,et al.  Regular expressions for language engineering , 1996, Natural Language Engineering.

[6]  C. M. Sperberg-McQueen The Text Encoding Initiative , 1994 .

[7]  Nerea Ezeiza Ramos Corpusak ustiatzeko tresna linguistikoak , 2003 .

[8]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[9]  Atro Voutilainen,et al.  Tagging accurately - Don't guess if you know , 1994, ANLP.

[10]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.

[11]  Itziar Aduriz,et al.  Morphosyntactic disambiguation and shallow parsing in computational processing of Basque , 2013 .

[12]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984 .

[13]  Ruslan Mitkov,et al.  The Oxford handbook of computational linguistics , 2003 .

[14]  Olatz Ansa,et al.  EDBL: a General Lexical Basis for the Automatic Processing of Basque , 2006 .

[15]  Xabier Artola,et al.  A Class Library for the Integration of NLP Tools: Definition and implementation of an Abstract Data Type Collection for the manipulation of SGML documents in a context of stand-off linguistic annotation , 2002, LREC.