Reusing Swedish Language Processing Resources in SVENSK

The SVENSK project is developing an integrated toolbox of language processing components and resources for Swedish. SVENSK employs GATE, General Architecture for Text Engineering from the University of Sheffield as a platform in which the components are to be integrated. The goal is that the resources included in SVENSK should be freely available for noncommercial use. A wide range of different modules have been incorporated so far, both in-house modules, commercially available modules, and modules from academia. The results of the integration of the modules in the GATE environment are very encouraging: it is possible to mix modules from different sources, written in programming languages from completely different paradigms and have them interact with each other, thus maintaining a high degree of reuse of algorithmical resources. However, the use of Tcl/Tk and the associated API for processing structurally relatively complex data, is time consuming and considerably slows the processing in GATE.

[1]  Jan van Eijck,et al.  Logical Forms in the Core Language Engine , 1989, ACL.

[2]  R. Philip,et al.  An Open Agent Architecture , 1994 .

[3]  Fred Karlsson,et al.  SWETWOL: A Comprehensive Morphological Analyser for Swedish , 1992 .

[4]  Klas Prytz Sammanställning av en träningskorpus på svenska för träning av ett automatiskt ordklasstaggningssystem (Compiling a Swedish Training Corpus for an Automatic Part of Speech Tagging System) , 1999 .

[5]  Tony Mason,et al.  Lex & Yacc , 1992 .

[6]  Joakim Nivre,et al.  Tagging Spoken Language Using Written Language Statistics , 1996, COLING.

[7]  Björn Gambäck Lexical acquisition: the Swedish VEX System , 1992 .

[8]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[9]  Ingrid Almqvist,et al.  SCANIA SWEDISH - A BASIS FOR MULTILINGUAL MACHINE TRANSLATION , 1997 .

[10]  Scott McGlashan,et al.  OLGA - a dialogue system with an animated talking agent , 1997, EUROSPEECH.

[11]  Joel Sunnehall,et al.  Robust Parsing Using Dependency with Constraints and Preference , 1996 .

[12]  Kimmo Koskenniemi,et al.  A General Computational Model for Word-Form Recognition and Production , 1984 .

[13]  Sangkyu Park,et al.  Multimodal user interfaces in the Open Agent Architecture , 1997, IUI '97.

[14]  Fredrik Olsson,et al.  Tagging and Morphological Processing in the SVENSK System , 1998 .

[15]  Thierry Declerck,et al.  Linguistic engineering using ALEP , 2000 .

[16]  Björn Gambäck,et al.  Question Answering in the Swedish Core Language Engine , 1993, SCAI.

[17]  Christer Samuelsson,et al.  Notes on LR Parser Design , 1994, COLING.

[18]  Ralph Grishman,et al.  TIPSTER Text Phase II Architecture Design Version 2.1p 19 June 1996 , 1996, TIPSTER.

[19]  Thomas Bub,et al.  VERBMOBIL: the evolution of a complex large speech-to-speech translation system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  Björn Gambäck Processing Swedish sentences : a unification-based grammar and some applications , 1997 .

[21]  Gunnar Eriksson,et al.  The Linguistic Annotation System of the Stockholm - Umea , 1993, EACL.

[22]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.