Morphisto - An Open Source Morphological Analyzer for German

This paper presents the development of an open-source morphology tool for German integrated into a grid-based environment. Departing from the SFST-based SMOR tools (Schmid et al. [1]), we have implemented a minimal lexicon component that works in tandem with the morphological tool. Tests on a list of 30,000 high-frequency German words show that the recognition rate is comparable to other systems with even larger lexicons. Additional tools for the management of lexical data and services built on top of the finite-state transducer are also integrated as web services in the grid, so that all resources can be shared easily among lexicographers, linguists, and finite-state developers.

[1]  Lauri Karttunen,et al.  Finite State Morphology , 2003, CSLI Studies in Computational Linguistics.

[2]  Ulrich Heid,et al.  Using Descriptive Generalisations in the Acquisition of Lexical Data for Word Formation , 2002, LREC.

[3]  Helmut Schmid,et al.  A Programming Language for Finite State Transducers , 2005, FSMNLP.

[4]  Fotis Jannidis,et al.  TextGrid and eHumanities , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[5]  Ulrich Heid,et al.  SMOR: A German Computational Morphology Covering Derivation, Composition and Inflection , 2004, LREC.