论文信息 - AN AFFIX STRIPPING MORPHOLOGICAL ANALYZER FOR TURKISH

AN AFFIX STRIPPING MORPHOLOGICAL ANALYZER FOR TURKISH

This paper presents the design and the implementation of a morphological analyzer for Turkish. A new methodology is proposed for doing the analysis of Turkish words with an affix stripping approach and without using any lexicon. The rule-based and agglutinative structure of the language allows Turkish to be modeled with finite state machines (FSMs). In contrast to the previous works, in this study, FSMs are formed by using the morphotactic rules in reverse order. This paper describes the steps of this new methodology including the classification of the suffixes, the generation of the FSMs for each suffix class and their unification into a main machine to cooperate in the analysis.

Eşref Adalı | Gülşen Eryiğit

[1] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[2] Richard Sproat,et al. Morphology and computation , 1992 .

[3] Kemal Oflazer,et al. Two-level Description of Turkish Morphology , 1993, EACL.

[4] Heiki-Jaan Kaalep,et al. An Estonian Morphological Analyser and the Impact of a Corpus on Its Development , 1997, Comput. Humanit..

[5] Kemal Oflazer,et al. Design and Implementation of a Spelling Checker for Turkish , 1993 .

[6] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .