论文信息 - MarS: A rule-based stemmer for morphologically rich language Marathi

MarS: A rule-based stemmer for morphologically rich language Marathi

Stemming is a technique that transforms morphologically similar terms into a unique term without doing a complete morphological analysis. Stemming is used as a preprocessing step in many Natural Language Processing (NLP) applications like Information retrieval (IR), Machine Translation, Parsing, Summarization, etc. The present work explores the application of stemming to the task of information retrieval. In IR, stemming is generally used for two main purposes: decreasing index size and for increasing system performance. This paper presents a stemmer for Marathi language which uses rule-based technique. The average accuracy achieved by the proposed stemmer is 79.97% when tested on a collection of 4500 unique words from the news corpus among nine runs. Since the accuracy of the proposed stemmer is satisfactory it can be effectively useful in several NLP systems for Marathi language.

Harshali B. Patil | Ajay S. Patil | A. Patil | H. Patil

[1] Jacques Savoy,et al. Comparative Study of Indexing and Search Strategies for the Hindi, Marathi, and Bengali Languages , 2010, TALIP.

[2] B. V. Pawar,et al. ISSUES AND CHALLENGES IN MARATHI NAMED ENTITY RECOGNITION , 2016 .

[3] Harshali B. Patil,et al. A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES , 2016 .

[4] Chris D. Paice,et al. Another stemmer , 1990, SIGF.

[5] Chris D. Paice. Method for Evaluation of Stemming Algorithms Based on Error Counting , 1996, J. Am. Soc. Inf. Sci..

[6] Mohd. Shahid Husain. An Unsupervised Approach to Develop Stemmer , 2012 .

[7] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .

[8] Tanveer J. Siddiqui,et al. Discovering suffixes: A Case Study for Marathi Language , 2010 .

[9] Christopher J. Fox,et al. Strength and similarity of affix removal stemming algorithms , 2003, SIGF.

[10] B. V. Pawar,et al. Modeling Complex Sentences for parsing through Marathi Link Grammar Parser , 2015 .

[11] Julie Beth Lovins,et al. Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.