Stemmatizer—Stemmer-based Lemmatizer for Gujarati Text

Stemming is an important process in Information Retrieval (IR). Stem returned by stemmer need not be always a valid dictionary word. While a lemma returned by lemmatizer is always a valid dictionary word, which is a requirement of many IR systems. Indian languages are poor in resources. Specifically, the Gujarati language is having a stemmer but lacking a lemmatizer. In this paper, the authors have proposed ‘The Stemmatizer’—stemmer-based lemmatizer for Gujarati language using a hybrid approach. It has the ability to learn new words. The proposed solution is tested on 2197 words and results have been found very much satisfactory.

[1]  Nisheeth Joshi,et al.  Design and Development of a Rule-Based Urdu Lemmatizer , 2016 .

[2]  Nisheeth Joshi,et al.  Design & development of rule based inflectional and derivational Urdu stemmer ‘Usal’ , 2015, 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE).

[3]  Pushpak Bhattacharyya,et al.  Hybrid Stemmer for Gujarati , 2010 .

[4]  Jikitsha Sheth,et al.  Dhiya: A stemmer for morphological level analysis of Gujarati language , 2014, 2014 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT).

[5]  Pushpak Bhattacharyya,et al.  Hybrid Inflectional Stemmer and Rule-based Derivational Stemmer for Gujarati , 2011 .

[6]  Pushpak Bhattacharyya,et al.  Facilitating Multi-Lingual Sense Annotation: Human Mediated Lemmatizer , 2014, GWC.

[7]  Nisheeth Joshi,et al.  Design of a Rule Based Hindi Lemmatizer , 2013 .

[8]  Devyani Sharma,et al.  Typological variation in the ergative morphology of Indo-Aryan languages , 2006 .

[9]  R. J. Prathibha,et al.  Design of rule based lemmatizer for Kannada inflectional words , 2015, 2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT).

[10]  Dunja Mladenic,et al.  A Rule based Approach to Word Lemmatization , 2004 .

[11]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .