An effective stemmer in Devanagari script
暂无分享,去创建一个
In today's word of internet web search engines are developing the techniques to make the surfing faster. Stemming is a technique used by web search engines for prefix and suffix removal from the derived word. Stemming provides the way to store similar documents together. This research work aims at the development of Hindi stemmer based on Devanagari script for stripping both prefixes as well as suffixes from derived word to provide better stemming than previous stemmers. Proposed stemmer uses the hybrid approach which is the combination of lookup algorithm, suffix stripping algorithm and prefix removal algorithm.
[1] Julie Beth Lovins,et al. Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.
[2] Dinesh Kumar,et al. Design and Development of a Stemmer for Punjabi , 2010 .
[3] M. F. Porter,et al. An algorithm for suffix stripping , 1997 .
[4] Tanveer J. Siddiqui,et al. Discovering suffixes: A Case Study for Marathi Language , 2010 .