A New Approach for Scoring Relevant Documents by Applying a Farsi Stemming Method in Persian Web Search Engines

In this paper, we will introduce a new approach for scoring Farsi (also called Persian) documents in a Persian Search engine. This approach is based on a new stemming method for Farsi language. Our new stemming method works without any dictionary. Evaluation results show significant improvement in performance (precision/ recall) of the Information Retrieval (IR) system using this stemmer. we have combine our stemming method with a mathematical scoring approach named FDS to obtain a powerful scoring policy for relevant documents in a Persian search engine.

[1]  Steven W. Duck,et al.  Writing and editing , 2007, Journal of General Internal Medicine.

[2]  Farhad Oroumchian,et al.  Assessment of a Modern Farsi Corpus , 2005 .

[3]  Kazem Taghva,et al.  A stemming algorithm for the Farsi language , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[4]  Marimuthu Palaniswami,et al.  Internet Document Filtering Using Fourier Domain Scoring , 2001, PKDD.

[5]  Marimuthu Palaniswami,et al.  A Novel Web Text Mining Method Using the Discrete Cosine Transform , 2002, PKDD.