Study on Root + Affix Form-Based Mongolian Information Retrieval Unit
暂无分享,去创建一个
In order to improve the efficiency of Mongolian information retrieval, further research is carried out on root + affixes form-based retrieval unit with selected information retrieval model by combining the characteristics of Mongolian language. Selectable information retrieval model include TF-IDF Model, Vector Space Model and Lemur Language Model. The following four steps are conducted for root + affixes form: establishment of index for corpus, query analysis, retrieval and evaluation. Thereby, comparison is conducted on recall rate and precision rate to find out the proper retrieval unit. The results show that root + 2 affixes is the proper retrieval unit for Mongolian information retrieval system.
[1] Michael E. Lesk,et al. Computer Evaluation of Indexing and Text Processing , 1968, JACM.
[2] Niu Jun-yu. Research of Language Model in Information Retrieval , 2007 .