Adaptive calculation of scores for fresh information retrieval

In business, we need fresh information. In order to realize fresh information retrieval, we need not only to collect documents in a short time, but also to rank the results in the suitable order. However, conventional ranking methods are not suited for fresh information retrieval because they ignore temporal value of information. So, we have proposed the novel ranking method FTF-IDF for fresh information retrieval. FTF-IDF extends TF-IDF by means of using FTF (fresh term frequency) instead of TF (term frequency). FTF differs from TF because FTF decreases as time goes. The speed of decreasing FTF is determined by the dumping factor. The dumping factor is sensitive against small changes of documents. So, we use a threshold to ignore such small changes. In some papers, we published, we detect the optimal threshold manually. In this paper, we proposed an adaptive calculating method in order to detect threshold automatically. In this method, the optimal value is determined by iterating to test generated thresholds. In this paper, we describe our method and its evaluation.

[1]  Minoru Uehara,et al.  Persistent cache in Cooperative Search Engine , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[2]  Minoru Uehara,et al.  Reliable distributed search engine based on multiple meta servers , 2002, First International Symposium on Cyber Worlds, 2002. Proceedings..

[3]  Taher H. Haveliwala Efficient Computation of PageRank , 1999 .

[4]  N. Sato,et al.  A case study on freshness based scoring for fresh information retrieval , 2004, IEEE International Symposium on Communications and Information Technology, 2004. ISCIT 2004..

[5]  Minoru Uehara,et al.  Fresh Information Retrieval Using Cooperative Meta Search Engines , 2002, ICOIN.

[6]  Minoru Uehara,et al.  FTF-IDF scoring for fresh information retrieval , 2004, 18th International Conference on Advanced Information Networking and Applications, 2004. AINA 2004..

[7]  Minoru Uehara,et al.  Query based site selection for distributed search engines , 2003, 23rd International Conference on Distributed Computing Systems Workshops, 2003. Proceedings..