Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval

Both ranking functions and user queries are very important factors affecting a search engine's performance. Prior research has looked at how to improve ad-hoc retrieval performance for existing queries while tuning the ranking function, or modify and expand user queries using a fixed ranking scheme using blind feedback. However, almost no research has looked at how to combine ranking function tuning and blind feedback together to improve ad-hoc retrieval performance. In this paper, we look at the performance improvement for ad-hoc retrieval from a more integrated point of view by combining the merits of both techniques. In particular, we argue that the ranking function should be tuned first, using user-provided queries, before applying the blind feedback technique. The intuition is that highly-tuned ranking offers more high quality documents at the top of the hit list, thus offers a stronger baseline for blind feedback. We verify this integrated model in a large scale heterogeneous collection and the experimental results show that combining ranking function tuning and blind feedback can improve search performance by almost 30% over the baseline Okapi system.

[1]  Jae Yun Lee,et al.  Optimization of some factors affecting the performance of query expansion , 2004, Inf. Process. Manag..

[2]  Edward A. Fox,et al.  Ranking function optimization for effective Web search by genetic programming: an empirical study , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[3]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[4]  Harris Wu,et al.  The effects of fitness functions on genetic programming-based ranking discovery for Web search: Research Articles , 2004 .

[5]  Ophir Frieder,et al.  Improving relevance feedback in the vector space model , 1997, CIKM '97.

[6]  Stephen E. Robertson,et al.  On Term Selection for Query Expansion , 1991, J. Documentation.

[7]  M. Shamim Khan,et al.  Enhanced Web document retrieval using automatic query expansion , 2004, J. Assoc. Inf. Sci. Technol..

[8]  Chris Buckley,et al.  A probabilistic learning approach for document indexing , 1991, TOIS.

[9]  Norbert Fuhr,et al.  Probabilistic information retrieval as a combination of abstraction, inductive learning, and probabilistic assumptions , 1994, TOIS.

[10]  Claudio Carpineto,et al.  An information-theoretic approach to automatic query expansion , 2001, TOIS.

[11]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[12]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[13]  Riccardo Poli,et al.  Foundations of Genetic Programming , 1999, Springer Berlin Heidelberg.

[14]  Weiguo Fan,et al.  A generic ranking function discovery framework by genetic programming for information retrieval , 2004, Inf. Process. Manag..

[15]  Garrison W. Cottrell,et al.  Automatic combination of multiple ranked retrieval systems , 1994, SIGIR '94.

[16]  Garrison W. Cottrell,et al.  Fusion Via a Linear Combination of Scores , 1999, Information Retrieval.

[17]  Proceedings of The Fourth Text REtrieval Conference, TREC 1995, Gaithersburg, Maryland, USA, November 1-3, 1995 , 1995, TREC.

[18]  Peter Nordin,et al.  Genetic programming - An Introduction: On the Automatic Evolution of Computer Programs and Its Applications , 1998 .

[19]  Jong-Hak Lee,et al.  Analyses of multiple evidence combination , 1997, SIGIR '97.

[20]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[21]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[22]  Claudio Carpineto,et al.  Improving retrieval feedback with multiple term-ranking function combination , 2002, TOIS.

[23]  Michael D. Gordon Probabilistic and genetic algorithms in document retrieval , 1988, CACM.

[24]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[25]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[26]  Weiguo Fan,et al.  Effective information retrieval using genetic algorithms based matching functions adaptation , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[27]  Alistair Moffat,et al.  Exploring the similarity space , 1998, SIGF.

[28]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[29]  Weiguo Fan,et al.  Effective profiling of consumer information retrieval needs: a unified framework and empirical comparison , 2005, Decis. Support Syst..

[30]  Weiguo Fan,et al.  Discovery of context-specific ranking functions for effective information retrieval using genetic programming , 2004, IEEE Transactions on Knowledge and Data Engineering.