Linear feature-based models for information retrieval

There have been a number of linear, feature-based models proposed by the information retrieval community recently. Although each model is presented differently, they all share a common underlying framework. In this paper, we explore and discuss the theoretical issues of this framework, including a novel look at the parameter space. We then detail supervised training algorithms that directly maximize the evaluation metric under consideration, such as mean average precision. We present results that show training models in this way can lead to significantly better test set performance compared to other training methods that do not directly maximize the metric. Finally, we show that linear feature-based models can consistently and significantly outperform current state of the art retrieval models with the correct choice of features.

[1]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[2]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[3]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[4]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[5]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[6]  W. Press,et al.  Numerical Recipes in Fortran: The Art of Scientific Computing.@@@Numerical Recipes in C: The Art of Scientific Computing. , 1994 .

[7]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[8]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[11]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[12]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[13]  Luo Si,et al.  A statistical model for scientific readability , 2001, CIKM '01.

[14]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[15]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[16]  Djoerd Hiemstra,et al.  The Importance of Prior Probabilities for Entry Page Search , 2002, SIGIR '02.

[17]  Donna K. Harman,et al.  Overview of the TREC 2002 Novelty Track , 2002, TREC.

[18]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[19]  John D. Lafferty,et al.  Hyperplane margin classifiers on the multinomial manifold , 2004, ICML.

[20]  Charles L. A. Clarke,et al.  Overview of the TREC 2004 Terabyte Track , 2004, TREC.

[21]  Ramesh Nallapati,et al.  Discriminative models for information retrieval , 2004, SIGIR '04.

[22]  John C. Henderson,et al.  Direct Maximization of Average Precision by Hill-Climbing, with a Comparison to a Maximum Entropy Approach , 2004, HLT-NAACL.

[23]  W. Bruce Croft,et al.  Indri at TREC 2004: Terabyte Track , 2004, TREC.

[24]  Xi Chen,et al.  Text classification with kernels on the multinomial manifold , 2005, SIGIR '05.

[25]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[26]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[27]  Jianfeng Gao,et al.  Linear discriminant model for information retrieval , 2005, SIGIR '05.

[28]  Gilad Mishne,et al.  Boosting Web Retrieval through Query Operations , 2005, BNAIC.

[29]  ChengXiang Zhai,et al.  Active feedback in ad hoc information retrieval , 2005, SIGIR '05.

[30]  W. Bruce Croft,et al.  Document quality models for web ad hoc retrieval , 2005, CIKM '05.

[31]  Stephen E. Robertson,et al.  Relevance weighting for query independent evidence , 2005, SIGIR '05.

[32]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[33]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[34]  Christopher J. C. Burges,et al.  High accuracy retrieval with multiple nested ranker , 2006, SIGIR.