Polynomial Semantic Indexing

We present a class of nonlinear (polynomial) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Dealing with polynomial models on word features is computationally challenging. We propose a low-rank (but diagonal preserving) representation of our polynomial models to induce feasible memory and computation requirements. We provide an empirical study on retrieval tasks based on Wikipedia documents, where we obtain state-of-the-art performance while providing realistically scalable methods.

[1]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[2]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[3]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[4]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[5]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[6]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[7]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[8]  Nello Cristianini,et al.  Inferring a Semantic Representation of Text via Cross-Language Correlation Analysis , 2002, NIPS.

[9]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Samy Bengio,et al.  Inferring document similarity from hyperlinks , 2005, CIKM '05.

[12]  Samy Bengio,et al.  A Neural Network for Text Representation , 2005, ICANN.

[13]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[14]  Peter V. Gehler,et al.  The rate adapting poisson model for information retrieval and object recognition , 2006, ICML.

[15]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[16]  Thomas Hofmann,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.

[17]  Amir Globerson,et al.  Visualizing pairwise similarity via semidefinite programming , 2007, AISTATS.

[18]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[19]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[20]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[21]  Tao Qin,et al.  LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .

[22]  Inderjit S. Dhillon,et al.  Online Metric Learning and Fast Similarity Search , 2008, NIPS.

[23]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  John Langford,et al.  Predictive Indexing for Fast Search , 2008, NIPS.

[25]  Oren Kurland,et al.  Query-drift prevention for robust query expansion , 2008, SIGIR '08.

[26]  Kilian Q. Weinberger,et al.  Fast solvers and efficient implementations for distance metric learning , 2008, ICML '08.

[27]  John Langford,et al.  Hash Kernels , 2009, AISTATS.

[28]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity through Ranking , 2009, IbPRIA.

[29]  Jason Weston,et al.  Supervised Semantic Indexing , 2009, ECIR.

[30]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..