Supervised Semantic Indexing for Ranking Documents
暂无分享,去创建一个
Ranking text documents given a query is one of the key tasks in information retrieval. Typical solutions include classical vector space models using weighted word counts and the cosine similarity (TFIDF) with no machine learning at all, or Latent Semantic Indexing (LSI) using unsupervised learning to learn a low dimensional space of “latent concepts” via a reconstruction objective. The former assumes independence of words and cannot capture synonymy or polysemy, whilst the latter is still agnostic to the actual task of interest.
[1] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.
[2] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.
[3] Thore Graepel,et al. Large Margin Rank Boundaries for Ordinal Regression , 2000 .
[4] Michael L. Littman,et al. Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .