Learning to Rank in Vector Spaces and Social Networks

We survey machine learning techniques to learn ranking functions for entities represented as feature vectors as well as nodes in a social network. In the feature-vector scenario, an entity, e.g., a document x, is mapped to a feature vector ψ(x) ∈ ℝ d in a d-dimensional space, and we have to search for a weight vector β ∈ ℝ d . The ranking is then based on the values of β · ψ(x). This case corresponds to information retrieval in the "vector space" model. Training data consists of a partial order of preference among entities. We study probabilistic Bayesian and maximummargin approaches to solving this problem, including recent efficient near-linear-time approximate algorithms. In the graph node-ranking scenario, we briefly review PageRank, generalize it to arbitrary Markov conductance matrices, and consider the problem of learning conductance parameters from partial orders between nodes. In another class of formulation, the graph does not establish PageRank or prestige-flow relationships between nodes, but encourages a certain smoothness between the scores (ranks) of neighboring nodes. Some of these techniques have been used by Web search companies with very large query logs. We review some of the issues that arise when applying the theory to practical systems. Finally, we review connections between the stability of a score/rank-learning algorithm and its power to generalize to unforeseen test data.

[1]  Wei Chu,et al.  New approaches to support vector ordinal regression , 2005, ICML.

[2]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[3]  Eric Brill,et al.  Beyond PageRank: machine learning for static ranking , 2006, WWW '06.

[4]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[5]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[6]  Junghoo Cho,et al.  Impact of search engines on page popularity , 2004, WWW '04.

[7]  Surajit Chaudhuri,et al.  DBXplorer: a system for keyword-based search over relational databases , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[9]  Soumen Chakrabarti,et al.  Learning random walks to rank nodes in graphs , 2007, ICML '07.

[10]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[11]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[12]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments , 2005, Internet Math..

[13]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[14]  Filip Radlinski,et al.  A support vector method for optimizing average precision , 2007, SIGIR.

[15]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  Shivani Agarwal,et al.  Ranking on graph data , 2006, ICML.

[17]  John A. Tomlin,et al.  A new paradigm for ranking pages on the world wide web , 2003, WWW '03.

[18]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[19]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[20]  Alexander J. Smola,et al.  Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.

[21]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[22]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[23]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[24]  Soumen Chakrabarti,et al.  Learning Parameters in Entity Relationship Graphs from Ranking Preferences , 2006, PKDD.

[25]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[26]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[27]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[28]  Sergei Vassilvitskii,et al.  Using web-graph distance for relevance feedback in web search , 2006, SIGIR.

[29]  S. Sudarshan,et al.  Keyword searching and browsing in databases using BANKS , 2002, Proceedings 18th International Conference on Data Engineering.

[30]  Shlomo Moran,et al.  Rank-Stability and Rank-Similarity of Link-Based Web Ranking Algorithms in Authority-Connected Graphs , 2005, Information Retrieval.

[31]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[32]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[33]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[34]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[35]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[36]  Marco Gori,et al.  Learning Web Page Scores by Error Back-Propagation , 2005, IJCAI.

[37]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[38]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[39]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[40]  F. Chung Laplacians and the Cheeger Inequality for Directed Graphs , 2005 .

[41]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[42]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[43]  Alexander J. Smola,et al.  Learning with kernels , 1998 .