Generalization Analysis for Ranking Using Integral Operator

The study on generalization performance of ranking algorithms is one of the fundamental issues in ranking learning theory. Although several generalization bounds have been proposed based on different measures, the convergence rates of the existing bounds are usually at most O ( 1 √ n ) , where n is the size of data set. In this paper, we derive novel generalization bounds for the regularized ranking in reproducing kernel Hilbert space via integral operator of kernel function. We prove that the rates of our bounds are much faster than O ( 1 √ n ) . Specifically, we first introduce a notion of local Rademacher complexity for ranking, called local ranking Rademacher complexity, which is used to measure the complexity of the space of loss functions of the ranking. Then, we use the local ranking Rademacher complexity to obtain a basic generalization bound. Finally, we establish the relationship between the local Rademacher complexity and the eigenvalues of integral operator, and further derive sharp generalization bounds of faster convergence rate.

[1]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[2]  Tao Qin,et al.  Query-level stability and generalization in learning to rank , 2008, ICML '08.

[3]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[4]  Tie-Yan Liu,et al.  Generalization analysis of listwise learning-to-rank algorithms , 2009, ICML '09.

[5]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[6]  Wojciech Rejchel,et al.  On Ranking and Generalization Bounds , 2012, J. Mach. Learn. Res..

[7]  Yong Liu,et al.  Eigenvalues Ratio for Kernel Selection of Kernel Methods , 2015, AAAI.

[8]  Cynthia Rudin,et al.  Margin-based Ranking and an Equivalence between AdaBoost and RankBoost , 2009, J. Mach. Learn. Res..

[9]  Marius Kloft,et al.  Learning Kernels Using Local Rademacher Complexity , 2013, NIPS.

[10]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[11]  Ingo Steinwart,et al.  Optimal learning rates for least squares SVMs using Gaussian kernels , 2011, NIPS.

[12]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[13]  Thomas Hofmann,et al.  Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.

[14]  Gábor Lugosi,et al.  Ranking and Scoring Using Empirical Risk Minimization , 2005, COLT.

[15]  Shivani Agarwal,et al.  Generalization Bounds for Ranking Algorithms via Algorithmic Stability , 2009, J. Mach. Learn. Res..

[16]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[17]  Cynthia Rudin,et al.  The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[18]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[19]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[20]  D. Pollard,et al.  Simulation and the Asymptotics of Optimization Estimators , 1989 .

[21]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[22]  Tong Zhang,et al.  Statistical Analysis of Bayes Optimal Subset Ranking , 2008, IEEE Transactions on Information Theory.

[23]  Tie-Yan Liu,et al.  Two-Layer Generalization Analysis for Ranking Using Rademacher Average , 2010, NIPS.

[24]  Bernhard Schölkopf,et al.  Generalization Performance of Regularization Networks and Support Vector Machines via Entropy Numbers of Compact Operators , 1998 .

[25]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[26]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[27]  Don R. Hush,et al.  Optimal Rates for Regularized Least Squares Regression , 2009, COLT.

[28]  Tie-Yan Liu,et al.  Learning to rank: from pairwise approach to listwise approach , 2007, ICML '07.

[29]  Shivani Agarwal,et al.  Stability and Generalization of Bipartite Ranking Algorithms , 2005, COLT.

[30]  Mikhail Belkin,et al.  Data spectroscopy: learning mixture models using eigenspaces of convolution operators , 2008, ICML '08.

[31]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[32]  P. Bartlett,et al.  Local Rademacher complexities , 2005, math/0508275.

[33]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..