Statistical Analysis of Bayes Optimal Subset Ranking

The ranking problem has become increasingly important in modern applications of statistical methods in automated decision making systems. In particular, we consider a formulation of the statistical ranking problem which we call subset ranking, and focus on the discounted cumulated gain (DCG) criterion that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, direct optimization of natural ranking criteria such as DCG leads to a nonconvex optimization problems that can be NP-hard. Therefore, a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the asymptotic statistical behavior of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.

[1]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[2]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[3]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[4]  Pierluigi Crescenzi,et al.  A compendium of NP optimization problems , 1994, WWW Spring 1994.

[5]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[6]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[7]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[8]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[9]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[10]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[11]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[12]  Ingo Steinwart,et al.  Support Vector Machines are Universally Consistent , 2002, J. Complex..

[13]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[14]  Rank Aggregation Revisited , 2002 .

[15]  Tong Zhang Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[16]  Shie Mannor,et al.  Greedy Algorithms for Classification -- Consistency, Convergence Rates, and Adaptivity , 2003, J. Mach. Learn. Res..

[17]  G. Lugosi,et al.  On the Bayes-risk consistency of regularized boosting methods , 2003 .

[18]  Gilles Blanchard,et al.  On the Rate of Convergence of Regularized Boosting Classifiers , 2003, J. Mach. Learn. Res..

[19]  Tong Zhang,et al.  Leave-One-Out Bounds for Kernel Methods , 2003, Neural Computation.

[20]  大卫·科索克 Method and apparatus for machine learning a document relevance function , 2004 .

[21]  Saharon Rosset,et al.  Model selection via the AUC , 2004, ICML.

[22]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[23]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[24]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[25]  Ambuj Tewari,et al.  On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[26]  Bin Yu,et al.  Boosting with early stopping: Convergence and consistency , 2005, math/0508276.

[27]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[28]  Gábor Lugosi,et al.  Ranking and Scoring Using Empirical Risk Minimization , 2005, COLT.

[29]  Dan Roth,et al.  Learnability of Bipartite Ranking Functions , 2005, COLT.

[30]  Michael I. Jordan,et al.  Convexity, Classification, and Risk Bounds , 2006 .

[31]  Hongyuan Zha,et al.  Incorporating query difference for learning retrieval functions in information retrieval , 2006, SIGIR '06.

[32]  Noga Alon,et al.  Ranking Tournaments , 2006, SIAM J. Discret. Math..

[33]  Peter Buhlmann Boosting for high-dimensional linear models , 2006, math/0606789.

[34]  Hongyuan Zha,et al.  Incorporating query difference for learning retrieval functions in world wide web search , 2006, CIKM '06.

[35]  Cynthia Rudin,et al.  Ranking with a P-Norm Push , 2006, COLT.

[36]  B. Peter BOOSTING FOR HIGH-DIMENSIONAL LINEAR MODELS , 2006 .

[37]  Yoram Singer,et al.  Efficient Learning of Label Ranking by Soft Projections onto Polyhedra , 2006, J. Mach. Learn. Res..