Generalization Performance of Regularized Ranking With Multiscale Kernels

The regularized kernel method for the ranking problem has attracted increasing attentions in machine learning. The previous regularized ranking algorithms are usually based on reproducing kernel Hilbert spaces with a single kernel. In this paper, we go beyond this framework by investigating the generalization performance of the regularized ranking with multiscale kernels. A novel ranking algorithm with multiscale kernels is proposed and its representer theorem is proved. We establish the upper bound of the generalization error in terms of the complexity of hypothesis spaces. It shows that the multiscale ranking algorithm can achieve satisfactory learning rates under mild conditions. Experiments demonstrate the effectiveness of the proposed method for drug discovery and recommendation tasks.

[1]  Cynthia Rudin,et al.  Margin-based Ranking and an Equivalence between AdaBoost and RankBoost , 2009, J. Mach. Learn. Res..

[2]  Di-Rong Chen,et al.  Partially-Linear Least-Squares Regularized Regression for System Identification , 2009, IEEE Transactions on Automatic Control.

[3]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[4]  Yiming Ying,et al.  Multi-kernel regularized classifiers , 2007, J. Complex..

[5]  Ramani Duraiswami,et al.  A Fast Algorithm for Learning a Ranking Function from Large-Scale Data Sets , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  T. Salakoski,et al.  Learning to Rank with Pairwise Regularized Least-Squares , 2007 .

[7]  Yiming Ying,et al.  Learning Rates of Least-Square Regularized Regression , 2006, Found. Comput. Math..

[8]  Shivani Agarwal,et al.  Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach , 2010, J. Chem. Inf. Model..

[9]  Chao Zhang,et al.  Generalization Bounds of ERM-Based Learning Processes for Continuous-Time Markov Chains , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Cynthia Rudin,et al.  The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List , 2009, J. Mach. Learn. Res..

[12]  Jiaxin Wang,et al.  Non-flat function estimation with a multi-scale support vector regression , 2006, Neurocomputing.

[13]  Xia Liu,et al.  Is Extreme Learning Machine Feasible? A Theoretical Assessment (Part I) , 2015, IEEE Trans. Neural Networks Learn. Syst..

[14]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Luoqing Li,et al.  Learning rates of multi-kernel regularized regression , 2010 .

[16]  L. Li,et al.  Learning Similarity With Multikernel Method , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Lu Liu,et al.  Least Square Regularized Regression in Sum Space , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[18]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[19]  Hong Chen,et al.  Generalization performance of bipartite ranking algorithms with convex losses , 2013 .

[20]  Yiming Ying,et al.  Learnability of Gaussians with Flexible Variances , 2007, J. Mach. Learn. Res..

[21]  Feilong Cao,et al.  Analysis of convergence performance of neural networks ranking algorithm , 2012, Neural Networks.

[22]  Yuan Yan Tang,et al.  Generalization Performance of Fisher Linear Discriminant Based on Markov Sampling , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[24]  Mehryar Mohri,et al.  An Alternative Ranking Problem for Search Engines , 2007, WEA.

[25]  J. Sutherland,et al.  A comparison of methods for modeling quantitative structure-activity relationships. , 2004, Journal of medicinal chemistry.

[26]  Felipe Cucker,et al.  Best Choices for Regularization Parameters in Learning Theory: On the Bias—Variance Problem , 2002, Found. Comput. Math..

[27]  Tong Zhang,et al.  Statistical Analysis of Bayes Optimal Subset Ranking , 2008, IEEE Transactions on Information Theory.

[28]  Xuelong Li,et al.  Error Analysis of Stochastic Gradient Descent Ranking , 2013, IEEE Transactions on Cybernetics.

[29]  Wojciech Rejchel,et al.  On Ranking and Generalization Bounds , 2012, J. Mach. Learn. Res..

[30]  Jie Xu,et al.  The Generalization Ability of Online SVM Classification Based on Markov Sampling , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Ding-Xuan Zhou,et al.  Capacity of reproducing kernel spaces in learning theory , 2003, IEEE Transactions on Information Theory.

[32]  Tapio Pahikkala,et al.  An efficient algorithm for learning to rank from preference graphs , 2009, Machine Learning.

[33]  Quoc V. Le,et al.  Abstract , 2003, Appetite.

[34]  Yicong Zhou,et al.  Extreme learning machine for ranking: Generalization analysis and applications , 2014, Neural Networks.

[35]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[36]  Mehryar Mohri,et al.  Magnitude-preserving ranking algorithms , 2007, ICML '07.

[37]  Yoonkyung Lee,et al.  Statistical Optimality in Multipartite Ranking and Ordinal Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Shivani Agarwal,et al.  Generalization Bounds for Ranking Algorithms via Algorithmic Stability , 2009, J. Mach. Learn. Res..

[39]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[40]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[41]  Ingo Steinwart,et al.  Fast rates for support vector machines using Gaussian kernels , 2007, 0708.1838.

[42]  Ivor W. Tsang,et al.  Domain Transfer Multiple Kernel Learning , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Wensheng Zhang,et al.  Generalization Performance of Radial Basis Function Networks , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[45]  Hong Chen,et al.  The convergence rate of a regularized ranking algorithm , 2012, J. Approx. Theory.

[46]  Jun Fan,et al.  Learning theory approach to minimum error entropy criterion , 2012, J. Mach. Learn. Res..

[47]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[48]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .