Kernelizing the Proportional Odds Model through the Empirical Kernel Mapping

The classification of patterns into naturally ordered labels is referred to as ordinal regression. This paper explores the notion of kernel trick and empirical feature space in order to reformulate the most widely used linear ordinal classification algorithm (the Proportional Odds Model or POM) to perform nonlinear decision regions. The proposed method seems to be competitive with other state-of-the-art algorithms and significantly improves the original POM algorithm when using 8 ordinal datasets. Specifically, the capability of the methodology to handle nonlinear decision regions has been proven by the use of a non-linearly separable toy dataset.

[1]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[2]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Xiaoming Zhang,et al.  Kernel Discriminant Learning for Ordinal Regression , 2010, IEEE Transactions on Knowledge and Data Engineering.

[5]  Joachim M. Buhmann,et al.  On Relevant Dimensions in Kernel Feature Spaces , 2008, J. Mach. Learn. Res..

[6]  Andrea Esuli,et al.  Evaluation Measures for Ordinal Regression , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[7]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[8]  Huilin Xiong,et al.  A Unified Framework for Kernelization: The Empirical Kernel Feature Space , 2009, 2009 Chinese Conference on Pattern Recognition.

[9]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[10]  María Pérez-Ortiz,et al.  An Experimental Study of Different Ordinal Regression Methods and Measures , 2012, HAIS.

[11]  Shigeo Abe,et al.  Sparse Least Squares Support Vector Regressors Trained in the Reduced Empirical Feature Space , 2007, ICANN.

[12]  A. Agresti Categorical data analysis , 1993 .

[13]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[14]  Sung-Bae Cho,et al.  Hybrid Artificial Intelligent Systems , 2015, Lecture Notes in Computer Science.

[15]  Robert B. Fisher,et al.  Incremental One-Class Learning with Bounded Computational Complexity , 2007, ICANN.

[16]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[17]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[18]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[19]  Bernard De Baets,et al.  Learning partial ordinal class memberships with kernel-based proportional odds models , 2012, Comput. Stat. Data Anal..