Projection-Based Ensemble Learning for Ordinal Regression

The classification of patterns into naturally ordered labels is referred to as ordinal regression. This paper proposes an ensemble methodology specifically adapted to this type of problem, which is based on computing different classification tasks through the formulation of different order hypotheses. Every single model is trained in order to distinguish between one given class (k) and all the remaining ones, while grouping them in those classes with a rank lower than k, and those with a rank higher than k. Therefore, it can be considered as a reformulation of the well-known one-versus-all scheme. The base algorithm for the ensemble could be any threshold (or even probabilistic) method, such as the ones selected in this paper: kernel discriminant analysis, support vector machines and logistic regression (LR) (all reformulated to deal with ordinal regression problems). The method is seen to be competitive when compared with other state-of-the-art methodologies (both ordinal and nominal), by using six measures and a total of 15 ordinal datasets. Furthermore, an additional set of experiments is used to study the potential scalability and interpretability of the proposed method when using LR as base methodology for the ensemble.

[1]  Willem Waegeman,et al.  An ensemble of Weighted Support Vector Machines for Ordinal Regression , 2007 .

[2]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[3]  María Pérez-Ortiz,et al.  Can Machine Learning Techniques Help to Improve the Common Fisheries Policy? , 2013, IWANN.

[4]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[5]  Amnon Shashua,et al.  Ranking with Large Margin Principle: Two Approaches , 2002, NIPS.

[6]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[7]  Jaime S. Cardoso,et al.  Learning to Classify Ordinal Data: The Data Replication Method , 2007, J. Mach. Learn. Res..

[8]  Klaus Obermayer,et al.  Support vector learning for ordinal regression , 1999 .

[9]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[10]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[11]  Kishan G. Mehrotra,et al.  Efficient classification for multiclass problems using modular neural networks , 1995, IEEE Trans. Neural Networks.

[12]  Ling Li,et al.  Ordinal Regression by Extended Binary Classification , 2006, NIPS.

[13]  S. Menard Logistic Regression: From Introductory to Advanced Concepts and Applications , 2009 .

[14]  Pedro Antonio Gutiérrez,et al.  A preliminary study of ordinal metrics to guide a multi-objective evolutionary algorithm , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[15]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[16]  Bernhard Schölkopf,et al.  Support Vector Machines as Probabilistic Models , 2011, ICML.

[17]  Gerhard Widmer,et al.  Prediction of Ordinal Classes Using Regression Trees , 2001, Fundam. Informaticae.

[18]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[19]  Qinghua Zheng,et al.  Ordinal extreme learning machine , 2010, Neurocomputing.

[20]  Andrea Esuli,et al.  Evaluation Measures for Ordinal Regression , 2009, 2009 Ninth International Conference on Intelligent Systems Design and Applications.

[21]  Bin Xu,et al.  Generalized Discriminant Analysis: A Matrix Exponential Approach , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Sotiris B. Kotsiantis,et al.  A Cost Sensitive Technique for Ordinal Classification Problems , 2004, SETN.

[23]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[24]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[27]  María Pérez-Ortiz,et al.  An ensemble approach for ordinal threshold models applied to liver transplantation , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[28]  John A. Nelder,et al.  Generalized linear models. 2nd ed. , 1993 .

[29]  Xin Yao,et al.  Multiclass Imbalance Problems: Analysis and Potential Solutions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[30]  María Pérez-Ortiz,et al.  An Experimental Study of Different Ordinal Regression Methods and Measures , 2012, HAIS.

[31]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  David J. Miller,et al.  Ensemble classification by critic-driven combining , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[33]  Pedro Antonio Gutiérrez,et al.  Exploitation of Pairwise Class Distances for Ordinal Classification , 2013, Neural Computation.

[34]  Chih-Jen Lin,et al.  A Comparison of Methods for Multi-class Support Vector Machines , 2015 .

[35]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[36]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[37]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[38]  Xiaoming Zhang,et al.  Kernel Discriminant Learning for Ordinal Regression , 2010, IEEE Transactions on Knowledge and Data Engineering.

[39]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[41]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.