Cost-Sensitive AdaBoost Algorithm for Ordinal Regression Based on Extreme Learning Machine

In this paper, the well known stagewise additive modeling using a multiclass exponential (SAMME) boosting algorithm is extended to address problems where there exists a natural order in the targets using a cost-sensitive approach. The proposed ensemble model uses an extreme learning machine (ELM) model as a base classifier (with the Gaussian kernel and the additional regularization parameter). The closed form of the derived weighted least squares problem is provided, and it is employed to estimate analytically the parameters connecting the hidden layer to the output layer at each iteration of the boosting algorithm. Compared to the state-of-the-art boosting algorithms, in particular those using ELM as base classifier, the suggested technique does not require the generation of a new training dataset at each iteration. The adoption of the weighted least squares formulation of the problem has been presented as an unbiased and alternative approach to the already existing ELM boosting techniques. Moreover, the addition of a cost model for weighting the patterns, according to the order of the targets, enables the classifier to tackle ordinal regression problems further. The proposed method has been validated by an experimental study by comparing it with already existing ensemble methods and ELM techniques for ordinal regression, showing competitive results.

[1]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[2]  César Hervás-Martínez,et al.  Addressing the EU Sovereign Ratings Using an Ordinal Regression Approach , 2013, IEEE Transactions on Cybernetics.

[3]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[4]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[5]  Ping Li,et al.  Dynamic Adaboost ensemble extreme learning machine , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[6]  Ren Xuemei,et al.  Camera Calibration Based on Extreme Learning Machine , 2013 .

[7]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[8]  André L. V. Coelho,et al.  On the evolutionary design of heterogeneous Bagging models , 2010, Neurocomputing.

[9]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[10]  J. Anderson Regression and Ordered Categorical Variables , 1984 .

[11]  Pedro Antonio Gutiérrez,et al.  MELM-GRBF: A modified version of the extreme learning machine for generalized radial basis function neural networks , 2011, Neurocomputing.

[12]  Haibo He,et al.  Incremental Learning From Stream Data , 2011, IEEE Transactions on Neural Networks.

[13]  Hsuan-Tien Lin,et al.  Combining Ordinal Preferences by Boosting , 2009 .

[14]  Robi Polikar,et al.  Can AdaBoost.M1 Learn Incrementally? A Comparison to Learn++ Under Different Combination Rules , 2006, ICANN.

[15]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[17]  Peter L. Bartlett,et al.  The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.

[18]  Zhiping Lin,et al.  Composite function wavelet neural networks with extreme learning machine , 2010, Neurocomputing.

[19]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[20]  Willem Waegeman,et al.  An ensemble of Weighted Support Vector Machines for Ordinal Regression , 2007 .

[21]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[22]  Ling Li,et al.  Reduction from Cost-Sensitive Ordinal Ranking to Weighted Binary Classification , 2012, Neural Computation.

[23]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[24]  Zhiquan Qi,et al.  Online multiple instance boosting for object detection , 2011, Neurocomputing.

[25]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[26]  P. McCullagh Regression Models for Ordinal Data , 1980 .

[27]  Xiaoming Zhang,et al.  Kernel Discriminant Learning for Ordinal Regression , 2010, IEEE Transactions on Knowledge and Data Engineering.

[28]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[29]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[30]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[31]  Pedro Antonio Gutiérrez,et al.  On the suitability of Extreme Learning Machine for gene classification using feature selection , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[32]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Pedro Antonio Gutiérrez,et al.  Projection-Based Ensemble Learning for Ordinal Regression , 2014, IEEE Transactions on Cybernetics.

[34]  Hongming Zhou,et al.  Credit risk evaluation with extreme learning machine , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[35]  Francisco J. Martínez-Estudillo,et al.  Evolutionary Extreme Learning Machine for Ordinal Regression , 2012, ICONIP.

[36]  César Hervás-Martínez,et al.  PCA-ELM: A Robust and Pruned Extreme Learning Machine Approach Based on Principal Component Analysis , 2012, Neural Processing Letters.

[37]  Zhi-Zhong Mao,et al.  An Ensemble ELM Based on Modified AdaBoost.RT Algorithm for Predicting the Temperature of Molten Steel in Ladle Furnace , 2010, IEEE Transactions on Automation Science and Engineering.

[38]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .

[39]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[40]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[41]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..

[42]  Xizhao Wang,et al.  Dynamic ensemble extreme learning machine based on sample entropy , 2012, Soft Comput..

[43]  María Pérez-Ortiz,et al.  An Experimental Study of Different Ordinal Regression Methods and Measures , 2012, HAIS.

[44]  Ling Li,et al.  Large-Margin Thresholded Ensembles for Ordinal Regression: Theory and Practice , 2006, ALT.

[45]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[46]  Qinghua Zheng,et al.  Ordinal extreme learning machine , 2010, Neurocomputing.

[47]  Ivor W. Tsang,et al.  Transductive Ordinal Regression , 2011, IEEE Transactions on Neural Networks and Learning Systems.

[48]  P. Saratchandran,et al.  Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[49]  Wei Chu,et al.  Support Vector Ordinal Regression , 2007, Neural Computation.

[50]  Robi Polikar,et al.  Learn$^{++}$ .NC: Combining Ensemble of Classifiers With Dynamically Weighted Consult-and-Vote for Efficient Incremental Learning of New Classes , 2009, IEEE Transactions on Neural Networks.

[51]  Xiaodong Lin,et al.  Active Learning From Stream Data Using Optimal Weight Classifier Ensemble , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[52]  Jaime S. Cardoso,et al.  Learning to Classify Ordinal Data: The Data Replication Method , 2007, J. Mach. Learn. Res..

[53]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[54]  Daniel Hernández-Lobato,et al.  Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles , 2011, Neurocomputing.

[55]  A. Kai Qin,et al.  Evolutionary extreme learning machine , 2005, Pattern Recognit..

[56]  Xin Yao,et al.  DIVACE: Diverse and Accurate Ensemble Learning Algorithm , 2004, IDEAL.

[57]  Robert A. Lordo,et al.  Learning from Data: Concepts, Theory, and Methods , 2001, Technometrics.

[58]  Pedro Antonio Gutiérrez,et al.  Negative Correlation Ensemble Learning for Ordinal Regression , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.