Neural network for multi-class classification by boosting composite stumps

We put forward a new model for multi-class classification problems based on the Neural Network structure. The model employs weighted linear regression for feature selection and uses boosting algorithm for ensemble learning. Unlike most previous algorithms, which need to build a collection of binary classifiers independently, the method constructs only one strong classifier once and for all classes via minimizing the total error in a forward stagewise manner. In this work, a novel weak learner framework called composite stump is proposed to improve convergence speed and share features. With these optimization techniques, the classification problem is solved by a simple but effective classifier. Experiments show that the new method outperforms the previous approaches on a number of data sets. HighlightsA novel structure is proposed to improve convergence speed and share features.An adaptive neural network model is presented for multi-class classification.Linear functions are employed as the activation functions in the model.A weighted linear regression with sparsity constraints is used for feature selection.

[1]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[2]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[3]  Carmen Peláez-Moreno,et al.  A Speech Recognizer Based on Multiclass SVMs with HMM-Guided Segmentation , 2005, NOLISP.

[4]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[5]  Giorgio Valentini,et al.  An Experimental Analysis of the Dependence Among Codeword Bit Errors in Ecoc Learning Machines , 2002, Neurocomputing.

[6]  Václav Hlavác,et al.  Multi-class support vector machine , 2002, Object recognition supported by user interaction for service robots.

[7]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[8]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Giorgio Valentini,et al.  Effectiveness of error correcting output coding methods in ensemble and monolithic learning machines , 2003 .

[11]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[12]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[13]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[14]  Shiliang Sun,et al.  Multitask Multiclass Support Vector Machines , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[15]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[16]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[17]  Anton van den Hengel,et al.  A Direct Approach to Multi-class Boosting and Extensions , 2012, ArXiv.

[18]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Qaisar Abbas,et al.  Pattern classification of dermoscopy images: A perceptually uniform model , 2013, Pattern Recognit..

[21]  Abdesselam Bouzerdoum,et al.  A hierarchical learning network for face detection with in-plane rotation , 2008, Neurocomputing.

[22]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[23]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[24]  Joydeep Ghosh,et al.  Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis , 2002, Pattern Analysis & Applications.

[25]  Goo Jun,et al.  Multi-class Boosting with Class Hierarchies , 2009, MCS.

[26]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[27]  Yoav Freund,et al.  Boosting: Foundations and Algorithms , 2012 .

[28]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.

[29]  Shiliang Sun,et al.  Multitask multiclass support vector machines: Model and experiments , 2013, Pattern Recognit..

[30]  Joydeep Ghosh,et al.  An Empirical Comparison of Hierarchical vs. Two-Level Approaches to Multiclass Problems , 2004, Multiple Classifier Systems.

[31]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[32]  Song Wang,et al.  Part-based methods for handwritten digit recognition , 2013, Frontiers of Computer Science.

[33]  Johannes Stallkamp,et al.  The German Traffic Sign Recognition Benchmark: A multi-class classification competition , 2011, The 2011 International Joint Conference on Neural Networks.

[34]  Robert E. Schapire,et al.  A theory of multiclass boosting , 2010, J. Mach. Learn. Res..

[35]  Shiliang Sun,et al.  A review of optimization methodologies in support vector machines , 2011, Neurocomputing.

[36]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[37]  Steven K. Rogers,et al.  Bayesian selection of important features for feedforward neural networks , 1993, Neurocomputing.

[38]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[39]  Jiawei Han,et al.  SRDA: An Efficient Algorithm for Large-Scale Discriminant Analysis , 2008, IEEE Transactions on Knowledge and Data Engineering.

[40]  Erol Gelenbe,et al.  Learning in the multiple class random neural network , 2002, IEEE Trans. Neural Networks.