Pairwise Costs in Multiclass Perceptrons

A novel loss function to train a net of K single-layer perceptrons (KSLPs) is suggested, where pairwise misclassification cost matrix can be incorporated directly. The complexity of the network remains the same; a gradient's computation of the loss function does not necessitate additional calculations. Minimization of the loss requires a smaller number of training epochs. Efficacy of cost-sensitive methods depends on the cost matrix, the overlap of the pattern classes, and sample sizes. Experiments with real-world pattern recognition (PR) tasks show that employment of novel loss function usually outperforms three benchmark methods.

[1]  John Langford,et al.  An iterative method for multi-class cost-sensitive learning , 2004, KDD.

[2]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[3]  Shun-ichi Amari,et al.  A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..

[4]  Raúl Santos-Rodríguez,et al.  Cost-sensitive learning based on Bregman divergences , 2009, Machine Learning.

[5]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[6]  Fritz Wysotzki,et al.  Learning Perceptrons and Piecewise Linear Classifiers Sensitive to Example Dependent Costs , 2004, Applied Intelligence.

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[9]  Sarunas Raudys,et al.  Evolution and generalization of a single neurone: I. Single-layer perceptron as seven statistical classifiers , 1998, Neural Networks.

[10]  Šarūnas Raudys,et al.  Statistical and Neural Classifiers: An Integrated Approach to Design , 2012 .

[11]  Ulf Brefeld,et al.  Perceptron and SVM learning with generalized cost models , 2004, Intell. Data Anal..

[12]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[13]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[14]  Robert P. W. Duin,et al.  Dissimilarity representations allow for building good classifiers , 2002, Pattern Recognit. Lett..

[15]  S. Raudys,et al.  Effect of initial values in simple perception , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[16]  Robert P. W. Duin,et al.  K-nearest Neighbors Directed Noise Injection in Multilayer Perceptron Training , 2000, IEEE Trans. Neural Networks Learn. Syst..