A Quantitative Comparison of Different MLP Activation Functions in Classification

Multilayer perceptrons (MLP) has been proven to be very successful in many applications including classification. The activation function is the source of the MLP power. Careful selection of the activation function has a huge impact on the network performance. This paper gives a quantitative comparison of the four most commonly used activation functions, including the Gaussian RBF network, over ten real different datasets. Results show that the sigmoid activation function substantially outperforms the other activation functions. Also, using only the needed number of hidden units in the MLP, we improved its conversion time to be competitive with the RBF networks most of the time.

[1]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[2]  Brian L. Evans,et al.  Channel equalization by feedforward neural networks , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[3]  Norbert Jankowski,et al.  Survey of Neural Transfer Functions , 1999 .

[4]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[5]  L. Rybicki,et al.  Visual comparison of performance for different activation functions in MLP networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[6]  Wlodzislaw Duch,et al.  Transfer functions: hidden possibilities for better neural networks , 2001, ESANN.

[7]  J. Nazuno Haykin, Simon. Neural networks: A comprehensive foundation, Prentice Hall, Inc. Segunda Edición, 1999 , 2000 .

[8]  E. F. Camacho,et al.  Application of the RAN algorithm to the problem of short term load forecasting , 1999, 1999 European Control Conference (ECC).

[9]  Jenq-Neng Hwang,et al.  Handbook of Neural Network Signal Processing , 2000, IEEE Transactions on Neural Networks.

[10]  Dianhui Wang,et al.  Protein sequence classification using extreme learning machine , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[11]  Robert I. Damper,et al.  Comparison of multilayer and radial basis function neural networks for text-dependent speaker recognition , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[12]  G. Dorffner UNIFIED FRAMEWORK FOR MLPs AND RBFNs: INTRODUCING CONIC SECTION FUNCTION NETWORKS , 1994 .

[13]  Yuhua Li,et al.  A comparison of the performance of radial basis function and multi-layer perceptron networks in condition monitoring and fault diagnosis applications , 1999 .

[14]  Guang-Bin Huang,et al.  Classification ability of single hidden layer feedforward neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[15]  Tommi Kärkkäinen,et al.  MLP in Layer-Wise Form with Applications to Weight Decay , 2002, Neural Computation.

[16]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[17]  Jose C. Principe,et al.  Handbook of Neural Network Signal Processing , 2018 .