A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power

Predicting electricity power is an important task, which helps power utilities in improving their systems’ performance in terms of effectiveness, productivity, management and control. Several researches had introduced this task using three main models: engineering, statistical and artificial intelligence. Based on the experiments, which used artificial intelligence models, multilayer neural networks model has proven its success in predicting many evaluation datasets. However, the performance of this model depends mainly on the type of activation function. Therefore, this paper introduces an experimental study for investigating the performance of the multilayer neural networks model with respect to different activation functions and different depths of hidden layers. The experiments in this paper cover the comparison among eleven activation functions using four benchmark electricity datasets. The activation functions under examination are sigmoid, hyperbolic tangent, SoftSign, SoftPlus, ReLU, Leak ReLU, Gaussian, ELU, SELU, Swish and Adjust-Swish. Experimental results show that ReLU and Leak ReLU activation functions outperform their counterparts in all datasets.

[1]  K. Gnana Sheela,et al.  Review on Methods to Fix Number of Hidden Neurons in Neural Networks , 2013 .

[2]  Ali A. Minai,et al.  On the derivatives of the sigmoid , 1993, Neural Networks.

[3]  Jun Zhou,et al.  Activation functions and their characteristics in deep neural networks , 2018, 2018 Chinese Control And Decision Conference (CCDC).

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Abdelaaziz El Hibaoui,et al.  Comparison of Machine Learning Algorithms for the Power Consumption Prediction : - Case Study of Tetouan city – , 2018, 2018 6th International Renewable and Sustainable Energy Conference (IRSEC).

[7]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[8]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[9]  S. Agatonovic-Kustrin,et al.  Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. , 2000, Journal of pharmaceutical and biomedical analysis.

[10]  Luis M. Candanedo,et al.  Data driven prediction models of energy use of appliances in a low-energy house , 2017 .

[11]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[12]  Sukhan Lee,et al.  A Gaussian potential function network with hierarchically self-organizing learning , 1991, Neural Networks.

[13]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[14]  S. Mohamed,et al.  Statistical Normalization and Back Propagation for Classification , 2022 .

[15]  Pasapitch Chujai,et al.  Time Series Analysis of Household Electric Consumption with ARIMA and ARMA Models , 2022 .

[16]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[17]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[18]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[19]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[20]  X.J. Liu,et al.  Multivariable generalized predictive scheme for gas turbine control in combined cycle power plant , 2008, 2008 IEEE Conference on Cybernetics and Intelligent Systems.

[21]  Coskun Özkan,et al.  The comparison of activation functions for multispectral Landsat TM image classification , 2003 .

[22]  Eric Alcaide,et al.  E-swish: Adjusting Activations to Different Network Depths , 2018, ArXiv.

[23]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[24]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[25]  Wei-Min Shen,et al.  Data Preprocessing and Intelligent Data Analysis , 1997, Intell. Data Anal..