A Novel Activation Function in Convolutional Neural Network for Image Classification in Deep Learning

In deep learning, there are various parameters that helps to drive optimal results. One of those parameters is to use the correct activation function. The activation function must have ideal statistical characteristics. In this paper, a novel deep learning activation function has been proposed. Sigmoid activation function generally used in the output layer for bi-classification problem. Recently, swish activation used sigmoid function with hidden layers. Motivated by this, a new activation function is being proposed as (relu (x) + x * sigmoid (x)) to get the significance benefits of relu and sigmoid in a swish like flavour. The proposed function represents the desired statistical characteristics as unboundedness, monotonicity, zero centred and non-vanishing gradient. The experimental outcomes are also quite significant.

[1]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[2]  Charles A. Micchelli,et al.  How to Choose an Activation Function , 1993, NIPS.

[3]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..

[4]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[5]  Yoshua Bengio,et al.  Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[6]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[7]  Joseph Sill,et al.  Monotonic Networks , 1997, NIPS.

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Ochin Sharma,et al.  A New Activation Function for Deep Neural Network , 2019, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon).

[10]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[11]  Richard Hans Robert Hahnloser,et al.  Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit , 2000, Nature.

[12]  Ochin Sharma,et al.  Deep Challenges Associated with Deep Learning , 2019, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon).