Soft-Clipping Swish: A Novel Activation Function for Deep Learning

This study aims to contribute to the improvement of the network’s performance through developing a novel activation function. Over time, many activation functions have been proposed in order to solve the issues of the previous functions. We note here more than 50 activation functions that have been proposed, some of them being very popular such as sigmoid, Rectified Linear Unit (ReLU), Swish, Mish but not only. The main idea of this study that stays behind our proposal is a simple one, based on a very popular function called Swish, which is a composition function, having in its componence sigmoid function and ReLU function. Starting from this activation function we decided to ignore the negative region in the way the Rectified Linear Unit does but being different than that one mentioned through a nonlinear curve assured by the Swish positive region. The idea has been come up from a current function called Soft Clipping. We tested this proposal on more datasets in Computer Vision on classification tasks showing its high potential, here we mention MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100 using two popular architectures: LeNet-5 and ResNet20 version 1.