Gish: a novel activation function for image classification
Künye
Kaytan, M., Aydilek, İ.B., Yeroğlu, C. (2023). Gish: a novel activation function for image classification. Neural Computing and Applications, 35 (34), pp. 24259-24281. https://doi.org/10.1007/s00521-023-09035-5Özet
In Convolutional Neural Networks (CNNs), the selection and use of appropriate activation functions is of critical importance. It has been seen that the Rectified Linear Unit (ReLU) is widely used in many CNN models. Looking at the recent studies, it has been seen that some non-monotonic activation functions are gradually moving towards becoming the new standard to improve the performance of CNN models. It has been observed that some non-monotonic activation functions such as Swish, Mish, Logish and Smish are used to obtain successful results in various deep learning models. However, only a few of them have been widely used in most of the studies. Inspired by them, in this study, a new activation function named Gish, whose mathematical model can be represented by y=x·ln(2-e-ex) , which can overcome other activation functions with its good properties, is proposed. The variable x is used to contribute to a strong regulation effect of negative output. The logarithm operation is done to reduce the numerical range of the expression (2-e-ex) . To present our contributions in this work, various experiments were conducted on different network models and datasets to evaluate the performance of Gish. With the experimental results, 98.7% success was achieved with the EfficientNetB4 model in the MNIST dataset, 86.5% with the EfficientNetB5 model in the CIFAR-10 dataset and 90.8% with the EfficientNetB6 model in the SVHN dataset. The obtained performances were shown to be higher than Swish, Mish, Logish and Smish. These results confirm the effectiveness and performance of Gish.