Research Article
Mathematical Analysis and Performance Evaluation of the GELU Activation Function in Deep Learning
Table 1
Test loss and test accuracy for different activation functions on CIFAR-10 dataset.
| Activations | Test loss | Test accuracy (%) |
| ELU | 0.4232 | 86.22 | Hardshrink | 1.1266 | 60.81 | Hardsigmoid | 1.4296 | 54.00 | Hardtanh | 0.5573 | 82.01 | Hardswish | 0.3921 | 88.77 | LeakyReLU | 0.4036 | 87.93 | LogSigmoid | 0.5755 | 81.42 | PReLU | 0.5552 | 86.33 | ReLU | 0.4478 | 87.19 | ReLU6 | 0.4145 | 88.70 | RReLU | 0.4308 | 85.91 | SELU | 0.4983 | 83.37 | CELU | 0.4260 | 86.21 | Sigmoid | 3.2102 | 33.90 | Softplus | 0.5762 | 80.82 | Softshrink | 0.5626 | 81.93 | Softsign | 0.6819 | 78.33 | Tanh | 0.5318 | 82.91 | Tanhshrink | 0.5776 | 80.78 | GELU | 0.3685 | 89.52 |
|
|
Bold indicates the best performance; italic indicates the second-best. |