Research Article
Mathematical Analysis and Performance Evaluation of the GELU Activation Function in Deep Learning
Table 2
Test loss and test accuracy for selected activation functions on CIFAR-100 and STL-10 datasets.
| Datasets | Activation | Test loss | Test accuracy (%) |
| CIFAR-100 | ELU | 1.5609 | 57.26 | Hardswish | 1.3122 | 64.12 | LeakyReLU | 1.4248 | 61.71 | ReLU | 1.4223 | 61.84 | ReLU6 | 1.4185 | 61.58 | RReLU | 1.4509 | 59.81 | SELU | 1.8315 | 51.09 | GELU | 1.3351 | 64.71 |
| STL-10 | ELU | 1.5533 | 41.78 | Hardswish | 1.2457 | 54.40 | LeakyReLU | 1.1650 | 56.26 | ReLU | 1.2105 | 54.86 | ReLU6 | 1.5044 | 47.01 | RReLU | 1.2814 | 51.25 | SELU | 1.5221 | 41.18 | GELU | 1.1853 | 58.48 |
|
|
Bold indicates the best performance; italic indicates the second-best. |