Research Article
Deep Neural Networks with Multistate Activation Functions
Figure 4
Restricting running averages to somewhere near 0. The hollow points stand for running averages of the conventional SGD, while the solid points stand for running averages of mean-normalised SGD.