Research Article

Deep Neural Networks with Multistate Activation Functions

Figure 4

Restricting running averages to somewhere near 0. The hollow points stand for running averages of the conventional SGD, while the solid points stand for running averages of mean-normalised SGD.