Computational Intelligence and Neuroscience

Research Article

Gradient-Sensitive Optimization for Convolutional Neural Networks

Table 3

Parameter change in each algorithm during the iteration process in the Styblinski–Tang function test.


Iterations/methods	AdaGrad	Adam	diffGrad	RMSprop	GS-Adam	GS-RMSprop

0	[0,0]	[0,0]	[0,0]	[0,0]	[0,0]	[0,0]
100	[−1.2098, 1.2098]	[−0.896, 0.896]	[−0.5919, 0.5919]	[−0.5941, 0.5941]	[−2.6916, 2.6916]	[−2.7698, 2.7698]
200	[−1.7251, 1.7251]	[−1.3934, 1.3934]	[−0.9353, 0.9353]	[−1.114, 1.114]	[−2.8835, 2.8835]	[−2.9034, 2.9034]
300	[−2.0724, 2.0724]	[−1.7744, 1.7744]	[−1.1952, 1.1952]	[−1.6207, 1.6207]	[−2.8977, 2.8977]	[−2.9035, 2.9035]
400	[−2.3214, 2.3214]	[−2.092, 2.092]	[−1.4119, 1.4119]	[−2.116, 2.116]	[−2.9011, 2.9011]	[−2.9035, 2.9035]
500	[−2.5013, 2.5013]	[−2.3606, 2.3606]	[−1.6018, 1.6018]	[−2.5901, 2.5901]	[−2.9023, 2.9023]	[−2.9035, 2.9035]
600	[−2.6297, 2.6297]	[−2.5784, 2.5784]	[−1.773, 1.773]	[−2.9033, 2.9033]	[−2.9029, 2.9029]	[−2.9035, 2.9035]
700	[−2.7197, 2.7197]	[−2.7372, 2.7372]	[−1.9306, 1.9306]	[−2.9035, 2.9035]	[−2.9031, 2.9031]	[−2.9035, 2.9035]
800	[−2.7815, 2.7815]	[−2.8339, 2.8339]	[−2.0769, 2.0769]	[−2.9035, 2.9035]	[−2.9033, 2.9033]	[−2.9035, 2.9035]
900	[−2.8233, 2.8233]	[−2.8805, 2.8805]	[−2.2129, 2.2129]	[−2.9035, 2.9035]	[−2.9034, 2.9034]	[−2.9035, 2.9035]
1000	[−2.8511, 2.8511]	[−2.8977, 2.8977]	[−2.339, 2.339]	[−2.9035, 2.9035]	[−2.9034, 2.9034]	[−2.9035, 2.9035]