| NN1 | Parameter | NN2 | Parameter | NN3 | Parameter |
| Conv | Conv (64, 3, 3) | Conv | Conv (128,3,3) | Conv | Conv (64, 3, 3) | Conv | Conv (32, 3, 3) | Conv | Conv (32,3,3) | Conv | Conv (32, 3, 3) | Active | ReLU | Active | BN | Fc | Linear (64, 3), reward prediction | Transform | Flatten | Active | ReLU | Fc | Linear (64, 3), reward prediction | Fc | Linear (288, 256) | Transform | Flatten | LSTM | Hidden state 256, state value | Fc | Linear (256, 256) | Fc | Linear (288, 256) | Fc | Linear (256, 1), state value | Actor | Linear (256, 4), a | | | LSTM | Hidden state 256, action value | Critic | Linear (256, 1), r | | | Fc | Linear (256, 1), action value | Critic | Linear (256, 1), r | | | | |
|
|