Research Article

Reinforcement Learning Guided by Double Replay Memory

Figure 2

Absolute value of TD-error (a) and sample weight (b) in CartPole.
(a) TD-error
(b) Sample weight