Research Article
Dynamical Motor Control Learned with Deep Deterministic Policy Gradient
Figure 4
Training and validation of the reaching movement generation. (a) The trajectories of random reaches after training. The start and target points were drawn from a disk-shaped work space. The trajectories were color-coded by the scale of error that measured the distance between the end state and the target state. (b) The cumulative reward versus the number of episodes. (c) The center-out reaching trajectories generated by the trained model.
(a) |
(b) |
(c) |