Dynamical Motor Control Learned with Deep Deterministic Policy Gradient

<div>Training and validation of the reaching movement generation. (a) The trajectories of random reaches after training. The start and target points were drawn from a disk-shaped work space. The trajectories were color-coded by the scale of error that measured the distance between the end state and the target state. (b) The cumulative reward versus the number of episodes. (c) The center-out reaching trajectories generated by the trained model.</div>

Computational Intelligence and Neuroscience

fig4

Figure 4

Figure 4: Dynamical Motor Control Learned with Deep Deterministic Policy Gradient