Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network

<table>Learning curve. Change in the average reward according to the number of episodes is shown. Broken line shows the ideal average reward under the assumption that the agent always catches the object at the optimal timing and place.</table>

Journal of Robotics

fig3

Figure 3

Figure 3: Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network