Research Article

Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network

Table 1

The agent’s ideal and actual performance after learning for three cases of invisibility area.

Range of the invisibility areaIdeal
RandomNothingMaximum

Average reward0.6850.6850.6810.742

Percentage with which the agent gets the reward99.098.499.9100

Relative distance between the agent and object when the agent chooses catch action0.2700.2600.2960.144