Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network

<table>Object catching task and a recurrent neural network. An agent moves up or down and catches a moving object. The initial direction of motion and velocity of the object are chosen randomly for every episode. The invisibility area is also chosen randomly in the range of <math id="M28" xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi><mo>&gt;</mo><mn>3.0</mn><mtext>.</mtext><mo> </mo></math><math id="M29" xmlns="http://www.w3.org/1998/Math/MathML"><mi>x</mi><mo>,</mo><mi>y</mi></math> coordinates of the object and <i>y</i> coordinate of the agent are input to an Elman-type recurrent neural network. Each input signal represents local information, as shown in Figure <a href="../fig2/">2</a>.</table>

Journal of Robotics

fig1

Figure 1

Figure 1: Emergence of Prediction by Reinforcement Learning Using a Recurrent Neural Network