AIBPO: Combine the Intrinsic Reward and Auxiliary Task for 3D Strategy Game

<div>The model of the IIML (the agent takes the action according to the policy to obtain the external reward information fed back by the external environment; meanwhile, internal reward information will be generated in the internal environment no matter whether the current state agent gets external reward or not).</div>

Complexity

fig1

Figure 1

Figure 1: AIBPO: Combine the Intrinsic Reward and Auxiliary Task for 3D Strategy Game