Research Article
A Novel Reinforcement Learning Architecture for Continuous State and Action Spaces
Table 1
Comparison of the best policies for the dribbling problem.
| | ā | SARSA | ()-learning |
| | Algorithm type | Actor-Critic | ()-learning | | Function approx. | RBFs | CMACs | | States | Continuous | Continuous | | Actions | Continuous | Discrete | | Total learning time | 10 minutes | 24 hours 30 minutes | | Average distance | 25.45 meters | 29.21 meters | | Maximum distance | 36.23 meters | 39.0 meters |
|
|