Research Article
Kernel Recursive Least-Squares Temporal Difference Algorithms with Sparsification and Regularization
Table 1
Main simulation results on both chains at the final episode.
| Algorithm | Nonnoise chain | Noise chain | RMSE | Dictionary size | Subiterations | RMSE | Dictionary size | Subiterations |
| RLSTD | 0.47 ± 0.03 | 20 | — | 0.50 ± 0.04 | 20 | — | SKRLSTD | 0.47 ± 0.05 | 15.36 ± 0.78 | — | 0.49 ± 0.06 | 15.32 ± 0.71 | — | OKRLSTD- | 0.45 ± 0.05 | 15.30 ± 0.81 | — | 0.47 ± 0.04 | 15.32 ± 0.84 | — | OKRLSTD- | 0.49 ± 0.08 | 11.52 ± 1.16 | 1.81 ± 1.82 | 0.53 ± 0.10 | 12.42 ± 1.13 | 2.60 ± 2.56 | OKRLSTD- | 2.21 ± 0.05 | 15.25 ± 0.87 | — | 32.92 ± 68.67 | 15.24 ± 0.77 | — | OKRLSTD- | 0.44 ± 0.05 | 15.40 ± 0.76 | 5.08 ± 3.24 | 0.47 ± 0.05 | 15.28 ± 0.88 | 4.90 ± 3.26 |
|
|