Research Article
Minimizing the Cost of Spatiotemporal Searches Based on Reinforcement Learning with Probabilistic States
Table 3
Accumulative search cost of QDP with different discount rates.
| Start moment | 8 : 00 | 10 : 00 | 12 : 00 | 14 : 00 | 16 : 00 | 18 : 00 | 20 : 00 |
| α = 1,γ = 0.97 | 34.38 | 35.05 | 36.51 | 34.86 | 35.57 | 32.44 | 36.61 | α = 1,γ = 0.99 | 33.25 | 33.42 | 35.01 | 33.27 | 34.31 | 30.46 | 35.11 | α = 1,γ =1.00 | 32.81 | 32.90 | 34.85 | 32.72 | 34.06 | 30.15 | 34.97 | α = 1,γ =1.01 | 33.59 | 32.95 | 34.91 | 33.08 | 33.72 | 30.18 | 35.14 | α = 1,γ =1.03 | 33.15 | 33.19 | 35.79 | 33.24 | 34.10 | 30.36 | 35.13 | α = 1,γ =1.05 | 33.68 | 33.64 | 36.17 | 33.92 | 34.96 | 31.21 | 35.79 |
|
|