Research Article
Reinforcement Learning for Routing in Cognitive Radio Ad Hoc Networks
Table 5
Simulation parameters and values for investigating the exploration approaches.
| Category | Parameter | Value |
| SU | Traditional -greedy exploration probability, | {0.07, 0.14} | Traditional softmax exploration temperature, | {0.04, 0.05} | Initial dynamic softmax temperature, | 0.05 | Dynamic softmax adjustment factor, | 0.01 | Dynamic softmax temperature range, | 0.01, 0.1 | Dynamic softmax -value threshold, | 0.1 |
| PU | Standard deviation of PUL, | {0.2, 0.8} |
| Channel | Mean PER, | 0 | Standard deviation of PER, | 0 |
|
|