Review Article

Application of Reinforcement Learning in Cognitive Radio Networks: Models and Algorithms

Table 11

RL model for a power control scheme [38].

Action , with and being transmitting SU ’s packets to the SU destination node using single-hop transmission and multiplehop relaying, respectively

Reward represents the revenue obtained from the other SUs for relaying their packets. Higher rewards indicate higher transmission rate and transmission power of SU node