Review Article

Application of Reinforcement Learning in Cognitive Radio Networks: Models and Algorithms

Table 4

RL model for joint dynamic channel selection and channel sensing [11].

State ; each state represents an available channel

Action , where action senses a channel for the duration of , transmits a data packet, and switches the current operating channel to another one which has the lowest best-known average transmission delay for a single-hop

Reward represents the difference between a successful single-hop transmission delay and the maximum allowable single-hop transmission delay