Review Article

Application of Reinforcement Learning in Cognitive Radio Networks: Models and Algorithms

Table 15

Performance enhancements achieved by the RL-based schemes in CR networks.

Application schemesReferencesRL ModelsPerformance enhancements
(P1) Higher throughput/
goodput
(P2) Lower end-to-end delay or link delay(P3) Lower level of interference to PUs(P4) Lower number of sensing channels(P5) Higher overall spectrum utilization(P6) Lower number of channel switches(P7) Lower energy consumption(P8) Lower probability of false alarm(P9) Higher probability of PU detection(P10) Higher number of channels sensed idle(P11) Higher accumulated rewards

(A1) Dynamic channel selectionBkassiny et al. [34]Partial observable××
Tang et al. [2]Traditional×××
Yao and Feng [19]Traditional×
Chen et al. [24]Model with ××
Jiang et al. [30, 31] Model with ××
Liu et al. [39]Collaborative×
Yau et al. [8, 9]Collaborative××
Bernardo et al. [27]Internal self-learning××

(A2) Channel sensing Di Felice et al. [11, 21]Set of Q-functions×××
Li et al. [10]Model with ××
Lo and Akyildiz [3]Traditional××
Chowdhury et al. [25]Collaborative××××
Lundén et al. [20]Collaborative×

(A3) Security enhancementWang et al. [14]Competitive×
Vucevic et al. [13]Actor-critic×

(A4) Energy efficiency enhancementZheng and Li [15]Traditional×

(A5) Auction mechanismJayaweera et al. [36]Auction××
Fu and van der Schaar [37]Auction×

(A6) Medium access controlLi et al. [17]Model with ×

(A7) RoutingPeng et al. [4]Traditional××
Xia et al. [33]Dual Q-functions×

(A8) Power controlXiao et al. [38]Auction×