Reinforcement Learning for Routing in Cognitive Radio Ad Hoc Networks
Table 2
Reward representation for WCRQ-routing model embedded at SU node .
Cost
where represents the number of retransmissions for a packet sent from SU node to SU neighbor node at time , while represents the number of packets in the queue of SU neighbor node . Weight factor is used to adjust the tradeoff between PUs’ and SUs’ network performance.