Research Article
Resource Scheduling in URLLC and eMBB Coexistence Based on Dynamic Selection Numerology
Algorithm 1
DQN-based resource allocation algorithm.
1 Initialize replay memory , capacity is | 2 Initialize action-value function Q with random weights and random target Q with | 3 For episode = 1, do | 4 repeat | 5 With probability select a random action , update numerology value , with probability select a random action | 6 Execute action , observe reward and new station | 7 Store in replay memory | 8 Collect sample data from replay memory randomly | 9 Update action-value function Q with limit | 10 Every C steps reset | 11 until S termination stat | 12 End For | 13 Return optimal strategy |
|