Research Article
Edge Caching for D2D Enabled Hierarchical Wireless Networks with Deep Reinforcement Learning
Algorithm 2
Double DQN-based content caching algorithm.
Initialization: Experience replay memory , main network with random weights , target | network with , and the period of replacing target Q network . | Iteration: | 1: for each episode | 2: Initialize | 3 i | 4: for each step of episode | 5: | 6: Randomly generate | 8: if | 9: randomly select an action | 10: else | 11: | 12: Take action | 13: Obtain and . | 14: Store into . | 15: Randomly sample a mini-batch of transitions . | 16: Update with . | 17: if i | 18: Update | 19: | 20: | 21: end for | 22: end for |
|