Research Article

Edge Caching for D2D Enabled Hierarchical Wireless Networks with Deep Reinforcement Learning

Algorithm 2

Double DQN-based content caching algorithm.
Initialization: Experience replay memory , main network with random weights , target
network with , and the period of replacing target Q network .
Iteration:
1: for each episode
2: Initialize
3 i
4: for each step of episode
5:
6: Randomly generate
8: if
9: randomly select an action
10: else
11:
12: Take action
13: Obtain and .
14: Store into .
15: Randomly sample a mini-batch of transitions .
16: Update with .
17: if i
18: Update
19:
20:
21: end for
22: end for