Research Article
Service Migration Policy Optimization considering User Mobility for E-Healthcare Applications
Algorithm 3
Policy_improvement (env, V).
(1) | Initialize policy | (2) | for all s in S: | (3) | Initialize qs | (4) | for a in range (N_ACTIONS): | (5) | n_s = env.P [s, a] | (6) | r = env.R [s, a] | (7) | qs [a] = r + gamma V [n_s [0], n_s [1]] | (8) | p = (np.abs (qs − np.max (qs)) < 1e-6) # greedy strategy | (9) | p = np.array (p, dtype = np.float32)/np.sum (p) #convert to float type and normalization | (10) | policy [i, j] = p | (11) | return policy |
|