Research Article

Service Migration Policy Optimization considering User Mobility for E-Healthcare Applications

Algorithm 4

SMVI (max_iter = 100, max_step = 100, tol = 1e-6).
(1) env = Env ()//Environment Initialization: MDP mode: state s, action a, and reward r
(2) Initialized V
(3) for (i = 1, i ++ , i < max_iter)
(4)  new_V = V.copy ()
(5)  update_steps = 0
(6)  for all s in S
(7)  Initialize value function V
(8)   for a in range (N_ACTIONS):
(9)    n_s = env.P [s, a]
(10)    r = env.R [s, a]
(11)   qs [a] = r + gamma V [n_s [0], n_s [1]]
(12)    update_steps + = 1
(13)   new_V [i, j] = np.max (qs)# update value function base on Bellman’s equation
(14)  mean_values.append (np.mean (V))#store the mean value
(15)  run_times.append (time.time ()-st)#strore the run time
(16)  if np.sum (np.abs (V-new_V)) < tol:
(17)     break
(18)    V = new_V
(19) return V, mean_values, run_times