Mobile Information Systems

Research Article

Three-Tier Computing Platform Optimization: A Deep Reinforcement Learning Approach

Optimal resource allocation for the user-collaborative edge device

(1)	Require: Data from the user/IoT nodes
	Data from the high-performance processor data from the Task edge device position of the nodes choice of computing platform computing and communication resources
(2)	Network initialization
	Initialize parameters of the actor and critic network
(3)	For incident = 1 to perform
(4)	Renew the environmental situation of the proposed user-collaborative edge device model
(5)	reset the state
(6)	reset = 0
(7)	for step = 1 to do
(8)	Choose in the simulation environment in line with
(9)	obtain the reward and the next state
(10)	cache in the replay buffer as the experiences used for training the actor and critic network
(11)	Arbitrarily extract minibatch of turples from that will be utilized for training the primary network of the actor and critic
(12)	Update the critic network parameters as follows:
(13)	Update the actor-network parameters as follows:
(14)	change the two target networks parameters every steps as follows: , . Where = 0.001
(15)	end for
(16)	end For