Research Article

Reinforcement Learning for Security-Aware Workflow Application Scheduling in Mobile Edge Computing

Algorithm 1

Deep Q network-based security-aware workflow scheduling scheme.
BEGIN
(1)Initialize the replay memory with the size of , and a minibatch of the state transition experiences with the size of ;
(2)fordo
(3) Resetting the system state ;
(4) for do
(5)  At the beginning of each time slot , the current state of the system is observed;
(6)  Based on the current state , randomly select an action with probability and select the action with the largest value with probability;
(7)  The immediate reward can be calculated and the system state in the next time slot can be observed;
(8)  The state transition experience can be obtained and stored into the replay memory;
(9)  The immediate rewards at each step are accumulatively summed;
(10)  Randomly sample state transition experiences from the replay memory to train the Q network;
(11)  Calculate the expectation of the mean-squared error between the current evaluated value and the target value :
(12)end for
(13)end for