Research Article

A Dynamic Hidden Forwarding Path Planning Method Based on Improved Q-Learning in SDN Environments

Algorithm 1

Planning optimal attack path based on improved Q-learning algorithm.
Require:
Host weight
The attack success rate of vulnerability
Vulnerability
Host Name
Ensure: Optimal policy (attack path)
  function IQL(, , , )
   obtain the number of vulnerabilities
   obtain the number of hosts
  getstate gain state set
  getNumber initialize discount factor
   initialize value matrix
   build reward matrix
  for     is iteration step do  
obtain the optimal policy
  if     then break
  end if
  end for  
  return  
  end function