Research Article

Reinforcement Learning-Based Service-Oriented Dynamic Multipath Routing in SDN

Table 1

RL signal comparison between our proposed method and related research methods.

RL signalsStateActionReward

DROM [43]The traffic matrix (TM) of the networkThe weights of links in the networkThe network operation and maintenance strategy
TIDE [44]The traffic matrix (TM) of the networkThe weights of links in the networkThe QoS strategy
Stampa et al. [45]The traffic matrix (TM) of the networkThe weights of links in the networkThe mean network delay
Pham et al. [46]The traffic matrix (TM) of the networkThe weights of links in the networkThe mean of QoS metrics/the mean of qualified flows
DQSP [47]The frequency of packet-in message, the occupancy rate of the flow table, and the channel occupancy rateThe weight of the node assigned as the next hopThe node packet loss rate, node forwarding delay, and flow table status
QR-SDN [48]The currently selected path for each flowDetermine the path of flow(s)The sum of latencies along the current paths of the flows
IHSF [49]The path reliability, delay, bandwidth utilization, and the number of disturbed flows in case of link’s failureDetermine the path of flow(s)The path’s reliability level, the minimum number of disturbed flows, maximum bandwidth utilization, and minimum delay
Proposed RED-STARThe service type, current bandwidth utilization, packet loss rate, and the latency of each linkDetermine the path of flow(s)The QoS requirements of services and link utilization balancing