Abstract

As an essential building block for smart grid, the industrial internet of things (IIoT) plays a significant role in providing powerful sensing capability and ubiquitous connectivity for differentiated power services. The rapid development of smart grid imposes higher data monitoring and transmission requirements in terms of delay and energy efficiency. However, due to the severe electromagnetic interference (EMI) caused by massive electrical equipment, the transmission performance of IIoT becomes inferior. The traditional single-hop transmission mode evolves towards a multihop cooperation mode to satisfy differentiated quality of service (QoS) requirements. In this paper, we propose an upper confidence bound- (UCB-) based joint route and power selection optimization algorithm to support multihop cooperation mode evolution, which adopts a software-defined networking- (SDN-) enabled IIoT network framework to simplify network configuration and management. Compared with existing local-side-information-based route selection (LSI-RS) and random route selection (RRS) algorithms, simulation results demonstrate that the proposed algorithm has superior performances in total delay, energy efficiency, and utility.

1. Introduction

The industrial internet of things (IIoT) is an essential building block for smart grid, which has powerful sensing capability and ubiquitous connectivity. With the development of smart grid, a large number of IIoT devices need to be deployed to collect information such as voltage, current, power, temperature, and humidity and transmit the information back for real-time analysis. IIoT has strict requirements on transmission delay, energy efficiency, and network coverage [1]. However, electrical equipment in smart grid emits electromagnetic interference (EMI), which affects the transmission performance of IIoT. Therefore, the traditional single-hop transmission mode needs to evolve towards a multihop cooperation mode to satisfy the quality of service (QoS) requirements [2]. IIoT utilizes massive devices laid in different routes to form a mesh network for multihop transmission. In multihop transmission, dynamic route selection can avoid worse routes with long distance and low quality and enable IIoT to reduce transmission delay and enhance energy efficiency.

Routing selection needs to be optimized according to the dynamic network environment. However, the traditional network architecture with tight coupling between control and data planes cannot adapt to complex IIoT application scenarios. Software-defined networking (SDN) provides a solution by separating the control plane from the data plane [3]. SDN can manage and control IIoT networks through a standard and open programmable interface, which supports more efficient and flexible route selection solutions [4]. However, the research on route selection optimization for SDN-enabled IIoT in smart grid still faces many challenges, which are summarized below.

First, considering the highly time-varying channel states and complex EMI, the global state information (GSI) is incomplete [5]. Traditional GSI-based route selection optimization cannot be applied. Second, IIoT devices based on battery have strict requirements on energy efficiency. Improving energy efficiency via dynamic power selection not only makes route selection more complicated but also possibly leads to larger transmission delay. Therefore, how to meet differentiated QoS requirements through joint route and power selection optimization is also a challenge. Finally, electric equipment such as inverters and insulation switches emits EMI [6, 7], which greatly reduces QoS performance and brings severe challenges for joint route and power selection optimization.

Route selection of IoT has always been a research hotspot. In [8], Desuo et al. employed an improved Dijkstra algorithm to find the shortest path between two consecutive points for IoT networks. In [9], He et al. proposed an energy-aware route selection algorithm for simultaneous information and power transfer to decrease energy consumption. However, these works do not consider SDN architecture and only consider a single QoS metric. In [10], Saha et al. proposed a traffic-aware QoS route selection scheme by exploiting the flow-based nature of SDN and obtained the optimal route based on Yen’s K-shortest path algorithm. In [11], Li et al. proposed an SDN-enabled IoT adaptive transmission architecture for different delay flow situations. However, these works assume that perfect GSI is available, which is not applicable for smart grid with incomplete information.

Upper confidence bound (UCB) as a reinforcement learning algorithm has emerged as a powerful solution to address problems without perfect GSI [12]. In [13], Sun et al. designed an energy-aware mobility management (EMM) scheme based on UCB to optimize energy consumption. In [14], Maghsudi and Stanczak proposed two joint power and channel selection strategies based on UCB to maximize energy efficiency. However, these works only consider energy consumption optimization, which ignore delay and other QoS requirements. In [15], Zhao et al. proposed a delay minimization algorithm based on UCB, but neglected the joint optimization of energy consumption and delay. In [16], Bae et al. proposed a downlink network routing algorithm based on UCB to jointly optimize throughput and delay, but ignored the influence of complex EMI and service priority. Moreover, all the abovementioned works do not consider the impact of complex EMI and service priority of smart grid on the joint optimization of route and power selection.

To address the abovementioned challenges, we propose a UCB-based joint route and power selection optimization algorithm. Firstly, considering the influence of EMI, we construct an SDN-based multihop IIoT framework and formulate the joint route and power selection optimization problem. The objective is to maximize the overall network utility function under the threshold constraints of signal-to-interference-plus-noise ratio (SINR) and energy efficiency. Second, we model the joint optimization problem as a multiarmed bandit (MAB) problem, where the options of route and power are combined to form an arm. Finally, we utilize UCB to learn the optimal route and power combination based on local and historical information. The main contributions of this work are summarized as follows:(i)We propose an SDN-enabled multihop IIoT framework for smart grid, which greatly simplifies network management through separating control and data planes. In addition, the control plane also supports the configuration of intelligent route and power selection algorithms.(ii)The route and power options are combined to form a set of arms in MAB. The proposed algorithm dynamically learns the optimal combination by interacting with the environment.(iii)Through dynamically adjusting the values of weight parameters, the proposed algorithm can satisfy differentiated QoS requirements of smart grid by adjusting the tradeoff between delay, energy efficiency, and service priority.

The rest of this paper is organized as follows: Section 2 describes the system model and problem formulation. The proposed joint route and power selection algorithm is introduced in Section 3. Section 4 provides simulation results. Finally, the conclusion is provided in Section 5.

2. System Model

In this section, the system model and the problem formulation are introduced.

2.1. Network Model of SDN-Enabled IIoT for Smart Grid

The SDN-enabled IIoT for smart grid is shown in Figure 1, which consists of two planes, i.e., the data plane and the control plane [17]. The data plane mainly contains IIoT devices which provide data forwarding services. The control plane mainly contains the SDN controller, which locates in the gateway. The SDN controller can obtain IIoT network topology, learn the optimal route and power selection strategy, and send the strategy to the IIoT source device (SD) [18].

The SDN-enabled IIoT network topology is represented by a directed graph [19], where denotes IIoT devices. The set is defined as . and are the SD and destination device (DD). , , is the relay device. denotes physical links, and the set is defined as , where is the set of devices connected with . There exist routes between and , and the set is represented as . Each route consists of devices, which are SD , DD , and relay devices. The set of devices in is denoted as in the order from SD to DD, where , .

In this paper, the set of time slots is represented as . The slot length depends on the transmission delay from SD to DD [20]. At the beginning of the -th slot, generates a data packet of size , , which needs to be transmitted to . Each data packet can only be transmitted in one route [21]. The transmission is unsuccessful if the delay exceeds .

2.2. Delay Model

We assume that the data packets are transmitted by wireless channels. We denote as the transmission power, which contains levels. The set of transmission power levels is given by

The achievable transmission rate from to is given bywhere is the transmission bandwidth of route . is the SINR [22] between and and is given bywhere is the channel gain. is the noise power. is the EMI power.

We denote the power selection variable as . represents that selects ; otherwise, . The transmission delay from to and the total forwarding delay on the route are given by

We denote the route selection variable . represents that selects ; otherwise, [23]. The total forwarding delay is given by

2.3. Energy Efficiency Model

The energy consumption for data packet transmission from to and the total energy consumption on route are given by

We define as the energy efficiency of data packet transmission on route with power in the -th time slot, which is given by

Therefore, the total energy efficiency is given by

2.4. Problem Formulation

Since the data packets have different QoS requirements, the service priority needs to be taken into account. We use to represent the priorities of different data packets. We define the overall network utility function related to the total forwarding delay, service priority, and total energy efficiency aswhere is the weight used to balance the order of magnitude.

Therefore, the objective is to maximize by optimizing the route and power selection strategies. The optimization problem is formulated aswhere and represent the thresholds of SINR and energy efficiency, respectively. is the route selection constraint; i.e., each data packet can only select one route. is the power selection constraint; i.e., each data packet can only select one power level. is the transmission power constraint. is the SINR constraint. is the energy efficiency constraint.

3. UCB-Based Route and Power Selection Optimization for SDN-Enabled Industrial IoT in Smart Grid

It is impractical to obtain the perfect GSI due to the dynamic network topology and complex EMI, and IIoT devices should optimize route and power selection based on the local-side information. MAB is an effective solution to solve decision-making problems with incomplete information [24]. In each slot, the decision maker pulls an arm. Then, the pulled arm generates a reward. The decision maker’s goal is to maximize the cumulative reward.

We transform P1 into an MAB problem. The decision maker, arm, and reward are modeled as follows:(i)Decision maker: the decision maker generates the decision. In this paper, we define the SDN controller as the decision maker.(ii)Arm: we define as the set of arms which satisfy , where represents the number of elements in . The arm represents the route and power .(iii)Reward: we define a reward function to represent the reward obtained by selecting , which is given by

If and , the reward is . Otherwise, the reward is zero.

We propose a UCB-based joint route and power selection algorithm for SDN-enabled IIoT in smart grid to address the MAB problem. UCB is a low-complexity learning-based algorithm to balance exploitation and exploration [25]. The proposed algorithm enables the SDN controller to take action based on local state information such as delay. Afterwards, the obtained reward and updated state information is perceived by the SDN controller for the next selection. The implementation of the proposed algorithm is shown in Figure 2.

(1)Input: , , , .
(2)Phase 1:
(3) Set , , , and , .
(4)for to do
(5)  Select arms sequentially, and obtain the initial values.
(6)end for
(7)for to do
(8)  Phase 2:
(9)  Calculate the preference of the SD towards arm as (12).
(10)  Select based on (13).
(11)  Phase 3:
(12)  Observe delay and energy efficiency performances.
(13)  Calculate based on (11).
(14)  Update and based on (15) and (16).
(15)end for

The proposed algorithm consists of three phases, which is summarized in Algorithm 1.(i)Phase I: , , , and are initialized as zero. When , the controller sequentially selects each arm and obtains the initial value.(ii)Phase II: based on (12), the preference of the SD towards arm in the -th slot is given bywhere is the average reward of up to the -th time slot. is the number of times to select . is the weight of exploration. The second term allows the controller to explore arms with selections to improve estimation and to focus on the exploitation when arms have been estimated enough. After obtaining , the selected arm is given by represents SD selects and , which is given by(iii)Phase III: the controller observes delay and energy efficiency performances as well as service priority. Then, is updated as (11). Accordingly, and are updated as

Finally, the algorithm terminates until .

4. Simulation Results

In this section, we firstly introduce the simulation parameter setting. Then, the simulation analysis is described.

4.1. Simulation Parameter Setting

In this section, we evaluate the proposed algorithm through simulations. The considered IIoT route topology is shown in Figure 3, which includes 9 IIoT devices and 6 routes. and are the SD and DD, respectively. The distances of adjacent devices on each route are shown in Table 1. In the case of large-scale fading, the channel gain is calculated according to [26], where is the distance between and . The EMI varies from 28 dBm to 30 dBm. The service priority is set as . The setting of simulation parameters is summarized in Table 2 [27, 28]. We consider two existing algorithms for comparison. The first one is the UCB-based route selection algorithm named UCB-RS [29]. The other one is the shortest route selection algorithm named SRS [30]. Both UCB-RS and SRS neglect the optimization of power selection.

4.2. Simulation Analysis

Figure 4 shows the average utility versus time slot. Compared with UCB-RS and SRS, the simulation result demonstrates that the proposed algorithm improves the performance of utility by and , respectively. The reason is that the proposed algorithm jointly optimizes the route and power selection. In contrast, UCB-RS neglects the power selection. SRS always selects the shortest route, which cannot overcome the adverse impact caused by the dynamic change of channel state, thereby performing the worst.

Figure 5 shows the average delay versus time slot. The simulation result shows that the proposed algorithm outperforms UCB-RS and SRS by and in delay performance, respectively. Both UCB-RS and SRS do not take the optimization of power selection into consideration, which result in worse delay performance.

Figure 6 shows the average energy efficiency versus time slot. Compared with UCB-RS and SRS, the proposed algorithm improves the performance of energy efficiency by and , respectively. The proposed algorithm can select suitable power to optimize energy efficiency.

Figure 7 shows the ratio of optimal route selection versus time slot. SRS performs the worst. The reason is that the proposed algorithm and UCB-RS can dynamically adjust the route selection strategy. However, SRS always selects the shortest route fixedly and cannot get rid of the adverse impact of EMI.

Figures 810 show the average energy efficiency, average delay, and average utility versus . With the increase of , the energy efficiency and delay of the proposed algorithm decrease, while the utility increases first and then decreases. When , the utility reaches the maximum value. The performance of UCB-RS fluctuates. The reason is that the proposed algorithm can learn to optimize power selection to meet more stringent SINR constraint. UCB-RS neglects power selection, which makes it difficult to adapt to different SINR constraints.

Figure 11 shows the impact of on the delay and energy efficiency. As increases, the proposed algorithm lays more emphasis on energy efficiency rather than delay. The proposed algorithm can dynamically balance the tradeoff between energy efficiency and delay. Moreover, the simulation results provide a reference for the setting of the weight .

5. Conclusions

In this paper, we proposed an UCB-based joint route and power selection optimization algorithm for SDN-enabled IIoT. The proposed algorithm can effectively optimize route and power selection strategies based only on local information and historical observations. It can provide a low-complexity route and power selection strategy while maximizing the overall network utility. Simulation results show that the proposed algorithm has superior performances in delay, energy efficiency, and utility. Compared with existing LSI-RS and RRS algorithms, the proposed algorithm reduces the delay by and , improves the utility by and , and improves the energy efficiency by and . In the future, we will use deep reinforcement learning to optimize the multidimensional resource allocation in SDN-enabled IIoT.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by the Science and Technology Project of the State Grid Shandong Power Supply Company under grant no. 5206021900VV.