Abstract

Proactive network solutions (PNS) become the precise management and orchestration (MANO) in the applied artificial intelligence (AI) era. The PNS proposed to invent future mobile edge communications by predicting the fault networks for reliable slicing configurations. Furthermore, federated learning (FL) systems have been appealed to apply for critical mobile data privacy of the Internet of Things (IoT) services. Therefore, FL-based IoT communications need a precise PNS to pretend the network failures to maximize the model inference and improve end-to-end (E2E) quality of services (QoS). This paper proposed an adopted software-defined network slicing (NS) for IoT communications based on network failure prediction and resource allocations by utilizing a deep-Q-network approach (DQN). The proposed proactive reliable subscribed network slicing was based on software-defined DQN-based proactive dynamic resource allocations (SDQN-PDRA) for adaptive communication configurations. The experiment showed that the proposed approach enhanced the significant outcomes of stability, reliability, convergence time, and other communication QoS.

1. Introduction

In next-generation (NG) communication technology, the mitigated mobile edge computing (MEC) from the remote cloud, called mobile cloud computing (MCC), is intended to empower fronthaul computing resources and enhance NG infrastructure as a service (IaaS) for overcoming heterogeneity novelty applications, including Internet of Things (IoT), heterogeneous IoT (HIoT), Internet of healthcare things (IoHT), and Internet of Vehicles (IoV), and especially, for the time critical-communications [13]. MEC plays an essential role in enabling local services for 5G perspectives, which aim to provide agile response services for user devices with ultra-dense new radio (NR) services for massive user terminals. According to the enlargement of edge network infrastructure, intelligent resource management and orchestration (MANO) towards autonomous network configurations have become the critical research areas [4]. Additionally, network systems self-organizing networks (SON) must be enhanced because autonomous networks can be empowered by adopting artificial intelligence (AI) algorithms. Deep learning (DL) models were introduced to handle and improve SON perspectives, especially in distributed network areas.

DL approaches have contributed effectively to the large-scale complexity of network datasets in terms of classification, recommendation, and prediction problems [5]. DL can be applied for efficient MANO heterogeneous network resources for these reasons. Especially in IIoT applications, various IoT devices will generate a large-scale network dataset. For example, in the federated learning (FL) based IoV paradigms, each vehicle has its data cloud to store and compute privacy constraint information and share its training model and model parameters to the distributed cloud server for aggregation. The training model will be shared between MEC servers in the vehicle edge networks (VEN) [69]. Because of the high-speed movement of the vehicles, the exchange of information between MEC servers is obligated to be performed with high stability and low latency [10, 11].

Moreover, to cope with massive data generated from high-speed mobility, a local cloud system must be installed for local training. To achieve reliable networking for end-to-end (E2E) communications, data integrity from the sensor is essential, and in-networking communications are required to ensure communication reliability. In FL systems, E2E round-trip communication between V2C requires ultra-reliability low latency communication (uRLLC) [12]. The software-defined routing (SDR) based on software-defined network (SDN) architecture plays a significant role as a global routing approach, which handles multipath forwarding in heterogeneous edge servers. An intelligent routing approach is essential in the VEN solution, while DL can be deployed to classify the different levels of the link statuses or reliable edge servers [12]. The adopted DL for intelligent SON with SDN architecture leverages the NG communication system towards efficient big data network solutions to empower E2E QoS and QoE based on experiential networked intelligence (ENI) [13].

DL models play essential roles in big data network data solutions based on the ENI architecture. Moreover, the intelligent softwarization network can be established with an open interface for AI infrastructure. This paper selected the DL algorithm, namely, recurrent neural network (RNN), to perform network failure prediction in the distributed edge servers for proactive network solutions. Generally, to perform network and FL system assurance, experience networking is crucial for evaluating future configuration for the coming request [14, 15]. In the distributed network, DL models empower the QoS based on the observation of QoE data points. RNN gains the top position for prediction purposes over the convolutional neural network (CNN) [14]. Meanwhile, CNN is popular and works well with image-sensing data gathered from sensor devices. In the VEN, the loading occurs with sudden fluctuation by the time spaces, whereas the RNN models are primarily based on long short-term memory (LSTM) structure with multiple gates.

Moreover, the extended reinforcement learning (RL) called deep reinforcement learning (DRL) approaches utilized DL approaches to recommend the optimal Q-value (state and action paired) for optimal action space selection purposes [16, 17]. DQN approaches have been comprehensively intentioned in the mobility network resource allocation, offloading, and E2E network MANO [1719]. DRL provides the ability to adapt to natural network environments to train the agents to handle network issues [1922]. This paper proposed an SDN-based subscribed network slicing DQN network loading adjusting when the loading metrics of the network devices exceed the defined threshold interval. SDN controller assures E2E communication reliability for model transferring between clients and aggregation servers by the global view of SDN controller with proactive fault detection and resource allocations [2327].

The main contributions of the paper are encapsulated as follows.

We deploy ENI-based architecture for network condition predictions and resource allocations with the integrated DQN model. After that, the collected network statuses turn to the classification phase for distinguishing the distinct network conditions (e.g., MEC loading, traffic loading, path delay, etc.), which are essential for SDR rulemaking. We deploy DQN for autonomous resource management and to minimize network load fluctuations. SDN controller invests the DQN model for optimal Q-value selection, by implementing DQN in SDN architecture to explore the most appropriate action for allocating the resource to maintain the optimal network loading metrics.

We provide the E2E evaluation metrics of our proposed SDQN-PDRA with various approaches in three communication aspects, including the model convergence reliability of FL in IIoT, network stability, and QoS analytics. The intelligent network computation and configuration were based on proactive network solutions (PNS). The SDN controller was considered to handle the loading predictions and adjustment proactively.

Continuing of the manuscript is organized as follows. Section 2 presents the related work and IIoT communication system and issues. In addition, our solution is described in Section 3. The experiment and numerical evaluation results with detailed interpretation are given in Section 4. Finally, Section 5 presents a conclusion and future work.

Each WSN is attached to a personal edge server with a sufficient local cloud system to perform data storing, local training model, and other significant computations (see Figure 1). Whenever the local edge has insufficient resources to operate heavy traffic, there will be a high network delay with the congestion caused by the network failure [27, 28]. The FL-based system uses model transferring instead of raw data sharing [29]. FL-based network architecture reduces the amount of traffic over the network since the raw data captured from the sensor network is stored in the local cloud and performs local training [30].

The FL-based communication can be described in three layers. First, the data gathering and local training layer required to satisfy KPI obligations in terms of data integrity, clearance of the dataset, and the sensor can be sensing overdetection information that will not be utilized for model evaluation due to the PNS required reliable model inference. Moreover, some of the computation and decisions must be made inside the personal edge service to support an in-network processing conducted based on band communication over Ethernet or wireless communications.

However, the synchronization processes between local and global entities will be frequently made, while the vast local models require aggregation transfer to edge servers. At the same time, the heavy computation is not suitable for computing inside a local server, especially the missing data for model decisions. Moreover, due to the fast speed and high IoT networking, the joining radio networks consist of high failure ratios that obligate handling for uRLLC. Additionally, the optimal remote radio head (RRH) recommendations will diminish the failure ratio of joining the radio access network (RAN). The aggregation server shares its model with the server in charge of continuous computing in the handover process whenever the alternative is mandatory. In the process of sharing, heavy traffic will be generated, and new route installation will be obligated.

Moreover, the rapidly converged network will become a key challenge issue when each router has a large-scale network database that takes long periods of computing processes. Therefore, PNS (advanced computation) and route installation in time critical IoT reduce delays for massive routing decisions. Additionally, the rapid routing will utilize SDN architecture since the routing computation and installation will be conducted in the control plane (CP), while the data plane (DP) takes a function of data forwarding based on the installed route [3134]. Thus, the computation can be wholly separated and performed in advance in the CP.

2.1. System Model

The system model is based on the ENI architecture by converging the three primary contributors, including DQN, caching, and SDN controller. The network conditions in terms of delay, congestion window, and resource limitation are considered network loading parameters.

The local server has been attached to the IoT system in federated IoT to store the sensed data from various intelligent sensor devices. The local training is conducted by splitting the local dataset between each client into minibatches of size which are included in the set . The local trained and updated models are sent to the global servers for aggregation which can be modeled as follows:

While is the model parameter update from local IoT clients , local data minibatches of total client , and MSE is the Mean Squared Error representing the loss function for deep neural network (DNN). The local client transmits the updated model over the wireless networks to the aggregation server. The global server collects the up-to-date model from various aggregation servers for model accumulation. The global server will send the average global models to the local client. The global model can be modeled aswhere is the updated model at each time step . is the global update summation at time ; the increasing number of the round-trip time (RTT) communications from local to server will boost the global training accuracy. However, the number of RTT model communications will reduce the E2E transmission QoS. Generally, RTT of FL communication, communicating over the 5G communication technology, consists of poor service identity towards intelligent handling for service level agreement (SLA) that requires additional resource allocations with the high-level priority.

2.2. Communication Overhead

The overall overhead, which reduces the network QoS, can be expressed based on the queuing system. In each network node and aggregation server, the serving overheads at a particular queue interface can be determined and modeled as queuing system with the limitation of capacity of the server. is the serving ratio between the arriving task and serving resource . For , user traffic and user's traffic in the system can be measured as follows:

The mean waiting time length in the single individual edge server can be measured as

Then, with the mean number of user traffic in the single individual edge, the system can be modeled as

The communication overhead will be considered in the computation and communication delays. Furthermore, we denote the communication rate between serving nodes ( interface) in wired-based networks with bandwidth as and transmission power as , the noise power between the to interface as , and communication channel gain as . Thus, the transmission rate from the aggregation server to another global server can be expressed as

and , and ; then, the transmission rate between nodes can be expressed as

3. Our Solution

SDN-based DQN will allocate resource in advance for feasible flow handling for every predicted loading metric DN-based DQN appraoch performs the allocation processes and optimal parth selections (see Figure 2). The network loading metric will be under the defined threshold in the optimal state. The forwarding process of the information from the local to global servers is according to the predicated and resource adjustment schemes. SDN controller establishes the routing policy according to the comparisons of real-time observation metrics; the path with minimum metric is considered for feasible routing path. However, in the case that loading metric at each possible routing path has loading metrics more than the defined threshold (nonoptimal state), the SDN controller will be adjusted resource by attempting to query the optimal action until the observed condition reaches an optimal state. The SDN controller will select the optimal gateway to ensure the transferring of updated model parameters of local devices and the aggregated model download from the aggregation server (see Algorithm 1).

The ENI infrastructure will not be utilized in steady situations, and the SDR will be coordinated to handle the model sharing based on the current route. This method will be helpful in the stability of communication, while DP communications share similar loading metrics. However, the caching metric will not be utilized for route configuration in the frequently fluctuating communication statuses since the feasible optimal aggregation server is obligated to be defined. Furthermore, the IIoT applications are unsuitable for utilizing the restriction scheme, which restricts sending resources during the heavy loading network, increasing the waiting time at the source devices.

3.1. Joined Environment for Resource Adjustment

To evaluate autonomous resource allocations, the environment with varied network stability between 0 and 255 represents the loading metric randomly. A variety of entities can accomplish the IoT environment in terms of data plane conditions called state space and the SDN controller considers taking action at each time-space based on the optimal policy , which provides the maximum Q-value. The detailed description is explained in the following.

3.1.1. State Space

At each time-space the SDN controller maintains the below information, which can be affected by the network conditions.(i) bandwidth of the wired between th to k interfaces(ii) is the number of the assigned tasks.(iii) is the number of the user requests at the time .(iv) is the average number of queue lengths at time .(v) is the transmission rate of the wired between serving entity at the time

Loading metrics are at interfaces between and network devices at the time .

So, the above information can be the entities of the system state space , that is, , due to the communication bandwidth at and interface in the wired link, will be sharing the same metric for the system state information and can be expressed as

3.1.2. Action Space

The agent takes a significant role in deciding the optimal action for flow requests according to the network states. In this action space, the agent will consider optimal action to meet the defined state information. We denote as our global action space at each time-space at the time , and is the action tacking at the and interfaces between and network devices. While denotes the global action space, performs at time slot ; then is varied corresponding to the state space information.

3.1.3. Reward Calculation

In our system, the the agent randomly selected action at each time , and the reward will be offered based on the network state information. The feedback from the environment trains the agent to determine optimal action, and the optimal action performs the optimal state. After was taken, the reward was immediately provided.

Based on equations (9)–(11), ; the rewards metric represents the good and bad states at each time-space. Further, the reward accumulation presents the optimal network condition for entire communication periods.

3.1.4. Optimal Policy

The reward metric is corresponding to the system environment status based on the optimal action selection. The agent explores the optimal action for reducing the network loading metrics. Based on the experiences, the agent can select the optimal action at the similar network states based on the experiences. The agent chooses action according to the policy , and the optimal policy can return the maximum sequence corresponding reward . To maximize the Q-value, the action required the optimal policy , and our Q-value can be obtained by utilizing Bellman optimal equation

The optimal Q-value at state and action with the policy will correspond to the discount parameter , and the weight in will reflect the agent to put on the future rewards. Moreover, for any main network and target network parameter , the optimal policy that performs maximum Q-value in paired state and action is denoted as

From (12), the agent explores the action policy in the obvious state to find the maximum Q-value. Then optimal Q-value is relative to the summation of rewards and discounted parameter multiply of maximum Q-value of next state and action , as depicted in (14). However, when is close to 1, the system will suffer from the computation delay since the agent has more chances to discover the optimal Q-value. Consequently, our target network will be written as

Then, the loss between target and actual network parameters of minibatches can be expressed as

denotes of general features as and denotes target features in the global dataset required for client and server
(1)Initialize the synchronous , , parameters, , for aggregation server
(2)Ensure the optimal sharing path between client and server, and the routing path (see Algorithms 2)
(3)[Aggregation Server]
(4)for each epoch in range ()
(5) Select lowDNN() for clients
(6) Aggregating the model for the next epoch by using the FedAvg algorithm [35].
(7)end for
(8)[Client Server]
(9)for each in () do
(10) Input , , and parameters
(11) Define class lowDNN(self, , ):
(12)  for each client in do
(13)   
(14)  end for
(15)end for

The SDQN-PDRA is based on the resource adjustment scheme to reduce the loading metrics at each loading period (see Algorithm 2). The SDN controller adjusts the observed loading metrics based on the DQN method. We suppose each serving server can retrieve the global resource from the root or global MEC server in this scenario. The network loading metric will be under the defined threshold in the optimal state. The forwarding process of the information from the local to global servers is according to the predicated and resource adjustment schemes. SDN controller establishes the routing policy according to the comparisons of real-time observation metrics; the path with minimum metric is considered for feasible routing path. However, in the case that loading metric at each possible routing path has a loading metric more than the defined threshold (nonoptimal state), the SDN controller will be adjusted resource by attempting to query the optimal action until the network condition reaches an optimal state.

(1)Initial the main, target parameters, and replay buffer, and respectively
(2)Define number of episodes
(3)for each step in the episodes, then
(4) State observation
(5) DQNagent selects action based on the optimal policy
(6) Action selection and explore the next state and obtain the reward
(7)At each time slot SDN controller executes the action ,
(8)if the size of the size of replay buffer
(9)  cache into the replay buffer
(10)else
(11)  Replace queue tail element with the current as the FIFO process.
(12)End if
(13) Transition to next network state
(14) Random minibatch of samples from replay buffer
(15) Compute the target network value:
(16)
(17) Compute and minimize the loss:
(18)
(19) Update the target network , based on the updated :
(20)
(21)End for

4. Numerical Evaluation

This section provides a precise description of the used system evaluation parameters for experiment installation in terms of experimental parameters, hyperparameters, and experiment components used to conduct E2E simulation. Furthermore, the numerical result evaluation of the prediction model, model convergence accuracy between client and server in various network conditions, and efficiency of resource adjustment based on SDQN towards the evaluation of communication QoS are performed.

4.1. Simulation Environments

During simulations, the captured delay was utilized to represent the real-world network loading metrics. In addition, the opened source dataset EMNIST [35] was loaded from the federated EMNIST to evaluate the converged network reliability. The EMNIST dataset was sliced to meet the number of clients for testing using the Google platform. Each client has each slice of the dataset (individual dataset) and training model, and the aggregate server utilizes the FedAvg function offered by TensorFlow Federated [35]. Moreover, the E2E evaluations were based on the simulated metrics captured from the NS3 [36] simulations (see Table 1) and the hyperparameters and the experiment components, respectively (see Tables 2 and 3).

4.2. Results and Discussion

The state observation during 1000 episodes in a real-world VEN environment was done and we applied our DQN approach for adjusting the state space metric to meet the determined optimal states threshold, which was set at (see Figure 3). The VEN states in natural communication consist of the average bad network state and good network state counts at 397.724 and 32.276, respectively (see Figure 3(a)). In some cases, the natural VEN environment has 0 optimal and 430 bad states (100% of observed states are in bad conditions). With the , and learning rate , our proposed resource adjustment reduced the communication overhead in VEN and reached the average optimal state count at 413.1537688 and the bad state counts of 16.84623116 (see Figure 3(b)). With this effective result, in some episodes, the proposed scheme reached 100% (430 optimal states) optimal state handling with 0% (0 bad states) of bad network state counts. Based on the notable metrics, our scheme performed the optimal state up to 96.08227%.

Four conditional simulations of the federated model experiments are conducted to emphasize FL model reliability in actual situation network routing (see Figure 4). The graph presents the remarkable outperformance of the convergence accuracy based on optimal network selected path (ONSP) over the three other possibility routing paths; for example, simple congestion of network chosen path (SCNSP), congestion of network selected path (CNSP), and heavy congestion of network selected path (HCNSP) were simulated to reflect the model reliability in network environments. The mean training loss metrics in 99 communication rounds of ONSP, SCNSP, CNSP, and HCNSP is at 0.484335, 0.743309, 1.129355, and 1.354101, respectively (see Figure 4(a)). And the minimum loss of the ONSP, SCNSP, CNSP, and HCNSP is at 0.052342, 0.309779, 0.694963, and 0.94137, respectively. Based on the mean loss comparisons, the ONSP has a lessened loss metric compared to SCNSP, CNSP, and HCNSP at 0.64502%, 0.258974%, and 0.869765%, respectively. The model aggregation will rely on the network situations. The congestion environment will lead to a loss in mode sharing between aggregation servers, and it can cause low accuracy in terms of global model reliability. The E2E model reliability corresponds to the global model accuracy comparison between the ONSP, SCNSP, CNSP, and HCNSP (see Figure 4(b)). The ONSP approach reached the maximum accuracy metric at 0.998873%, while the SCNSP, CNSP, and HCNSP have the accuracy of 0.941435%, 0.925075%, and 0.825702, respectively. The ONSP enhanced the other possibility routing paths based on the numerical comparison at 0.058974138%, 0.14566234%, and 0.270408005%, respectively. Due to the ONSP delivering the optimal scheduling approach, the network loading metrics have lessened based on the proactive network configurations. Furthermore, the proposed ONSP approach will also enhance the possibility of saving the computation power in the CP.

In terms of E2E communication QoS metric evaluation, we compared our proposed integrated software-defined DQN in proactive resource allocations (SDQN-PDRA) with the other approaches, including software-defined RNN in dynamic routing (SDRDR), software-defined dynamic routing (SDDR), and software-defined experience routing (SDER). Our proposed SDQN-PDRA approach illustrated remarkable outperformed results over the other approaches, such as SDRDR, SDDR, and SDER, in terms of the packet drop counts, packet drop ratio, packet delivery ratio, and communication delay, respectively (see Figure 5).

The natural network environment consists of limited network loading awareness for improving the routing experience. Thus, the local and external data sharing can be elaborated as static and dynamic routing protocols. Likewise, the static and dynamic routing protocols have a weakness when considering selecting the optimal path with efficiency costless. Our proposed SDQN-PDRA provided high accuracy at loading shape network prediction for the lowest cost routing and can reduce the loading state to meet the defined optimal conditions metric threshold. Our proposed SDQN-PDRA obtained the lowest packet drop counts over the SDRDR, SDDR, and SDER in mean metrics at 69, 120.2666667, 339.0666667, and 737, respectively (see Figure 5(a)).

Our proposed SDQN-PDRA obtained the lowest packet drop ratio over the SDRDR, SDDR, and SDER: 0.01925501%, 0.027260413%, 0.097242413%, and 0.176594407%, respectively (see Figure 5(b)). Therefore, the proposed SDQN-PDRA lessened the E2E communication loss between client and server in network environments based on the given graphs. For the E2E communication reliability for the communication drop ratio, see Figure 5(c). Our proposed SDQN-PDRA achieved the highest E2E communication reliability. The average E2E communication reliability of SDQN-PDRA, SDRDR, SDDR, and SDER was achieved at 99.98074499%, 99.97273959%, 9.90275759%, and 99.82340559%, respectively. Based on the presented reliability metrics, our proposed SDQN-PDRA is 0.008005403%, 0.077987403%, and 0.157339397% higher than the communication reliability metric of SDRDR, SDDR, and SDER, respectively.

The selected routing path containing high loading metric will suffer computation overhead, which can postpone the serving request and increase the waiting time of arrival traffic. Moreover, the network buffer can be limited whenever the serving rate is under the requested tasks. To cope with these issues, real-time loading resource reduction plays an essential role in improving the communication experience. The E2E communication delay between the proposed SDQN-PDRA approach and SDRDR, SDDR, and SDER approaches has been presented (see Figure 5(d)). The proposed SDQN-PDRA effectively reduced the network loading metric and selected the optimal path for route installation, and the communication delay was completely reduced for E2E data sharing. Based on the graphs, the proposed SDQN-PDRA reached the minimum average delay at 8.294905267 milliseconds, while the SDRDR, SDDR, and SDER consume a higher delay at 19.62816013 milliseconds, 71.227377 milliseconds, and 163.8931339 milliseconds, respectively. Furthermore, our proposed SDQN-PDRA quickly responded 11.33325487 milliseconds, 62.93247173 milliseconds, and 155.5982286 milliseconds faster than SDRDR, SDDR, and SDER, respectively.

5. Conclusion

This paper proposed an intelligent SDR based on the integrated SDN-based RNN-based traffic loading prediction and DQN for network loading adjustment (SDQN-PDRA) for reliable FL-based IIoT. It is worth noting that IoV communication perspectives are obligated to be handled as a real-time service in distributed edge routing. Moreover, the high-speed mobility sensor will face many challenges in data processing, ultra-high mobility communication, and frequently alternative edge cloud computing. The FL-based IoT will be a large-scale distributed cloud requiring intelligent routing that effectively performs URLLC in routing and network convergence processes. The network assurance plays an essential contribution to reliable FL in IoT systems, while the reliability network systems influence the reliability of FL convergence models in terms of accuracy and decision making. Our proposed SDQN-PDRA approach prevents routing failure and adjusts the network condition to meet the optimal states. The proposed SDQN-PDRA provided remarkable contributions to IoT systems regarding IoT stability conditions and E2E communication QoS, including reliability, latency, and communication throughput. For future work, we will explore the computation cost influence on the communication overhead in routing and expand the routing environment to reflect real-world IoT communications in the 5G system. Furthermore, the SDN-based multidimensional deep-Q-network approaches will be invested in improving the autonomous routing policy.

Data Availability

The data and finding are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was funded by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2020R1I1A3066543) and BK21 FOUR (Fostering Outstanding Universities for Research) (no. 5199990914048). In addition, this work was supported by the Soonchunhyang University Research Fund and by the Bio and Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korean government (MSIT) (no. NRF-2019M3E5D1A02069073).