Abstract

To address the problem of diagnostic accuracy and stability degradation caused by random selection of the initial parameters for the wavelet neural network (WNN) fault diagnosis model, this paper proposes a network troubleshooting model based on the improved gray wolf algorithm (IGWO) and the wavelet neural network. First, the convergence factor and policy for the weight update are redesigned in the IGWO algorithm. This study uses a nonlinear convergence factor to balance the global and local search capabilities of the algorithm and dynamically adjusts the weights according to the adaptability of the head wolf α to strengthen its leadership position. Thereafter, the initial weights and biases of the WNN are optimized using the IGWO algorithm. During the backpropagation of the WNN error, momentum factors are introduced to prevent the model from falling into local optimization. Experimental results show that the IGWO algorithm is far better than GWO in terms of convergence speed and convergence accuracy. Furthermore, the average diagnostic accuracy of the IGWO-WNN model on the KDD-CUP99 dataset reaches 99.22%, which is 1.15% higher than that of the WNN model, and the stability of the diagnostic results is significantly improved.

1. Introduction

With the development of big data and the Internet of Things (IOT) [1], there are more and more network failure incidents caused by network intrusions, posing a major challenge to network security. Therefore, network fault diagnosis technology has received widespread attention, and various fault diagnosis algorithms have been proposed.

Parmar et al. [2] used SVM for fault diagnosis and achieved excellent results, with an accuracy rate of almost 96%. However, this method is difficult to train large-scale fault samples, and the accuracy of diagnosis still needs to be improved. Mazini et al. [3] proposed a network fault diagnosis model based on AdaBoost, which first used the artificial bee colony algorithm for feature selection and then used AdaBoost for fault classification, which achieved good results in diagnostic accuracy compared with traditional fault diagnosis methods. However, this method is very sensitive to anomalous samples and is not ideal when dealing with unbalanced data. Zhang et al. [4] proposed a network fault diagnosis method based on fault tree, using fault tree to present fault indications and logical structure of faults, therefore extracting rules from fault tree, and corresponding fault characteristics one by one. This method can clearly represent the internal logical structure of the fault, but not all faults can be explicitly extracted from the reasonable rules, and the generalization of this method is poor.

In recent years, fault diagnosis methods based on deep learning have become more and more popular, and many fault diagnosis methods based on artificial neural networks have been proposed [57]. The diagnostic false positive rate and accuracy rate of these methods are superior to traditional methods, which makes the artificial neural network-based method a research hotspot in the field of fault diagnosis.

Wavelet neural network (WNN) is a feedforward neural network [8, 9], which has a strong self-learning ability, generalization ability, nonlinear mapping ability, and good time-frequency characteristics. WNN has achieved good results in dealing with complex nonlinear fault diagnosis problems. However, the selection of the initial weights and deviations of the traditional wavelet neural network model has great randomness, which has a great impact on the learning performance of the model. If the appropriate initial weights and deviations are not selected, the accuracy and stability of fault diagnosis may be reduced.

Metaheuristic algorithms are a fast and robust global optimization method that is often used to solve model initial parameter optimization problems, due to their outstanding ability to solve complex optimization problems. Some of the more well-known heuristic algorithms are the genetic algorithm (GA) [10], particle swarm algorithm (PSO) [11], artificial swarm algorithm (ABC) [12], ant colony optimization algorithm (ACO) [13], arithmetic optimization algorithm (AOA) [14], Golden Eagle optimization algorithm (GEO) [15], and so on.

Among them, the gray wolf optimization (GWO) algorithm [16] is a recently developed metaheuristic algorithm that is inspired by the hunting behavior of gray wolves. In the GWO, the search direction is determined by the wolf (alpha, beta, and delta) in the wolf pack, which has the advantage of ensuring that the algorithm converges quickly.

In recent years, the GWO algorithm has been widely used in feature selection, parameter optimization, power dispatching, and other fields due to its strong convergence performance, few parameters, and easy implementation. Although the widespread use of GWO proves its sufficient search performance, in some cases, it still encounters inappropriate stagnation and falls into the problem of local optimization. Therefore, many scholars have tried to improve the search performance of the GWO algorithm. Yu et al. [17] proposed an OGWO algorithm that combines a learning method based on oppositions and nonlinear parameter control into GWO, which can help the algorithm jump out of the local optimality and does not increase the computational complexity, but it is easy to skip the true solution in the search process in multimode problems. Fan et al. [18] proposed a gray wolf optimization algorithm based on the beetle antenna strategy (BGWO), which gives the wolf pack leader hearing to reduce unnecessary searches, which strengthens the global search ability of the GWO algorithm but is less than satisfactory in dealing with single-peak problems. Bansal et al. [19] combines opposition-based learning (OBL) with the exploration equation to effectively solve the situation where the target value is far from the current state position. Dhargupta et al. [20] used Spearman’s correlation coefficient to determine the location of the omega wolf. In addition, the method combines OBL with GWO, but only a few wolves are obtained by OBL. This helps avoid unnecessary global searches and enables fast convergence. Gupta et al. [21] proposed an algorithm called mGWO that determines the search mechanism of wolves based on each wolf’s personal best history through crossover and greedy selection. These strategies can enhance the algorithm’s global and local search capabilities, with the disadvantage that the method requires a lot of memory.

Therefore, addressing the problems of random selection of initial parameters of wavelet neural networks, low diagnostic accuracy, and poor stability, this paper introduces the improved GWO algorithm into the WNN model and proposes a new IGWO-WNN network fault diagnosis model. The main contributions of this article are as follows:(1)In the process of backpropagation of wavelet neural network error, momentum factor is introduced; that is, the correction of model parameters considers not only the current error but also the previous training error, which is beneficial to the neural network training to jump out of the local optimal point.(2)This paper proposes a new variant of the gray wolf algorithm, which balances the global and local search capabilities of the algorithm by improving the convergence factor of the gray wolf algorithm, and proposes a new wolf pack update strategy to strengthen the leadership of the head wolf α in the process of wolf pack update, so as to avoid the wolf pack search falling into local optimization and improve the convergence speed and convergence accuracy of the algorithm.(3)This paper combines the introduction of momentum factor and gray wolf algorithm optimization parameters, which avoids the randomness of the initial parameter selection of wavelet neural networks and effectively improves the accuracy and stability of network fault diagnosis results.(4)The IGWO-WNN fault diagnosis model proposed in this paper has superior generalization, which can not only achieve good results in network faults but also be suitable for diagnosing mechanical faults based on thermal imaging [22, 23], vibration signal [24, 25], and acoustic signals [26, 27].

2. WNN with Momentum Factor

Suppose there are P groups of training data, N input layer units, and M output layer units. Then, the p-th sample input is , the output is , and the expected output is . The output of the k-th hidden layer is the following:where is the connection weight between the input layer and the k-th hidden layer, is the scale factor of wavelet, and is the displacement factor of wavelet. The output expression of the output layer iswhere is the connection weight between the output layer and the hidden layer. The standard error function is

Different from the traditional algorithm, we introduce the momentum factor in the iterative process of neural network. The specific iterative formula is as follows:

When , this correction formula is consistent with the correction formula of the traditional algorithm; that is, only the decline of the current output error is considered. At this time, if the algorithm trains to the local optimum, the correction amount approaches zero, and the correction of network training parameters almost stops, resulting in the inability of the algorithm to jump out of the local optimum. When ≠0, if the algorithm trains to the local optimal region, even if the current correction is close to zero, there is still the correction at the previous time to ensure the continuous training of the algorithm and make the output of the neural network jump out of the local minimum.

Although the introduction of the momentum factor can improve the problem of network training falling into local optimization to a certain extent, this method can only make the algorithm jump out of the ‘small pit’ and fail when encountering a ‘large pit’. In order to solve this problem, we use the global optimization characteristics of gray wolf optimization algorithm to optimize the initial parameters of neural network, so as to avoid the randomness of initial parameter selection and give the model the ability to jump out of local optimization.

3. Improved GWO

Although the standard GWO algorithm has a good optimizing effect on the neural network parameters, due to its linearly decreasing convergence factor, it does not consider the balance between global and local search of wolves in real network fault gray wells. Moreover, the weight of the update position of the leading gray wolf in the algorithm always remains unchanged, which causes the algorithm to fall into local optimization and influences parameter optimization and network fault diagnosis. Therefore, a nonlinear convergence factor and dynamic weighting strategy is used to improve the gray wolf algorithm, balance local and global search ability, avoid algorithm training from falling into local optimization, and further improve the effect of parameter optimization.

3.1. GWO Algorithm

The GWO algorithm searches for the optimal neural network parameters by simulating the hierarchical pyramid and predatory behavior in a gray wolf population. The gray wolf group has a strict pyramidal social hierarchy. α is the leader of the wolves. β and δ assist α in managing wolves, while ω assists α, β, and δ in foraging.

The main steps of the GWO algorithm are as follows:(1)Firstly, the gray wolves surround the prey, and the distance between the prey and the wolf is calculated as follows: is the position of the prey after the m-th iteration and is the position of the m-th generation wolf. D is the distance between the prey and the wolf. The calculation formulas for the synergy coefficients C and A are as follows: and are random numbers in the interval [0, 1], and s is a constant that decreases linearly from 2 to 0 as the number of iterations increases.(2)The location of the prey is determined, and the position of ω is updated according to the best search units α, β, and δ.(3)The wolves constantly update their positions until the optimal parameters are reached.

3.2. Nonlinear Convergence

The convergence factor provides a good balance between the global and local search capabilities of the algorithm. When |A| > 1, the wolves expand the search range to find better prey, and the algorithm performs a global search; when |A| < 1, the gray wolves shrink the search range to get closer to the prey, and the algorithm performs a local search. In the standard GWO algorithm, the convergence factor decreases linearly from 2 to 0 as the iteration time increases, resulting in insufficient global search time. However, to better utilize the global search capability of the GWO algorithm, a large number of global searches should be performed in the network fault diagnosis, to prevent the algorithm from falling into local optimization.

In this study, the convergence factor is improved and a nonlinear convergence factor strategy is proposed to improve the global search ability of the algorithm. The mathematical formula for the convergence factor is as follows:where m is the number of iterations and is the maximum number of iterations.

As shown in Figure 1, the comparison of the convergence factors shows that the decreasing trend of the original convergence factor s is a straight line that decreases at the same rate in the iteration process, the global search capabilities of the algorithm are not highlighted, the decreasing trend of the nonlinear convergence factor s’ represents an arc, and the attenuation degree of s is slow at the beginning of the iteration, which leads to a large number of global searches, to improve the effect of parameter optimization.

3.3. Dynamic Weight Strategy

In the standard GWO algorithm, the position of the gray wolf update is the arithmetic average of , , , and the weight of all three is 1/3. However, the reality is that the position of is more likely to be better than that of and the position of is more likely to be better than that of . If the weights of the three are consistent, the priority relationship (α > β > δ) cannot be reflected, which will reduce the optimization speed and accuracy of the algorithm. Therefore, this paper proposes a dynamic weight update strategy to enhance the leadership of α. A schematic of weight updates is shown in Figures 2(a) and 2(b).

First we calculate the distance between (point A), (point B), (point C), and (point M) to adjust the proportion of the head wolf. The greater the distance between the points, the greater the weight, so that the spatial distance between the updated position (point N) and the three points A, B, and C is more balanced. The weight calculation formula is as follows:, , and are the spatial distances between the point (A, B, C) and the point M, and , , and are the weights of , , and . The second weight calculation is then performed according to the fitness values of α, β, and δ. The calculation equation is as follows:where , , and are the weights of the second update.

The formula for updating the gray wolf position is

By swapping point A and point C in figure (a), we get figure (b). In both cases, because the three coordinates remain unchanged, the traditional algorithm obtains the same update position, which deviates from the desired result. The improved weight update strategy appropriately strengthens the leadership of wolf α, which ensures that the updated position coordinates are always close to point A. The advantage of this is that it can speed up the convergence of the algorithm and obtain higher convergence accuracy.

4. IGWO-WNN Network Fault Diagnosis Model

4.1. Model Structure

We built the IGWO-WNN network fault diagnosis model and used principal component analysis (PCA) to reduce the data redundancy and improve the efficiency of network fault diagnosis. Then, the IGWO algorithm was used to optimize the initial parameters of the WNN network fault diagnosis model. Finally, a wavelet neural network with a momentum factor is used for network fault diagnosis. The model can effectively solve the problems where the initial parameters of the traditional WNN model are random and easily fall into local optima and effectively improve the accuracy and stability of network fault diagnosis. The modeling process of the IGWO-WNN is shown in Figure 3. The specific steps are as follows:

Step 1. Data Preprocessing
Preprocess the original dataset. The pretreatment process is mainly divided into two parts.
The preprocessing of the original dataset was divided into two parts.(1)Feature Mapping of the Datasets. To better train the network fault diagnostic model, the network fault data are numerically normalized.(2)Data Dimensionality Reduction. The high dimensionality of the network fault data results in a long training time for the model. To collect the maximum possible fault data information, a PCA dimensionality reduction algorithm was used to reduce the data dimensionality.

Step 2. Building a WNN Model
An N-K-M three-layer neural network training model was built according to the structure of the wavelet neural network, where N represents the characteristic value of the network fault data after PCA dimensionality reduction, the number of hidden layer neurons K is determined by formula (11), and M is the number of network fault data categories. Since we only classify whether there is a fault, M is equal to one.

Step 3. IGWO Parameter Optimization(1)Assign a set of initial values to the parameters to be optimized. The number of parameters Q to be optimized satisfies the following conditions:, , and are the numbers of neurons in the input layer, hidden layer, and output layer of the neural network.(2)Determine the fitness calculation function: the mean square error (MSE) between the calculated value of the wavelet neural network and the actual sample category is used as the fitness value of the individual wolf. The specific calculation formula of the MSE is as follows:obse represents the actual sample category value, and pred represents the calculated value of the wavelet neural network after substitution into the wolf group position. The smaller the MSE value is, the better the wolf group individual is.(3)The wolf pack level is determined according to the fitness value of each wolf individual α, β, δ, and ω.(4)Each wolf individual ω updates the individual position according to α, β, and δ and compares the fitness value of the new position with that of α, β, and δ. If the new position is better, the new position and its fitness value are retained.(5)Continue iteratively updating the individual, reach the termination condition, stop updating, and output the first wolf α, which is the optimized weight, scaling factor, and displacement factor of WNN.

Step 4. Neural Network Training and Network Fault Diagnosis(1)Parameter Initialization. The optimal parameters obtained by IGWO optimization are used as the connection weight value, wavelet scale , and displacement factor .(2)Input the network fault data, calculate the estimated value and error through wavelet neural network, and adjust the parameters reversely according to the error between the estimated value and the actual value.(3)Iterate and update parameters until the termination condition is reached and output and save the training model.(4)Input the network data to be diagnosed, and then obtain the network fault diagnosis results.

4.2. Evaluation Criteria of Network Fault Diagnosis Model

The evaluation criteria of network fault diagnosis model are defined in Table 1. TP (true positive) is the total number of samples correctly classified as normal, TN (true negative) is the total number of samples correctly classified as faults, FP (false positive) is the total number of samples incorrectly classified as normal, and FN (false negative) is the total number of samples incorrectly classified as faults.

5. Experiments and Results

The MATLAB R2018b was used as the experimental simulation environment, and KDD-CUP99 network data as the experimental dataset to verify the simulation performance of the IGWO-WNN network fault diagnosis model. First, PCA is used to reduce the dimensions of experimental data, redundancy of fault data, and training time. Then, two single-peak functions and two multipeak functions were used to check the optimization effect of IGWO. Finally, by comparing the diagnosis performance of the IGWO-WNN fault diagnosis model on the same network fault dataset with the network fault diagnosis model trained by WNN and SVM, the accuracy and stability of our proposed network fault diagnosis model were verified. The experimental design was as follows.

5.1. Network Fault Data Preprocessing

The KDD-CUP99 dataset is used to distinguish abnormal connections (faults) from normal connections. The data includes four fault categories, including DOS and probing. The specific classifications are listed in Table 2.

Since the experiment only performed fault diagnosis, the normal samples and the fault samples were processed numerically, and Figure 4 shows the distribution of the datasets.

The original dataset has a high feature dimension and large redundancy, which leads to a large loss of time in model training. Therefore, the PCA algorithm is used to reduce the dimensions of the original data before model training. More information about PCA is available in [2830]. Under the premise of affecting validity of the data as little as possible, the feature dimension is reduced as much as possible and the training efficiency is improved. After dimension reduction, the cumulative contribution rates of principal components are shown in Figure 5.

In general, the principal components are considered to have fully preserved the original data information if the proportion of principal components is more than 95%. As shown in Figure 5, the cumulative contribution rate of the first five features after dimensionality reduction was more than 95%. Therefore, the first five principal components were selected as the error feature data.

5.2. Performance Analysis of IGWO-WNN Network Fault Diagnosis Model
5.2.1. IGWO Performance Analysis

In this study, the GWO algorithm, the VW-GWO [31] algorithm, and the GWO-FW [32] algorithm are selected as comparison algorithms. Four test functions, including two single-peak functions and two multipeak functions, are used for performance testing to verify the superiority of the IGWO algorithm proposed in this paper in terms of optimization performance. The specific mathematical expressions of the test functions are shown in Table 3.

Figures 6(a)6(d) show the three-dimensional diagram of the four test functions and the convergence diagram of the four algorithms such as IGWO and GWO on the test function. In the experiment, the number of populations of the algorithm is set to 50, and the maximum number of iterations is 500. The cutoff accuracy of the algorithm is shown in Table 4.

In terms of the test results of the single-peak test functions, the convergence curves of GWO-FW and VM-GWO algorithms almost overlap with that of the traditional GWO algorithm, and neither the test accuracy nor the convergence speed is obviously improved. Although the IGWO algorithm proposed in this paper converges slowly in the early stage, the final convergence accuracy is better than the other three algorithms.

The test results of multipeak functions show that the convergence of IGWO is obviously faster than the traditional GWO algorithm, but slightly slower than GWO-FW and VM-GWO. Especially in the test of , only our method can accurately find the minimum value of the function, while the other three methods cannot.

The experiments show that the IGWO algorithm proposed in this paper has the highest convergence accuracy whether dealing with single-peak or multipeak problems. This is because although the weight updating strategy adopted by GWO-FW and VM-GWO algorithms can strengthen the leadership position of wolf α, the proportion of wolf α is often set too high in the early iteration. Although the algorithm can converge quickly in the early stage of iteration, this method weakens the rationality of the leader’s leadership distribution to some extent, so the convergence speed slows down and the accuracy is slightly lower in the late stage of iteration. The IGWO algorithm proposed in this paper adopts nonlinear convergence factor, which increases the proportion of global optimization during iteration and enhances the global optimization ability of the algorithm. Moreover, the weight updating strategy adopted by IGWO algorithm comprehensively considers the fitness value of the wolf and the spatial position relationship and dynamically updates the weight according to the specific situation during iteration. The weight calculated by this method is closer to reality, and the positions of α, β, and δ wolves are moderate.

In summary, the IGWO algorithm proposed in this paper has good optimization performance, and when compared with the traditional GWO algorithm, it has a great improvement in convergence speed and accuracy, which has certain advantages.

5.2.2. Diagnosis Accuracy Analysis of IGWO-WNN Fault Diagnosis Model

In this experiment, BP, SVM, and ELM were selected as the comparison algorithms for comparison with the IGWO-WNN model. The advantages of the IGWO-WNN fault diagnosis model in classification accuracy were verified by analyzing its precision, TPR, FPR, and other indicators. The specific diagnostic results are shown in Table 5 and Figure 7.

From Table 4, it can be seen that the WNN and its improved network fault diagnosis model have obvious advantages in terms of accuracy, precision, and other indicators. Compared with the BP, SVM, and ELM models, the WNN model improved the overall accuracy by 3.22%, 4.46%, and 1.08%, respectively, and reduced the FPR by 6.4%, 8.48%, and 1.8%, respectively. The experimental results show that the probability of false positives and false negatives is lower with the WNN and its improved model than the other models. It has obvious advantages in diagnosing fault samples, which also proves the rationality of network fault diagnosis using the WNN model. Moreover, the overall accuracy of our proposed IGWO-WNN network fault diagnosis model was improved by 0.46% compared with the WNN fault diagnosis model and 0.38% compared with the GWO-WNN model. The comparison experiment with the WNN model shows that using the IGWO algorithm to optimize the initial parameters of the neural network can effectively avoid the randomness of the initial parameters and improve the accuracy of fault diagnosis. Compared with the fault diagnosis results of the GWO-WNN model, the gray wolf optimization algorithm using the dynamic weighting and nonlinear convergence factor strategy has a better optimization effect on the fault model. From Figure 7, it can be seen that there is little difference in the TPR of each model, which is reflected in the precision, FPR, overall accuracy, and other indicators. This shows that there is little difference in the diagnosis results from each fault diagnosis model for normal samples, and the difference in diagnosis effect is reflected in the diagnosis of fault samples. This is because the experiment categorizes a large quantity of fault state data and only performs fault diagnosis, which results in the data correlation in the fault state being less than the normal state; therefore, the detection rate of normal state samples in the experiment is high. From Figure 7, it can be seen that the indices of the fault diagnosis model proposed in this study are higher than those of the other models. This is because the IGWO-WNN model can make the head wolf dynamically command the wolf pack in the process of parameter optimization and increase the proportion of global search to obtain better initial parameters. At the same time, the momentum factor is introduced into the back propagation of the neural network training error, which can avoid the disadvantage that the traditional WNN model can easily fall into local optimization.

Finally, the IGWO-WNN network fault diagnosis model was compared with the WNN, BP, and other classification models. The results show that the classification effect of the fault diagnosis model proposed in this study is the best, the classification accuracy of the network fault samples and normal samples is the highest, and the accuracy is greatly improved compared with WNN, BP, and other models.

5.2.3. Stability Analysis of IGWO-WNN Fault Diagnosis Model

Through numerous experiments, we compared the average and variance of classification accuracy and FPR between the IGWO-WNN model and the standard WNN model to verify the advantages of the proposed model in diagnostic stability. To avoid the randomness of the experimental results, the two models were used 30 times for diagnosis. The change trends of the diagnostic results of the 30 experiments are shown in Table 6 and Figure 8.

As can be seen from Figure 8, the IGWO-WNN fault diagnosis model varies less in the overall diagnosis accuracy and diagnosis error rate of the fault samples and is rather flat overall, whereas the traditional WNN model has sharp points with large variations in the diagnosis results. In the 30 experiments, the WNN network fault diagnosis model had many large fluctuations in diagnosis. According to the results of the two models, the variance of each index of the WNN model is approximately two orders of magnitude higher than that of the IGWO-WNN model. This is because the WNN model initializes the model parameters by random selection. Different initial parameters can cause the neural network training to converge to different local optima, causing the model’s diagnosis results to vary significantly. The IGWO-WNN model uses the IGWO algorithm to optimize the initial parameters of the neural network, which avoids the randomness of the initial parameters to some extent, so that the training of the wavelet neural network converges to the global optimum. Therefore, the IGWO-WNN network fault diagnosis model not only has a higher overall fault diagnosis accuracy than the WNN model but also has less fluctuation in the results of multiple experiments, and its stability is much better than that of the WNN model.

In summary, our proposed IGWO-WNN fault diagnosis model is significantly better than the original WNN model in terms of accuracy and diagnostic stability. Numerous experiments have confirmed the effectiveness and superiority of our model.

6. Conclusion

In this paper, a network troubleshooting model based on IGWO-WNN is proposed. The GWO algorithm is improved by using a nonlinear convergence factor and dynamic weight update strategy, and the initial parameters of WNN are optimized by using the IGWO algorithm. Analyzing the optimization results of IGWO and GWO on the test function, it can be found that the nonlinear convergence factor can play a good balance between global and local searches. By strengthening the weight of wolf α, the convergence can be accelerated and algorithms can obtain better accuracy. Using the KDD-CUP99 dataset to train the fault diagnosis model, the use of IGWO to find the initial parameters of the WNN effectively avoids the randomness of the initial parameter selection of the WNN model and greatly improves the accuracy and stability of the result.

The IGWO algorithm proposed in this paper optimizes not only the initial parameters of the wavelet neural network, but also the parameters of models such as SVM and CNN, and it can also be applied to feature selection, scheduling control, and other fields. The IGWO-WNN model proposed in this paper is used not only for network fault diagnosis, but also for diagnosing or even predicting wireless sensors, bearings, and other faults.

Although the IGWO-WNN model has achieved good results in terms of accuracy and stability of fault diagnosis, there is still room for improvement. Future studies should lead to improvements in the processing of unbalanced data to improve the diagnostic accuracy of small probability faults. In addition, the fault diagnosis model should be improved from two classifications to specific multiclassification models to realize the segmentation and precision of network faults.

Data Availability

This paper uses a dataset of network intrusion detection. The dataset download link is as follows: https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

Conflicts of Interest

The authors declare that they do not have any commercial or associative interest that represents conflicts of interest in connection with the work submitted.

Acknowledgments

This work was supported by the National Nature Science Foundation of China (nos. 61931004 and 61705109) and Jiangsu Innovation and Entrepreneurship Group Talents Plan.