Security and Privacy Challenges for Intelligent Internet of Things DevicesView this Special Issue
An Adaptive Industrial Control Equipment Safety Fault Diagnosis Method in Industrial Internet of Things
With the rapid development of intelligent manufacturing and Industrial Internet of Things, many industrial control systems have high requirements for the security of the system itself. Failures of industrial control equipment will cause abnormal operation of industrial control equipment and waste of resources. It is very meaningful to detect and identify potential equipment abnormalities and failures in time and implement effective fault tolerance strategies. In the Industrial Internet of Things environment, the instructions and parameters of industrial control equipment often change due to changes in actual requirements. However, it is impractical to customize the learning method for each parameter value. Aiming at the problem, this paper proposes a fault diagnosis model based on ensemble learning and proposes a method of updating voting weights based on dynamic programming to assist decision-making. This method is based on Bagging strategy and combined with dynamic programming voting weight adjustment method to complete fault type prediction. Finally, this paper uses different loads as dynamic conditions; the diagnostic capability of the Bagging-based fault diagnosis integrated model in a dynamically changing industrial control system environment is verified by experiments. The fault diagnosis model of industrial control equipment based on ensemble learning effectively improves the adaptive ability of the model and makes the fault diagnosis framework truly intelligent. The voting weight adjustment method based on dynamic programming further improves the reliability of voting.
Traditional industrial control system safety fault diagnosis mainly focuses on mechanical failure of industrial control equipment. However, with the rapid development of intelligent manufacturing and Industrial Internet of Things, the links between industrial control equipment have become closer, and the operating environment has become more complicated. This also increases the probability of equipment failure to a certain extent, leading to further increase in equipment management and control costs. In the Industrial Internet of Things environment, the instructions and parameters of industrial control equipment often change due to changes in actual needs. For example, since more industrial tasks need to be completed in a short time, the current, voltage, speed, load, and other parameters of industrial control equipment must be adjusted so that the operation can be successfully completed. As the actual environmental parameters are changing, the safety fault signal returned by the sensor will also change with the change of operating parameters. These dynamic factors have brought huge challenges to equipment safety fault diagnosis in the industrial control environment. In the environment of Industrial Internet of Things, the parameters are constantly changing, so it is not practical to customize a learning method for each parameter value.
How to intelligently detect the changed safety fault sequence according to the dynamic changes of actual parameters is the focus of this paper.
The main contents of this paper are as follows:(i)A set of safety fault diagnosis models based on ensemble learning is proposed. The model has high accuracy for safety fault diagnosis under the condition that the equipment parameters change.(ii)A method for adjusting voting weight based on dynamic programming is proposed. The weights can be evaluated based on the current performance of each learner. The weights are then updated after the evaluation is completed.
2. Related Work
In recent years, a large amount of research has been conducted on intelligent fault diagnosis and prediction in the Industrial Internet of Things. Fault diagnosis and prediction is one of the most important functions in complex and safety-critical engineering systems, especially fault diagnosis, which has been the subject of in-depth research in the past forty years. This capability allows detecting and isolating early developing failures and predicting failure propagation, which can allow preventive maintenance or even as a countermeasure to the possibility of catastrophic events due to failures. Chen et al.  proposed a distributed fast fault diagnosis approach for multimachine power systems based on deterministic learning (DL) theory. In the study of Mousavi et al. , an efficient strategy for fault detection and isolation (FDI) of an Industrial Gas Turbine is introduced based on ensemble learning methods. Soleimani et al.  studied the impact of early stages insulation deteriorations on the temperature inside the transformer using a finite-element electromagnetic-thermofluid method and proposed an online sensor-based decision-making predictive fault diagnosis approach based on the observations. Kumar et al.  built up on the ideas of inference-based ambiguity management in the setting of decentralized control and developed a framework for inference-based decentralized diagnosis. Guan et al.  proposed a control and protection framework based on multiagent system (MAS), in which situation awareness of zone agents plays an important role.
Ensemble learning is a very popular research direction in the field of machine learning (Dietterich, 2002) . Its core idea is to build multiple models and then merge their decision results to get better results than a single model. Common ensemble learning methods include Bagging (Breiman, 1996) , Boosting (Freund, 1996) , Random Forest (RF) (Breiman, 2001) , XGboost (Chen, 2016) , and GBDT (Friedman, 2001) . There are many similarities between integrated learning and information fusion. Information fusion is mainly divided into three categories: data layer fusion, feature layer fusion, and decision layer fusion. Integrated learning includes the content of decision fusion. Integrated learning has two important steps: one is the construction of the base model and the other is the selection of the fusion method. A good fusion method can effectively improve the effect of the integration.
Next, we summarized the base model selection method and decision fusion method.
Ensemble learning can be viewed as a multimodel system. There are two main factors that affect the integration effect: one is the performance of a single model; the other is the diversity between models. The greater the difference between the models, the more obvious the effect after integration. For improving model diversity, there are mainly the following methods:(1)Use different training data sets to train the same type of model [12, 13]. Bagging and random forest methods both generate different base models in this way. In this way, different basic models are generated. The above two methods generate different training data sets by resampling the original training data set. Then, the same base model is trained separately. For example, the Random Forest uses different training sets to train the decision tree model. Bagging can choose any base model, such as a decision tree model or a neural network model [14, 15].(2)The model is trained through different feature subsets, and the Random Forest is building a decision tree at each node, and a different feature subset is randomly selected . This further enhances the difference between decision trees and can achieve better integration performance [17, 18].(3)Completely different types of models are used as the base model [19, 20].(4)Different model parameters are used to construct the base model .
Most of the existing ensemble learning methods construct the base model through the above 4 methods.
Decision fusion methods can mainly be classified into two categories: one is the fusion method based on label output and the other is the fusion method based on probability output. The label-based methods mainly include voting method (Lam, 1997) , Borda counting method (Emerson, 2013) , and the behavioral knowledge space method (BKS) (Huang, 1993) . The probability-based methods mainly include the decision template method (Kuncheva, 2001) , the Dempster–Shafer (DS) evidence fusion method (Sentz, 2002) , and Bayesian fusion method (Stathaki, 2011) .
3. Safety Fault Detection Model of Industrial Control Equipment Based on Integrated Learning
In order to be able to adaptively face the changes of the industrial control system environment, this paper proposes a safety fault diagnosis model for industrial control equipment based on integrated learning. The model is a composite model composed of multiple individual safety fault classifiers. Individual safety fault classifiers vote on safety fault detection results, and safety fault combination classifiers perform safety fault diagnosis based on the voting results of each classifier. Figure 1 shows the specific structure of the safety fault classifier based on the composite model. Compared with individual safety fault classifiers [28, 29], safety fault classifiers based on composite models tend to have more accurate results.
For the safety fault detection of industrial control equipment, let the set of safety fault types be , the true function of safety fault classification be , and the error rate of the individual safety fault classifier be . Then, for each individual safety fault classifier , there is
Here, m is the number of individual safety fault classifiers and is an odd number, is the safety fault type, and x is the input safety fault sequence. When more than one-half of the individual safety fault classifiers can correctly classify the safety fault sequence, the safety fault integrated classifier can correctly classify the following:
Assume that the error rate of each safety fault classifier is independent of each other. Then, according to the Hoeffding inequality, it can be known that the detection error rate of the safety fault combination classifier is as follows:
It can be known from formula (3) that when the number of individual safety fault classifiers increases, the error rate of the safety fault combination classifier will become lower and lower.
According to whether there is a strong correlation between the individual learners, the current integrated learning methods can be divided into two categories. In the first category, the individual learners have a strong dependency relationship. In this case, serial generation must be used to integrate learning. The main representative of this method is Boosting. On the other hand, the dependency relationship between individual learners is low, and it does not have strong correlation. In this case, use a parallelization method that can be generated simultaneously. The main representatives of this method are Bagging and Random Forest.
Since the safety fault diagnosis in the industrial control environment has high real-time requirements, if the serial inheritance method is used, the result selection of each round of the training set is related to the learning results of the previous rounds. This process has a large time overhead, and it is difficult to meet the real-time requirements of industrial control systems. Therefore, this paper adopts a parallelized integrated learning model to adaptively diagnose industrial control equipment safety faults.
3.1. Integrated Model of Safety Fault Diagnosis Based on Bagging
In the safety fault diagnosis of industrial control systems, if a safety fault diagnosis integrated model with strong generalization ability is needed, the individual safety fault learners in the integrated model should be as independent as possible from each other [30‐32]. However, due to the impact of the industrial control environment, it is almost impossible to achieve independence from each other in actual situations. How to make a relatively large difference between safety faulty individual learners has become the key to research.
As a representative of parallelized integrated learning, the Bagging algorithm has broad application prospects in the industry. Assume that the samples of the safety fault data set in the current industrial control environment are all safety fault sequences under the parameter , where . The safety fault data under the parameter need to be detected, where .
The basic process of the fault diagnosis method based on the Bagging algorithm is as follows.
First, the training set composed of device state data which are divided and sampled, and several different subtraining sets are obtained. Then, the safe failure data set expansion method based on periodic overlap sampling is used to expand the safe failure data set. Next, through deep model convolution training, we train multiple individual fault learners from these subtraining sets. Then, we input the failure training under the changing parameters. Through voting decisions, we can get the fault category output.
The basic flow of the safety fault diagnosis method based on the Bagging algorithm is shown in Figure 2.
First, the safety fault data set augmentation method based on periodic overlapping sampling is used to expand the safety fault data set. It is assumed that the enhanced data set contains m safety fault samples. A random sample is selected from the safety faulty data set and added to the sampling set. After the selection is completed, the safety faulty data set is returned. In this way, the sample may be selected next time. After repeating this process m times, a sampling set containing m samples can be obtained. In this way, some samples in the initial failure data set will appear multiple times in the sampling set, while some will not appear in the sampling set. With this sampling method, the probability that a sample will never be selected is
It can be seen from Figure 3 that when the number of samples m of the safety faulty data set is sufficiently large, the probability that a sample is not always sampled tends to be stable. It can be known from formula (4) that with the increase in the safety faulty data set, about 63.2% of the samples appear in the sampling set. By repeating the above process k times, respectively, we can get k sample sets containing m samples.
Using the convolutional neural network-based safety fault diagnosis method as an individual learning algorithm, the sample set under each parameter is trained to obtain an individual learner. Then, these individual safety fault learners are combined to predict the safety fault category through a voting strategy. If there are multiple categories with the highest number of votes after voting, a failure type is randomly selected as the algorithm output.
The complexity analysis of the safety fault detection method based on Bagging is performed below. Assume that the computational time complexity of each individual safety fault learner is . Because the Bagging method supports parallel computing, it is assumed that the time for context switching between individual safety fault learners is and the time chosen for voting is . Therefore, the time complexity of the Bagging algorithm is . In fact, compared with the calculation time of individual learners, the time for context switching and the time for voting selection in Bagging parallel computing are negligible. Therefore, in terms of time complexity, the Bagging-based safety fault diagnosis integration method has only a constant level gap compared with individual learners and has higher performance.
3.2. Combination Strategy of Safety Fault Diagnosis Based on Voting Method
Common basic integration strategies include voting, averaging, and learning. These methods are relatively simple but very powerful. For classification problems such as industrial control equipment safety fault diagnosis, the combination strategy based on the voting method has been widely used and has achieved good results. The voting method adopts the principle that the minority is subject to the majority. The individual safety fault classifier needs to select a predictive label from . Further, the voting method can be subdivided into absolute majority voting method, relative majority voting method, and so on.
For the absolute majority voting method, for a certain safety fault category , if it gets more than half of the individual safety fault classifiers, then this category is used as the output category; otherwise, prediction is rejected. The formula is as follows:
Here, represents the counting function and represents all prediction results.
For the relative majority voting method, the strategy will choose the type with the most votes as the final prediction result. If there are multiple safety fault types with the same number of votes and the highest votes, then randomly select one of these safety fault types as the final output, which is
4. Method for Adjusting Voting Weight Based on Dynamic Programming
Each of the existing k sampling sets contains m safety fault samples. Because these sample sets are collected from the initial failure data set and the initial data set contains n failure samples under different parameters, the sample type is . How to determine the initial weight of the voting weight adjustment method based on dynamic programming and the weight state transition equation is the key to whether the Bagging-based ensemble learning method can adaptively predict the safety fault sequence under unknown parameters.
In theory, each individual classifier should have the same initial weight. In theory, each individual classifier should have the same initial weight because they are all extracted from the initial failure data set. Due to the randomness of the extraction, the proportion of parameters corresponding to the safety fault sequences in the m sampling sets cannot be exactly the same. For the fault sequence to be detected under parameter θt, if the distance between θi and θt is closer, theoretically, the fluctuation and amplitude of the fault sequence θt can be considered to be more similar to those of the fault sequence to be detected. In other words, under the parameter, the individual learner should have better generalization ability, compared with the individual learner under the parameters which are far from .
Based on the analysis, the initial voting weight determination process of the individual learner is as follows.
First, according to the distance between and , set the parameter distance influence coefficient . The closer the parameter is, the larger the value of the influence coefficient is, where .
Second, calculate the proportion of data under each parameter in each sampling set as , .
Third, the initialization weight parameters are solved, and the solution formula is as follows:
Here, k is the number of individual failure learners. At this point, the determination of the initial parameters of the voting weight is completed. Next, the state transition equation of the voting weight parameters is determined.
The weighted state of each individual safety fault learner vote is transferred according to the previous safety fault detection results. If the previous detection is successful, then the safety faulty individual learner is considered to have a better effect under the parameter , and the weight of the safety faulty individual learner should be appropriately increased. If the previous safety fault detection fails to predict success, the voting weight of the safety faulty individual learner should be appropriately reduced.
The proportion of weight increase or decrease is also dynamically changed. The proportion of change is determined by the scale of the test data and the accuracy of the safety fault detection. When the data size is small, the proportion of weight adjustment is relatively small regardless of the success or failure of the individual safety fault learner detection. Because the statistical characteristics are not obvious when the data are small, there may be some chance. Conversely, when the data size is large, the overall performance of individual learners can be well reflected. If the safety faulty individual learner has shown good results in the previous detection, if it successfully detects the current test sample, the adjustment proportion of weight increase will also increase; if the current sample fails, the weight reduction is relatively small. If the safety fault learner did not perform well in the previous detection and then if it successfully predicts the current safety fault sample, then the proportion of weight improvement is small. If the current failure test sample prediction fails, the reduction in voting weight is also relatively large. The specific voting weight transfer equation is as follows:
Here, is the weight to be calculated, is the last voting weight, the function represents a weighted incremental function of the number of test failure samples and detection accuracy when the detection is correct, the function represents a weight reduction function between the number of test failure samples and the detection error rate when detecting errors, represents the number of failure samples currently tested, represents the detection result of the j-th individual safety fault learner on the i-th test set, is the failure data sample space, and represents the true function of the safety fault category.
In the integrated safety fault diagnosis model based on Bagging, a combination of absolute majority voting and relative majority voting is used to vote. When the absolute majority voting method can directly determine the failure type, the type selected by voting is output; when the absolute majority voting method cannot determine the failure type, the relative majority voting method is used for selection.
5. Experimental Results and Analysis
This paper uses the CWRU bearing fault data set to verify the feasibility of the proposed fault diagnosis method in the experiment. The CWRU bearing failure data set comes from the Laboratory of Case Western Reserve University (http://www.grouplens.org/node/73). In this experiment, the data set is divided into normal data, 12K drive end bearing fault data, 48K bearing drive end fault data, and 12K bearing fan end fault data according to different frequencies. For each type of data, fault samples are collected under four different loads (0 hp, 1 hp, 2 hp, and 3 hp). This experiment uses the 12K drive end bearing fault data to carry out the experiment. The fault locations are divided into inner ring faults, outer ring faults, and rolling element faults. The fault diameters of different parts are divided into four types: 0.007 inch, 0.014 inch, 0.021 inch, and 0.028 inch. For outer ring failures, there are no samples with a failure diameter of 0.028 inches. Therefore, with normal data, there are a total of 12 fault states. In order to verify the diagnostic capability of the integrated fault diagnosis model based on Bagging in the dynamic industrial control system environment, this experiment will use different loads as dynamic conditions to verify the diagnostic capability of the model. The specific experimental strategy is as follows.
The safety fault data set used in the experiment contains safety fault samples with load sizes of 0 hp, 1 hp, 2 hp, and 3 hp. This experiment uses safety fault data under three loads for integrated training and uses the trained model to classify the data set under another load. The safety fault data sets are defined under 0 hp, 1 hp, 2 hp, and 3 hp as A, B, C, and D data sets. There are four groups of experiments for verification. The specific experimental groups are shown in Table 1.
Here, there are 18,000 samples of each type of failure for each set of data. In the following, the first group of experiments is taken as an example for analysis. Since the Bagging-based safety fault diagnosis integrated model uses a voting mechanism for classification, the number of learners to be learned is set to an odd number to facilitate voting. In this experiment, the number of learners to be learned was set to 9. The samples of the three types of data sets A, B, and C are mixed first, and the mixed data set is called S. S is divided into 9 parts, and make sure that each data set does not contain duplicate samples. The sample distribution of each sampling set after the equalization is shown in Table 2.
For each sampled data set, the training set is selected by the self-sampling method and the uncollected data set is used as the validation set. The self-collection of the data set corresponding to each safety fault diagnosis model after self-sampling is shown in Table 3:
As shown in the table, the self-sampled sample data distribution of each individual learner to be learned is obtained. The traditional safety fault diagnosis model based on convolutional neural network learns the training set to obtain each safety faulty individual learner. Then, the existing model is used for ensemble learning. The experimental data are shown in Table 4.
Using the safety fault data sets A, B, and C to diagnose the data of the predicted safety fault data set D, the average diagnostic accuracy of the individual learner is 91.23% and the average accuracy of the integrated learner is 94.93%; with the safety fault data sets A, B, and D, the data of the prediction failure data set C are diagnosed. The average diagnosis accuracy of the individual learner is 91.53%, and the average accuracy of the integrated learner is 95.79%. The failure data sets A, C, and D are used to predict the failure data set B. Data Diagnosis. The average diagnosis accuracy rate of individual learners is 91.95%, and the average accuracy rate of integrated learners is 95.86%. The safety fault data sets B, C, and D are used to diagnose the data of predicted safety fault data set A. The average diagnostic accuracy of the individual learner is 91.48%. The diagnostic accuracy is 91.48%, and the average accuracy of the integrated learner is 95.41%. The experimental results show that the safety fault diagnosis method based on ensemble learning can achieve higher diagnostic accuracy under the dynamic changes of equipment parameters.
From the above experimental results, we can clearly see the advantages of the industrial control equipment safety fault detection algorithm based on integrated learning:(i)We deal with the dynamic factors of the industrial control system and propose a fault diagnosis model of industrial control equipment based on ensemble learning. This effectively improves the adaptive ability of the model and makes the fault diagnosis framework truly intelligent.(ii)We use the voting weight adjustment method based on dynamic programming to further improve the reliability of voting.
It has been verified that the safety fault detection model of industrial control equipment based on integrated learning that we proposed can efficiently deal with the dynamic factors of the industrial control system.
Aiming at the problem of dynamic changes of actual parameters in the environment of intelligent manufacturing and Industrial Internet of Things, this paper focuses on the safety fault diagnosis of industrial control equipment. This paper studies how to self-adaptively complete the safety fault diagnosis through the existing model and innovatively proposes a safety fault diagnosis method based on the integrated learning model. This effectively improves the adaptive ability of the model and makes the fault diagnosis framework truly intelligent. Furthermore, this method is based on the voting weight adjustment method which effectively improves the reliability of voting.
The raw/processed data required to reproduce these findings cannot be shared at this time as the data also form part of an ongoing study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This paper is partially supported by the National Key R&D Program of China (No. 2020YFB1805503), Jiangsu Province Modern Education Technology Research Project (84365), National Vocational Education Teacher Enterprise Practice Base “Integration of Industry and Education” Special Project (Study on Evaluation Standard of Artificial Intelligence Vocational Skilled Level).
T. Chen, D. J. Hill, and C. Wang, “Distributed fast fault diagnosis for multimachine power systems via deterministic learning,” IEEE Transactions on Industrial Electronics, vol. 67, no. 5, pp. 4152–4162, 2019.View at: Google Scholar
M. Mousavi, M. Moradi, A. Chaibakhsh et al., “Ensemble-based fault detection and isolation of an industrial Gas turbine,” in Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2351–2358, IEEE, Toronto, Canada, October 2020.View at: Google Scholar
T. G. Dietterich, “Ensemble learning,” The Handbook of Brain Theory and Neural Networks, vol. 2, pp. 110–125, 2002.View at: Google Scholar
Y. Freund and R. E. Schapire, “Experiments with a new boosting algorithm,” icml., vol. 96, pp. 148–156, 1996.View at: Google Scholar
T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 785–794, San Francisco, CA, USA, August 2016.View at: Google Scholar
J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of Statistics, pp. 1189–1232, 2001.View at: Google Scholar
Y. S. Huang and C. Y. Suen, “The behavior-knowledge space method for combination of multiple classifier,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. 347, Institute of Electrical Engineers Inc (IEEE), July 1993.View at: Google Scholar
K. Sentz and S. Ferson, “Combination of evidence in Dempster-Shafer theory,” Sandia National Laboratories, Albuquerque, New Mexico, 2002, Technical Report.View at: Google Scholar
T. Stathaki, Image Fusion: Algorithms and Applications, Elsevier, New York, NY, USA, 2011.
S. Zhao, S. Li, L. Qi, and L. Xu, “Computational intelligence-enabled cybersecurity for the Internet of Things,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 4, no. 5, pp. 666–674, 2020.View at: Google Scholar