#### Abstract

With the rapid development of Internet of things and information technology, wireless sensor network technology is widely used in industrial monitoring. However, limited by the architecture characteristics, software and hardware characteristics, and complex external environmental factors of wireless sensor networks, there are often serious abnormalities in the monitoring data of wireless sensor networks, which further affect the judgment and response of users. Based on this, this paper optimizes and improves the fault detection algorithm of related abnormal data analysis in wireless sensor networks from two angles and verifies the algorithm at the same time. In the first level, aiming at the problem of insufficient spatial cooperation faced by the network abnormal data detection level, this paper first establishes a stable neighbor screening model based on the wireless network and filters and analyzes the reliability of the network cooperative data nodes and then establishes the detection data stability evaluation model by using the spatiotemporal correlation corresponding to the data nodes. Realize abnormal data detection. On the second level, aiming at the problem of wireless network abnormal event detection, this paper proposes a spatial clustering optimization algorithm, which mainly clusters the detection data flow in the wireless network time window through the clustering algorithm, and analyzes the clustering data, so as to realize the detection of network abnormal events, so as to retain the characteristics of events and further classify the abnormal data events. This paper will verify the realizability and superiority of the improved optimization algorithm through simulation technology. Experiments show that the fault detection rate based on abnormal data analysis is as high as 97%, which is 5% higher than the traditional fault detection rate. At the same time, the corresponding fault false detection rate is low and controlled below 1%. The efficiency of this algorithm is about 10% higher than that of the traditional algorithm.

#### 1. Introduction

As the product of the cross development of information technology and Internet of things technology, wireless sensor network technology is widely used in various scenarios such as environmental monitoring, ecological monitoring, and urban traffic [1, 2]. Conventional wireless sensor networks mainly lay out a large number of corresponding microsensors in the corresponding monitoring area and form a multidetection and self-organized wireless sensor network. Through a large number of wireless sensor networks, they cooperate to sense, collect, and process the perceived objects and environment and finally carry out data transmission and processing in the form of wireless sensor network transmission [3, 4]. With the development of wireless sensor network, it mainly has three operation characteristics. One of its characteristics is distributed and self-organizing. In this process, the corresponding nodes can monitor and analyze each other through corresponding algorithms. The corresponding data nodes have self-organizing ability, and the corresponding network nodes will automatically configure and manage; it will not affect the operation of the sensor network [5–7]. The second corresponding feature is that the corresponding sensor network data has large scale and high density, and the redundant information between corresponding nodes can realize cooperative work [8]. The third corresponding feature is that the node energy corresponding to wireless sensor networks is relatively limited. The main energy consumption of conventional wireless sensor networks is mainly based on communication, which increases with the communication distance. Therefore, the main data transmission mode in many wireless sensor networks is multihop mode [9–11].

In wireless sensor networks, there are often abnormal detection data, and at this time, the abnormal data is often much higher or much lower than the conventional data [12]. Conventional data anomalies mainly exist in two aspects. The first aspect is that data anomalies come from external emergencies. Emergencies cause environmental anomalies, resulting in abnormal monitoring data. The corresponding anomalies mainly include the collected data anomalies of temperature and humidity, which are significantly different from daily or other data. On the other hand, there are software or hardware failures in the nodes of the wireless sensor network. The corresponding hardware failures are mainly the obvious failure of the hardware sensor of the sensor node, the damage of the corresponding hardware system, and the corresponding sensor cannot transmit and analyze the data in time. Therefore, the node automatically exits the corresponding data network; the corresponding software failure is reflected in the software failure in the corresponding data node. Although the corresponding data information can still be routinely transmitted and analyzed, the actually collected data is often in an abnormal state, so the corresponding reading is an error at the source [13–15]. Detecting abnormal data is an important event in wireless sensor network. When abnormal data is collected in a certain area, it is necessary to timely process and analyze the corresponding data and take corresponding countermeasures immediately. When the so-called software failure occurs in the corresponding data node, it is also necessary to use human intervention. Based on this data anomaly detection and fault handling algorithm, there are three main optimization algorithms. First, a more optimized wireless network communication protocol optimization algorithm is designed to improve its reliability. In the second layer, more powerful wireless sensor network nodes are designed to further reduce the failure rate of sensor network data nodes, so as to ensure the stability and reliability of the system network. The third level is to optimize the design of the application system and improve the performance and stability of the system from the platform itself, so as to further ensure the reliability and superiority of the system [16–18].

In view of the above corresponding research status and existing problems, this paper will optimize the fault detection algorithm based on abnormal data analysis in wireless sensor networks from two levels and simulate the corresponding algorithm. Firstly, aiming at the problem of insufficient spatial cooperation at the level of abnormal data detection, this paper first establishes a stable neighbor screening model to filter and analyze the reliability of the corresponding cooperative data nodes and then establishes a detection data stability evaluation model by using the corresponding spatiotemporal correlation of the data nodes, so as to realize abnormal data detection. Aiming at the problem of abnormal event detection, this paper proposes a spatial clustering optimization algorithm, which mainly clusters the detection data flow in the time window through the clustering algorithm based on the time correlation of data nodes, and analyzes the clustering data, so as to realize the detection of events, so as to preserve the characteristics of events and classify abnormal data events, to determine whether the abnormal value is an event reading. This paper will verify the realizability and superiority of the optimized and improved algorithm through simulation technology. The experiment shows that the fault detection rate based on abnormal data analysis is as high as 97%, which is 5% higher than the traditional fault detection rate. At the same time, the corresponding fault false detection rate is low, which is controlled below 1%.

The structure of this paper is as follows: the second section of this paper will analyze and study the current research status of fault detection algorithm based on data anomaly in wireless sensor networks; In the third section, the fault detection algorithm based on abnormal data analysis in wireless sensor networks is optimized from two levels, and the corresponding algorithm is simulated. The fourth section of this paper is mainly validation experiment and analysis. Finally, this paper will be summarized.

#### 2. Correlation Analysis: The Current Research Status of Fault Detection Algorithm Based on Data Anomaly in Wireless Sensor Networks

At present, there are many research points on wireless sensor network technology and corresponding network data anomaly detection technology. A large number of researchers and research institutions have studied and analyzed the corresponding research contents. At present, the algorithms for abnormal data detection of wireless sensor networks are mainly focused on five points, which correspond to the abnormal data detection algorithm based on statistical model, the abnormal data detection algorithm based on adjacent comparison, the abnormal data detection algorithm based on data clustering analysis, the abnormal data detection algorithm based on classification discrimination, and the abnormal data detection algorithm based on spectrum decomposition [19, 20]. Based on the above five kinds of detection algorithms, a large number of researchers have optimized and expanded. Relevant research institutions in Europe first summarized the types and corresponding technical characteristics of wireless sensor network technology and summarized the corresponding detection technologies based on the corresponding induction method [21]. Relevant Japanese scientific research institutions use the Gaussian model in statistics to identify whether the data nodes in the wireless sensor network are abnormal through the corresponding threshold judgment and judge whether the cause of the abnormality comes from the node or the external environment. However, such methods seriously ignore the time correlation between nodes. As a result, the accuracy of the corresponding monitoring nodes is low. Relevant research institutions in the United States propose to monitor abnormal data based on clustering algorithm, but the corresponding parameters of the algorithm are complex, and the corresponding amount of calculation is relatively large [22]. Relevant research institutions in Japan can detect abnormal data according to distance, which mainly uses the similarity of data nodes in the whole network to identify local abnormal nodes, and reconfirm and judge the abnormal nodes through the neighbor nodes of the abnormal nodes. This method is relatively flexible, but this method actually has too much calculation when the wireless sensor network is large [23]. Based on the problem of large amount of calculation of the above algorithms, relevant researchers propose a classification based algorithm for abnormal data monitoring. The corresponding algorithm is mainly an adaptive distributed abnormal data detection algorithm, which has high recognition accuracy and is suitable for advanced algorithms such as neural networks [24]. In view of the data anomalies caused by external environmental factors, relevant researchers proposed an algorithm technology based on principal component analysis, which mainly selects the main components of the data object for mathematical modeling and carries out anomaly verification analysis based on adjacent data. The algorithm is essentially a dimensionality reduction idea, which mainly uses the main components of information to realize the search and analysis of normal data [25]. In the research on the recovery algorithm after corresponding abnormal data detection, its main algorithms include data recovery algorithm based on reliable transmission mechanism, data recovery algorithm based on perceived data correlation, and data recovery algorithm based on data spatiotemporal correlation theory [26–28].

#### 3. Fault Detection Algorithm for Abnormal Data Analysis in Wireless Sensor Networks

This section mainly studies and analyzes the fault detection algorithm based on abnormal data analysis in wireless sensor networks. The corresponding detection level mainly includes two levels: abnormal data detection level and abnormal event detection level. The corresponding schematic block diagram is shown in Figure 1, it can be seen from the figure that the corresponding two levels of abnormal data detection, the corresponding detection principle block diagram and detection algorithm, and the corresponding abnormal data recovery algorithm principle and some simulation process diagrams are also given in the corresponding figure.

##### 3.1. Optimization Analysis of Abnormal Data Detection Algorithm

The conventional anomaly data detection is mainly divided into centralized and distributed. The method used in this paper is centralized anomaly data detection, which is mainly based on the spatiotemporal correlation principle of perceived data. The assumptions of the abnormal data detection algorithm proposed in this paper are as follows: is defined as the probability of data failure of all data nodes in wireless sensor networks. The corresponding single node in the wireless sensor network is represented by data . The corresponding represents the sensing data node corresponding to the corresponding single node . The corresponding parameters of the corresponding detection accuracy rate and the corresponding false alarm rate are and far, respectively, and the corresponding calculation formulas are shown in

Based on the above assumptions, the corresponding mathematical model of wireless sensor network is established. The corresponding difference cooperation model is shown in formula (3), in which the corresponding represents the difference threshold. The selection of the threshold depends on the different application scenarios. The model is mainly used to feed back the spatiotemporal correlation characteristics of data.

Based on the above difference cooperation model, the corresponding offset evaluation model is further derived. The model is mainly established based on the idea of spatiotemporal cooperation. The corresponding model formula is shown in formula (4). In the corresponding formula, represents the cooperation coefficient between historical data and corresponding neighbor data, and its corresponding value range is based on 0-1. When the corresponding value is 0, it completely depends on the neighbor data for judgment processing; when the corresponding value is 1, it completely depends on the adaptive historical data for judgment analysis.

At the level of corresponding node exception judgment, it mainly depends on the state corresponding to the function . When the corresponding function , the corresponding data state is judged as an abnormal state, and when the corresponding function , the corresponding data state is judged as a normal state. Based on the decision function, the current node data and neighbor data are compared and analyzed. The corresponding comparison function is defined as as shown in formula (5). When the corresponding comparison difference in the function is less than the selected threshold, the corresponding two results are similar. At this time, the return value of the corresponding comparison function is 0; otherwise, the return value is 1.

Based on the above analysis, it can be concluded that the evaluation rule of the corresponding data node at this time is shown in formula (6). The corresponding value in the formula represents the number of corresponding adjacent nodes. This number of nodes is used to determine the corresponding operation state of a specific node. When the corresponding , it is determined that the node is normal.

In order to further determine whether the node data is abnormal, it is necessary to analyze the corresponding offset of the data node. The corresponding data offset of the corresponding data node always changes at different times, but there are many factors affecting its continuous change. The main factors include node failure and external environment change. The corresponding offset calculation steps are shown in Figure 2. From the figure, it can be further summarized that the corresponding calculation steps are as follows:

*Step 1. *Based on the principle of time correlation, the historical data collaboration window is used to sample the historical data of the node data at the current time and calculate the offset between the historical data and the sampling data at the current time.

*Step 2. *Establish a reliable nearest neighbor node data set based on the principle of spatial correlation, and obtain the offset between the current corresponding sampling data node and the reliable neighbor node.

*Step 3. *Comprehensively evaluate the offset of nodes by means of time-space cooperation.

*Step 4. *Repeat the above steps for all nodes to obtain the final offset of the data node.

Based on the above analysis, the algorithm flow corresponding to the abnormal data detection algorithm proposed in this paper can be further obtained. The corresponding algorithm is mainly divided into two levels, which correspond to reliable neighbor node selection and corresponding abnormal node data judgment. The corresponding algorithm flow chart is shown in Figure 3. From the figure, we can see the details of the corresponding algorithm flow are as follows.

At the corresponding reliable neighbor selection level, the drug loading lies in the generation of neighbor node data set and the operation of state prediction algorithm. For abnormal data detection, it is mainly the calculation and analysis of data node offset. When calculating the offset, it is necessary to determine the corresponding fluctuation initialization threshold in advance and continuously evaluate and analyze the corresponding offset. In the corresponding abnormal data detection part, we need to focus on the selection of threshold value. When the corresponding threshold value is small, the data angle will not be able to obtain reliable adjacent nodes. At the same time, the dynamic settings in the corresponding algorithm will fail, which will also cause error detection in the system abnormal data detection algorithm.

Based on the above analysis, the corresponding fault detection algorithm based on abnormal data in wireless sensor networks can be obtained.

##### 3.2. Optimization Analysis of Abnormal Event Detection Algorithm

At the level of abnormal event detection, this paper mainly adopts the optimized clustering algorithm for processing. Its main core idea is to divide the node data into subspaces, divide the corresponding data belonging to the same space into one kind of data, and keep the corresponding data of various subspaces independent. The corresponding clustering process is shown in Figure 4; the corresponding subspace data clustering details are as follows:

*Step 1. *Establish the model with multiple corresponding multidimensional data, and establish the corresponding coefficient matrix with Lagrange multiplier method.

*Step 2. *Establish an undirected weighting graph based on the above coefficient matrix, and use the coefficient matrix to establish the corresponding similarity matrix.

*Step 3. *Cluster the undirected weighted graph based on the corresponding normalized segmentation algorithm.

Based on the basic principle of the above clustering process, the current collected data and historical data of each data node of the wireless sensor network are clustered. In the clustering process, set the corresponding time window length as , and the data contained in this window are the collected data corresponding to the current node and the historical data collected during n periods. When detecting each group of data, add the sensing data of a new group of nodes in advance, and delete the farthest group of data of the time node at the same time; the corresponding time window moves a certain time node to the right. The principle block diagram of the processing flow based on the corresponding time node is shown in Figure 5. Figure 5 completely shows the clustering process of wireless sensor networks. The clustering of data nodes in a corresponding cycle is finally terminated by adding new nodes and deleting old nodes.

Based on the above schematic diagram, the corresponding abnormal event detection process is summarized as follows:

*Step 1. *The abnormal event detection algorithm has started the corresponding monitoring when the data node is working. At this time, the collected node data is often in the normal state. When the corresponding time window slides, when the corresponding event fails or makes an error, the new data belongs to the fault data set.

*Step 2. *During each clustering operation, the distance between the two data sets needs to be calculated at the same time. When the distance between the corresponding data sets is lower than the threshold set by the system, it is determined that there is no obvious difference between the two data sets. At this time, the data set is in normal state, otherwise the data set is abnormal.

*Step 3. *When the corresponding event is in the clustering calculation, the corresponding data set is in the abnormal state and the normal state; then, the corresponding data set is determined as a soft fault. However, when the corresponding data set has a stable abnormal state for a long time and returns to normal after being stable, the corresponding data set is called the event exception caused by network events.

*Step 4. *Iterate and judge all node data sets based on the above steps.

The flow of the corresponding optimized and improved abnormal event detection algorithm is shown in Figure 6. From the figure, we can see the advantages of the improved algorithm compared with the traditional algorithm.

Based on the above analysis and research, the anomaly detection based on two levels proposed in this paper improves and optimizes the traditional algorithm in the algorithm principle and subdivides the corresponding anomaly, so as to improve the detection accuracy of different faults in wireless sensor networks and improve the detection accuracy.

#### 4. Experiment and Simulation

Abnormal data mainly refers to the abnormal situation of the environment under the corresponding index detected by the wireless sensor network. The wireless sensor system analyzes the state based on the data collected by relevant sensors and provides judgment basis for technicians. In this paper, the algorithm is verified and compared by simulation. The corresponding experimental conditions are as follows.

The corresponding experimental data set comes from the test data set published on the network. The corresponding sensor network layout is shown in Figure 7. It can be seen from the figure that the sensor will detect and collect various data of the environment every 1 min, and the corresponding data includes temperature, humidity, light, voltage, and current values. In the actual processing, the corresponding original data needs to be preprocessed and analyzed. The preprocessing process is mainly based on the time window.

The corresponding experiments in this paper are mainly divided into two categories. The first category mainly verifies the performance of the abnormal data detection optimization algorithm. The main experimental faults include the verification experiment of the algorithm performance in the pre calibration stage, the corresponding algorithm verification under the offset reading fault type, and the algorithm detection under the random reading fault type. The corresponding second category is mainly the verification of event anomaly detection optimization algorithm.

For the performance test of the algorithm before calibration, the experiment mainly verifies its detection accuracy and the corresponding error detection rate. In the actual experiment, the corresponding anomaly threshold is set to 0.5. Based on this threshold, the simulation experiments of the optimized algorithm and the traditional algorithm are carried out, respectively. The corresponding experimental results are shown in Figure 7. It can be seen from the figure that the detection accuracy of the optimization algorithm proposed in this paper is significantly higher than that of the traditional algorithm, and its corresponding detection accuracy is as high as about 96%. At the level of error detection rate, the optimization algorithm proposed in this paper can be controlled below 0.1% with the increase of data samples, while the corresponding traditional algorithm shows a straight-line upward trend.

For the corresponding algorithm verification under the fault type of offset reading, the corresponding algorithm verification environment is similar to the algorithm in the pre calibration stage, and the corresponding experimental results are shown in Figure 8. It can be seen from the figure that compared with the traditional algorithm, the optimization algorithm proposed in this paper has higher detection accuracy and lower error detection rate when dealing with the fault type of offset reading. It can be seen from the data in the figure that the detection accuracy corresponding to the algorithm in this paper is maintained at more than 97.3%, and the maximum error between the corresponding traditional algorithm and the algorithm in this paper increases with the increase of the amount of data.

For the corresponding algorithm verification under the random reading fault type, the corresponding algorithm verification environment is similar to the algorithm in the precalibration stage, and the corresponding experimental results are shown in Figure 9. It can be seen from the figure that compared with the traditional algorithm, the optimization algorithm proposed in this paper has higher detection accuracy and lower error detection rate when dealing with the fault type of offset reading.

In order to verify the superiority of the improved clustering algorithm for abnormal events, the following environment settings are made in the experimental simulation: assuming that 300 sensors are evenly distributed in a square area, the sensors are mainly used to detect the temperature and humidity of the environment, the distance between the corresponding sensors is set to 300, and the corresponding sensor node distribution diagram is shown in Figure 10. Each sensor in the corresponding figure executes the optimized clustering algorithm. The abnormal events constructed in this paper are set as fire events within the sensor range, and the corresponding fire event range has also been framed in Figure 10.

Based on the above simulation conditions, each data node is processed by clustering algorithm, so as to continuously detect whether the node itself has abnormal node values. The corresponding detection results are shown in Figure 11. The corresponding circle in the figure represents normal node data, and the corresponding cross represents abnormal node data. Based on the corresponding abnormal event detection accuracy and error detection rate, as shown in Figure 12, it can be seen from Figure 12 that the optimization algorithm proposed in this paper has higher detection accuracy and lower error detection rate when dealing with the fault type of offset reading compared with the traditional algorithm.

In order to further verify the performance of the abnormal data recovery algorithm in this algorithm, this paper also carries out data recovery experiments on a typical abnormal data. At this time, it is mainly to judge the corresponding collaborative estimation value and preset value of the data. When the corresponding error is smaller, the better the corresponding data recovery effect is. From the data recovery results, it can be seen that the collaborative estimation value of the algorithm proposed in this paper is about 19.2111, which is 0.02413 different from the preset value, which is far lower than the traditional difference of 0.21511. Therefore, the recovery algorithm proposed in this paper has better data recovery effect.

Based on the above experiments, it can be seen that compared with the traditional algorithm, the proposed algorithm has obvious advantages in the corresponding detection accuracy and error detection rate, and the corresponding data recovery algorithm also has obvious advantages.

#### 5. Conclusion

This paper mainly analyzes the research status and disadvantages of fault detection algorithm and simulation technology based on data anomaly analysis in wireless sensor networks. Based on the problem of abnormal data and abnormal event detection, this paper optimizes and improves the fault detection algorithm of related abnormal data analysis in wireless sensor networks from two angles and verifies the algorithm at the same time. In the first level, aiming at the problem of insufficient spatial cooperation faced by the network abnormal data detection level, this paper first establishes a stable neighbor screening model based on the wireless network, filters and analyzes the reliability of the network cooperative data nodes, and then establishes the detection data stability evaluation model by using the spatiotemporal correlation corresponding to the data nodes. Abnormal data detection was realized. On the second level, aiming at the problem of wireless network abnormal event detection, this paper proposes a spatial clustering optimization algorithm, which mainly clusters the detection data flow in the wireless network time window through the clustering algorithm, and analyzes the clustering data, so as to realize the detection of network abnormal events, so as to retain the characteristics of events. And further classify the abnormal data events. This paper will verify the realizability and superiority of the improved optimization algorithm through simulation technology. Experiments show that the fault detection rate based on abnormal data analysis is as high as 97%, which is 5% higher than the traditional fault detection rate. At the same time, the corresponding fault false detection rate is low and controlled below 1%. The efficiency of this algorithm is about 10% higher than the traditional algorithm. On the whole, the optimization algorithm proposed in this paper has obvious advantages compared with the traditional detection algorithm. In the actual test process, this paper finds that this algorithm has disadvantages such as large algorithm loss and complex algorithm when the amount of abnormal data is large. Therefore, in the subsequent research, this paper will focus on the lightweight of the algorithm to reduce the algorithm consumption in the case of complex abnormal data.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.