Abstract

The current detection schemes of malicious nodes mainly focus on how to detect and locate malicious nodes in a single path; however, for the reliability of data transmission, many sensor data are transmitted by multipath in wireless sensor networks. In order to detect and locate malicious nodes in multiple paths, in this paper, we present a homomorphic fingerprinting-based detection and location of malicious nodes (HFDLMN) scheme in wireless sensor networks. In the HFDLMN scheme, using homomorphic fingerprint and coding technology, the original data is divided into n packets and sent to the base station along n paths, respectively; the base station determines whether there are malicious nodes in each path by verifying the validity of the packets; if there are malicious nodes in one or more paths, the location algorithm of the malicious node is implemented to locate the specific malicious nodes in the path; if all the packets are valid, the original data is recovered. The HFDLMN scheme does not need any complex evaluation model to evaluate and calculate the trust value of the node, nor any monitoring nodes. Theoretical analysis results show that the HFDLMN scheme is secure and effective. The simulation results demonstrate promising outcomes with respect to key parameters such as the detection probability of the malicious path and the locating probability of the malicious node.

1. Introduction

With the rapid development of the Internet of Things, wireless sensor networks (WSNs) are not only widely used in transportation, agriculture, home furnishing, military, environmental monitoring, and other fields [1] but also used in smart city environments [2], smart grid [3], and smart healthcare system [4], and Underwater Sensor Networks (USNs) have become widespread and are being deployed in a wide range of applications ranging from harbor security to monitoring underwater pipelines and fish farms [5] recently. Since WSNs are constructed by a large number of sensor nodes in a wireless and multihop way, and the sensor nodes are restricted by calculation, storage, and communication, they are easy to be captured as malicious nodes by attackers. The existence of malicious nodes is a great threat to the network; by manipulating these malicious nodes, attackers can launch a variety of internal and external attacks [6], for example, monitoring the important confidential information passing through these malicious nodes, injecting a large number of false data into sensor networks, destroying the normal data aggregation process by tampering with the data, launching various DoS attacks, and so on [7, 8]. Malicious nodes in multipath are more harmful because malicious nodes will send false data or pollution data to nodes in multiple paths at the same time, which is easy to cause the pollution data to continue to spread, thus consuming a large number of valuable resources of intermediate forwarding nodes and ultimately shorten the life cycle of the entire wireless sensor network; therefore, it is very important to detect, locate, and isolate the malicious nodes in multipath.

Detection of malicious nodes has always been a hot topic in wireless sensor networks; many scholars have proposed some effective detection schemes of malicious nodes. The current detection schemes of malicious nodes mainly focus on how to detect and locate malicious nodes in a single path; however, for the reliability of data transmission, many sensor data are transmitted by multipath in wireless sensor networks [913]. In order to detect and locate malicious nodes in multiple paths, in this paper, we present a homomorphic fingerprinting-Based detection and location of malicious nodes (HFDLMN) scheme in wireless sensor networks. In the HFDLMN scheme, if the source node wants to send sensor data to the base station (BS), it divides the sensor data into n fragments and then encodes the n fragments to n new fragments; then, using homomorphic fingerprint technology, n packets are generated and sent to the base station along n paths, respectively. After receiving n packets with the same data number from n paths, the base station determines whether there are malicious nodes in each path by verifying the validity of the packets; if there are malicious nodes in one or multiple paths, the location algorithm of the malicious node is implemented to locate the specific malicious nodes in the path; if all the packets are valid, the original data will be recovered.

The main contributions of this paper are as follows. (1) A homomorphic fingerprinting-based detection and location of malicious nodes (HFDLMN) scheme in wireless sensor networks is presented; using homomorphic fingerprint and coding technology, the HFDLMN scheme can detect and locate malicious nodes in multiple paths. (2) In order to detect and locate malicious nodes, the HFDLMN scheme does not need any complex evaluation model to evaluate and calculate the trust value of the node, nor any monitoring nodes. (3) The HFDLMN scheme can resist the malicious node interfere with the base station to detect malicious nodes. (4) Theoretical analysis results show that the HFDLMN scheme is secure and effective, and the simulation results show it can detect and locate malicious nodes with high probability by sending a small number of packets.

The rest of the paper is organized as follows. Section 2 introduces the related work. Preliminaries and the system model are described in Section 3. The HFDLMN scheme is described in Section 4. Proof and analysis of related theorems are described in Section 5. Security analysis is described in Section 6. The performance evaluation is implemented in Section 7. Section 8 concludes this paper.

At present, scholars have done a lot of work for detecting the malicious nodes in wireless sensor networks and have proposed some effective detection schemes. These schemes can be divided into multihop acknowledgment-based detection schemes, trust evaluation-based detection schemes, and statistics classification-based detection schemes.

Balakrishnan et al. [14] proposed a two-hop acknowledgment detection scheme (TWOACK) based on the checkpoint node. In the TWOACK scheme, each node in the forwarding path is the checkpoint node. If a node i receives a packet, it will send an acknowledgment packet to node j that two hops away from it. If node j does not receive the acknowledgment packet, it suspects the link between i and j to be a malicious link and sends a warning to the source node. However, the TWOACK scheme greatly increases conflict and collision of network messages. To solve this problem, Xiao et al. [15] proposed a multihop acknowledgment-based detection scheme (CHEMAS). In the CHEMAS scheme, some nodes in the path from the source node to the base station are randomly selected as checkpoint nodes. After the checkpoint node receives a packet, it will send an acknowledgment packet to its upstream node. If an intermediate forwarding node in the path does not receive the specified number of acknowledgment packets, it will suspect that its next-hop node is a malicious node and send a warning to the source node. Although CHEMAS can greatly reduce the conflict and collision of network messages, if two or more malicious nodes are selected as checkpoint nodes in the CHEMAS scheme, the collusion of these malicious nodes will make the CHEMAS scheme invalid. In order to solve this problem, Liu et al. [16] proposed a new scheme based on multihop acknowledgment mechanism (PHACK). In the PHACK scheme, in order to detect and locate malicious nodes, each node in the forwarding path not only needs to forward normal packets but also needs to generate an acknowledgment packet for each packet and send it to the source node along a different path. However, these schemes based on multihop acknowledgment need to transmit a large number of confirmation packets, which will increase high communication overhead and greatly reduce the network life.

To improve the effect of malicious nodes detection, Yang et al. [17] proposed a malicious node detection model based on reputation with enhanced low energy adaptive clustering hierarchy, MNDREL. Based on the enhanced routing protocol, the cluster head nodes are selected and other nodes form different clusters by choosing the corresponding cluster head. By analyzing the reputation value for the parent node evaluated by the child node, the malicious nodes in the network are effectively identified. The MNDREL model outperformed in detecting malicious nodes in WSN with lower false alarm rate; however, the real-time performance of the MNDREL model has to be improved. Xiao et al. [18] proposed a sensor network reputation model based on Gaussian distribution (GRFSN). In this model, the trust value of each node is obtained by calculating the weight sum of direct reputation and indirect reputation, and finally, compared with the trust threshold, if the trust value of the node is less than the trust threshold, the node is a malicious node. This scheme only needs to determine a trust threshold, but the trust threshold is static, and the misjudgment rate of normal nodes being judged as malicious nodes is high. In order to detect the untrusted nodes in the network quickly and effectively and ensure the reliable operation of the network, Zheng et al. [19] proposed a network security mechanism based on trust management to deal with the threats faced by WSNs (DNSMTM). Based on the trusted access of nodes, this mechanism firstly calculates the local trust degree of nodes according to existing interaction behavior and further obtains the comprehensive trust degree of nodes that can reflect the trust degree of nodes, and the detection of malicious nodes is carried out according to the comprehensive trust degree of nodes. The mechanism can effectively detect malicious nodes, with a higher detection rate, and reduce the energy consumption of nodes. Liao et al. [20] proposed a hybrid strategy monitoring-forwarding game detection scheme to detect selective forwarding attack (MSGSFS). In this scheme, a set of strategies is constructed by integrating factors such as packet loss, data corruption, and forwarding delay. The data sending node and its one-hop neighbor nodes select strategies from the set to perform the monitoring-forwarding game and collect the routing trust value of the suspicious node. In order to locate and isolate malicious nodes in the cluster, a distributed watchdog is run on each cluster head node to monitor and record the forwarding behavior of its one-hop neighbor cluster head node. This scheme can effectively alleviate selective forwarding attack in wireless sensor networks and has less energy consumption. Zhou et al. [21] proposed an improved trust evaluation model based on Bayesian and Entropy (ITEMBB). In this model, the direct trust value of the node is first calculated, and if the direct trust value is not reliable enough, the indirect trust value of the node is calculated. By integrating the direct trust value and the indirect trust value, a comprehensive trust value is obtained, and entropy is used to assign a greater weight to highly trusted nodes. To a certain extent, the model overcomes the limitations of subjective weight allocation, but the problem of static reputation value has not been solved. Zhou et al. [22] combined the neighbor node monitoring and watchdog mechanism to propose a cluster-based selective forwarding attack detection scheme (SMCSF). In this scheme, the nodes in the cluster are divided into three types: cluster head nodes, monitoring nodes, and cluster member nodes; by selecting the monitoring node in the cluster, the monitoring node performs the calculation and adjustment of the comprehensive reputation of the cluster head nodes and cluster member nodes in the cluster. And in this scheme, the monitoring nodes are not only responsible for calculating and adjusting the reputation of the node and judging and detecting malicious nodes in the cluster but also responsible for monitoring whether the cluster head node has malicious behaviors such as data tampering or packet loss during the data forwarding process. Although this scheme can quickly and accurately locate malicious nodes, the responsibility of monitoring nodes is too heavy.

Silva et al. [23] proposed a detecting scheme of malicious nodes based on statistics (IDSBS). The scheme matches and detects the abnormal behavior of nodes through a series of predetermined rules. Because there is no interaction between nodes, the false detection rate of the system is high in the initial stage. Liu et al. [24] proposed a detecting scheme of malicious nodes based on classification (MCMND). In this scheme, first, the multiple attributes of the node are modeled, and then, the known sensor nodes are learned by the multiple classification method based on likelihood. The posterior probability is used to generate a classifier, for any unknown type of nodes, the nodes are classified according to the class with the maximum posterior probability, so as to determine whether a node is a malicious node, but when the number of active nodes in the network is insufficient or the number of packets processed by nodes is small, the false detection rate is high. Aiming at the problem that the existing malicious node detection methods in wireless sensor networks cannot be guaranteed by fairness and traceability of detection process, She et al. [25] present a blockchain trust model (BTM) for malicious node detection in wireless sensor networks. In BTM, through 3D space, it is realized by using blockchain intelligent contract and WSN quadrilateral measurement for localization of the detection of malicious nodes, and the consensus results of voting are recorded in the blockchain distributed. The model can effectively detect malicious nodes in WSNs and ensure the traceability of the detection process, but the consensus method in the model is the traditional POW workload proof method, which requires relatively large computational power and high energy consumption, so it is not especially suitable for the running environment of wireless sensor networks.

Li et al. [26] proposed a distributed and randomized detection algorithm to locate the attackers who inject polluted packets (IPAs). In this scheme, each node i maintains a set of suspicious nodes. In the beginning, all the neighbors of node i are added to a set of suspicious nodes; if the packets sent by its neighbor nodes are invalid, then the neighbor nodes that send valid packets are deleted from the suspicious nodes set; after n rounds of detection, the nodes in the set of suspicious nodes are malicious neighbors. Although the scheme can effectively detect malicious nodes in the network, it needs n rounds of detection, which will greatly increase the network communication overhead.

To sum up, all kinds of current research schemes have their own characteristics (Table 1). Comparison of advantages and disadvantages of each scheme makes a comparative analysis of relevant work. The detection schemes [1416] based on multihop acknowledgment need to transmit a large number of acknowledgment packets, which will lead to high communication overhead. The detection schemes [1722] based on trust evaluation need more monitoring nodes, which greatly increases the overhead of the network. And the current detection schemes of malicious nodes mainly focus on how to detect and locate malicious nodes in a single path. The HFDLMN scheme proposed in this paper does not need any complex evaluation model to evaluate and calculate the trust value of the node, nor any monitoring nodes, and the HFDLMN scheme can detect and locate malicious nodes in multiple paths.

3. Preliminaries and System Model

3.1. Preliminaries

Homomorphic Fingerprinting. Hendricks et al. first proposed homomorphic fingerprinting in [27]. The fingerprinting functions of homomorphic fingerprinting belong to a family of universal hash functions also. Let denote a field of order , let be the set of fingerprinting key, and let be a deterministic algorithm that outputs monic irreducible polynomials of prime degree with coefficients in ; the polynomials are chosen with probabilities taken over the choice of input uniformly at random; then a fingerprinting function can be defined as . A fingerprinting function is homomorphic if and for any and . Let (encode, decode) be a linear erasure code with coefficients , for ; if , then for a homomorphic fingerprinting function , the following equation holds: , where and .

3.2. Network Model

The sensor network is composed of ordinary nodes, malicious nodes, and base station (BS). Before deployment, each node i is assigned a unique identity , a random number r, , and a symmetric key shared with a base station. After the network is deployed to the target area, all nodes do not move. Adopting the method of [13], each node establishes multiple disjoint paths with the base station, and each node sends the data to the base station through multiple paths, for example. In Figure 1, the source node D and the base station have established n data transmission paths. Assuming that node D wants to send the data to the base station, it first divides the data into n fragments that are different from each other and encodes the n fragments to n new fragments; then, using homomorphic fingerprint technology, n packets are generated and sent to the base station along n different paths, respectively. When the base station receives n packets, if all the packets are valid, it will recover the original data.

3.3. Attack Model and Security Goal

The HFDLMN scheme assumes that any intermediate forwarding nodes can be captured as malicious nodes by the attackers. These malicious nodes can launch pollution attacks by injecting false data into the network, forging or modifying the packets. The HFDLMN scheme does not consider other attacks such as selective forwarding attacks but only considers pollution attacks. When the malicious node in the path receives the data, it will forge or modify the data with probability q and then forward it to the next-hop node. In the HFDLMN scheme, it is assumed that the calculation, storage, and communication capabilities of the base station are not limited, and the attackers can only capture ordinary sensor nodes, but not the base station.

Nowadays, there are several WSN standards (e.g., IEEE 802.15.4) that use different security levels at each layer. For instance, the network part of a packet is signed and encrypted with a network key and a data link layer with a DLL key. When an intermediate node receives a packet to retransmit, the DLL part needs to be verified; if it is not signed, the intermediate node drops the packet. Although the signature and encryption method can verify whether the packet has been modified, it cannot locate the malicious node that modifies the packet. The security goal of the HFDLMN scheme is not only to verify whether the packet is polluted but also to detect and locate the malicious nodes that launch pollution attacks.

4. Homomorphic Fingerprinting-Based Detection and Location of Malicious Nodes

The HFDLMN scheme proposed in this paper is divided into five steps: the source node generates the packets, the intermediate node forwards the packet, the base station detects the path of pollution attack, the base station locates the malicious node or malicious link, and base station recovers the original data.

4.1. Generating the Packets
4.1.1. Generating Data Segmentation

If the source node wants to send the data to the base station, it first divides the data into n fragments that are different from each other, namely, .

4.1.2. Coding Data Segmentation

Then, the source node generates n linearly independent vectors, the elements of the vectors are randomly picked from the field , and the vector is denoted as . According to the following equation, the source node can get n new fragments, which are denoted as .

4.1.3. Generating and Sending the Packets

After coding data segmentation, the source node generates n packets, which are denoted as , where the n packets have the same number which is denoted by , denotes the fingerprinting of , which is computed by fingerprinting function and the random number r,, denotes the coding vector of , and denotes the new data fragment after coding; then, the is sent to the base station along n different paths, respectively.

4.2. Forwarding the Packets

All intermediate forwarding nodes maintain a data forwarding table (DFT), which is shown in Table 2, where the Seq_Number field stores the packet number, and the Finger_printing field stores the fingerprinting of , the Encoding_Vector field stores the coding vector of , the Encoded_DataBlock field stores the new data fragment after coding, and the DFT only stores the packets that were forwarded the last three times. When the intermediate forwarding node j receives , it will delete the first record stored in the DFT and store the currently received packet in the DFT, which makes the DFT only store the packets that were forwarded the last three times and facilitate the base station query, and then it forwards the to the next intermediate forwarding node.

4.3. Detecting the Path of Pollution Attack

After the base station receives n packets with the same number from n paths, it first gets and from the and computes respectively; then, it randomly picks from the field and constructs a new vector according to

The base station can validate the validity of n packets according to equation (3). If equation (3) holds, all the n packets are valid, and it will be performed in Section 4.5 to recover the original data; otherwise, it means that malicious nodes polluted the packets in one path or more paths. Then, the base station can detect which packet is polluted according to equation (4). If equation (4) does not hold, the is polluted, and there are malicious nodes in the path; the base station will execute Algorithm 1 in Section 4.4 to detect and locate the malicious node.

Input: the path of pollution attack,the pollution packet , and the response packet
 Output: malicious node or malicious link
 For (i = m; i > 1; i--) do
  
  If cannot get accurately then
   Return
  End if
  
  
 End for
 For(i = 1; i ≤ m; i++) do
  If then
   Return and
  End if
 End for
4.4. Locating the Malicious Node

When the base station finds the path of pollution attack and the pollution packet , it assumes that there are m hops in the attack path from the source node to the base station, it is represented by , where represents the source node, and the remaining nodes represent intermediate forwarding nodes in the path. In order to locate the malicious nodes in the path, from the source node, the base station first informs each node to send the response packet to the base station along the attack path in turn, and the response packet is generated according to equation (5). In equation (5), represents the identity of node , and , represents the response packet sent by the previous hop node , represents the timestamp, and || represents the connection operation.

After the base station receives the response packet , it will execute Algorithm 1 to locate the malicious node or the malicious link. From back to front, first, it sequentially decrypts with the symmetric key shared with the node and gets the and ; then, from the node , the base station compares with the pollution packet received by the base station in turn; if it is equal, it means that node is a malicious node or the link between node and node is a malicious link.

4.5. Recovering the Original Data

After the base station receives n linearly independent packets with the same number from n paths, it can validate the validity of n packets according to equation (3); if equation (3) holds, all the n packets are valid, and it will recover the original data. It first gets and from the , and generates the vector coefficient matrix T, as shown in equation (6). Because n vectors are vector linearly independent, the coefficient matrix T is full rank, and the base station can get the inverse matrix and recover the original according to equation (7).

5.1. Proof of Malicious Path Detectability

The base station can validate the validity of n packets according to equation (3); if equation (3) holds, all the n packets are valid, and it will recover the original data; otherwise, the base station can detect which packet is polluted according to equation (4); if equation (4) does not hold, the is polluted, and there are malicious nodes in the path of sending the pollution packet. This section will prove the correctness of equations (3) and (4).

Theorem 1. After the base station receives n linearly independent packets with the same number from n paths, if all the packets it receives are valid, then equation (3) holds; if there are t (t<n) packets that are polluted, then equation (3) does not hold.

Proof. (1) If all the packets received by the base station are valid, thenThat is, Eq. (3) holds.
(2) Assuming that one of the n packets received by the base station is polluted by the forwarding malicious node in the path, and the polluted packet is , where are false data injected by the malicious node , denotes the fingerprinting of , computed by fingerprinting function and the random number r, , then,Because the malicious node has no and , it cannot construct and make ; as a result, .
So, without considering the link error of the network, if one or more packets are polluted, then Eq. (3) does not hold.
Theorem 1 is proved.

Theorem 2. After the base station receives n linearly independent packets with the same number from n paths, if all the packets received by the base station are valid, then equation (4) holds; otherwise, equation (4) does not hold, and the path of sending pollution packet is the malicious path.

Proof. (1) Assuming that is valid, then,That is, if the packet has not been modified, , therefore, (4) holds.
(2) Assuming that one of the n packets received by the base station is polluted by the forwarding malicious node in the path, and the polluted packet is , where are false data injected by the malicious node , denotes the fingerprinting of , computed by fingerprinting function and the random number r,, thenBecause the malicious node has no , it cannot construct and make ; as a result, .
So, without considering the link error of the network, if one or more packets are polluted, then (4) does not hold, and the path to send the pollution packet is the malicious path.
Theorem 2 is proved.

5.2. Probability Analysis of Legitimate Nodes Being Misjudged as Malicious Nodes

Because of the link error, a legitimate node will be misjudged as a malicious node by forwarding a packet distorted by the link error. This section will analyze the probability of the legitimate node being misjudged as a malicious node in the case of a link error.

Theorem 3. Assuming that the probability of link error is q, and the number of packets transmitted in the path is S in time period T, the probability of legitimate nodes being misjudged as malicious nodes is

Proof. Let X be the times that misjudged packets are detected, it is obvious that X satisfies the binomial distribution of parameters S and q, that is, X-b (S, q), and the distribution law of X is as follows:Therefore, the probability of legitimate nodes being misjudged as malicious nodes isSo, Theorem 3 is proved.

5.3. Time Complexity Analysis of the Algorithm

If there are malicious nodes in the path, the base station will execute Algorithm 1 in Section 4.4 to detect and locate the malicious node or malicious link, this section will analyze the time complexity of Algorithm 1.

Basic operations of Algorithm 1 are to get the from the response packet and the comparison of with the pollution packet . Getting the from the response is mainly to perform m cyclic operations; therefore, the time complexity of basic operation of getting the from the response is . The comparison of with the pollution packet is to search whether the pollution packet is in the array ; therefore, the time complexity of basic operation of the comparison of with the pollution packet is , too. So, the time complexity of the two basic operations is ; that is, the time complexity of Algorithm 1 is .

6. Security Analysis

In the HFDLMN scheme, the malicious nodes not only can launch pollution attacks by injecting false data into the network, forging or modifying the packets, but also can also modify or delete the response packet sent by its previous node, so that the malicious nodes can avoid or interfere with the base station to perform the detection of the malicious nodes. Because the pollution attack launched by a malicious node can be detected by (3) and (4), and if the malicious node normally forwards the[[parms resize(1),pos(50,50),size(200,200),bgcol(156)]] scheme resists to the malicious nodes avoiding or interfering with the base station to perform the detection of the malicious nodes.

Theorem 4. Any malicious node can be detected after modifying, deleting, or not sending response packets.

Proof. In order to interfere with the detection of malicious nodes performed by the base station, when the base station informs each node to send the response packet to the base station along the path, in turn, the malicious node can perform the following operations: (a) Attempt to modify the data of the response packet sent by its previous node; however, the response packet is a key chain generated by encrypting the time stamp, , the identity of the node , and with the symmetric key shared by node and base station; that is, , because the malicious node does not have the symmetric key shared by the node and the base station, and it cannot modify the data in the response packet sent by its previous node. (b) Try to delete the data in the response packet sent by its previous node, also because the malicious node does not have the symmetric key shared by the node and the base station, so it cannot delete the data in the response packet sent by its previous node. (c) Try not to send its own query response packet to the base station, that is, directly forwards the received response packet from its previous node to its next node . In this case, according to Algorithm 1, when the base station tries to decrypt the response packet with the symmetric key shared with the malicious node , the malicious node did not send its own response packet, so the base station cannot correctly decrypt the response packet ; therefore, it can be determined that the node is a malicious node. (d) Try to send false data to the base station and interfere with the detection of malicious nodes. For example, the malicious node sends unmodified data to the base station; according to Algorithm 1, the base station can correctly locate the malicious link composed of the malicious node and its next-hop node; similarly, the malicious node can also send a modified data to the base station; according to Algorithm 1, the base station can correctly locate the malicious link composed of the malicious node and its next-hop node.
In summary, in the HFDLMN scheme, any malicious node can be detected after modifying, deleting, or not sending response packets.
So, Theorem 4 is proved.

7. Simulation

In this paper, the performance of the HFDLMN scheme is evaluated from the aspects of the detection probability of malicious path, the location probability of malicious node, and the false detection probability of normal nodes and paths. The simulation experiment environment is carried out on OMNeT++ platform, with 100 nodes randomly distributed in a square area of 400m × 400 m, each node is assigned a unique ID, the nodes will not move after deployment, and the base station is deployed in the center of the area. By adjusting the communication range of each node, each node has at least four neighbor nodes, and each node establishes four disjoint paths to the base station. Some nodes in the network are randomly selected as data source nodes and malicious nodes, and others as intermediate forwarding nodes. The source node sends the packet to the base station by multihop every 1 second, and the length of each packet is 256 bytes. The initial energy of each node is 1J, and the energy consumption of transmission and receiving is 50 nJ/bit. When a malicious node becomes an intermediate forwarding node, it will forge or modify packets with a probability from 0.1 to 0.7. For each set parameter, the average value obtained by 100 simulations is taken. The parameter settings of the experimental simulation are shown in Table 3.

Figure 2 describes the detection probability of a malicious path when the malicious node forges or modifies packets with a probability of 0.1, 0.3, 0.5, and 0.7, and there is a malicious node in one of the four paths from the source node to the base station. From Figure 2, it can be seen that the number of packets that need to be sent to successfully detect malicious paths is related to the probability q of malicious node modifying data. The higher the probability of malicious node modifying data, the less packets need to be sent to successfully detect the malicious path; for example, when the probability q of malicious node modifying data is 0.3, in order to detect the malicious path successfully, the source node needs to send 14 packets; when the probability q of malicious nodes modifying data is 0.5, the base station can successfully detect the malicious path by only sending 9 packets.

Figure 3 describes the detection probability of a malicious path when the malicious node forges or modifies packets with a probability of 0.1, 0.3, and 0.5, and the number of paths with malicious nodes is 2 and 3. From Figure 3, it can be seen that the number of packets that need to be sent to successfully detect malicious paths is not only related to the probability q of malicious node modifying data but also related to the number of paths with malicious nodes. The higher the probability of malicious node modifying data and the more number of paths with malicious nodes, the less packets need to be sent to successfully detect the malicious path; for example, when the probability q of malicious node modifying data is 0.1 and the number of paths with malicious nodes is 2, in order to detect the malicious path successfully, the source node needs to send 8 packets; when the probability q of malicious node modifying data is 0.3 and the number of paths with malicious nodes is 3, the base station can successfully detect the malicious path by only sending 4 packets.

Figure 4 describes the locating probability of the malicious node when the malicious node forges or modifies packets with a probability of 0.1 and 0.3, and the number of paths with malicious nodes is 1, 2, and 3. From Figure 4, it can be seen that with the probability increase of the malicious node modifying data and the number increase of malicious paths, the probability of successfully locating malicious nodes will increase; for example, when there is a malicious node in only one path and the probability q of malicious node modifying data is 0.1, the source node sends 15 packets and the probability of successfully locating the malicious node is about 84%; when there are malicious nodes in two paths and the probability q of malicious nodes modifying data is 0.3, the source node only sends 10 packets and the probability of successfully locating the malicious node is about 94%.

Because of the link error, a legitimate node will forward a packet distorted by the link error, so that the legitimate node and path are misjudged as malicious node and path. Figure 5 describes the probability of the legitimate node and path being misjudged as malicious node and path when the probability of link error is from 0.005 to 0.06 and the number of packets transmitted in the path is 100 in a certain period of time. From Figure 5, it can be seen that with the probability increase of the link error, the probability of the legitimate node and path being misjudged as malicious node and path will increase; for example, when the probability of link error is 0.01, the probability of the legitimate node and path being misjudged as malicious node and path is about 5%, and when the probability of link error is 0.05, the false detection probability of the legitimate node and path is about 19%.

8. Conclusions

To detect and locate malicious nodes in multiple paths, this paper presents a malicious node detection and location scheme based on homomorphic fingerprint and coding technology in wireless sensor networks, HFDLMN. In the HFDLMN scheme, the source node generates n packets and sends them to the base station along n paths, respectively; the base station determines whether there are malicious nodes in each path by verifying the validity of the packets; if there are malicious nodes in one or some paths, the location algorithm of a malicious node is implemented to locate the specific malicious nodes in the path. The HFDLMN scheme does not need any complex evaluation model to evaluate and calculate the trust value of the node, nor any monitoring nodes. Using a key chain, the HFDLMN scheme can resist malicious nodes to avoid or interfere with the base station to detect malicious nodes. Theoretical analysis results show that the HFDLMN scheme is secure and effective, the simulation results demonstrate that the HFDLMN scheme can effectively detect malicious paths and malicious nodes, with a higher detection rate; for example, if there are malicious nodes in two paths and the probability q of malicious nodes modifying data is 0.3, the source node only sends 10 packets and the probability of successfully locating the malicious node is about 94%. In the future, we aim to extend this work into designing a new detection and location of malicious nodes scheme among Internet of Things devices.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this study.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 62002143 and in part by the Natural Science Foundation of Jiangxi Province under Grant 20192BAB217007.