The fault-tolerant routing problem is important consideration in the design of heterogeneous wireless sensor networks (H-WSNs) applications, and has recently been attracting growing research interests. In order to maintain š‘˜ disjoint communication paths from source sensors to the macronodes, we present a hybrid routing scheme and model, in which multiple paths are calculated and maintained in advance, and alternate paths are created once the previous routing is broken. Then, we propose an immune cooperative particle swarm optimization algorithm (ICPSOA) in the model to provide the fast routing recovery and reconstruct the network topology for path failure in H-WSNs. In the ICPSOA, mutation direction of the particle is determined by multi-swarm evolution equation, and its diversity is improved by immune mechanism, which can enhance the capacity of global search and improve the converging rate of the algorithm. Then we validate this theoretical model with simulation results. The results indicate that the ICPSOA-based fault-tolerant routing protocol outperforms several other protocols due to its capability of fast routing recovery mechanism, reliable communications, and prolonging the lifetime of WSNs.

1. Introduction

The complex networks have attracted growing research interests in topology structure and dynamic problems. Many kinds of system can be described with the complex network model, and these models are constructed by several nodes connected with each other, such as the Internet and the wireless sensor networks (WSNs). Due to the ability of collecting data from the environment and reporting it back to the sink without human supervision [1, 2], WSNs, especially heterogeneous ones, have come to pervade every aspect of our lives, such as habitat monitoring, industrial sensing, and traffic control [3ā€“5]. The heterogeneous WSNs always deploy an appropriate number of heterogeneous wireless sensor nodes (called macronodes), which contain devices with more capabilities, storage space, and energy than ordinary nodes. They cannot only improve the success rate of data transmission of WSNs, but also reduce the energy consumption of transmission, thus can effectively prolong the network lifetime. The benefits of using heterogeneous WSNs (H-WSNs) have been presented in the literature [6ā€“9]. It is reported that when properly deployed, heterogeneity can triple the average delivery rate and provide a fivefold increase in the network lifetime [6].

However, in practical applications, unpredictable events such as environmental impairment, communication link broken, and battery depletion may cause the sensor devices to fail, partitioning the network and disrupting network functions. Therefore, fault tolerance becomes a critical issue for the successful communication of H-WSNs. It is expected that the network topology broken by software or hardware failure of sensor nodes could be automatically reconstructed and self-healed by the fault-tolerant routing technology so as to be recovered from path failure and ensured the performance of the communication tasks.

The objective of this paper is to solve the fault-tolerant routing problem for the H-WSNs while maintaining š‘˜ disjoint communication paths from each source sensor to the macronode it belongs to (called š‘˜-disjoint-path routing recovery problem). For this purpose, we propose a swarm intelligence algorithm, immune cooperative particle swarm optimization algorithm (ICPSOA), to provide fast recovery from path failure in the H-WSNs. In this way, the network can tolerate the failure of up to š‘˜āˆ’1 sensor paths with reconstructed topology, traditional retransmissions can be decreased, and reliability can be provided with lower energy consumption. Our problem is specifically tailored to the situation that data is forwarded from sensors to macronodes.

The main contributions of this paper are as follows: firstly, we formulate the š‘˜-disjoint-path routing recovery problem for the H-WSNs. Then, in order to maintain š‘˜ disjoint communication paths from each source sensor to the set of macronodes, we propose the ICPSOA-based protocol to reconstruct the network topology and provide fast routing recovery from path failure in the H-WSNs. The proposed method can provide simplicity, robustness, and effectiveness for routing recovery problem of the WSNs.

The remainder of this paper is organized as follows: Section 2 overviews the related work on fault-tolerant routing problem, especially routing recovery problem in the H-WSNs. Then, we propose the H-WSNs architecture and fault model and the ICPSOA-based approach for solving the routing recovery problem in Section 3. The simulation results are presented in Section 4. Section 5 provides a conclusion of our paper and discusses a few future directions for further improving the performance of our approach.

2.1. The Fault-Tolerant Routing Algorithms of WSNs

Fault-tolerant routing protocols proposed for WSNs can be classified into three groups: (1) proactive routing, called disjoint multipath, in which several paths from source node to sink are calculated, maintained in advance, and stored in a routing table, but greater energy consumption and the requirement to predict the global topology information are the disadvantages [10ā€“12], (2) reactive routing, where all paths are created on demand [7], and (3) hybrid routing, which is a mix of the above two groups [8, 9].

One of the common fault-tolerant routing solutions is to establish disjoint multipath with proactive routing mechanism. Disjoint multipath constructs a number of alternative paths which are node/links disjoint with the primary path and other alternative paths. Thus, a failure in any or all nodes/links on the primary path does not affect the alternative paths. Using this multipath scheme in a network with š‘˜ disjoint paths from source to destination can tolerate at most š‘˜āˆ’1 intermediate network component failures. A secure and energy-efficient multipath routing protocol proposed by Nasser and Chen [10] is effectively resistive to some specific attacks, and has the character of pulling all traffic through the malicious nodes by advertising an attractive route to the destination.

A considerable amount of work has also been done on the hybrid routing scheme, which combines multipath scheme and reactive routing scheme. In this scheme, multiple paths are calculated and maintained in advance, and then, alternative paths are created on demand. EARQ (energy-aware routing for real-time and reliable communication) is a hybrid routing scheme proposed by Heo and Hong [9], which selected a path that expended less energy than others, among paths that delivered a packet in time, which enabled even distribution of energy expenditure to sensor nodes. It also provided reliable communication and fast recovery from path failure, because it only sent a redundant packet via an alternative path if the reliability of a path was less than a predefined value. Pandana and Liu [11] sought to propose an algorithm which designed the connectivity weight of each node and established a most reliable path in order to keep the other nodesā€™ connectivity.

Our work differs from the above existing ones [13ā€“16] by considering a different architecture and routing objective. We consider the H-WSNs architecture with a number of macronodes and concern with providing š‘˜-connectivity from each source node to the set of macronodes, and as such, we provide a hybrid routing scheme to maintain the multipath routing. The H-WSNs usually consist of two types of wireless devices [12]: a large number of resource-constrained wireless sensor nodes deployed randomly and a much smaller number of resource-rich macronodes placed at known locations. The macronode network, which provides more energy, transmission bandwidth, computing ability, and storage space, is used to quickly forward sensor data packets to the sink. With this setting, data gathering in the H-WSNs has two steps. Firstly, sensor nodes transmit and relay information on multihop paths toward any macronode. Then, it is forwarded to the sink using fast macronode-to-macronode communication once a packet encounters a macronode.

The similar hybrid routing schemes for the H-WSNs are as follows: CPEQ (cluster-based periodic, event-driven, and query-based protocol) [17] groups sensor nodes to efficiently relay data to the sink by uniformly distributing energy dissipation among the nodes. It can provide fast broken path reconfiguration and high reliability in the delivery of event packets and speed up new subscriptions by using the reverse path. Cardei and Yang [18] proposed GATCš‘˜ and DATCš‘˜ in the H-WSNs, with the objective of minimizing the total energy consumption while providing š‘˜ independent paths from each node to macronodes. Such a topology provides the infrastructure for fault-tolerant data-gathering applications robust to the failure of up to š‘˜āˆ’1 sensors. Boukerche et al. [8] used a protocol of ICE (intercluster communication-based energy-aware and fault-tolerant protocol) by alternating the nodes responsible for intercluster communication inside one cluster. If one of multiple paths has faulty nodes, the other ones will be used for the event notificationā€™s propagation. But the fast routing recovery mechanism for path failure has rarely been considered. Further, as the fault-tolerant optimization problem to find the optimal routing is NP-hard, these heuristic deterministic methods would always get the likely optimal routing result, and is easy to fall into local optimum. So, we employ a swarm intelligence algorithm, the ICPSOA, to improve the performance of solving these problems.

In this paper, we propose an ICPSOA-based fault-tolerant routing algorithm, which reconstructs the network topology of H-WSNs and provides a fast recovery from path failure with alternative path. We also compare the performance of the protocols of EARQ, CPEQ, and ICE with that of our approach. As we known, EARQ is an effective fault-tolerant routing protocol for homogeneous WSNs, while ICE and CPEQ are for H-WSNs to provide routing recovery from path failure. In this way, we can evaluate the fault-tolerant routing recovery mechanism with different network types.


The EA-based bionic randomized algorithm has become the important tools for solving complex optimization problems because of its intelligence and widely used and global search ability. But the algorithm dealing with fault-routing problem of WSNs should support the characteristic of energy saving. In general, better fault-tolerant performance always needs more energy consumption. Therefore, we choose light-weight algorithm based on the particle swarm optimization algorithm (PSOA), which has a simple structure and is easy to realize.

The PSOA is a new EA based method to search an optimal solution in the high-dimensional problem space [13], where each particle is a potential solution to the problem under analysis. In updating a population of particles with regard to their internal position and velocity, the PSOA is informed by the experiences of all the particles. It provides an idea to find solutions to complex problems using group advantage without global model and centralized control and can be suitable to apply in a dynamical modeling environment. It has been applied to many optimization problems, such as control problems and protocol design [14]. A remarkable difference between the PSOA and other EA-based algorithms is that the PSOA is very simple and has few parameters to be adjusted. Therefore, in general, it requires less computational complexity.

In the standard PSOA (SPSOA), each particle is a potential solution to the problem. Assume š‘ particles fly in the š·-dimensional search space, the position of the š‘–th particle is š‘„š‘”š‘–=(š‘„š‘”š‘–1,š‘„š‘”š‘–2,ā€¦,š‘„š‘”š‘–š·)š‘‡, and its velocity is š‘£š‘”š‘–=(š‘£š‘”š‘–1,š‘£š‘”š‘–2,ā€¦,š‘£š‘”š‘–š·)š‘‡. š‘š‘–=(š‘š‘”š‘–1,š‘š‘”š‘–2,ā€¦,š‘š‘”š‘–š·) is the best previous position of the particle, and š‘š‘” is the global best position of the whole particle swarm. Therefore, the velocity and position of each particle will be updated according to [15]š‘£š‘”+1š‘–š‘‘=š‘¤š‘£š‘”š‘–š‘‘+š‘1rand1ī€·š‘š‘”š‘–š‘‘āˆ’š‘„š‘”š‘–š‘‘ī€ø+š‘2rand2ī‚€š‘š‘”š‘”š‘‘āˆ’š‘„š‘”š‘–š‘‘ī‚,š‘„š‘”+1š‘–š‘‘=š‘„š‘”š‘–š‘‘+š‘£š‘”+1š‘–š‘‘,(2.1) where 1ā‰¤š‘‘ā‰¤š·, š‘1 and š‘2 are learning factors, and usually, we make š‘1=š‘2=2; š‘¤ is the inertia weight and used to control the tradeoff between the global and local exploration ability of the swarm. Random numbers rand1 and rand2 are uniformly distributed in [0,1].

The SPSOA also exhibits several disadvantages: it sometimes posses the problem of converging to undesired local optimum, for the diversity of population decreases in the latter iteration of evolution; optimizing stops when reaching a likely optimal solution, and thus the accuracy of the algorithm is limited. Therefore, a cooperative PSOA (CPSOA), which uses cooperative behavior of multiple swarms to improve the SPSOA, is proposed in [16]. In the CPSOA, limitation of an individual can be compensated by a number of other individuals from other symbiotic groups in the interaction. It can avoid misjudgment caused by single exchange of information [16]. However, it still uses the formula of the SPSOA to evolve. The trajectory of each particle is unable to yield high diversity of particles to increase search space. Therefore, the CPSOA may get a suboptimal solution.

For this reason, we draw on good diversity characteristic of immune mechanism and develop an immune CPSOA (ICPSOA), in which each particle is considered as an antibody. Particle clone is used to generate a new population with offspring. Mutation is used to diversify the search process. Immune restrain is considered to restrain the inferior ones in order to keep the stable population. Immune memory is used to store the feasible solutions [19]. The affinity between antibody and antigen can measure the optimal path, and the affinity between antibodies and antibodies can evaluate the diversity of population [20]. In the ICPSOA, mutation direction of the particle (called antibody) is determined by evolution equation, and its diversity is increased by immune mechanism [21]. Although the addition of the immune mechanism may add more time complexity to the system, the proposed ICPSOA largely improves the capability of jumping out of local optima. The use of the ICPSOA for the fault-tolerant routing problem in H-WSNs has been presented in the following sections.

3. Fault-Tolerant Routing Problem in H-WSNs Based on the ICPSOA

3.1. Model of the Proposed H-WSNs
3.1.1. The Architecture for the Model of H-WSNs

The architecture for the model of H-WSNs contains two types of wireless sensor devices as shown in Figure 1. The lower layer is formed by sensor nodes with constrained resource, including small amount of source nodes and other relay nodes. The main tasks performed by the source nodes are sensing, data processing, and data transmission. The tasks performed by the relay nodes are data processing and relaying. The dominant energy consumer is the radio transceiver. The upper layer consists of resource-rich macronodes overlaid on the H-WSNs. Wireless communication links between macronodes have considerably longer ranges and higher data rates, allowing the macronode network to bridge remote regions of the interest area. The tasks performed by a macronode are data aggregation and transmission, complex computations, and decision making. The ICPSOA is also executed by macronodes.

Therefore, in-network data transmission can be performed by forming a spanning tree among all the tree nodes. As shown in Figure 1, transmission starts with the leaf nodes (source node) of the tree sending their values to their parent nodes (macronode, nodes 1,2,ā€¦,12) in the tree, until the final data is obtained at the root node (sink, node 13). Thus, the overall architecture would have a tree of macronodes and then each macronode can serve as the root of a subtree of ordinary nodes. Here, we are only interested in the fault-tolerant routing between sensor-sensor and sensor-macronode communications.

Assume that the network has the following characters: (1) the H-WSNs is a static network, where the nodes will not move after deployment, (2) every node knows its own position and that of the macronodes and the sink. The location can be obtained by GPS or localization protocols for estimating the location of a node, (3) the wireless transmission energy of macronode can be adjusted based on the distance between the receiver and itself, (4) the adjacent nodes would acquire the state information of their 1-hop neighbors and the links between them through periodically broadcast. The meanings of used symbols is provided in Table 1.

3.1.2. The š‘˜-Disjoint-Path Spanning Graph in the Subtree

The subtree STš‘– of the network is modeled as a directed, connected graph šŗ(š‘‰,šø), where š‘‰ is a finite set of subtree nodes and šø is the set of subtree edges representing connection between these nodes, where source node š‘›š‘ āˆˆš‘‰ and macronode (root) š‘›š‘Ÿāˆˆ{š‘‰āˆ’{š‘›š‘ }}. š‘š‘–(š‘ ,š‘Ÿ) is a valid path between š‘›š‘  and š‘›š‘Ÿ, and š‘ƒ(š‘ ,š‘Ÿ) is the set of all the paths š‘š‘–(š‘ ,š‘Ÿ).š‘›(š‘›āˆˆš‘š‘–(š‘ ,š‘Ÿ)) represents a node in š‘š‘–(š‘ ,š‘Ÿ), and š‘’(š‘’āˆˆš‘š‘–(š‘ ,š‘Ÿ)) represents direct edge between any two adjacent nodes in š‘š‘–(š‘ ,š‘Ÿ). Then, we can get the š‘˜-disjoint-path spanning graph in the subtree STš‘–. The factors affecting the choice of path š‘š‘–(š‘ ,š‘Ÿ) include (1) the available energy function of each node, ene(š‘›), (2) distance function of the edge between adjacent nodes, dist(š‘’), (3) energy consumption function, ene(š‘’), (4) communication delay function of the node, delay(š‘›). Then, these parameters can determine the fitness function of š‘š‘–(š‘ ,š‘Ÿ), ļ¬tness(š‘š‘–)ī€·š‘ļ¬tnessš‘–ī€ø=āˆ‘š‘›āˆˆš‘š‘–(š‘ ,š‘Ÿ)ene(š‘›)šœ”1š‘“1+šœ”2š‘“2+šœ”3š‘“3,š‘“1=āˆ‘š‘’āˆˆš‘š‘–(š‘ ,š‘Ÿ)ene(š‘›)āˆ‘š‘’āˆˆšø,š‘“ene(š‘›)2=āˆ‘š‘›āˆˆš‘š‘–(š‘ ,š‘Ÿ)delay(š‘›)āˆ‘š‘›āˆˆš‘,š‘“delay(š‘›)3=āˆ‘š‘’āˆˆš‘š‘–(š‘ ,š‘Ÿ)dist(š‘’)āˆ‘š‘’āˆˆšø,dist(š‘’)(3.1) where š‘“1 is the ratio of the energy consumed by the edges of path š‘š‘– and the energy consumed by all the edges in the subtree, š‘“2 is the delay of the edges and nodes of path š‘š‘– versus the delay of all the nodes in the subtree, and š‘“3 is the distance of the edges of path š‘š‘– versus the distance of all the edges in the subtree. šœ”1,šœ”2,šœ”3 are the weight of effective energy, delay and distance constraints in the fitness function, and šœ”1+šœ”2+šœ”3=1. We set šœ”1=0.4, šœ”2=0.2, šœ”3=0.4. The higher fitness value indicates the more suitable path.

As illustrated in Figure 2, we assume š‘˜=3, the three disjoint paths between source node (node 2) and root (node 30) are 2-3-9-15-20-25-28-30, 2-8-13-18-23-30, and 2-7-12-16-21-27-30, respectively. The detailed protocol dealing with routing recovery problem is presented in the following sections.

3.1.3. The Proposed Fault Model and Energy Model

We use the simple fault model proposed in [22] and identify the node failure in it. The fault-model should be simple enough to analyze, but also sophisticated enough to capture the fault behavior effectively. The probability of sensor nodes failure of subtree is given by š‘node. As we use the more reliable macronodes to sustain the failure during transmission process, the probability of macronode failure is assumed to be š‘macroā‰ˆ0. If any of the sensor nodes fails, our routing recovery approach for node failure in subtree can be implemented.

We introduce the energy model adopted in [18], and the equation of energy model of a sensor node is as follows:ene(š‘š,š‘‘)=eneš‘”š‘„(š‘š,š‘‘)+eneš‘Ÿš‘„ī€·š‘Ž(š‘š)=11+š‘Ž2š‘‘š‘›ī€øš‘š+š‘Ž12š‘š,(3.2) where š‘‘ is the distance from the sensor node to the next-hop node, eneš‘”š‘„(š‘š,š‘‘) and eneš‘Ÿš‘„(š‘š) are the energy consumption of sending and receiving š‘š bits of data, š‘Ž11, š‘Ž2, and š‘Ž12 are energy consumption parameters of sending circuit, sending amplifier, and receiving circuit, and š‘› is the channel attenuation index. We also define eneDF as energy consumption of data fusion and eneRT as energy consumption of updating routing table. For the ICPSOA, we define energy consumption of paticle update, immune clone, mutation, particle selection, and restrain per iteration as enePU, eneIC, eneIM, enePS, and enePR, respectively. So, the total consumption of the ICPSOA is according to the actual iterated generations per round.

3.2. Fault-Tolerant Routing Problem Using the ICPSOA

As described in Section 2.2, the ICPSOA is used to provide a fast recovery mechanism from path failure due to physical damage or energy depletion with an alternative path. It chooses a path with optimal fitness from the optional sensor nodes. The ICPSOA is the kernel of our fault-tolerant routing protocol. Its flowchart is shown in Figure 3, and its framework is shown in Algorithm 1. The detailed procedures are described in the following subsections.

Input:ā€ƒ š‘ƒ S T š‘œ : The information parameters of nodes
ā€ƒā€ƒā€ƒā€‚ š‘ƒ G e n : The iterated generations for searching process
Output:ā€ƒ ļ¬ t n e s s ( š‘ š‘– ) : The global optimal fitness
Step 1:ā€ƒInitialization: Generate initial particle swarm parameter.
Step 2:ā€ƒImmunization: Immune clone, mutation, particle selection and restrain.
Step 3:ā€ƒIf termination criterion conditions are satisfied, go toā€‰ā€‰Step 5; else go to
ā€ƒā€ƒā€ƒā€‚Step 4.
Step 4:ā€ƒUpdate: Update the velocity and position of each sub-swarm and particle.
Step 5:ā€ƒOutput: Output the global optimal fitness of the particle swarm. Ends.

(1) Initialization
The principle of the ICPSOA is to search, respectively, in different š·-dimensional target spaces using š‘˜ independent particle swarms. To initialize the algorithm, we set the population size of particle š‘›, the division factor š‘˜, and each particle swarm includes š‘›/š‘˜ particles. Then, the š·-dimensional vector (vector of particleā€™s position and velocity) is divided into š‘˜ swarms. We define a matrix by [š·Ć—2š‘›] to represent the initial particle swarm, in which the former š‘› columns are the position of particle, and the latter š‘› columns are the velocity of particle.

In Algorithm 2, š‘(š‘”) is a complete vector function consisting of all subswarmsā€™ optimal position vector, š‘„š‘šš‘†š‘– represents position vector of the š‘šth particle in the š‘–th swarm, š‘š‘šš‘†š‘– is the optimal history position vector of the š‘šth particle in the š‘–th swarm, and š‘š‘”š‘†š‘– represents optimal experience position vector of the š‘–th swarm.

Input:ā€ƒā€‚ š‘ƒ S T š‘œ : The information parameters of nodes
ā€ƒā€ƒā€ƒā€ƒn: The population size of particle
ā€ƒā€ƒā€ƒā€ƒ š‘˜ : The population size of swarm
Output:ā€ƒ š‘† š‘– : The vector of the š‘– th particle swarm
ā€ƒā€ƒā€ƒā€ƒ š‘ ( š‘” ) : Each sub-swarmā€™s optimal position vector function
ā€ƒā€ƒā€ƒā€ƒParticleā€™s š· -dimensional vector is divided into š‘˜ particle swarms.
ā€ƒā€ƒā€ƒā€ƒ š‘ ( šæ , š‘– ) = ( š‘ š‘” š‘† 1 , ā€¦ , š‘ š‘” š‘† š‘– āˆ’ 1 , šæ , š‘ š‘” š‘† š‘– + 1 , ā€¦ , š‘ š‘” š‘† š‘˜ )

(2) Immunization
In this step, each particle can be considered as an antibody, resulting in the clonal mutation set š¶. The clone number and the fitness of particle are proportional. The clonal number š‘š‘ is usually calculated as follows: š‘š‘=š›¼š‘,(3.3) where š›¼ is the clone factor and is proportional to particleā€™s fitness value. š‘ is the number of particles. Mutation rule can be setup according to experience. The particle mutation rule for the function optimization problem is š‘š‘–=š‘„š‘–+š›½rand,(3.4) where š‘š‘– is the clonal individual, š‘„š‘– is the original antibody, š›½ represents the mutation factor, and rand is uniformly distributed in [0,1].

For the particles replacement rule, we need to calculate the antigen stimulus degree of the original particles and select clonal mutation particles. The Euclidean distance between any particle š¶š‘” and antigen š‘Œš‘” is ī„¶ī„µī„µāŽ·š‘‘(š‘–,š‘—)=š‘›ī“š‘–=1ī€·š‘š‘–š‘”āˆ’š‘¦š‘—š‘”ī€ø2.(3.5)

Therefore, the stimulus degree of antibody particle is1š“(š‘–,š‘—)=š‘‘(š‘–,š‘—).(3.6)

After that, each particle is compared with stimulus threshold; the higher one will maintain in the subswarm, and the lower one will be replaced (called restrain). The process of this step is as shown in Algorithm 3. Then, go to Step 2 in Algorithm 1.

Input:ā€ƒ š‘› : The population size of particles
ā€ƒā€ƒā€ƒā€‚ š‘ƒ G e n : The iterated generations for searching process
ā€ƒā€ƒā€ƒā€‚k: The population size of swarm
Output:ā€‚The allele of the offspringā€™s antibodies (particles)
ā€ƒā€ƒā€ƒā€‚ For each swarm š‘– āˆˆ [ 1 ā‹Æ š‘˜ ]
ā€ƒā€ƒā€ƒā€ƒFor each particle š‘š āˆˆ [ 1 ā‹Æ š‘› / š‘˜ ]
ā€ƒā€ƒā€ƒā€‚ā€ƒā€‚Clone operation: š‘ š‘ = š›¼ š‘
ā€ƒā€ƒā€ƒā€‚ā€ƒā€‚Mutation operation: š‘ š‘– = š‘„ š‘– + š›½ r a n d
ā€ƒā€ƒā€ƒā€ƒā€‚ā€‚Replacement (restrain) operation:
ā€ƒā€ƒā€ƒā€‚ā€ƒ īƒŽ š‘‘ ( š‘– , š‘— ) = š‘› āˆ‘ š‘– = 1 ( š‘ š‘– š‘” āˆ’ š‘¦ š‘— š‘” ) 2 , š“ ( š‘– , š‘— ) = 1 / š‘‘ ( š‘– , š‘— )
ā€ƒā€ƒā€ƒā€ƒā€‚If š“ ( š‘– , š‘— ) > threshold, the particle (antibody) is replaced
ā€ƒā€ƒā€ƒā€‚End For
ā€ƒā€ƒā€ƒEnd For

(3) Termination Criterion
If the solution is satisfied with the termination criterion, ļ¬tness(š‘š‘–) is the optimal fitness or š‘ƒGen decreases to zero, the optimal path š‘š‘– will be the desired optimal solution, and this procedure ends; otherwise, returns to Step 4 in Algorithm 1. Then, the š‘˜th path is established.

(4) Update
In this step, the velocity and position of the particle is updated as (2.1). The process is as shown is Algorithm 4. The updating equation of particlesā€™ optimal position vector in each subswarm is as follows: š‘ī€·š‘š‘šš‘†š‘–ī€ø=ī‚»š‘ī€·š‘„,š‘–š‘šš‘†š‘–ī€øī€·š‘ī€·š‘„,š‘–,ļ¬tnessš‘šš‘†š‘–ī€·š‘ī€·š‘,š‘–ī€øī€øā‰„ļ¬tnessš‘šš‘†š‘–,š‘ī€·š‘,š‘–ī€øī€øš‘šš‘†š‘–ī€øī€·š‘ī€·š‘„,š‘–,ļ¬tnessš‘šš‘†š‘–ī€·š‘ī€·š‘,š‘–ī€øī€ø<ļ¬tnessš‘šš‘†š‘–,,š‘–ī€øī€ø(3.7) where 1ā‰¤š‘–ā‰¤š‘˜. The updating equation of optimal position of each subswarm is怀š‘ī€·š‘š‘”š‘†š‘–ī€ø,š‘–=argš‘ƒ(š‘š‘šš‘†š‘–,š‘–)ī€·š‘ī€·š‘maxļ¬tnessš‘šš‘†š‘–š‘›,š‘–ī€øī€ø,1ā‰¤š‘šā‰¤š‘˜,1ā‰¤š‘–ā‰¤š‘˜.(3.8)

Input:ā€ƒā€‚ š‘ƒ S T š‘œ : The information parameters of nodes
ā€ƒā€ƒā€ƒā€‚ā€‚ š‘† š‘– : The š‘– th particle swarm
ā€ƒā€ƒā€ƒā€‚ā€‚ š‘ ( šæ , š‘– ) : Each sub-swarmā€™s optimal position vector function
ā€ƒā€ƒā€ƒā€‚ā€‚ š‘ƒ G e n : The iterated generations for searching process
Output:ā€ƒ š‘£ š‘” + 1 š‘– š‘‘ and š‘„ š‘” + 1 š‘– š‘‘ : The velocity and position of each sub-swarm and its particle
ā€ƒā€ƒā€ƒā€ƒ ļ¬ t n e s s ( š‘ š‘– ) : The global optimal fitness
ā€ƒā€ƒā€ƒā€ƒFor each swarm š‘– āˆˆ [ 1 ā‹Æ š‘˜ ]
ā€ƒā€ƒā€ƒā€ƒā€ƒFor each particle š‘š āˆˆ [ 1 ā‹Æ š‘› / š‘˜ ]
ā€ƒā€ƒā€ƒā€ƒā€ƒā€ƒUpdate velocity and position of each sub swarmā€™s particle
ā€ƒā€ƒā€ƒā€ƒā€ƒIf ļ¬ t n e s s ( š‘ ( š‘„ š‘š š‘† š‘– , š‘– ) ) ā‰„ ļ¬ t n e s s ( š‘ ( š‘ š‘š š‘† š‘– , š‘– ) ) , š‘ ( š‘ š‘š š‘† š‘– , š‘– ) = š‘ ( š‘„ š‘š š‘† š‘– , š‘– )
ā€ƒā€ƒā€ƒā€ƒā€ƒElse If ļ¬ t n e s s ( š‘ ( š‘„ š‘š š‘† š‘– , š‘– ) ) < ļ¬ t n e s s ( š‘ ( š‘ š‘š š‘† š‘– , š‘– ) ) , š‘ ( š‘ š‘š š‘† š‘– , š‘– ) = š‘ ( š‘ š‘š š‘† š‘– , š‘– )
ā€ƒā€ƒā€ƒā€ƒā€‚ā€ƒCalculate ļ¬ t n e s s ( š‘ š‘– ) of the particle (path)
ā€ƒā€ƒā€ƒā€ƒā€ƒā€‚End For
ā€ƒā€ƒā€ƒā€ƒā€‚ š‘ ( š‘ š‘” š‘† š‘– , š‘– ) = a r g š‘ƒ ( š‘ š‘š š‘† š‘– , š‘– ) m a x š‘“ š‘– š‘” š‘› š‘’ š‘  š‘  ( š‘ ( š‘ š‘š š‘† š‘– , š‘– ) )
ā€ƒā€ƒā€ƒā€ƒEnd For

Equation (3.8) indicates that the optimal position of the š‘šth subswarm will select the personal optimal position with the optimal fitness of particle in the swarm.

Inertia weight š‘¤ plays an important role to the convergence of the result among the adjustable parameters. The larger weight can help the particle escape from the local best solution, and the smaller one is better for the convergence, thus the inertia weight can achieve balance between global search and local search. To overcome the limitations of other general strategies, the linear differential decreasing strategy is used [23]. Here, we select š‘¤start=0.85, š‘¤end=0.35š‘‘š‘¤(š‘”)=2ī€·š‘¤š‘‘š‘”startāˆ’š‘¤endī€øš‘”2maxš‘”,š‘¤(š‘”)=š‘¤startāˆ’ī€·š‘¤startāˆ’š‘¤endī€øš‘”2maxš‘”2.(3.9)

The computational complexity is an important issue in designing our optimization algorithms. In the š‘›th iteration of the ICPSOA, the time to calculate fitness function for immune clone is š‘š‘, the time to calculate fitness function for particle mutation and selection is š›½š‘š‘, and the time to calculate fitness function for particle update is š‘”š‘. So the total calculating time š‘ƒš‘› in the š‘›th iteration should beš‘ƒš‘›ā‰¤ī€ŗš‘š‘+š‘š‘š›½ī€»+š‘”š‘=(1+š›½)š‘š‘+š‘”š‘.(3.10)

Therefore, the computational complexity of the ICPSOA is O(š‘š‘), which indicates that the size of clone group has a direct impact on the search speed of the ICPSOA with the same size of particle.

3.3. The ICPSOA-Based Fault-Tolerant Routing Protocol Framework for H-WSNs

The ICPSOA is the kernel of fault-tolerant routing protocol. As shown in Figure 2, once š‘›fail (node 18) fails, the macronode š‘›š‘Ÿ (node 30) constructs subgraph šŗī…ž(šŗī…žāŠ‚šŗ) according to the current topology information of nodes and extracts the set of nodes š‘š‘ which can be used to construct an alternative path š‘š‘–(š‘ ,š‘Ÿ) from šŗī…ž. Each node represents a particle, and the population size of particle is š‘›. Some nodes of š‘š‘ can form a particle sequence {š‘›š‘š‘–1,š‘›š‘š‘–2,š‘›š‘š‘–3ā€¦,š‘›š‘š‘–š‘š} (š‘šā‰¤š‘›) according to their order, which can construct a path š‘š‘–(š‘ ,š‘Ÿ) from source š‘›š‘  to š‘›š‘Ÿ. The algorithm ICPSOA would optimize the particle sequence to obtain the optimal path š‘š‘(š‘ ,š‘Ÿ) with optimal fitness ļ¬tness(š‘š‘), and š‘š‘(š‘ ,š‘Ÿ) includes the following nodes {š‘›š‘ ,š‘›š‘š‘1,š‘›š‘š‘2,ā€¦,š‘›š‘š‘š‘š,š‘›š‘Ÿ}. Each node owns a routing table recording the paths it belongs to and the nodesā€™ information on these paths. We now demonstrate with an example how the routing recovery process is accomplished in our protocol. In the example (Figure 2), š‘›failā€™s child node š‘›failāˆ’š‘ and parent node š‘›failāˆ’š‘ are node 23 and 13.

Step 1. š‘›failāˆ’š‘ reports the failure of š‘›fail to š‘›š‘Ÿ, and š‘›failāˆ’š‘ reports the failure to š‘›š‘  (node 2), then š‘›š‘  starts up another backup path to transmit data. š‘›š‘  broadcasts a path request (PR) packet, with routing table including its own available energy and coordinate.

Step 2. If an intermediate node š‘›š‘– receives PR, it will relay the packet according to its own state of information: if š‘›š‘– is on one of the other existed š‘˜āˆ’1 paths between š‘›š‘  and š‘›š‘Ÿ, it will ignore the packet; else, š‘›š‘– will calculate dist(š‘’), ene(š‘’) and delay(š‘’) between š‘›š‘– and š‘›š‘–āˆ’1 according to the information provided by š‘›š‘–āˆ’1. Then, it continues to relay packet RP, with routing table including above information, š‘›š‘– and š‘›š‘–āˆ’1ā€™s ID, and its ene(š‘›).

Step 3. Each intermediate node in the subtree repeats Step 2 until š‘›š‘Ÿ receives the PR. Then š‘›š‘Ÿ extracts the information, calculates ļ¬tness(š‘š‘–) using the ICPSOA and selects the path š‘š‘ with optimal fitness. Then, it broadcasts packet RP_ACK, including the IDs of selected nodes {š‘›š‘š‘1,š‘›š‘š‘2,š‘›š‘š‘3,ā€¦,š‘›š‘š‘š‘›} on š‘š‘ (path 2-8-14-19-24-30 in Figure 2) in routing table.

Step 4. If š‘›š‘š‘š‘– has received PR_ACK, it checks whether its ID is in the packetā€™s routing table. Then it establishes a connection between child š‘›š‘š‘š‘–āˆ’1 and parent š‘›š‘š‘š‘–+1 and delivers PR_ACK to parent until š‘›š‘  receives the packet. Go to Step 5.

Step 5. š‘›š‘  broadcasts packet PR_END, and š‘›š‘š‘š‘– on š‘š‘ delivers it to š‘›š‘Ÿ. Once š‘›š‘Ÿ receives PR_END, the š‘˜th path from source node to its root in the subtree is established, the network topology is reconstructed, and protocol ends.
During this process, š‘›š‘Ÿ will broadcast one packet and receive three packets, š‘›š‘  will broadcast two packets and receive two packets, a part of relay nodes š‘›š‘– in the subtree will deliver four packets. We assume the number of hops of š‘š‘–(š‘ ,š‘Ÿ) (š‘–āˆˆ(1,š‘˜āˆ’1)) is š‘š‘š‘–(š‘ ,š‘Ÿ)š‘–, then the number of the packets received by š‘›š‘Ÿ before running the ICPSOA is š‘=š‘STš‘–š‘–āˆ’āˆ‘š‘˜āˆ’1š‘–=1š‘š‘š‘–(š‘ ,š‘Ÿ)š‘–. In this way the energy consumption of packet receiving and broadcasting of š‘›š‘Ÿ can be calculated.

4. Simulation Results

4.1. Simulation Model

To evaluate the performance of the ICPSOA, we design a corresponding simulation scenario upon Matlab. The simulation experiment is constructed on Windows XP with Intel Pentium 4 processor (2.4ā€‰GHz) and 2ā€‰GBā€‰RAM. The goal of the simulation is to show that the ICPSOA can provide a more stable transport environment in an error-prone network. The results are also compared to the protocols of ICE, CPEQ, and EARQ.

In Table 2, we present the parameters configured for the conducted simulation experiments. The sensor nodes are randomly deployed on area š“, and the macronodes are located at known coordinates. 500 rounds are taken and five packets are delivered in each round. The size of the network is the same for different algorithms and the fitness function is then measured. The parameters used for the ICPSOA are function dimension š·=30, iterated generations š‘ƒGen=1000, division factor š‘˜=5, clone factor š›¼=4, mutation factor š›½=0.5. According to the description of the ICPSOA in Section 3.1.3, we set energy consumption enePU=80ā€‰pJ, eneIC=5ā€‰pJ, eneIM=10ā€‰pJ, enePS=15ā€‰pJ, enePR=10ā€‰pJ per iteration. The metrics that we use in our experiments are average number of alive nodes per round, average energy depletion ratio per round (measured as the energy dissipation versus the initial energy), average delay of packet delivery, and packet delivery ratio (measured as the number of successfully delivered packet versus required packet).

4.2. Evaluation of the Simulation Results

To illustrate the effect of the proposed protocol, we take a snapshot during a simulation. Figure 4 shows a small area (2000ā€‰m Ɨ 2000ā€‰m), which illustrates a subtree with the existed three paths between source node (node 29) and macronode (node 18). We can see when an intermediate node (node 24) fails, source node immediately establishes an alternative path to connect the macronode in order to replace the previous 3rd path.

The simulation ends after 1000 rounds. We compare the number of alive nodes per round for these four protocols. As shown in Figure 5 for different network sizes, the number of nodes died in the ICPSOA, ICE, and CPEQ is less than EARQ over the same number of rounds. This is because comparing with heterogonous WSNs, all the nodes only need to transmit data to its root (macronode) of the subtree in the H-WSNs, which indirectly shortens the transmission distance between sensor nodes to the sink, and prolongs their lifetime. The fast routing recovery mechanism of the ICPSOA also makes its number of alive nodes 5%āˆ¼10% more than that of ICE, and CPEQ in the same rounds. Then, we would only compare the ICPSOA, ICE, and CPEQ with the same network style (H-WSNs) in the remaining simulation process.

As shown in Figure 6 for different network sizes, energy depletion ratio of the ICPSOA based protocol is 5%āˆ¼15% smaller than that of ICE and CPEQ. And the dispersion between them is more obvious as the size of the cluster increases. That is because firstly, each source node has š‘˜ paths to the macronode, and the total energy consumed is minimized; secondly, the ICPSOA can select the nodes with better QoS parameters (such as more available energy and less distance of path) to establish alternative path and construct a more reliable transmission environment to reduce the retransmission caused by unstable paths, therefore, prolong the network lifetime as compared to ICE and CPEQ.

Figure 7 shows the average delay of packet delivery (average delay of each packet delivered from source node to the sink). We can observe that the ICPSOA outperforms ICE and CPEQ in terms of average delay for the same networks. The ICPSOA has demonstrated a lower delay when network size grows. A low delay of packets can be explained by the multipath property and shortest alternative path selection of the proposed ICPSOA-based protocol for fault tolerance.

Figure 8(a) shows that the H-WSNs with the ICPSOA can deliver more packets to the sink than the network with EARQ and ICE with the value of 0.02 of š‘node. In most cases, the ICPSOA can send 5%āˆ¼15% more packets to the sink. A bigger value of packet delivery ratio indicates a lower packet dropout probability [24, 25] and a better network capability of delivering useful information. This result can be explained by the fact that the ICPSOA provides a fast recovery from path failure with an optimal alternative path, which improves the success rate of data transmission. Note that the packets in the ICPSOA experience a higher delivery ratio as the size of the network grows, which indicates the ICPSOA-based protocol of the H-WSNs is more feasible for practical deployment of large-scale WSNs than ICE and CPEQ.

We should also compare the performance trend of the three algorithms with different probability of node failure š‘node. We plot the packet delivery ratio against the number of nodes in Figure 8 for various š‘node (value of 0.02, 0.05, and 0.08). As shown in Figure 8(a), 8(b), and 8(c), the observed packet delivery ratio of the proposed schemes degrades as š‘node ascends, which means the performance of proposed scheme is reduced as the percentage of failed nodes increases. But the ICPSOA can still deliver more packets than ICE and CPEQ for different size of the sensor network.

5. Conclusions

We propose the ICPSOA-based fault-tolerant routing protocol for H-WSNs, which focuses on a solution to the problem of energy depletion and packet delivery of nodes, by trying to reconstruct the topology structure and recover the routing for the path failure and achieve energy conservation by avoiding unnecessary retransmission. The conserved energy can be used to increase the quantity of information received by the sink. The experiment presents the promising ability of the ICPSOA, and better solutions of fault tolerance and prolonging the network lifetime can be obtained by the ICPSOA-based protocol than the protocols of EARQ, ICE and CPEQ. The results have illustrated the advantage of H-WSNs and backup disjoint multipath, which can reduce the risk of data delivery loss and energy consumption on the path exploring. It also aims at shortening delay of packet delivery, evening energy dissipation among the nodes by constructing the optimal alternative paths in the H-WSNs with the swarm intelligence algorithm. The strength of the ICPSOA is its simplicity, robustness and effectiveness for fast routing recovery compared to other approaches and makes the ICPSOA a potential solution to meet the requirements of critical conditions monitoring applications.

As for future studies, the following directions are under the way: firstly, the proposed protocol ignores the fault-tolerant routing between macronode-macronode communications, which could be considered to form a more complete protocol architecture; secondly, we should further reduce the computational complexity of the proposed ICPSOA such that it converges faster to a better solution, providing robustness against failure in the network.


This work was supported in part by the Key Project of the National Nature Science Foundation of China (no. 61134009), the National Nature Science Foundation of China (no. 60975059), Specialized Research Fund for the Doctoral Program of Higher Education from Ministry of Education of China (no. 20090075110002), Specialized Research Fund for Shanghai Leading Talents, Project of the Shanghai Committee of Science and Technology (nos. 11XD1400100, 11JC1400200, 10JC1400200, 10DZ0506500).