Abstract
Due to the influence of the environment on the water quality wireless sensor network, it is difficult to replace the node energy at any time, so it must make the most of the little energy available. This work provides a technique that combines cluster head selection, cluster structure creation, and data transmission into one optimal scheme. Firstly, we optimize the cluster head election probability threshold formula on the basis of LEACH and introduce overlap ratio in the competitive mechanism, to avoid excessive overlap between cluster heads. Subsequently, to alleviate the “hot spot” problem caused by multihop, the competitive mechanism of ILEACH is optimized, which is based on the nonuniform competitive mechanism in the EEUC algorithm when electing the cluster head. Meanwhile, in the structural planning between cluster heads, the optimal path is searched based on the parallelism of genetic algorithm (variable path coding strategy); the combination of ILEACH and EEUCIGA, named Energy Balance Multihop Clustering Routing Protocol (BEBMCR), can avoid the emergence of “hot paths” and reduce running time. The simulation results show that BEBMCR still has a longer stable period and higher energy utilization rate under largescale networks, and node energy consumption is more balanced.
1. Introduction
In recent years, with the rapid development of the Internet of Things (IoT), wireless sensor networks, one of the key technologies to realize the Internet of Things [1], have been widely used. It also brings new ideas to the field of water environment monitoring, which can be applied to harsh and inconvenient applications that are difficult for humans to reach and require longterm largearea monitoring.
The wireless sensor network is mainly composed of lowpower sensor nodes and forms a multihop network in a selforganizing manner, with realtime perception, data processing, storage, and transmission capabilities. Since the general water area is widely distributed, it is difficult to maintain the network after deployment, so it can work in unattended mode [2, 3]. Limited by the size, weight, and cost of water quality sensor nodes, batteries are usually used to power nodes with limited energy. At present, the research hotspots of wireless sensor networks are mainly to save energy and extend the life of nodes.
1.1. Routing Algorithms of Homogeneous WSNs
Heinzelman et al. (2000) proposed LowEnergy Adaptive Clustering Hierarchy (LEACH) protocol [4]. This protocol randomly selects some nodes as cluster head nodes; the node is sent to the base station through the cluster head. The random selection of cluster heads balances the energy of nodes to a certain extent and prolongs the working life of the network. However, the protocol also has excessive energy consumption caused by random election of cluster heads and singlehop data transmission. In recent years, scholars have proposed various improvement methods for these problems. DistanceandEnergyAware Routing (DEARER) combines energy and distance to make network energy consumption more balanced [5]. The same idea, combining the energy quality function and the distance quality function to determine the cluster head, can make the selection of the cluster head more uniform and reasonable and reduce the power consumption in data transmission [6]. However, the data transmission stage is not considered, and it is easy to form a single intercluster route. In the process of random selection of cluster heads, the energy of cluster heads needs to meet certain standards to replace new cluster heads, which reduces excessive energy consumption caused by repeated election of cluster heads by nodes and extends the network life cycle [7]. On the basis of the original cluster head, a secondary cluster head is selected as a guarantee that the cluster head node fails or cannot complete the work alone, which can improve the robustness of network transmission [8]. Elmonser et al. proposed Dynamic Multihop LEACH (DMHLEACH), which combines dynamic clustering, multihop transmission, and node mobility to improve node capabilities, but ignores the impact of other locations and node overlap on clustering. For homogeneous WSNs, Shahbaz et al. proposed a multipath routing solution, in which the wireless sensor network is clustered using the firefly algorithm, but the search method relies too much on excellent individuals, which reduces the convergence speed [9, 10].
1.2. Routing Algorithms of Heterogeneous WSNs
Research on the twolevel energy heterogeneous model, based on the LEACH algorithm, combined with the difference in node initial energy, proposes the SEP algorithm. This protocol optimizes the number of cluster heads and sets higher cluster head elections for nodes with higher energy. Thereby, node energy consumption is more balanced, the disadvantage is that the routing adopts singlehop transmission, and the transmission energy consumption is high [11]. For the threelevel energy heterogeneous model, the weighted cluster head election strategy is dynamically selected through the initial energy of the node. Combined with the remaining energy of the node, the purpose of extending the life cycle of the network is achieved [12]. In the EnergyEfficient Uneven Clustering (EEUC) protocol, in order to reduce the direct energy consumption of the cluster head, the protocol uses a nonuniform cluster competition mechanism. However, the disadvantage is that the residual energy of the cluster head node and the uneven distribution of the cluster head are not considered [13]. The Distribute EnergyEfficient Clustering Algorithm (DEEC) is based on a general multilevel energy heterogeneous model. The rationality of selecting cluster heads needs to consider the current energy of the node and the average remaining energy in the network. However, only the remaining energy is considered, and the actual distance from the node to the sink node is not fully considered [14]. A centralized nonuniform clustering routing protocol (CEEUC) takes into account the residual energy of nodes and the cost of communication between nodes and cluster heads. However, the optimal number of cluster heads is not considered for clustering [15].
In order to solve the above problems, the proposed scheme improves the basic idea: (1)According to the remaining energy of the node, the location of the node, and the density of neighbor nodes, optimize the threshold formula for electing cluster heads(2)Introducing overlap rate into the competition mechanism, combined with the concept of competition radius, solves the overlap of the cluster head node coverage area, stabilizes the number of cluster heads in the network, and helps improve the throughput and energy utilization of cluster heads [7, 8](3)The topological structure of the water quality monitoring network is designed using a genetic algorithm, and the algorithm is made suitable for largescale network architectures by combining singlehop and multihop communication(4)The BEBMCR Algorithm 1 draws on the EEUC nonuniform competition mechanism in the election of cluster heads [13]. In the selection of the next hop, it searches for the optimal path based on the efficiency and parallelism of the improved genetic algorithm and uses the variable path coding strategy to integrate communication distance, the relative remaining energy, and the number of nodes in the cluster

2. Establishment of Network Model
The homogeneous WSN routing algorithm [4–10] and the heterogeneous WSN routing algorithm [11–15] use the same firstorder radio energy consumption model for analysis. The energy consumption analysis of the network wireless communication model is exactly the same as the LEACH algorithm, so the analysis will not be repeated here.
2.1. Energy Consumption Model
The wireless communication energy consumption model is shown in Figure 1. It needs to consume data to transmit bit data to the sink node with a distance of [16, 17]. where is the energy consumption of sending 1 bit data, is amplification characteristic constant, is the multipath attenuation model amplification characteristic constant, and is the distance threshold of the two models.
The calculation formula of is as follows:
The calculation formula for the energy consumed by the node receiving data is as follows:
In addition to the above energy consumption, the cluster head node also needs to perform data fusion after receiving the data. The energy consumed by the data is: where is the energy of the cluster head node compressing data.
2.2. Network Model
Random water quality sensor and random point distribution within a certain area, outside the base station area, and outside the area have the following contents: (1)The water quality sensor nodes in the network are spread at random, and the positions of the sensor nodes and base station nodes will not alter once the deployment is complete(2)Except for the base station, each sensor node in the network has a unique ID number, and the computing power, initial energy, and storage space of each node are exactly the same(3)Each water quality sensor node can judge the location of the sending end node by analyzing the angle and intensity of the received signal(4)The data sent by the sensor node can adjust the communication power according to the communication distance(5)The sensor node can automatically adjust the working and sleep modes according to the TMDA time slot [18]
2.3. Overlap Ratio
In the ideal state of network energy consumption balance, the DEEC algorithm calculates the cluster head probability by averaging the remaining energy, and other factors are ignored, such as the distance between the node and the base station and the density of adjacent nodes [14]. It is ensured that nodes with high residual energy, geographical location close to the base station, and high density of adjacent nodes have a greater probability of becoming cluster heads, but there are too many overlapping areas between cluster heads, which are not relatively independent, as shown in Figure 2, CH O_{1} and CH O_{2} repeatedly cover a larger area, and will increase the energy consumption of the node to a certain extent in this range [19].
Taking into the overlap of the cluster head competition area, the cluster head enters the final determination link combined with the calculation formula of the competition radius . The fields of cluster head O_{1} and cluster head O_{2} are circles with radius , the area of which is recorded as , and the area of quadrilateral O_{1}AO_{2}B is recorded as . The areas of sector AO_{1}B and sector AO_{2}B are recorded as and , respectively, and the shadow area is recorded as , and get: where and can be expressed as: where , is the relative position coordinates of O_{1} and O_{2}.
Using the law of cosines, we can get:
The value of can be calculated using formula (7):
From formulas (5), (6), and (8), the overlap area ratio of the region can be calculated as (shaded area):
When the overlap area ratio is , which can get , the range of the water quality sensor’s cluster head broadcast message is . In this way, in order to ensure that adjacent CH nodes have competition radius alternation, broadcasting within the defined range can save node energy consumption to a certain extent. The main purpose is to ensure that the nodes are evenly distributed and to avoid the repeated area of cluster heads being too large.
3. Improved LEACH Protocol for Cluster Head Election
In the cluster head election phase, add node remaining energy or node position or node density to improve the cluster head election formula, or Heinzelman et al. adopted nonuniformity; the idea of clustering is mainly to reduce the scale of clusters close to the sink node. Although the phenomenon of “hot spots” is alleviated, when the actual water quality sensor detects water quality indicators, due to the influence of geography and weather, adjacent CH nodes have alternate radii of competition. This part of the improvement of LEACH (ILEACH) is an improved algorithm proposed for the poor scalability of the above algorithms.
3.1. Cluster Head Selection Stage
According to the LEACH algorithm analysis in Section 2.1, it can be seen that the average energy considered by the cluster head election probability is in an ideal state, and regardless of this factor, the distance between the node and the base station, but the energy consumption of the node is affected by the distance. In addition, the random selection of cluster heads will also lead to uneven distribution of cluster heads in the network. In this section, the selection of the temporary cluster head will be based on the LEACH algorithm, combined with the current energy and location conditions of the node to obtain a new threshold formula [20]. Figure 3 summarizes the cluster head selection process.
Step 1. Information exchange between node and neighbor nodes
In order to ensure that all nodes are active, the base station first broadcasts a message and then broadcasts an initialization message to announce the start of the cluster head election.
After receiving the message, the node stores the data in the message and estimates the distance from itself to the base station and calculates , at the same time, to decide whether to run for cluster heads. The nodes participating in the campaign broadcast messages with the maximum communication radius to obtain neighbor node information [7]. where is the remaining energy of the node at the beginning and is the average remaining energy of the network.
Step 2. Election of temporary CHS by new threshold formula
By comparing and selecting the largest fitness value, if the fitness value is the same, the second round of calculation is performed. The candidate cluster head with a relatively large fitness value is selected as the cluster head during the cluster structure formation stage, and the election message is broadcast. Takele et al. proposed a new threshold calculation method, which mainly considers the method including the current remaining energy of the node, the density of adjacent nodes, the distance between the node and the base station, and the average distance between the node and the adjacent nodes [21]. where is the current remaining energy of the node; is the distance between the node and the base station; is the average distance between the node and neighbor nodes, and is the number of neighbor nodes within the radius, and is the number of nearby nodes in the standard cluster.
The maximum coverage communication range of an area with dense water quality sensor nodes is , that is, the node distance is less than to be defined as a dense node area. where is the distance between the ordinary node and the ordinary node of the cluster where it is located, is the percentage of the number of surviving nodes in the CH node number station, and is the area of the monitoring area.
The calculation formula of is as follows: where is the distance between node and node and , , , and are weighting coefficients, and .
In the improved threshold formula, the probability of a node being selected as a temporary cluster head is related to the current remaining energy of the node, the relative density of the node, the distance from the base station, and the average distance from the node.
Step 3. The candidate cluster heads that have failed the election temporarily go to sleep
Step 4. Information exchange between temporary CH and neighbor CHs
For specific content, refer to Section 2.3 and Section 3.2.
Step 5. The final CH node is selected, and the final CH node broadcasts the elected message and necessary information.
3.2. Overlap Ratio Add to Competitive Mechanism
In the final determination of the cluster head node, the overlap rate is introduced in ILEACH. The temporary cluster head takes itself as the center and broadcasts the message with as the competition radius. If there are other temporary cluster heads, compare the remaining energy of these temporary cluster heads, specify that the temporary cluster head with the most remaining energy is the final cluster head in the area, and broadcast the elected cluster head message to the network and the remaining temporary cluster heads that fail the broadcast announced its withdrawal from the competition. When the temporary cluster heads in the same competition area receive the news that other cluster heads are elected, they will automatically announce their withdrawal from the competition. where is the side length of the monitoring area and is the theoretical optimal number of CHS.
ILEACH optimizes the probability threshold formula when selecting cluster head to ensure high standby energy and high distance to base station; nodes with high density of adjacent nodes have more action opportunities than cluster head. At the same time, the overlap rate is introduced to reduce the cost. Since there is not much overlap between cluster heads, they are relatively independent, which is conducive to improving the throughput and energy utilization of cluster heads.
4. Algorithm for the “Hot Paths” Problem
In order to reduce the complexity of the wireless sensor network without affecting the effectiveness of the EEUC protocol, in the routing algorithm for searching for the optimal path, this paper chooses a variablelength path coding strategy to improve the genetic algorithm. On the one hand, the nonconformity of the EEUC protocol is used, which is the idea of uniform clustering. On the other hand, it improves the efficiency of genetic algorithm to solve the optimal value [13, 22].
4.1. Improved EEUC Protocol
In order to address the issue of “hot paths” in the network, cluster heads near the base station consume too much energy, causing these nodes to perish quickly. The optimization in EEUC’s nonuniform competition mechanism is based on the election of cluster head nodes. Competitive radius issue: nodes form clusters of different sizes by calculating their own competition radius, so that the cluster radius will be closer to the base station, reducing energy consumption for receiving member node information and increasing the ability to send and receive information between internal and external.
Figure 4 is a schematic diagram of temporary cluster head competition, because and do not have a competitive relationship. While is within the competition radius of , it is necessary to compare the current remaining energy of the two.
After the CH is ultimately determined, the remaining half is completed using the ILEACH clustering stage’s remaining energy gradient, which will not be reproduced here.
4.2. Structure between CHS
The encoding method affects the efficiency of the genetic algorithm to solve the optimal value. Usually, binary coding or real number coding is selected, but since the nodes distributed in the network are all set to integers, these integers form chromosomal individuals in the population, so this article chooses the variablelength path coding strategy. For example, the sequence of nodes from node 1 to node : , and the path code: . This coding method makes the path constituted without loops, which obviously improves the search efficiency of the genetic algorithm [23].
Let represent a path from CH1 to CHn, and the fitness function is shown in the formula. where is the weight of the energy consumption of the communication link, refers to the path consumption of path , and is the busy state of node . where is the total energy consumed by node in the network communication process, is the initial energy of node , and is the distance between node and BS. where is the distance from to the next , is the distance from the next to BS, and is the current energy of the next .
Crossoperation: suppose , randomly select as the intersection in path , then select that intersection makes the path after this node and form a feasible path , that is, the intersection is . After the two paths are exchanged and reorganized, two new paths are formed: and . When performing the mutation operation, select the node with more energy in the path as the mutation point, then the path from the initial node will not change, and energy of the node is less than , which is a new path: , , and . The specific process is shown in Figure 5.
4.3. Combination of ILEACH and EEUCIGA
The proposed protocol—Energy Balance Multihop Clustering Routing Protocol for LargeScale Water Quality Monitoring (BEBMCR)—is as follows.
We combine the advantages of the two algorithms. Aiming at the overlapping problem of the cluster head broadcast field, based on the LEACH protocol, the cluster head election probability is improved according to the remaining energy of the node, the density of neighbor nodes, and the location of the node; combined with the concept of the proposed competition radius, the energy consumption of nodes in the repeated coverage area of the network is reduced; secondly, in view of the dead node consumption of the EEUC protocol and the problem of some nodes not joining the cluster, we optimized the genetic algorithm and adopted path coding strategy to improve the efficiency of genetic algorithm to solve the optimal value and reduce the energy consumption of unnecessary network path transmission.
5. Experimental Simulation
5.1. Simulation Setup
In order to facilitate the creation of a GUI to verify each algorithm, this article chooses the MATLAB environment to implement and simulate the routing protocol and compare LEACH, ILEACH, EEUC, and BEBMCR. Table 1 shows the environmental parameter values used in the simulation. The comparison method is mainly used to simulate and analyze the number of surviving nodes in the entire network, network data transmission volume, and network energy consumption.
The simulation network model is shown in Figure 6. CHS uses multihop data transmission, and its graphical view represents the cluster of each round node and the routing topology of the network, in which the network is divided into formation. There is a cluster head represented by a red node, and the area below is called a cluster. Each cluster in the whole network has a BS, which collects data from all other nodes. The red line indicates the CH node sending data to BS, and the green line indicates the transmission path between CH nodes. The red and green lines represent the multihop data transmission between CHS.
(a)
(b)
Figure 6(a) shows the simulation iteration process of the EEUC routing protocol. The competition range of nodes decreases with the decrease of the distance from the base station. The result is that the closer the cluster is to the base station, the smaller the scale is, and the less energy is consumed. Within the cluster, the data processing process consumes less energy, which provides more energy for intercluster communication. But in EEUC routing in each round of data transmission, if the distance from BS is farthest or closest, all nodes need to recalculate their respective competition radius and inform other nodes in the network. Therefore, it is difficult to avoid that some nodes in the network do not join any clusters or some cluster heads do not communicate. The second and fourth parts of Figure 6(a) show this problem very well.
Figure 6(b) shows the simulation iteration process of the proposed BEBMCR routing protocol. The proposed algorithm combines ILEACH and EEUC and adopts ILEACH competition mechanism when clustering, thereby reducing the energy of the cluster head, which is close to the base station. The purpose of IGA is to optimize the multihop mode between EEUC clusters to communicate with the base station, thereby alleviating the “hot spot” problem in the algorithm.
This paper also simulates the classic ACO path search algorithm, the routing protocol (GA) based on the traditional genetic algorithm, and the improved genetic algorithm (improved GA) from the calculation time of the optimal path [23, 24]. As is shown in Figure 7, when the number of surviving water quality nodes is relatively small, these three algorithms will search for the best path in time. As the number of water quality nodes increases, the ACO algorithm needs to search one by one according to the path cost, while the genetic algorithm finds the best path according to its efficiency and parallelism. Therefore, compared with the ACO algorithm and the GA algorithm, the time for the improved GA algorithm to search for the optimal path is greatly reduced, which reduces the delay of the algorithm. Because the path coding strategy proposed in this paper is easier to perform crossover and mutation operations and correspondingly reduces the calculation time, it also shows that the improved GA algorithm improves the efficiency of the algorithm.
5.2. Simulation Results
Figure 8(a) shows that in LEACH, ILEACH, and EEUC, the first dead node appeared in the first 300 rounds, but in the proposed BEBMCR, it appeared in about 350 rounds. When all water quality sensor nodes fail, the network operation rounds of leaching, ILEACH, and EEUC are the first 400, 500, and 700, and our algorithm is about 1000. The reason is that our algorithm not only considers the remaining energy of the node, the location of the node, and the density of adjacent nodes but also increases the competition radius. Compared with the above three protocols, the network lifetime of BEBMCR has increased by 30% to 60%.
(a)
(b)
(c)
Figure 8(b) shows that the throughput of BEBMCR is much higher than the other protocols. This is the case of limited energy, according to the proposed algorithm, the network can send more data, so BEBMCR has higher energy efficiency than other energy efficiency. At the same time, since LEACH adopts a competition mechanism of randomly electing cluster heads, LEACH has the smallest throughput. For example, when the round is 400, the throughputs of LEACH, EEUC, and ILEACH are, respectively, , , , and .
Figure 8(c) shows the trend system of the energy consumption of the routing protocol with the number of rounds. It is not difficult to find that the energy consumption of the BEBMCR protocol is always less than the energy consumption of other protocols, because the BEBMCR protocol uses IGA to optimize EEUC’s intercluster multihop communication with the base station, alleviating the “hot spot” problem in the algorithm and designing a reasonable communication mode for water quality sensor nodes around the receiver .
Figure 9 shows that, while keeping the number of water quality sensors in the monitoring area and other conditions in the network unchanged, the network area is changed to and , respectively. In LEACH, ILEACH, and EEUC, in the first 100 rounds, the first node died; with the increase of the network area, the network life of BEBMCR has been prolonged compared to the other three algorithms. Overall, the network life of BEBMCR has remained above 50%.
(a)
(b)
Figure 10 shows that while keeping the field size in the monitoring area and other conditions in the network unchanged, the number of water quality sensor nodes is changed to 200 and 400, respectively. In LEACH and EEUC, the first dead nodes appear in the first 200 rounds. ILEACH is the first node death in the first 300 rounds. That is, as the number of water quality sensor nodes increases, the first dead node in the BEBMCR algorithm is maintained at about 300 rounds. The network lifetime of BEBMCR is still maintained at an increase of about 50%. When all the water quality sensor nodes are dead, the network operation rounds of LEACH, ILEACH, and EEUC have basically not changed, and our algorithm is after 1000. Therefore, BEBMCR is reliable when there are water quality sensor nodes on the large network.
(a)
(b)
6. Conclusion
In order to extend the running time of the monitoring network and realize longterm online monitoring of river water quality parameters, an energy balance clustering routing algorithm (BEBMCR) for largescale water quality monitoring is proposed. The main innovations of this article are as follows: (1)The cluster head election probability is increased based on the remaining energy of the node, its location, and the density of neighboring water quality nodes, according to the research of LEACH, the classic cluster routing method(2)Considering the overlap rate of water quality sensors, combined with the concept of competition radius, the overlap rate is introduced into the competition mechanism to stabilize the number of cluster heads in the network(3)The combination of singlehop communication and multihop communication makes the algorithm suitable for largescale water quality sensor network structure(4)In the selection of the next hop, the optimal path is searched based on the efficiency and parallelism of the genetic algorithm, and the variable path coding strategy is used to integrate the communication distance, the relative remaining energy, and the number of water quality nodes in the cluster, to avoid the appearance of “hot spot”(5)Combining the characteristics of ILEACH and EEUCIGA, four groups of experiments were compared
Experimental results show that the routing algorithm is more suitable for largescale network deployment and provides a theoretical basis and reference for river water quality monitoring research. In this paper, the distribution of sensor nodes is fixed and static, but in some cases, the nodes need to be mobile. How to ensure the optimal energy efficiency of the network brought by the routing protocol when the nodes are constantly changing is an important issue that needs to be considered in the next research.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest to report regarding the present study.
Authors’ Contributions
Yulin Gong (corresponding author) is responsible for the conceptualization, formal analysis, funding acquisition, methodology, and project administration; Jiannan Cao for the data curation, software, visualization, and original draft; Chengfei Han for the investigation, resources, and validation; and Yunqing Liu for the supervision, review, and editing.
Acknowledgments
This paper was supported by the Science and Technology Development Project of Jilin Province (201903030800sf).