#### Abstract

Emergency response to hazardous gases in the environment is an important research field in environmental monitoring. In recent years, with the rapid development of sensor technology and mobile device technology, more autonomous search algorithms for hazardous gas emission sources are proposed in uncertain environment, which can avoid emergency personnel from contacting hazardous gas in a short distance. Infotaxis is an autonomous search strategy without a concentration gradient, which uses scattered sensor data to track the location of the release source in turbulent environment. This paper optimizes the imbalance of exploitation and exploration in the reward function of Infotaxis algorithm and proposes a mobile strategy for the three-dimensional scene. In two-dimensional and three-dimensional scenes, the average steps of search tasks are used as the evaluation criteria to analyze the information trend algorithm combined with different reward functions and mobile strategies. The results show that the balance between the exploitation item and exploration item of the reward function proposed in this paper is better than that of the reward function in the Infotaxis algorithm, no matter in the two-dimensional scenes or in the three-dimensional scenes.

#### 1. Introduction

With the development of gas sensor technology and mobile device technology, more and more researchers are facing gas-related research. Tracking the weak information released by the releasing sources in nature and human society through the olfactory tracking device can be regarded as the optimization problem of dynamic multiparameter function. Olfactory searchers can be used in areas related to the gas releases, such as toxic gas detection and location [1], rescue and relief [2], explosives detection and location [3] air pollution source tracking [4], and fire accident [5].

According to the search mode, the release source search algorithm can be divided into three categories: concentration tropism [6, 7], wind tropism [8–10], and information tropism [11]. According to the number of robots, it can be divided into multirobot cooperative search method [12, 13]and single-robot search method. Common multirobot collaboration algorithms include grey wolf algorithm [14], particle swarm optimization algorithm [15], bee colony algorithm [16], and so on. According to the search area, it can be divided into two-dimensional scene search algorithm [17] and three-dimensional scene search algorithm [18–20].

Concentration tropism algorithm and wind tropism algorithm are early olfactory localization algorithms inspired by biological behaviour [21–22], which have a long search path according to the evolution of predation and mate seeking behaviour. In recent years, there are also release source tracking algorithms that combine bioheuristic algorithm with vision [23, 24]. However, concentration tendency algorithms, such as the blue crab algorithm [22], have a patch and discontinuous particle distribution of the release source in a turbulent environment with a high Reynolds number, and the concentration gradient tracked by the mobile searcher is not reliable. Due to the complexity of plumes, it is important to collect as much information as possible. Wind tropism algorithms [8] are mostly applied to expensive commercial anemometers. Information tropism refers to an autonomous search algorithm based on information gain. This algorithm was first proposed by Vergassola et al. [11] in 2007, which provides a new way to solve the tracking problem of release sources. Compared with other kinds of autonomous search algorithms, when the concentration information is sporadic and sparse, even if the threshold value of the sensor carried by the searcher cannot be reached, the mobile searcher can well complete the task of autonomous search. Since then, more and more research studies have been conducted on information orientation algorithm [25, 26].

Masson [27] proposed a search algorithm in the two-dmensional limited environment perception scene, which uses free energy as the information acquisition metric of mobile tracking devices. Rodriguez et al. [28] applied the Infotaxis algorithm to finite two-dimensional lattices with different geometric shapes. In 2014, Ristic et al. [29] proposed an autonomous search algorithm in the two-dimensional scene with obstacles, which is called enfotaxis. Ristic et al. [30] used Rényi divergence as information acquisition to measure the autonomous search method based on sequential Monte Carlo combination in the two-dimensional scene and analyzed it in combination with various reward functions. Hutchinson et al. [31] proposed an information destination algorithm based on Kullback–Leibler divergence as a measure of information acquisition and simulated it in real environment. The results show that the algorithm has good search performance in real environment. Rodríguez et al. [32] proposed a blind information algorithm in the two-dimensional scene and compared the search success rate with the traditional Infotaxis algorithm under a variety of fuzzy environment conditions. Ruddick et al. [33] extended the Infotaxis algorithm to three-dimensional space for application and applied the algorithm to the actual environment for verification. In 2019, Song et al. [34] proposed the minimum free energy search method based on the combination of entropy and potential energy in Monte Carlo. Hai-Feng et al. [35] proposed an underwater chemical plume tracking system, which used the combination of partially observable Markov decision algorithm and artificial potential field algorithm to construct the underwater source probability map. Park and Oh [13] conducted a large number of simulation analyses by applying a sequential Monte Carlo Infotaxis algorithm with particle filter and combined them with a variety of reward functions in a two-dimensional scene by using a multirobot.

Although the literatures mentioned above have improved the Infotaxis algorithm to some extent, they are only improved in a single scene (two-dimensional or three-dimensional scenes), and there is no comparative analysis on the universality of different decision functions and mobile strategies in two-dimensional and three-dimensional scenes. Therefore, this paper proposes a decision function for Infotaxis algorithm and analyzes the search performance of the new modified Infotaxis algorithms in two-dimensional and three-dimensional scenes using different mobile strategies as well as proposes a new mobile strategy for three-dimensional search scenes.

#### 2. Infotaxis Algorithm

Infotaxis algorithm can be divided into three steps: information state, reward functions, and mobile strategy. The search area is gridded. The algorithm initially assumes that each square has the same probability of releasing the source. The mobile searcher evaluates the value of each alternative direction reward function in the mobile strategy and chooses the direction of the maximum information entropy drop as direction, as shown in Figure 1. The mobile searcher divides and updates the probability distribution of each square with each step forward. The darker the square is, the higher the probability that the square is the source of the release. With the decrease of information entropy, the mobile searcher will get closer to the release source.

##### 2.1. Gas Diffusion Model

The premise of the Infotaxis algorithm is that the mobile searcher is able to predict the concentration of particles released by the emission source around the location, which is to predict the concentration of particles released by the emission source everywhere in the space through the gas diffusion model. General gas diffusion model satisfies the following equation:where represents the average wind speed in unit ; refers to the effective isotropic diffusion coefficient in the turbulence model, in unit ; refers to the release rate of the release source; refers to the average life of particles released by the releasing source, in unit ; is Kronecker delta; and refers to the particle concentration at *r* if the release source is at *r*_{s}. After is processed, the two-dimensional turbulent diffusion model is transformed into equation (2), and the three-dimensional turbulent diffusion model is transformed into equation (3).where is the Bessel function with zero order modification; and is the linear distance between the current position of the mobile searcher and the released source. The diffusion model is the expression of the classical turbulent diffusion model in two-dimensional space. The simulation diagram is shown in Figure 2. The coordinate of the release source with the release intensity of is, the wind with the average velocity of blows along the positive direction of axis, the effective diffusion coefficient of isotropy is , and the life of a particle is .

The diffusion model is the expression of the classical turbulent diffusion model in three-dimensional space, reflecting the relationship between the particle concentration of the releasing source and some variables in three-dimensional space. Figure 3 shows the three-dimensional simulation section diagram of the model, and the black arrow represents the wind direction. The simulation range is a three-dimensional space of , which is divided into cubes.The coordinate of the release source is ; other parameters are the same as Figure 2.

##### 2.2. Information Status

The information state is expressed by a probability distribution when determining the location of the release source, and refers to the location of the leakage source. refers to the collection of all measured values of the mobile tracking equipment from the beginning to the time . Using Bayesian framework to update posterior probability distribution, the updating process is shown in the following equation:where is the normalization constant. Because the events of particles released by release source detected and particles released by release source not detected are independent and obey Poisson distribution, the number of plume detection between and time is expressed by , where . is the time interval of detection. is given by the Smoluchowski’s arguments to describe the probability that the mobile searcher with radius of meter detects the particles released by the release source (equation (5)). The expressions of this equation in two-dimensional and three-dimensional environments are, respectively, equations (6) and (7). In particular, in a two-dimensional environment, the value of must be larger than that of .

##### 2.3. Reward Function

Reward function is a criterion used to evaluate information measurement, which determines how the mobile searcher moves. The two traditional decision functions and the reward function proposed in this paper are compared.

###### 2.3.1. Reward Function of Infotaxis

Information entropy can be used to describe the uncertainty of the location of the released source. The higher the uncertainty, the higher the entropy, which is the core idea of the Infotaxis algorithm. The algorithm updates the probability map of release source based on Bayesian equation and calculates the change of information entropy according to the probability to determine the moving direction of the mobile searcher. The expression of information entropy at time is described in the following equation:

The information entropy of the alternative moving position at the next moment is predicted, and the direction in which the information entropy decreases the fastest is chosen as the starting position of the next moving position. The subtraction equation of information entropy is shown in the following equation:where refers to the probability of finding the release source on the probability map; refers to the probability sum of not finding the release source on the probability map; and refers to the decrease of information entropy when the release source is found. When the release source is found in an ideal situation, the information entropy is 0. refers to the decrease of information entropy without finding the release source.

###### 2.3.2. Reward Function of InfotaxisII

When the searcher performs the search task, in the exploitation item of equation (9) is meaningful only when the release source matches the node in the searched area [30]. Therefore, the first item in equation (9) is discarded directly, making . Get the decision function:

###### 2.3.3. Reward Function of Sinfotaxis

As shown in equation (11), is the distance between the location of the current searcher and the assumed release source . is the threshold value that considers the distance to find the release source. In Infotaxis algorithm, the first item of reward function is exploitation, and the second item is exploration. The role of exploitation is to determine the location of the source by using the information obtained so that the explorer is more inclined to move in the direction of the source. The role of exploration is to explore more location areas for the searcher to increase the certainty of the location of the source. Equation (11) increases the role of exploitation, limits the exploration of the searcher, reduces the unnecessary exploration of the unknown area by the searcher, increases the guidance role of the source, and makes the searcher to explore in the direction of releasing the source.

##### 2.4. Mobile Strategy

When the mobile searcher judges each alternative moving direction through the reward function, the next moving direction is selected according to the evaluation result. The set of alternative directions of movement is called mobile strategy. For two-dimensional search, there are at least four alternative directions when the mobile searcher is at ; for three-dimensional search, there are at least six alternative directions when the mobile searcher is at . The reasonable choice of movement mode can effectively reduce the length of the search path of the mobile searcher so that the mobile searcher can locate the release source more quickly. Therefore, this paper will expand the mobile orientation in two-dimensional and three-dimensional scenes and discuss the performance of several Infotaxis algorithms after the increase of mobile orientation.

#### 3. Infotaxis Algorithms for Two-Dimensional Scene

##### 3.1. Mobile Strategy in Two-Dimensional Scene

Most of the previous literatures have carried out a lot of simulation analysis, but there are not many Infotaxis algorithms with different combinations of mobile strategies and reward functions. Different reward functions and different mobile strategies are combined to form the new Infotaxis algorithms, which will produce different search performances. Therefore, in the two-dimensional scene, this paper considers two groups of acceptable mobile strategies, as shown in Figure 4. The location of the mobile search device is marked by the UAV icon, and the black arrow points to the alternative direction of movement of the mobile strategy.

**(a)**

**(b)**

##### 3.2. Simulation Analysis of Infotaxis Algorithm in Two-Dimensional Scene

Let , and is the size of fine granularity of simulation environment, where cannot be negative; when , Sinfotaxis is equal to Infotaxis. In the two-dimensional scene, the distance threshold of Sinfotaxis is . Two mobile strategies mentioned in Figure 4 and three reward functions mentioned above are, respectively, simulated and analyzed. The six combined Infotaxis algorithms are as follows: Figure 5 shows the simulation diagram of traditional Infotaxis adopting the mobile mode of Figure 4(a), and Figure 6 shows the simulation diagram of traditional Infotaxis adopting the mobile mode of Figure 4(b). Figure 7 shows the simulation diagram of InfotaxisII adopting the mobile mode of Figure 4(a), and Figure 8 shows the simulation diagram of InfotaxisII adopting the mobile mode of Figure 4(b). Figure 9 shows the simulation diagram of Sinfotaxis adopting the mobile mode of Figure 4(a) when , and Figure 10 shows the simulation diagram of Sinfotaxis adopting the mobile mode of Figure 4(b) when .

The simulation environment is a two-dimensional region of , and the fine granularity of simulation is . When the linear distance between the mobile searcher and the release source is no more than , the mobile search device is considered to have located the leak source. In the simulation diagram, the black star is the location of the release source, with the coordinates of ; the black dot is the starting location of the mobile search device, with the coordinates of ; the white line is the search path of the mobile search device, and the red triangle represents the particle information released by the release source detected by the mobile search device at this location. The wind direction is positive along the -axis, and the threshold of concentration detected by the mobile search device is . The wind speed is . The existence time of the particle is . The radius of the searcher is . The intensity of the release source is , and the diffusion coefficient is .

The mobile searcher initially estimated the location information estimation map of the release source which satisfies the turbulence diffusion model. As can be seen from Figures 5–10, with the increase of search steps, the mobile searcher judges the source location information around the path through the searched path in the past. With the search continuing, the mobile searcher continuously estimates the release source location accurately and finally determines the location of the leak source. For mobile search equipment, the information around the searched path is mostly known, and the probability distribution of the release source in the unsearched area also satisfies the turbulence diffusion model to some extent.

Figure 11 shows the path process of six Infotaxis algorithms searching the release source in the same experimental environment. It can be seen from Figure 11 that all six Infotaxis algorithms can successfully locate the location of the release source. The total search steps of the six Infotaxis algorithms are 170 (Infotaxis_4),161 (Infotaxis_8), 157 (InfotaxisII_4), 116 (InfotaxisII_8), 174 (Sinfotaxis_4), and 77 (Sinfotaxis_8). It can be seen from Figure 11 that the traditional Infotaxis under the two mobile strategies and the InfotaxisII algorithm based on the four alternative mobile strategies are easy to make the search fall into local extremum, that is, the mobile search device will carry out multiple explorations in a small area. On the whole, the movement mode of the eight alternative positions in the move strategy has more advantages in the process of releasing source search. Because wind is complex and changeable in real environment, this advantage is more obvious when applying the Infotaxis based on eight alternative orientation move strategies.

Figure 12 shows the variation trend of entropy value of releasing source position during the search process of the six Infotaxis algorithms. When the steps of the six Infotaxis algorithms are less than 50, the degree of entropy decrease and the trend of entropy decrease are basically the same. When the six Infotaxis algorithms locate the released source, the final entropy is 5.6246 (Infotaxis_4), 5.3805 (Infotaxis_8), 5.5052 (InfotaxistII_4), 6.0027 (InfotaxisII_8),6.1376 (Sinfotaxis_4), and 6.1911 (Sinfotaxis_8). Figure 10 shows that the entropy value is not the minimum value of the entropy in the search process, when they locate the release source. So it is not advisable to take the entropy value as the search task stop condition in the two-dimensional scene. It can also be seen from Figure 12 that when the mobile searcher predicts and detects the particles released by the release source, the change of entropy will fluctuate; especially when the Infotaxis algorithm falls into a local extreme value, the entropy changing curve will fluctuate greatly. To further compare these six Infotaxis algorithms, this paper randomly selects different starting points and parameters for 100 times of simulation and evaluates them through average steps. As shown in Table 1, it can be seen that Sinfotaxis_8 has good search performance in two-dimensional search.

#### 4. Infotaxis Algorithms for Three-Dimensional Scene

##### 4.1. Mobile Strategy in Three-Dimensional Scene

When analyzing the performance of the Infotaxis algorithm in the three-dimensional scene, this paper proposed three kinds of mobile strategies combined with three kinds of reward functions. The cube of the three mobile strategies is shown in Figure 13, where the mobile searcher is located and marked with UAV icon. The cube marked in blue represents where the mobile searcher is likely to move at the next moment. Three kinds of mobile strategies are proposed according to selecting the probability range of when is different. When, the area is as shown in Figure 13(a), with 6 alternative moving directions. When, the area is as shown in Figure 13(b), with 14 alternative moving directions. And when, the area is as shown in Figure 13(c), with 26 alternative moving directions.

**(a)**

**(b)**

**(c)**

##### 4.2. Simulation Analysis of Infotaxis Algorithm in Three-Dimensional Scene

In this paper, three thresholds were proposed for the Sinfotaxis algorithm in the three-dimensional scene: , , and , named Sinfotaxis, SinfotaxisII, and SinfotaxisIII, respectively, which are combined with the Infotaxis and InfotaxisII proposed before to form five reward functions totally. The five reward functions are combined with the three mobile strategies in Figure 13 to form fifteen Infotaxis algorithms.

The mobile searcher used in the three-dimensional scene reference is usually a small UAV (without considering the rotor’s disturbance to the air flow). The radius of this UAV is generally . Considering that the search range of the three-dimensional scene in real scene is wider than the two-dimensional scene and the required fine granularity is not high, the scope of simulation space is set as three-dimensional scene of with fine granularity of . When the straight-line distance between the mobile search device and the release source is not greater than , it is considered that the mobile search device has located the leak source. The wind direction is positive along the -axis, the threshold value of concentration detected by the mobile search device is , the wind speed is , the particle existence time is , the release source intensity is , and the diffusion coefficient is . The simulation roadmap is shown in Figure 14. The location of the release source is represented by a black star, and the black dot represents the starting point of the mobile searcher. The total number of search steps of the 15 information orientation algorithms is shown in Table 2.

Fifteen Infotaxis algorithms have successfully completed the search task. As can be seen from Figure 14, when the traditional Infotaxis is adopted in the three-dimensional scene, the mobile strategy with 14 alternative directions is relatively best. It has the minimum number of search steps, but there is no phenomenon of falling into local extreme values in the two-dimensional scene. Combined with Table 2, it can be concluded that in this simulation condition, the worst performance of the three mobile strategies in the three-dimensional scene is the mobile strategy (Figure 13(b)) with 6 alternative mobile directions. In terms of reward function, no matter which kind of mobile strategy is adopted, the performance of Sinfotaxis algorithm in the three-dimensional scene is the best among fifteen Infotaxis algorithms, which shows that when threshold value is , the exploitation and exploration in reward function reach a good balance. In contrast, InfotaxisII algorithm with six alternative moving directions is not suitable for tracking release source tracking events in the three-dimensional scene.

Figure 15 shows the entropy changing curve of fifteen Infotaxis algorithms when performing search tasks in the three-dimensional scene. The ratio of the value of equation (7) in the three-dimensional scene to that of equation (6) in the two-dimensional scene is too small. Therefore, when using the Poisson distribution, the number of times the mobile searcher meets the particles released by the release source between its own position and the next position is basically zero, which leads to the change of the entropy value of the Infotaxis algorithm in the three-dimensional scene which is much more stable than that in the two-dimensional scene. In order to obtain more objective data, the fifteen Infotaxis algorithms proposed in this paper were simulated for 50 times when the starting point and parameters of the search were changed, and the average number of search steps in Table 3 was obtained. The results show that SinfotaxisIII_14 has fewer steps to locate the release source than other Infotaxis algorithms.

On the whole, the search path of the mobile strategy with 26 alternative directions is inclined to explore too many unknown areas, and it is not suitable for the event of three-dimensional scene tracking release source. The reason for this phenomenon is that the straight-line distance from each alternative direction to the mobile searcher is different in the mobile strategy with 26 alternative directions, which will cause a certain error. Although this error exists in the movement strategy with fourteen alternative directions, it is not enough to affect the performance of the Infotaxis algorithm.

#### 5. Conclusions

In this paper, a variety of reward functions and mobile strategies for Infotaxis algorithms were discussed in two-dimensional and three-dimensional scenes. The simulation results show that Sinfotaxis_8 algorithm with 8 alternative movement directions has good search performance in the two-dimensional scene, and Sinfotaxis_14 algorithm with 14 alternative movement directions has good search performance in the three-dimensional scene. The results show that the reward function of the proposed Sinfotaxis has good performance compared with other Infotaxis algorithms in both two-dimensional and three-dimensional scenes, indicating that the decision function of Sinfotaxis can reach the balance of exploitation and exploration to a certain extent so that the mobile searcher can locate the release source through fewer steps.

In the process of research, some problems of Infotaxis algorithms are also found. In the context of grid search space, when the mobile searcher adopts mobile strategies with more alternative mobile directions, not all the straight-line distances from alternative mobile directions to mobile tracking devices are exactly the same, and the existence of this difference will cause a difference in the prediction of the reward function of mobile tracking devices in the next step. When applying Infotaxis algorithm in the three-dimensional scene, Poisson distribution is used to estimate the number of particles released by the detected release source, which is not suitable for the case when the release rate of the release source is too small. These are some problems worth solving in future work.

#### Data Availability

The data used to support the findings of this study are included within the article.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Authors’ Contributions

S.F. was responsible for funding acquisition, resources, and conceptualization. D.H. contributed to methodology. D.H. and X.S. contributed to original draft preparation. M.S.Y. and Z.L. reviewed and edited the manuscript. K.X. validated the study.

#### Acknowledgments

This research was funded by the Key Research and Development Project from Hebei Province, China (grant nos. 19210404D and 20351802D), Key Research Project of Science and Technology from the Ministry of Education of Hebei Province, China (grant no. ZD2019010), and Project of Industry-University Cooperative Education of Ministry of Education of China (grant no. 201801335014).