Abstract
In multiagent systems, tracking multiple targets is challenging for two reasons: firstly, it is nontrivial to dynamically deploy networked agents of different types for utility optimization; secondly, information fusion for multitarget tracking is difficult in the presence of uncertainties, such as data association, noise, and clutter. In this paper, we present a novel control approach in distributed manner for multitarget tracking. The control problem is modelled as a partially observed Markov decision process, which is a NPhard combinatorial optimization problem, by seeking all possible combinations of control commands. To solve this problem efficiently, we assume that the measurement of each agent is independent of other agents’ behavior and provide a suboptimal multiagent control solution by maximizing the local Rényi divergence. In addition, we also provide the SMC implementation of the sequential multiBernoulli filter so that each agent can utilize the measurements from neighbouring agents to perform information fusion for accurate multitarget tracking. Numerical studies validate the effectiveness and efficiency of our multiagent control approach for multitarget tracking.
1. Introduction
Advances in microelectromechanical systems have significantly boosted the development of multiagent systems in the past two decades. Lowcost agents, for example, robots, unmanned vehicles, or autonomous platforms, with high mobility, various sensing types, and powerful communication ability, are capable of different tasks in complex environment, for example, environmental monitoring, target localization and tracking, and event recognition [1, 2]. Multiagent coordinated control, as the fundamental problems in multiagent system, has received increasing research interest for utility optimization recently [3–5]. An area that benefits greatly from multiagent system is target tracking [6]. However, very little progress has been made in this direction since it is an extremely challenging problem in two aspects: multiagent control and information fusion for multitarget tracking.
The multiagent control problem in essence is a decision making issue which is to fulfill a given task in an optimal/suboptimal way. In this paper, we study the multiagent control problem particularly for multitarget tracking including accurate estimation of both target number and the location of each target. In the literature, researchers and practitioners have done extensive work on the agent control problem. OlfatiSaber et al. [6] established a consensus based Kalman filter for distributed single target tracking. In [7], multiagent control is performed by path following of a virtual leader. OlfatiSaber [8] provided flocking algorithms in both theory and applications to handle large number of agents. Unfortunately, none of them fit for the multitarget tracking issue well.
Recently, random finite set (RFS) based Bayesian framework has opened doors for multisensor multiobject system and provides elegant mathematical tools to address multitarget tracking problem in multiagent systems [9]. The probability hypothesis density (PHD) [10], cardinalized PHD (CPHD) [11], and multiBernoulli filter [12] have been developed as approximations under different posterior assumptions. Gaussian mixture and particle implementation of these filters have been shown to be effective in different tracking applications [13–17]. Using tools from FISST, agent control can pose as a partially observed Markov decision process (POMDP) [18], which has been shown to be effective in a single agent case in recent works [19, 20].
In this paper, we will further extend the work in [19, 20] to a more general problem: multiagent control for multitarget tracking, which is far more difficult than the single agent control case. We first model the multiagent coordinated control problem as a 1step lookahead POMDP since the multiplestep lookahead is computationally intractable and show it is a NP combinatorial optimization problem when seeking all possible combinations of admissible control commands. Hence, we propose a suboptimal solution under the assumption that the measurements of each agent are independent of other agents’ behavior. The multiagent coordinated control is decoupled into distributed control of each agent by maximizing the local Rényi divergence between prior and posterior multitarget probability density. Besides, we present the sequential Monte Carlo (SMC) implementation of the sequential multiBernoulli filter for each agent to utilize measurements from neighbouring agents. Numerical simulations demonstrate the effectiveness and efficiency of our approach.
The remainder of this paper is organized as follows. Section 2 presents some preliminary knowledge of RFS based Bayesian framework to lay a foundation for the rest of this paper. In Section 3, we illustrate the distributed agent control approach and present its implementation in detail. Section 4 provides the information fusion scheme for multitarget tracking for each agent to utilize measurements from other agents. Section 5 provides numerical results that verify the proposed agent control and multitarget tracking approach.
2. Preliminary Knowledge of RFS Based Bayesian Framework
This section provides the basic concepts and notations of RFS based Bayesian framework. Section 2.1 gives a general description of RFS and how to model multitarget by multiBernoulli RFS. Then, the RFS based Bayesian filtering is provided thereafter in Section 2.2.
2.1. MultiBernoulli RFS
A random finite set (RFS) is a random variable that takes values as unordered finite sets. The randomness of an RFS refers to two aspects: the set cardinality (number of elements of the set) is random; each element in the set is also a random variable. The probabilistic description of RFS has been studied regarding various types of probability distributions such as multiBernoulli (or Bernoulli) RFS, IID (short for independent identically distributed) cluster RFS, and Poisson RFS [21]. Here, we introduce the multiBernoulli RFS for multitarget state modelling, which offers a better alternative than the Poisson RFS and IID cluster RFS in applications with highly nonlinear model and/or nonhomogeneous sensor type [12].
Assume the dimension of target state is ; then, the target state space is denoted by . A multiBernoulli RFS on is a union of a fixed number of independent Bernoulli RFSs with existence probability and probability density (defined on ), ; that is, .
Use a Bernoulli set for modelling a single target; then, the multitarget state can be modeled as multiBernoulli RFS with probability density given in [12] as follows: where and , respectively, represent the existence probability and distribution of the th target and . It is clear that the multitarget density can be completely specified by multiBernoulli parameter set . Hence, let us denote the multitarget density at time by for short in the following content.
2.2. Multisource Multiobject Bayesian Framework
Stochastic filtering in Bayesian framework has developed in decades [22]. Under the assumption of linear model and Gaussian distribution, Kalman filter is derived [23] and has been widely used for tracking since then. To extend the standard Bayesian framework to a multisource multitarget version, we need the help of RFS modelling. Let and denote the state set and observation set, respectively, as follows: where is the total number of agents (we treat each agent as a single sensor) and for . is the timevarying cardinality of targets, while is the cardinality of the measurement set generated by agent .
Using the RFS representation, the movement of multiobject can be described using two parts: an RFS for survival targets from previous time step and an RFS for spontaneous birth targets at current time . Thus, at time , we have the predicted RFS . The RFS for measurements of agent can be represented as a union of two parts: targetgenerated measurements and clutter ; thus, .
Given the specific type of RFS, the Bayesian framework for optimal estimation via RFS which is the same form as the classical Bayesian filtering is given as follows: which represent the prediction and update process of Bayesian recursion via set integrals, respectively. Under different assumptions of the RFS type, the PHD, CPHD, and multiBernoulli filter have been derived from (3) by finite set statistics [9]. Notice that the integrals in (3) are FISST set integrals (see [9–11] for details).
3. POMDP Based Distributed Multiagent Control Approach
In this section, we first illustrate multiagent control in the framework of a POMDP in Section 3.1. Section 3.2 provides maximizing expected Rényi divergence between prior and posterior multitarget density as the objective function for the control scheme. By assuming the measurement of each agent is independent of other agents’ behavior, we propose a distributed agent control approach by maximizing the local Rényi divergence in Section 3.3.
3.1. POMDP Based Multiagent Control
We begin with the notations of using a POMDP as the solution for multiagent control. At time , denote the control command of agent by , where is the set of all admissible control commands for agent . Let denote the multiagent control command, and is the set of all possible control command combinations. Then, for and is the total number of agents.
Define as the objective function dependent on multiagent control command , multitarget density , and the associated measurement set when control command was applied. The aim of multiagent control is to find the optimal multiagent control command by maximizing/minimizing the statistical expectation of predefined objective function as where represents the multitarget posterior density after applying a sequence of multiagent control commands . Remark that the general formulation of POMDP is a step future decision process of which the computational cost would grow exponentially with the number of future steps. In this paper, we only consider a onestep future decision described by (4) as an approximation.
3.2. Global Objective Function
As shown in (4), the objective function plays a crucial role in POMDP based multiagent control problem. Information theoretic method is a typical objective function for sensor management. Here, we propose maximizing the information gain of multitarget prior and posterior density as the objective function for tracking. The Rényi divergence, also known as alpha divergence, measures the information gain between any two probability densities. The objective function for multitarget tracking is given as follows: where and is a parameter that determines how much we emphasize the tails of two densities in the metric. Notice that the Rényi divergence becomes the KullbackLeibler discrimination and Hellinger affinity, respectively, when and [24].
To compute the expectation of (5), we introduce the SMC implementation of the objective function. At time , assume that the multitarget predicted density is given in SMC form; that is, , where each multitarget particle , is the number of targets, for represent target position in the state space, and is the weight associated to particle . Notice that is a particle sampled from a RFS, which accounts for the randomness of both the cardinality and target positions of particle . Given the Bayesian prediction and update equations (3), we obtain where the multisensor multitarget likelihood function dependent on multiagent control is given by and the singlesensor multitarget likelihood function dependent on multiagent control is given in [9] as follows: where is the standard singlesensor singletarget likelihood function described by the measurement model of agent ; is the probability density of clutter RFS and for Poisson RFS; is the clutter rate, while is the probability distribution of clutter; represents all possible associations between the particle set and the measurement set ; that is, . Notice that (6) is a multisensor multitarget case, whereas the derivation is similar to the single sensor case in [19]. Thus, we directly omit the tedious proof here.
3.3. Distributed Agent Control
Even though multiagent control can be described as a POMDP given by (4) and (5), it is still intractable to achieve the global optimum of defined objective function for two reasons: firstly, searching all admissible control command combinations is a NPhard combinatorial optimization problem [25]; secondly, global optimization requires the existence of a centralized fusion center that receives information from all agents, which is unrealistic for most largescale multiagent systems. Hence, we propose a distributed sensor control method that compromises the local optimum of the global objective function, which is computationally tractable and convenient to implement.
Instead of computing the global information gain, we consider finding the optimal command for each individual agent. Similar to (5), the Rényi divergence for the th agent is given by Assume that measurement set generated by one agent is independent of other agents’ behaviour; then, the multisensor multitarget likelihood function in (9) can be written as follows: where for . The future measurement is generated assuming no clutter and unity detection rate as illustrated in [20]. For agent , is predicted based on the multitarget state and possible control command , while for is predicted based on the multitarget state and current location of agent . Therefore, we can find the optimal control command for each individual agent and then combine them together to form the total control command set, which is given as follows:
Remark 1. Equation (9) is exactly a particular case of (6) by assuming all agents except agent stay still at current time. The proposed distributed control approach is a suboptimal solution of (6) by seeking its local optimum, and the local optimum means that is optimal given all other agents keep still.
When computing the optimal command of agent , the distributed agent control given by (11) approximates the likelihood function via using current locations of agent for . Hence, we refer to (9) as the “local Rényi divergence” in this paper.
The proposed distributed control approach can significantly reduce the computational cost to perform realtime agent control. For illustration, assume that all agents have the same number of possible control command denoted by ; the computation complexity of our control approach is which is much smaller than by searching all control combinations, especially in largescale multiagent system.
Generally speaking, each agent can only communicate with its neighbouring agents and obtain their current locations as well as their measurement model for . Here, we assume that the information received by each agent is accurate without any input saturation. For a more challenging case that there is input saturation described in [26, 27], the topic is beyond the scope of this paper. The neighbouring relationship may change over time due to the relative movement of agents. Algorithm 1 provides the SMC implementation of proposed distributed control for each agent by maximizing the local Rényi divergence.

4. Multisensor Fusion for MultiTarget Tracking
As mentioned before, the multiBernoulli filter outweighs the PHD/CPHD filter in the SMC implementation for nonlinear problem since the state extraction in multiBernoulli filter is not dependent upon the heuristics in clustering but is dependent only on the Bernoulli parameters. Hence, the multiBernoulli filter has been used extensively in computer vision [16, 17], robot SLAM [28], and sensor network [29]. Besides, the stateoftheart development of the multiBernoulli filter offers the power to directly produce tracks of individual targets, which is known as the labelled multiBernoulli filter in the community [30].
In this section, we first briefly review the cardinalitybalanced multiBernoulli filter given in [12] in Section 4.1. Then, Section 4.2 provides the sequential update scheme for information fusion of multiple sensors.
4.1. CardinalityBalanced MultiBernoulli Filter
Prediction. At time , if the posterior multitarget density is multiBernoulli given by and the density of new births is also multiBernoulli , then the predicted density is given by where for survival targets And, for new born targets, are prior existence probability and distribution of birth model.
Update. At time , if the predicted multitarget density is multiBernoulli , the output of corrector is composed of legacy tracks and measurementupdated tracks as where and are probability of survival and detection. The inner product is defined between two real valued functions and by . Note that, without loss of generality, we refer to the cardinalitybalanced multiBernoulli filter as “multiBernoulli” filter for simplicity in this paper.
4.2. Sequential Multisensor MultiBernoulli Filter
In multisensor multitarget tracking scenario, there is no unified multisensor fusion method which is tractable and computationally acceptable. Sequential update has been widely used and verified to be a good approximation for information fusion of multiple sensors. Here, multitarget tracking in multiagent network is implemented via multiBernoulli filter with sequential update scheme based on the SMC implementation.
Suppose that, at time , the posterior multitarget density is given as , and the distribution of each target is given by a set of weighted particles . Then, we give the SMC implementation of sequential multisensor multiBernoulli filter in Algorithm 2. We refer the readers to Section of [12] for detailed equations.

The superscript in Algorithm 2 represents the predicted th Bernoulli set updated with the th sensor. To avoid the infinite growth of multiBernoulli set number, those with existence probability less than a predefined threshold (e.g., 0.001) are removed. Meanwhile, the particle number is limited between and , in case that sampling is not enough or resampling reallocates too many particles. The number of particles for each Bernoulli set is proportional to each target existence during the resampling step. With a given existence threshold 0.75, those sets with over 0.75 are true tracks, while the others are not. Notice that the multiBernoulli filter we adopt here cannot produce tracks directly, and it can be replaced by the labelled multiBernoulli filter to produce individual tracks at the cost of some extra computation.
5. Simulation
In order to demonstrate the performance of proposed multiagent control approach for multitarget tracking, we present numerical results for a planar multitarget tracking scenario in multiagent system where three controllable moving observers, two equipped with rangeonly sensors and one with a bearingonly sensor, are placed in a specified surveillance area of size to estimate the number of targets as well as their positions. Each agent shares its current locations, sensor type, and observations with the other agents. It is intuitive that the initial positions of agents have an impact on the agent control and target tracking procedure. However, we are not going to involve the network topology issue in this paper since our approach is supposed to act independently from the network topology which is different from the method described in [31, 32].
There are unknown and timevarying numbers of targets observed in clutter for each agent. Assume that targets move according to the nearly constant velocity model given by where ; are planar position and are planar velocity, respectively, along coordinate, coordinate. ; . is identity matrix and denotes Kronecker product. is the sampling period, and is a IID Gaussian noise. Assume process noise is timeinvariant and identical for both and ; then, , where is the standard deviation.
Measurement of sensor originated from target with state is noisy vector of range or bearing measurement, and the measurement model for rangeonly sensor is given by and, for bearingonly sensor, where is the location of agent and is the Euclidean distance between sensor and the target. is zero mean Gaussian noise and is also zero mean Gaussian noise . The standard derivations of and are, respectively, given by and as follows: with , , , and .
Targets can appear or disappear in the scene at any time, and survival probability for each existing target. New born targets appear spontaneously according to . Three targets are presented as illustrated in Figure 1, where , , and , respectively, for each target, and are identical for all three targets.
The probability of detection for both rangeonly sensor and bearingonly sensor is modelled by with and . The clutter rate of each sensor per scan. The standard derivation of process noise for both and . Given the current agent location , the set of admissible control commands for each agent is computed as , where and are for angular and radial step size. and here.
The initial locations of agents are shown in Figure 1, and each agent runs for 30 scans with sampling period s. Birth intensity for the multiBernoulli filter is approximated using adaptive target birth intensity sampling technique described in [33]. and are the minimum and maximum particle numbers of each Bernoulli set for track maintenance.
Since each agent shares its location and observation with other two agents, agents have pretty much the same performance in this scene. Hence, we take agent 3 (bearingonly) as an example to illustrate the tracking procedure. Figure 2 presents three key frames of one trial run of the proposed control and tracking approach. It is clear that, with the movement of agents, scattered particles converge to the ground truth locations of targets. This is because each agent is moving forward to a more informative direction so that “ghost” particles that are not generated by actual targets can be quickly eliminated by the multisensor fusion scheme. As a result, the estimation of target positions is getting more accurate over time. Besides, it can be seen from Figure 2(c) that the trajectories of agents are different for their distinctive initial locations and sensor types.
(a) Time
(b) Time
(c) Time
Due to stochastic nature of our control and tracking approach, we adopt 1000 Monte Carlo runs to evaluate their performance. The optimal subpattern assignment (OSPA) metric composed of location error and cardinality error is used for tracking performance evaluation [34]. Figure 3 shows the OSPA distance of our approach, (, ) from 1000 Monte Carlo runs, from which we can see that the OSPA distance converges with the movement of agents over time. The location error is reduced gradually since each agent is obtaining more informative measurements to lower the covariance of position estimation, while the estimation of target number quickly converges to a relatively low value. The average computation time for a single run is only approximately 0.82 s (algorithm is implemented in MATLAB 2012a on a PC with 8 GB RAM and Intel Core i74770k CPU). Several runs have been recorded in videos attached as Supplementary Material (available online at http://dx.doi.org/10.1155/2015/903682) for demonstration.
6. Conclusion
In this paper, we propose a novel distributed multiagent control approach by maximizing the local Rényi divergence. The SMC implementation of the sequential multiBernoulli filter is provided for each agent to utilize the information from neighbouring agents. Simulation results demonstrate that the proposed approach is capable of distributed multitarget tracking via effective sensor control.
Our future work is to use convex relaxation method for seeking global optimal solution for multiagent control and compare with the approach proposed in this paper. We also need to consider more challenging measurement model, such as timedifferenceofarrival measurement or Doppler measurement, which depends on the behavior of multiple agents.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
Liang Ma got financial support from the Program of China Scholarships Council (no. 201206680025). This work is sponsored by the Fundamental Research Funds for the Central Universities (HEUCF130703).
Supplementary Materials
Videos are simulation trial runs for a planar multitarget tracking scenario in multiagent system where three controllable moving observers, two equipped with rangeonly sensors (two corners) and one with a bearingonly sensor (center), are placed in a specified surveillance area of size [0, 1000m] × [0, 1000m]. Our goal is to estimate the number of targets as well as their positions by control these three agents. Each agent share its current locations, sensor type and observations to the other agents. Green dots are for particles. Colored lines are trajectories of agents. Colored ◆ are current locations of agents. Red ° are ground truth of target stop positions while black * are for target estimation.