#### Abstract

This paper addresses the problem of distributed fusion when the conditional independence assumptions on sensor measurements or local estimates are not met. A new data fusion algorithm called Copula fusion is presented. The proposed method is grounded on Copula statistical modeling and Bayesian analysis. The primary advantage of the Copula-based methodology is that it could reveal the unknown correlation that allows one to build joint probability distributions with potentially arbitrary underlying marginals and a desired intermodal dependence. The proposed fusion algorithm requires no a priori knowledge of communications patterns or network connectivity. The simulation results show that the Copula fusion brings a consistent estimate for a wide range of process noises.

#### 1. Introduction

The rapid growth of sensing and computational capabilities has advanced the development of distributed estimation [1] and information fusion [2] technologies for wireless sensor networks. Sensor networks provide greater coverage than single sensors by improving detection and estimation accuracy and extending geographical reach. The distributed fusion architecture [3] has many advantages over a centralized one such as lower bandwidth and increased reliability. In a typical distributed sensor network, estimates are often computed locally and transmitted to a fusion site through a communication network for fusion. However, one of the main challenges for distributed fusion in dynamic wireless sensing networks is to model the unknown correlation between the local estimates [4]. In optimal Bayesian fusion, the key is to identify the common information that has to be removed to avoid double counting [5]. The common information is usually from common target process noise or the information shared during the past communication.

In a distributed estimation scenario where the local nodes process their local information and exchange their results (local tracks) to each other for fusion, the common process noise from the target of interest causes the dependency between the local estimation errors [6]. One solution proposed to solve this track-to-track correlation and fusion problem is to calculate the cross-covariances between the local estimation errors by communicating the local historical Kalman gains [7]. In addition to the common process noise, the local estimates may hold common information due to information exchange patterns in a distributed network. The Bayesian fusion equation [5] can be used to derive optimal fusion formulae for the state of interest as long as the measurements are conditionally independent given the state and where the common information can be extracted based on the processing architecture [8]. While the removal of duplicate information is straightforward in the theoretical formulation, identification of duplicate information for distributed estimation systems is difficult in dynamic networks. A recognized approach is to use graphical models [9] to represent dependent information with spatial and temporal relationships. A fundamental issue associated with this method is the amount of data needed to express correlations between the local estimates. In particular, long pedigree information might need to be communicated in order to achieve optimality. Since the optimal result is difficult to achieve in practice, several scalable fusion rules such as channel filter, the covariance intersection method [10], Chernoff fusion, and Bhattacharyya fusion [11] are proposed to reduce the communication loads. Furthermore, it is also difficult to combine data from multiple heterogeneous sources having different probability distributions, some of which may be non-Gaussian. For example, the dependency structure for Gaussian mixture estimates [12] has been demonstrated similarly to the linear Gaussian case [6]. Several approximate methods for the fusion of Gaussian mixtures have been presented based on Chernoff fusion [4, 13].

Most traditional fusion approaches rely on assumptions of conditionally independent measurements. When such condition is not met, the Bayesian fusion equation is not optimal. In that case, a complete knowledge of the joint probability distribution of the observations is required in order to produce the optimal fusion results. However, the derivation of the joint distribution is intractable in general. We propose using Copula theory to model the correlations between local estimates in a consistent manner.

The primary advantage of the Copula-based methodology is that it can characterize modal dependencies regardless of the respective marginal distributions [14, 15]. This property allows one to build joint probability distributions with potentially arbitrary underlying marginals and a desired intermodal dependence [16–18]. This is particularly suitable for distributed fusion with only knowledge of the marginal distributions given local sensor observations but not sufficient knowledge on their joint distribution.

This paper proposes a new fusion method to deal with unknown correlation in a distributed sensing network. The method will function in a mathematically consistent manner while limiting data exchange and processing requirements. The paper is organized as follows. Section 2 formulates the problem and Section 3 presents the Copula fusion method. The algorithm implementation details are presented in Section 4. Section 5 details the simulation results followed by the concluding remarks given in Section 6.

#### 2. Problem Formulation

##### 2.1. Kinematics Model

The Wiener process is widely used to model unknown inputs (maneuvers) in state estimation/tracking problem. Consider a 1D Wiener random process (or a discrete time Brownian motion) as follows:where represents target position at time step and is a white Gaussian process noise with variance .

##### 2.2. Sensor Configuration and Measurement Model

The sensor configuration involves two range measuring sensors, where each sensor collects a set of measurements about the target state . The measurement equation is given bywhere represents the measurement at time step for sensor . is a white Gaussian measurement noise with variance for sensor .

##### 2.3. Fusion Architecture

Typically, each sensor computes an estimate of the state of interest based on its local observations and sends the local estimation to a fusion node. The objective of the fusion node is to estimate the fused state given all the information available through communication, namely, . In this paper, we focus on hierarchical fusion where two sensors communicate their local estimates to the fusion site to be combined periodically. Denote and as the two information sets consisting of measurements. Then fusing state estimates are straightforward when these pieces of information are mutually exclusive and conditionally independent; that is,Thenwhere the normalizing constant is given by

When the conditional independence assumption is not met, the fusion equation (4) is only approximate. For example, in hierarchical fusion architecture, when two sensors communicate at less than full rate, the measurements are correlated given the target at the previous observation time due to the common target process noise [11]. In that case, a complete knowledge of the joint probability distribution of the observation conditioned on the target state is required in order to obtain the optimal results as shown in the following equation [5]:

However, the derivation of the joint distribution may be intractable in general. In the next section, we propose a method based on Copula theory to model the unknown dependence in order to derive the joint probability density.

#### 3. Distributed Fusion with Copula

Theorem 1 (Sklar’s Theorem for continuous conditional distributions [14, 15]). *Let be a conditional bivariate distribution function with continuous marginals and , and let be some conditioning set. Then there exists a unique conditional Copula such thatwhere and .*

Theorem 1 indicates that the unknown joint density term in (6) can be constructed aswhere are marginal distributions and is the Copula function. Substituting (8) into (6) we have

Sklar’s Theorem states that a multivariate joint distribution can be written in terms of univariate marginal distribution functions and a Copula function which describes the dependence structure between the variables. Besides, when the marginal distribution functions are continuous, then the Copula function is unique.

While there exists a unique Copula, its exact form is not available to us when we want to construct a joint distribution function with only marginal distribution functions. In this regard, we are supposed to choose a suitable Copula function to “approximate” the unknown true one. There are families of Copula functions, like Gaussian Copula, Archimedean Copula, and Student-t Copula. In this work, we choose Gaussian Copula since the local estimates are also Gaussian. The bivariate Gaussian Copula function [16, 17] is defined aswhere and . represents the correlation coefficient, is the cumulative distribution function of , and is its functional inverse [17].

This result implies that the Copula fusion equation (9) is similar to the original Bayesian fusion equation given in (4), the difference is the intermodal dependence term represented by the Copula function , which is determined by the information of local sensor observation marginal densities.

#### 4. Approximate Implementation of Copula Fusion

##### 4.1. Pseudo Measurements

The exact implementation of fusion rule represented by (9) requires transmitting a full pedigree of the measurements from each sensor between communications, which is not practical in reality. In order to implement the Copula fusion rule in an autonomous manner for Gaussian Copula, we first approximate the information contained in the measurement history with a pseudo measurement ; namelywithwhere is the number of observations between communications. The derivation of (12) is given in the Appendix. With the pseudo measurement, we could approximate the Copula function (10) and derive the fused results (9) given the two local estimates.

##### 4.2. Correlation Coefficient

The next issue to consider is how to derive the “consistent” correlation coefficient . When the system parameters, namely, the fusion rate, , and , are given, we can perform exhaustive search over the grid in and conduct extensive Monte Carlo simulations to find an appropriate such that the resulting estimation variance corresponding to is consistent; namely, the result passes the NEES [18] test. However, this method only works in simulation, since the true state is not available in practice.

In this paper, we conduct an off-line process to approximate the coefficient with the help of the analytical expression of steady state estimation variance for MAP fusion derived in [19]. A numerical search algorithm is applied to select the coefficient such that the resulting estimation variance corresponding to is consistent with the theoretical result produced by MAP fusion.

Note that (9) boils down to finding the joint density function of and . Since we do not keep the full pedigree of the measurements from each sensor, we use a single equivalent one to represent the history of measurements at the cost of losing information. MAP fusion is to find , since and only carry the cumulative effect from the past observations and , just like the partial information carried in the pseudo measurement, not the complete information in all measurements history, therefore we cannot reconstruct the optimal centralized fusion results with MAP fusion, and we can only get the suboptimal results. So theoretically, with an appropriate , we are finding the joint density given the pseudo measurements so that the resulting estimate is consistent. On the other hand, MAP fusion is finding the best estimate given the two local estimates, which are equivalent to two pseudo measurements and the resulting fused estimate is also consistent.

Although it is hard to prove the equivalence analytically, we show that the Copula fusion and MAP fusion obtain very similar results in a numerical manner. Here we conduct simulations at different fusion rate, where the coefficient is predetermined from Monte Carlo simulations and NEES tests.

As shown in Tables 1–3, the two fusion methods have similar results on both estimation variance and averaged MSE. For the reason that the steady state estimation variance for MAP fusion can be predicted in an analytic form, we could find an appropriate such that the resulting estimation variance from Copula fusion is consistent with the one from MAP fusion.

##### 4.3. Algorithm Summary

In this section, we summarize the Copula fusion algorithm. As shown in Figure 1, at each fusion step, with the two local estimates, we first calculate the pseudo measurement , , according to (12). Since the density function of the fused estimate cannot be derived in an analytical closed form, here we approximate the distribution with a Gaussian distribution through sampling.

Suppose that, at time step , we generate a sampling set based on the prior probability distribution , where is the number of samples. Then, for each element in the sampling set, we can calculate the likelihood function (weight) of each sample point, namely, , based on (9), where and can be obtained from (11) and the local posterior distributions and fused prior distributioncan be used to calculate and , respectively. Finally, we can approximate the resulting PDF as a Gaussian distribution and obtain its mean and variance accordingly.

#### 5. Simulation Results

To validate the proposed Copula fusion algorithm, a simulation scenario is developed with target and sensors models described in (1) and (2), respectively. The simulation is set up with a hierarchical fusion with feedback architecture. As shown in Figure 2, each sensor obtains the measurement and updates its local tracks with a sampling interval s. The fusion center fuses the local tracks with a period of . When the fusion process is completed, the sensors will receive the feedback from the fusion center.

To verify the performance, we compare the results with the information matrix (IM) fusion algorithm. It is well known that, except in the full-rate fusion case (fusion after each measurement update), the IM fusion is only suboptimal and could be inconsistent [19]. 1,000 Monte Carlo simulation trials were conducted for each variance of the process noise varying from to .

Figures 3–6 compare the MSE results between Copula fusion and IM fusion at full, 1/2, 1/4, and 1/8 rate, respectively. It can be seen that the estimation accuracies between the above two algorithms are very close.

Figures 7–9 compare the distributions of the fused result at steady state between Copula fusion and the IM fusion at 1/2, 1/4, and 1/8 rate communication, respectively. It can be seen that the two posterior distributions are not the same and the corresponding variance for IMF is smaller than that of Copula fusion.

Figures 10–12 show the ratios of the perceived variance over the true variance. It can be seen that, except in the full-rate case, the fusion algorithm based on IM fusion is inconsistent when the variance of the process noise (target dynamic) is close to the variance of the measurement noise. Unfortunately, this is a critical operating range. As shown in the figures, the perceived variance is much smaller compared to the true ones for IM fusion.

With the Copula fusion algorithm, the results shown in Figures 10–12 demonstrate that the estimation results are near optimal and they are “consistent” for a wide range operating region with different process noises. This is because Copula function offers a flexible and reliable representation for modeling the unknown dependency between the local estimates. While the target dynamic and sensor observation models are assumed to be Gaussian in this paper, the proposed general method could be extended to model unknown dependency between potentially arbitrary marginal distributions to fuse data in a dynamic sensing network.

#### 6. Conclusion

This paper presents a novel sensor fusion methodology based on Copula and Bayesian probabilistic theory for track-to-track fusion with unknown dependency. In the method, a mathematical characterization of the dependence structure of the local estimates is constructed using Copula statistical modeling. The recursive version of distributed Copula fusion is implemented by approximating the full pedigree of the local measurements with a pseudo measurement. The simulation demonstrates that, unlike other traditional approaches, the resulting fused estimates are consistent in the sense that the perceived uncertainty characterized by the estimation error variance is close to the true uncertainty. This is particularly important because overly optimistic uncertainty assessment could mislead a critical decision. Compared with the existing work, the proposed Copula fusion requires no detailed knowledge of communications patterns and is potentially applicable to fusion process with networked disparate sensors. A natural future research direction is to generalize the methodology coupled with the proven scalable “channel filter” algorithm to the fusion problems of heterogeneous and correlated measurements in ad hoc sensing networks.

#### Appendix

Given the target and measurement models in (1) and (2), the local Kalman filter [18] is given as whereSubstituting (A.2) into (A.1) we havewhich can be rewritten asSimilarly, we havewhere represents the pseudo measurement from time step to time step and

#### Competing Interests

The authors declare that they have no competing interests.

#### Acknowledgments

Research is partially supported by ARO under Grant no. W911NF-15-1-0409 (K. C. Chang).