An Information Geometric Viewpoint on the Detection of Range Distributed Targets
The paper adopts the information geometry, to put forward a new viewpoint on the detection of range distributed targets embedded in Gaussian noise with unknown covariance. The original hypothesis test problem is formulated as the discrimination between distributions of the measurements and the noise. The Siegel distance, which is exactly the well-known geodesic distance between images of the original distributions via embedding into a higher-dimensional manifold, is given as an intrinsic measure on the difference between multivariate normal distributions. Without the assumption of uncorrelated measurements, we propose a set of geometric distance detectors, which is designed based on the Siegel distance and different from the generalized likelihood ratio algorithm or other common criterions in statistics. As special cases, the classical optimal matched filter, Rao test, and Wald test, which have been proven to have the CFAR property, belong to the set. Moreover, it is also accessible to an intuitively geometric analysis about how strongly the data contradict the null hypothesis.
Worthy of repetitive thought is the problem of detecting range distributed targets in the presence of Gaussian noise with unknown covariance. In the actual projects, measurements are collected via radar or sonar, which will be modeled as reflections from targets. With the collected data, a hypothesis test aims at distinguishing between target returns plus noise and noise only. In 1986, Kelly firstly proposed his famous generalized likelihood ratio test (GLRT) for detecting point targets in . Later, a lot of advances on the detection of distributed targets are made based on the generalized likelihood ratio algorithm. There is no exception for the recent work that mainly focuses on the research of adaptive detection. In this field, one class is dealing with completely unknown signals [2, 3]. Specifically, paper  by Conte et al. provides a valuable statistical tool for ensuring the CFAR property. Others consider detecting a rank-one steering vector [4–6], or a spread-spectrum signal [7, 8], and so forth. All these detectors are reasonably testified to have perfect detection performances in corresponding situations.
However, a common problem is that the detectors designed by the generalized likelihood ratio algorithm are limited to the case of determinate signals, due to the given form of distribution under the alternative hypothesis. In this paper, the detection problem with completely unknown target returns is discussed. In addition, there is no assumption whether the measurements are correlated or not. The original hypothesis test problem is formulated as a measure on the difference between distributions of the measurements and the noise, which is intuitively described via the platform of information geometry.
The differential geometric approach was first introduced by Rao , in order to supply a way for the construction of distances between parametric density functions. Later related papers greatly improved Rao’s concept. It is worth mentioning that Amari presented statistical models with a differential structure in [10, 11], which plays an important role in information geometry. The geodesic distance has been put forward as the shortest length between two distributions on the statistical manifold endowed with the Fisher metric and Levi-Civita connection. A very active field of information geometry is about the manifold of multivariate normal distributions. On the multivariate normal manifold, the explicit expression of the geodesics has been derived by Eriksen  and later by Calvo and Oller , when the initial distribution and direction are given. However, the geodesic distance between multivariate normal distributions has not yet been obtained, except for on the submanifold of multivariate normal distributions with the same mean . The lack of the explicit expression greatly limits the application of the geodesic distance. In 1990, Calvo and Oller proposed the Siegel distance on the multivariate normal manifold via embedding into the Siegel group . The Siegel distance aims at calculating the geodesic distance between the images of the initial distributions and also provides a lower bound for the geodesic distance. Moreover, it is proven by Calvo et al. in  that the Siegel distance shows the similar behavior with the geodesic distance in special cases. This suggests that the Siegel distance is reasonable to be applied to measure the difference between the general multivariate normal distributions.
In this paper, the key point is how to calculate the Siegel distance between the distributions of the measurements and the noise. Since the target returns are completely unknown, the possible distributions of measurements form a submanifold of multivariate normal distributions. By calculating the Siegel distance from some point on the multivariate normal manifold to the submanifold, a set of detectors are designed with the critical region where denotes the significance level and the constant is chosen to satisfy
From the viewpoint of information geometry, it also provides an intuitive analysis about how strongly the measurements contradict the null hypothesis. Moreover, the optimal matched filter, Rao test (or modified 2-step GLRT), and Wald test (or 2-step GLRT), which have been proven to have the CFAR property in , can also be derived from the set of detectors.
The outline of the paper is as follows: Section 2 shows the problem formulation. Section 3 reviews some important principles related to the geodesic distance between multivariate normal distributions. Section 4 is devoted to the Siegel distance and the design of geometric distance detectors and provides a discussion on the choice of the nominal noise covariance matrix in Section 5. Some intuitive interpretations are presented in Section 6. Section 7 gives a conclusion.
2. Problem Formulation
The problem under investigation is the detection of completely unknown target returns in the presence of Gaussian noise with unknown covariance. We assume the targets are spatially distributed across range cells, from which the primary data , , are collected. The primary data consist of the possible target returns plus Gaussian noise. To acquire information about the noises, the secondary data coming from cells arranged around the targets are supposed to be available. Or rather, the secondary data consist of a training set , , of which each sample shares the same distribution as the Gaussian noise adherent to the primary data. Therefore, the detection problem is formulated aswhere, , , denote the signals, and the noises , , are independent Gaussian vectors with mean zero and the same unknown covariance matrix . All s, s, and s are -dimensional vectors.
Let and . In fact, the hypotheses (3) carry out a comparison between the distributions of and the noise. As usual, considering the distribution of the whole measurement set is verified to be more effective .
Suppose the joint probability distribution of the measurements is Gaussian with an unknown covariance matrix . Let be the realization of random vector . In the real-time processing such as radar target detection, owing to the unknown mean of , the multivariate normal sampling model for is regarded as where denotes the set of symmetric positive-definite matrices. As a comparison, an array of noise with the same size is considered. Let ; then we can specify the sampling distribution of as where denotes the block diagonal matrix having the same block . The matrix , which is obtained from the collected data, contains information about the unknown noise covariance. We will have a discussion in Section 5.
The remaining problem is how to compare the two distributions above. It is essentially a property of distance and can be intuitively described via the platform of information geometry. In this paper, our major concern is to measure the difference between distributions (4) and (5) with methods of information geometry and also to give out a geometric criterion of how strongly the data contradict the hypothesis .
Remark 1. In [2, 4, 5], since the distribution of under is specified as it is limited to discuss the case of determinate signals. Thus, many other cases cannot be described by the process of generalized likelihood ratio algorithm; however, the derived detectors still have perfect performances in the simulations of . For our development, the methods of information geometry enlarge the scope of our discussion, since the covariance in (4) has no constraints on the matrix structure.
Remark 2. It is common in literature works to use distance as a test statistic. Please refer to [14, 15]. In fact, the Mahalanobis distance can also be used to derive a perfect test, but it is not an intrinsic distance in geometry.
3. The Riemannian Geometry of the Multivariate Normal Model
We hereafter review some important notions in information geometry, in order to induce a measure of distance on the multivariate normal model.
3.1. Statistical Manifold
The statistical model describes a family of parameterized probability distributions: where is a subset of . Under the condition of diffeomorphism, the family is considered as an -dimensional statistical manifold . Each probability distribution is represented by a point on the manifold with the corresponding coordinate .
3.2. The Fisher Metric and the Geodesic Distance
For a statistical manifold , the Fisher metric aims at defining an inner product on the tangent space . It is given by the Fisher information matrix widely used in information science as where denotes the expectation with respect to the distribution .
With the Fisher metric, the length of a curve , , on the manifold is defined in  as where the dot denotes the differentiation with respect to the variable and the indices indicate the corresponding components in the given coordinate parameterisations. Suppose a curve joining and ; the length between the endpoints depends on the choice of the curve. The geodesic distance between and is defined as the minimum length over all possible curves connecting the two points on the manifold ; that is,
We must emphasize that the geodesic distance is not only the intrinsic metric on the manifold of parametric probability distributions but also a genuine metric in the sense that it satisfies the nonnegative, symmetry properties and the triangle inequality. As a measure between probability distributions, the Kullback-Leibler divergence is also popular in information science, which is well-known as the relative entropy : However, it is not a genuine metric, since it fails to be symmetric and satisfies the triangle inequality .
3.3. Riemannian Connection and Geodesics
Besides the Fisher metric, another fundamental tensor in Riemannian geometry is the Riemannian connection. The Riemannian connection is a symmetric affine connection that defines a linear one-to-one mapping between tangent spaces. For the Fisher metric, the Riemannian connection exists uniquely. The Riemannian connection coefficients are given in the form of the Christoffel symbols by the Fisher metric  as
Geodesics are defined as the autoparallel curves on the manifold. Usually with the Christoffel symbols, a geodesic is calculated by the equations :It must be pointed out that, on Riemannian manifold, the geodesic distance between two points equals the length of the shortest geodesic segment between them .
3.4. Riemannian Geometry of the Multivariate Normal Manifold
The multivariate normal manifold consists of -dimensional normal distributions parameterized by the mean and covariance matrix and is denoted by
As detailed in , the coordinate system of is given by . Moreover, the tangent space coincides with the product space of -dimensional vectors and symmetric matrices, which is denoted by . According to (8), the Fisher metric is given in the form of inner product:where denotes the point and . According to (12) and (13), the geodesics on the multivariate normal manifold are calculated by solving the following equations:The geodesic distance is calculated by integral (9) along the shortest geodesic. Specifically, when , the geodesic is explicitly given in [14, 20, 21]. The geodesic distance on the univariate normal manifold is given by
Besides the aforementioned basic properties, the geodesic distance defined on the multivariate normal manifold owns the invariance property under the group transformation which acts as where denotes the group of -dimensional regular matrices.
Unfortunately, in the general case , the analytical solutions to the geodesic equations (16) are complicated. The recent development is the achievement of deriving the explicit formula of the geodesic curves in [12, 13, 22], when the initial point and initial direction of the geodesic are given. Nevertheless, it remains unsolved how to obtain the geodesic curves given the endpoints and further to calculate the geodesic distance between them. However, efforts are made on the set of multivariate normal distributions with constant mean vectors, and another well-known geometry is defined.
3.5. The Submanifold with Fixed Mean Vector
Consider the submanifold in defined bywhere is a constant vector. The structure of has been studied in [14, 15, 23]. The geodesics have been explicitly given and the geodesic distance has been computed. In particular, it is shown in  that is a totally geodesic submanifold of . That means, on the manifold , the geodesic joining two points of will line entirely in .
Thus, if given , the geodesic from to is given in , by and the initial vector is in the direction of For any , define . Then the geodesic distance is where , , are the eigenvalues of .
4. The Design of Geometric Distance Detector
Let . Note in (4); thus the possible distributions of constitute a submanifold in . We denote the distribution in (5) as the point in the manifold . Since in general case , it has been mentioned before that the direct acquisition of the geodesic distance between multivariate normal distributions with different mean vectors is hard. Here we introduce the Siegel distance given by Calvo and Oller in , which attempts to calculate the geodesic distance between the images of the original distributions based on embedding into :Then the Siegel distance is defined in  as
Similarly, the Siegel distance between and the submanifold is defined aswhere denotes the normal distribution and . The second equality is due to the invariance properties of the Siegel distance, which is the same as the geodesic distance. The last equality follows from the definition of the Siegel distance in (24).
It has been proven that is isometric to . However, is not a geodesic submanifold in . That is to say, the geodesic curve joining the images and in may contain points outside . Thus, the Siegel distance provides a lower bound for the geodesic distance between multivariate normal distributions, which has also been verified as a distance measure on the multivariate normal manifold.
The solution to (25) is easy to be found according to the following lemma.
Lemma 3 (Pythagorean theorem). Let be a point in and let be a submanifold of . A necessary and sufficient condition for a point to be a stationary point of the function restricted on is for the geodesic connecting and to be orthogonal to at .
The proof of the Pythagorean theorem is common in differential geometry. Then the lemma follows.
Theorem 4. The point in that achieves the minimum of (25) must satisfy for some and .
Proof. By the Pythagorean theorem, we have This theorem follows due to the arbitrariness of .
The matrix given in (30) has at most two nonzero eigenvalues , which must satisfy Note that the eigenvalues and are opposite sign while .
Theorem 5. If and are a solution of the following equations,then the minimum in (25) is achieved byand the Siegel distance between and is
Proof. For the eigenvalues , the corresponding eigenvectors of in (30) can be given as , . Taking orthogonal matrix we haveThen from and we have whereFrom (28) and (39), we can get Therefore, From (41)-(42), we have Equations (43) and (46) imply (33). Since is a complete manifold, there exists a solution to (33) of the two variables and .
Equation (34) follows from (44). By (22), the Siegel distance between and becomes That is (35). This theorem thus holds.
The equations in (33) are nonlinear which can be solved numerically, and then we can calculate the Siegel distance by (35). Let . From Theorem 5, it is easy to find the Siegel distance closely related to . As stated in Section 1, with the Siegel distance , the detection problem (3) can be carried out with the critical region defined as (1). Note that the matrix in is undeterminate, which does not affect the derivation of . Therefore, a set of geometric distance detectors are given via different choices of the matrix .
5. The Choice of Matrix S
Let be the realization of random vector for each . Note that Figure 1 illustrates how the Siegel distance varies with the logarithm of increasing. The curve is obvious to demonstrate a monotonic increase. Thus the critical region in (1) is rewritten as where is an increasing function of .
If the underlying covariance is known, it is natural to take as directly. In this case, the optimal matched filter is derived. Otherwise, efficient information about is given by the maximum likelihood estimate based on the received data. When only with the secondary data, is replaced by thus the Wald test belongs to the set of geometric distance detectors. In addition, substituting the maximum likelihood estimate based on the whole set of data in place of , we can find it is the same as the Rao test.
In summary, it is verified that the referred three classical tests are members of the set of geometric distance detectors designed based on the Siegel distance.
In this section, some figures are presented, in order to give intuitive interpretations about how the geometric distance detectors work. As special cases, the optimal matched filter, the Rao test, and Wald test are described in the form of geometric distance detectors.
Figure 2 illustrates the definition of the geodesic distance from some point on the univariate normal manifold to the submanifold with the same mean. A group of geodesics from are shown. Gradient colors along the geodesics represent different geodesic distances given by (17) from to the current point. The geodesic distance from to is given by the integral along the geodesic marked red. As a geometric distance detector, a threshold is required for the hypothesis test problem. is not rejected if the Siegel distance between the distribution of noise and the submanifold determined by the measurements is less than the threshold. The critical region in (1) is obtained by a projection of submanifolds that are outside of the threshold on the observation space. The same principle applies to the Siegel distance on the multivariate normal manifold. As without visualization features, the multidimensional cases are not presented here.
Figure 3 shows a simulation with , , since the examples with a higher dimension have similar behaviors but no visualization features. The distributions of noise are displayed at the origin, with the corresponding noise covariance matrix: the underlying covariance and the sample covariance or . All curves were drawn as the projections on the measurement space. It is obvious that the part outside a contour curve represents the critical region of the geometry detector with the corresponding alarm probability listed in the legend. As the intrinsic merit of geometric measure, the magnitude of the Siegel distance reflects how strongly the collected data contradict the null hypothesis . In Figure 3(d), contour curves with the same alarm probability are presented. We can observe that the contour curves of the Siegel distance based on and appear to overlap. This is due to the equivalency of the Rao test and Wald test when . In fact, if we denote the eigenvalues of the matrix as , then, according to [2, 3], the Rao test and Wald test are, respectively, rewritten as When , has only one nonzero eigenvalue. Thus, both the Rao test and the Wald test coincide to test for the nonzero eigenvalue. However, they yield different Siegel distances because of the distinct covariance matrices which are neglected in projection mapping. The acceptance regions of both the Rao test and the Wald test can be improved by increasing the number of secondary data, so as to approach that of the geometric distance detector based on the underlying noise covariance.
As a general rule, the Rao test and Wald test have different acceptance regions. Figure 4 displays another simulation with , . For flat visualizations, the contour curves of the Siegel distance are projected on the space of .
In this paper, an information geometric viewpoint on how to deal with the detection problem of range distributed targets embedded in Gaussian noise with unknown covariance is put forward. More precisely, we have derived a set of geometric distance detectors, of which the optimal matched filter, the Rao test, and Wald test are members. This establishes a link between the information geometry and the hypothesis testing. As a future research, other choices of can be tested, and it might also be of interest to find one with a better performance among the set of geometric distance detectors.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported in part by the National Key Basic Research Program of China under Grant 2013CB329405, the National Natural Science Foundation of China under Grant 61374027, and the Specialized Research Fund for the Doctoral Program of Higher Education under Grant 20130181110042.
E. J. Kelly, “An adaptive detection algorithm,” IEEE Transactions on Aerospace and Electronic Systems, vol. 22, no. 2, pp. 115–127, 1986.View at: Google Scholar
P. S. Eriksen, “Geodesics connected with the Fisher metric on the multivariate normal manifold,” in Proceedings of the GST Workshop: Geometrization of Statistical Theory, pp. 225–229, University of Lancaster, Lancaster, UK, October 1987.View at: Google Scholar
C. Lenglet, M. Rousson, R. Deriche, and O. Faugeras, “Statistics on the manifold of multivariate normal distributions: theory and application to diffusion tensor MRI processing,” Journal of Mathematical Imaging and Vision, vol. 25, no. 3, pp. 423–444, 2006.View at: Publisher Site | Google Scholar | MathSciNet
L. Madsen, The geometry of statistical models [Ph.D. thesis], University of Copenhagen, Copenhagen, Denmark, 1978.
S.-I. Amari, “Theory of information space: a differential-geometrical foundation of statistics,” Post RAAG Reports 106, 1980.View at: Google Scholar