Table of Contents Author Guidelines Submit a Manuscript
Journal of Electrical and Computer Engineering
Volume 2012, Article ID 628479, 7 pages
Research Article

Target Detection Using Nonsingular Approximations for a Singular Covariance Matrix

1Department of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
2Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
3Signal and Image Centre, Royal Military Academy, 1000 Brussels, Belgium

Received 1 April 2012; Accepted 7 June 2012

Academic Editor: Xiaofei Hu

Copyright © 2012 Nir Gorelik et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Accurate covariance matrix estimation for high-dimensional data can be a difficult problem. A good approximation of the covariance matrix needs in most cases a prohibitively large number of pixels, that is, pixels from a stationary section of the image whose number is greater than several times the number of bands. Estimating the covariance matrix with a number of pixels that is on the order of the number of bands or less will cause not only a bad estimation of the covariance matrix but also a singular covariance matrix which cannot be inverted. In this paper we will investigate two methods to give a sufficient approximation for the covariance matrix while only using a small number of neighboring pixels. The first is the quasilocal covariance matrix (QLRX) that uses the variance of the global covariance instead of the variances that are too small and cause a singular covariance. The second method is sparse matrix transform (SMT) that performs a set of K-givens rotations to estimate the covariance matrix. We will compare results from target acquisition that are based on both of these methods. An improvement for the SMT algorithm is suggested.

1. Introduction

The most widely used algorithms for target detection are traditionally based on the covariance matrix [1]. This matrix estimates the direction and magnitude of the noise in an image. In the equation for a matched filter presented in [1] we have 𝑅=𝑡𝑇Φ𝐺1(𝑥𝑚),(1)𝑥is the examined pixel, 𝑚 is the estimate of that pixel based on the surroundings, Φ𝐺 is the global covariance matrix, and 𝑡 is the target signature. In words, we can say that our matched filter for target detection will detect the target in a particular pixel 𝑥 if 𝑥 is different than its surroundings (𝑥𝑚), unlike the noise (controlled by Φ𝐺1) and in the direction of the target. If the target signature is unknown, then the RX algorithm uses the target residual (𝑥𝑚) as its own match, that is, 𝑅=(𝑥𝑚)𝑇Φ𝐺1(𝑥𝑚),(2)Φ𝐺 is traditionally calculated as follows: Φ𝐺=1𝑁𝑁𝑖=1𝑥𝑖𝑥𝑚𝑖𝑚𝑇.(3) Although the equation is theoretically justified if the background is stationary, it is often used in cases where this is not true.

In target detection, the image is not normally statistically stationary; it will however have quasistationary “patches” which connect to each other at the edges. When one estimates the mean and covariance matrix of the background of a particular pixel, the local neighboring pixels will have provided a better estimate than the pixels of the entire image. In [2], we show that much better results can be obtained if one uses a “quasilocal covariance matrix” (QLRX). In general terms, it uses the eigenvectors of the overall global matrix, but the eigenvalues are taken locally. This tends to lower the matched filter scores at edges in the data (when the image is going from one stationary distribution to another), but allows for accurate detection in less noisy areas.

The overall question of using a covariance matrix from local areas in which not enough data is sparse is actually a well-studied issue in the literature. In particular, in [3], Theiler et al. consider the sparse matrix rotation method for determining a covariance matrix based on limited data. In this paper, it is our intention to compare the two methods both in terms of their detection ability and their overall efficiency.

2. Local Covariance Matrix

Assume we are given a dataset 𝑋 which is composed of 𝑛 pixels with 𝑝 dimensions. Φ𝐺 is the covariance matrix of this dataset. An SVD (singular value decomposition) can be used to decompose the global covariance matrix [2] into eigenvectors and eigenvalues; we will refer to this as PCA (principal component analysis) space.

To compare the covariance matrix based on the local area surrounding a pixel and a matrix based on all the available data (referred to as global), consider the statistics of the dataset 𝑋=𝐸𝑇𝐺𝑋𝐺. Here 𝐸𝐺 is the rotating matrix based on the global eigenvectors [2], and 𝑋 is the dataset after rotation into the PCA subspace.

If 𝑋 is based on all the pixels in the image, then the covariance matrix of 𝑋 consists of a diagonal matrix with the global eigenvalues on the diagonal. However, if 𝑋 only contains the local surroundings, then the values on the diagonal of the covariance matrix of 𝑋 will represent the variances of the local data in the direction of the global eigenvectors.

Mathematically, using the dataset 𝑋, for every pixel we calculate the local covariance Φ𝐿 from the nearest neighbors. The diagonal 𝐷𝐿 of the local covariance matrix is the variance of the neighbors in PCA subspace as follows: 𝐷𝐿𝐸=diag𝑇𝐺Φ𝐿𝐸𝐺Φ=diag𝐿.(4) Since the local covariance is composed from a small number of samples, some of the variances may be inappropriately small or even cause a singular covariance. To avoid singularity, the variance matrix ΛQL will be the maximum between the variances of the global (Λ𝐺) covariance and the variances of local covariance in the PCA subspace (𝐷𝐿) as follows: ΛQLΛ=max𝐺,𝐷𝐿.(5) In this way, if the local area of the pixel has a large variance in some bands, it will be whitened by the local variance; for the bands that have too small local variances that can even cause a singular covariance matrix, it will use the global variance. The quasilocal covariance will be ΦQL=𝐸𝐺ΛQL𝐸𝑇𝐺.(6) In the PCA subspace it will simply be: ΦQL=ΛQL.(7) If we will calculate the RX in the PCA subspace, we will need fewer rotations, we will rotate only once all the data to the PCA subspace, and then we will get: QLRX=̃𝑥𝑚𝐿𝑇Λ1QL̃𝑥𝑚𝐿=𝑝𝑖=1̃𝑥𝑖𝑚𝐿𝑖2𝜆𝑖.(8)𝑚𝐿—the mean of the selected surrounding pixels in PCA subspace.

For subpixel targets, previous work [1] shows that it will be better to use the mean of 8 neighbors 𝑚𝑔. This can be done assuming that the target does not affect the surroundings; if we fear that the target has entered the surrounding areas, then we will ignore those pixels and only use external pixels to them for our estimate.

In this method, we use sparse matrix rotations to find the nearest covariance matrix to the original one which is still nonsingular. We can use SVD to decompose the local covariance as follows: Φ𝐿=𝐸𝐿Λ𝐿𝐸𝑇𝐿,Φ𝐿𝑅𝑝×𝑝.(9) Based on the fact that every eigenvectors matrix 𝐸 (or any unitary matrix) can be extracted from a product of 𝐾 spare orthonormal rotation matrix [3] we can write 𝐸𝐿=0𝑘=𝐾1𝐸𝑘=𝐸𝑘1𝐸𝑘2𝐸0.(10) Every rotation matrix 𝐸𝑘 is a Givens rotation operating on coordinates indices (𝑖𝑘,𝑗𝑘); the rotation will be on the surface that contains the vectors 𝑖𝑘,𝑗𝑘 as follows: 𝐸𝑘=𝑖𝑘𝑗𝑘1𝜃cos𝑘𝜃sin𝑘𝜃sin𝑘𝜃cos𝑘1.(11) With 𝐾=N2 rotation we can get from the identity matrix to any rotation matrix.

The concept of SMT is to start from the identity matrix, rotate every time two axes in the direction of the axis of the eigenvectors of the local covariance matrix, and to stop the rotations when it gives the best fit without becoming singular.

In other words, from a first set of data we can determine the correlations between the variables. If we did all the possible rotations, we would have diagonalized the matrix, but we cannot do this since we do not have enough data to simultaneously find all the local eigenvectors. Instead, we do these rotations on the most correlated ones, testing our new matrix by the degree that it provides good results on a second dataset. When our correction to the second dataset fails, we stop the diagonalizing procedure.

Mathematically, the rotation matrix 𝑇 will be all the selected rotations combined: 𝑇=0𝑘=𝐾1𝐸𝑘=𝐸𝑘1𝐸𝑘2𝐸0𝑁2,when𝐾<.(12) The variances will be the variances of the local covariance matrix in the direction of the rotation matrix 𝑇,ΛSMT=diag(𝑇Φ𝐿𝑇) and the inversed covariance matrix will be Φ1SMT=𝑇Λ1SMT=𝑇. to decide what rotation matrix is best we use the maximum likelihood covariance estimation and “leave-third-out” cross-validation. (Note that the use of leave-one-out cross-validation will give better results but will cost much more in computational efforts).

We divide the group of pixels into three groups. We take one third to be the tested pixels 𝑌𝑅(𝑛/3)×𝑝, and we use the other two thirds to make the approximation of the covariance ΦSMT. After every rotation we calculate the likelihood of covariance ΦSMT to describe correctly the group 𝑌. We have processed to data to make sure that 𝑌 is zero mean as follows: 𝑃ΦSMT1(𝑌)=(2𝜋)𝑝/2||ΦSMT||1/21exp2𝑌tr𝑇Φ1SMT𝑌.(13) We do this three times, each time another third is being taken out as the test data 𝑌; after combining the results of the three tests, we find the number of rotations that gives the best result (based on the highest value of 𝑃(𝑌)); we then use the full set and this number of rotations to get the final approximation of the covariance matrix (see Figure 1).

Figure 1: The probability that ΦSMT describes 𝑌 correctly after 𝑘 rotations; the 𝑘 that will be chosen is the one that gives the maximum probability.

To select every time the rotations that will make the biggest improvement, we perform greedy minimization, that is, always choosing the next rotation that will contribute most to reduce the correlation between data along the axis of the matrix as follows: 𝑖𝑘,𝑗𝑘argmax𝑖,𝑗𝑆2𝑖𝑗𝑆𝑖𝑖𝑆𝑗𝑗,(14)𝑆 is the current covariance matrix, (𝑖,𝑗) are indices of two rows in the matrix, and 𝑆𝑖𝑗,𝑆𝑖𝑖,𝑆𝑗𝑗 are the members in the matrix with those indices.

After we calculate the covariance matrix for SMT, we can use it for anomaly detection: RXSMT=𝑥𝑚𝐿𝑇Φ1SMT𝑥𝑚𝐿,(15)𝑥—the tested pixel.

𝑚𝐿—the mean of the selected shrouding pixels in PCA subspace.

For subpixel targets as we stated previously, it will be better to use the mean of 8 neighbors 𝑚8.

3. Dataset

Two datasets were used (Figure 2); a description of their origin can be found in Table 1 and in greater detail in [4].

Table 1: Datasets information—OBP1 and OBP2 are parts of OBP.
Figure 2: RGB composite of the original data cubes. From left to right: OBP1, OBP2.

The two data cubes (OBP1 and OBP2) are real data from the Hymap sensor in which anomalies were inserted artificially by linearly mixing the spectra of a green paint pixel with the original background pixel. For display purposes, in Figure 2, images with full-pixel paint spectra are shown. For the evaluation of anomaly detection results, images with a mixing ratio of 0.33 (𝑃=33%) were used.

4. Results

We now wish to compare the SMT and QLRX algorithms. We will perform RX anomaly detection (2) using the covariance matrices given by each of the algorithms.

Since the dataset being used contains implanted subpixel targets without any danger of overlap into neighboring pixels, the mean in the calculation of (2) was always the mean of the eight nearest neighbors. However, we must consider the correct neighborhood for the calculation of the covariance matrix for SMT that provides the best results.

The first test was done using only the nearest 8 neighbors for the approximation of the covariance; in this test, it is very easy to see that QLRX results are superior to the SMT results. The ROC curves are given in Figure 3. We assume that the area of the target (and of any examined pixel) consists of the square region of dimension OWS by OWS (outer window). The target area itself has area GWS by GWS (guard window); for subpixel targets GWS will equal 1. Thus the neighboring pixels are those pixels which are located in the square set of pixels in the area OWS by OWS not contained in the inner GWS by GWS matrix.

Figure 3: Results of RX algorithm using QLRX and SMT on dataset OBP1 with the stated OWS = 3 and GWS = 1.
Figure 4: OBP1 results with OWS given by the number in the legend.

In this picture is the result for 𝑃=33%, but the tests for 10, 25, 50, and 100 percents gave similar results.

Results from the OBP2 dataset were comparable.

When the OWS is larger, the results of the SMT improve dramatically.

For the dataset OBP1 we can see a large improvement as OWS increases. For this dataset QLRX gives better results, especially in the low CFAR (constant false alarm rate).

For the dataset OBP2, the differences between QLRX and SMT are reduced but still QLRX performs better in the low CFAR regime (see Figure 5).

Figure 5: Similar to Figure 4 for the OBP2 dataset.

Similar results were received for the cases in which 10, 25, 50, or 100 percent of the target is in the pixel.

The SMT has two main difficulties: first, the algorithm calculates a new covariance matrix at every point. This calculation needs a sequential set of rotations based on the training set followed by evaluations of the test set. Both sets are taken from the pixel surrounding only, so none of the information outside the selected group is used in the calculation. In QLRX, the eigenvectors are the same for all points (the eigenvectors of the global covariance). All that we need to do is measure the variance in the local area in the spectral direction of the eigenvalues and calculate the new covariance matrix. Second, the calculation of the SMT itself is highly dependent on the size of the “local” area. While a larger area improves the results, it also increases the time for calculation (see Table 2).

Table 2: This table shows the time it took to complete the calculation of a dataset.

5. Improvements for SMT

A small change in the published method for doing SMT could lead to a large improvement.

In the original algorithm, the initial assumed axes of the covariance matrix are in the direction of the original dataset. Then the axes are rotated in pairs into the directions of the “local eigenvectors” to create new covariance matrices.

When we stop, some of the axes will be almost the same direction as the local covariance eigenvectors and some will be closer to the direction of the original axes.

Now since the original directions were random, that is, not related to the correlations between the axes, it is easy to see that there is no reason that this should be optimum. In particular, would it not make more sense to start from the global eigenvectors and rotate into the local ones? Another benefit we will get from this approach is that we will start the rotation from a condition that most probably will be closer to the optimum point (see Figure 6); within fewer rotations, we will get to the maximum likelihood. We will call this new algorithm SMT PCA.

Figure 6: When starting from PCA subspace, we will start from a closer point to the maximum so we need fewer rotations; the delta in 𝑘 between the original location to the current one is the rotations done by transforming to the PCA subspace.

For the OBP1 dataset (Figure 7), the result after starting with the subspace based on the global eigenvectors are better than QLRX when OWS is big (7,9). SMT-PCA gives better results than SMT for any OWS.

Figure 7: OBP1 results with OWS given by the number and the legend.

For the OBP2 dataset (Figure 8), the result after starting with the subspace based on the global eigenvectors are better than QLRX when OWS is big (7,9). SMT-PCA gives better results from SMT for any OWS (for OWS = 3 SMT and SMT-PCA almost the same).

Figure 8: Similar to Figure 7 for the HAR dataset.

Examining the number of rotations needed in the SMT and in the SMT-PCA (see Table 3).

Table 3: This table shows the time it took to complete the calculation of a dataset.

6. Conclusions

As a preliminary to our conclusions, please note that when we discuss using a small or large number of pixels, that in all cases the number of pixels used is less than the number of spectral bands.

Two methods in this paper have been considered for dealing with possibly singular covariance matrices. In the first (QLRX), we use global eigenvectors and local eigenvalues as an approximation of the inverse covariance matrix. In the second (SMT), we use an iterative process to slowly “twist” our axes to come closer to those determined by the data.

In our two datasets, we found that if a small area was used for estimating the background, the QLRX algorithm was superior. For large areas of background, QLRX remains superior, although SMT greatly improves as follows: Numberofpixels=𝑝Numberofbands𝑛(0,1),(16)(i)the calculation time of QLRX is much smaller (two orders of magnitude) than both SMT and SMT-PCA,(ii)the calculation time of SMT PCA is less than the calculation time of the original SMT by about a factor of two,(iii)SMT-PCA and QLRX performance are better than those of SMT for any number of pixels,(iv)for a small number of pixels (𝑝/𝑛0.1), the QLRX performance is better than that of SMT-PCA,(v)for a large number of pixels (0.2𝑝/𝑛<0.1), the performance of SMT-PCA is better than that of QLRX.


The authors gratefully recognize the partial support for this work from the Paul Ivanier Center for Robotics Research and Production Management, Beer-Sheva, Israel. The test images are part of the Oberpfaffenhofen HyMAP scene, collected in 2004, during an airborne campaign sponsored by the Belgian Science Policy Office (BelSPO). Flights were operated by the German Aerospace Center (DLR).


  1. C. E. Caefer, M. S. Stefanou, E. D. Nielsen, A. P. Rizzuto, O. Raviv, and S. R. Rotman, “Analysis of false alarm distributions in the development and evaluation of hyperspectral point target detection algorithms,” Optical Engineering, vol. 46, no. 7, Article ID 076402, 2007. View at Publisher · View at Google Scholar · View at Scopus
  2. C. E. Caefer, J. Silverman, O. Orthal, D. Antonelli, Y. Sharoni, and S. R. Rotman, “Improved covariance matrices for point target detection in hyperspectral data,” Optical Engineering, vol. 47, no. 7, Article ID 076402, 2008. View at Publisher · View at Google Scholar · View at Scopus
  3. J. Theiler, G. Cao, L. R. Bachega, and C. A. Bouman, “Sparse matrix transform for hyperspectral image processing,” IEEE Journal on Selected Topics in Signal Processing, vol. 5, no. 3, pp. 424–437, 2011. View at Publisher · View at Google Scholar · View at Scopus
  4. D. Borghys and C. Perneel, “Study of the influence of pre-processing on local statistics-based anomaly detector results,” in Proceedings of the Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS '10), pp. 1–4, June 2010. View at Publisher · View at Google Scholar · View at Scopus