Block Sparse Bayesian Learning over Local Dictionary for Robust SAR Target Recognition
This paper applied block sparse Bayesian learning (BSBL) to synthetic aperture radar (SAR) target recognition. The traditional sparse representation-based classification (SRC) operates on the global dictionary collaborated by different classes. Afterwards, the similarities between the test sample and various classes are evaluated by the reconstruction errors. This paper reconstructs the test sample based on local dictionaries formed by individual classes. Considering the azimuthal sensitivity of SAR images, the linear coefficients on the local dictionary are sparse ones with block structure. Therefore, to solve the sparse coefficients, the BSBL is employed. The proposed method can better exploit the representation capability of each class, thus benefiting the recognition performance. Based on the experimental results on the moving and stationary target acquisition and recognition (MSTAR) dataset, the effectiveness and robustness of the proposed method is confirmed.
Synthetic aperture radar (SAR) has been used in Earth observations since it was first developed. Automatic target recognition (ATR) is a special application in SAR image interpretation, which aims to analyze the interested targets in images and determine their labels. Since the start in 1990s, SAR ATR methods have been studied widely using feature extraction and classification algorithms [1, 2]. Different types of features were applied to SAR ATR including geometrical, transformation, and electromagnetic features. Target contour, region, shadow, etc. are typical geometrical features, which describe the sizes or shape distributions [3–12]. Ding et al. developed a matching algorithm of binary target regions for SAR ATR , which was further improve by Cui et al. using the Euclidean distance transform . The Zernike and Krawtchouk moments were employed to describe the target regions in [5, 6], respectively. Anagnostopoulos extracted the outline descriptors for SAR target recognition . Papson validated the utility of target shadow for SAR ATR . The transformation features were usually obtained in mathematical or signal processing ways. The mathematical tools include principal component analysis (PCA) , kernel PCA (KPCA) , and nonnegative matrix factorization (NMF) . In addition, some newly proposed manifold learning methods were demonstrated effective for SAR ATR [16–19]. Image decomposition tools including wavelet , monogenic signal [21, 22], and empirical mode decomposition (EMD) [23, 24] were adopted in SAR ATR with good performance. The scattering center is the typical scattering feature with several applications in SAR ATR [25–31]. A Bayesian matching scheme of attributed scattering centers was developed in  for target recognition. Ding et al. used the attributed scattering centers as the basic features and proposed several classification schemes [27, 28]. Zhang proposed a noise-robust method using attributed scattering centers . Furthermore, the attributed scattering centers were employed to partially reconstruct the target to enrich the available training samples [30, 31]. In addition to the use of single-type features, many multifeature SAR ATR methods were designed in the present works [32–36].
The classification algorithms were mainly introduced from the pattern recognition fields. The famous classifiers including support vector machine (SVM) [37, 38], adaptive boosting (AdaBoost) , and sparse representation-based classification (SRC) [40–42], were successfully applied to SAR ATR. SVM was first used by Zhao and Principe for SAR target recognition . After then, SVM has been the most popular classifier to classify different kinds of features for SAR ATR [3, 5, 38]. Sun et al. developed the AdaBoost for SAR target recognition, which enhanced the classification performance by boosting several simple classifiers . Based on the compressive sensing theory, SRC was first validated in face recognition  and further used in SAR ATR in many related works [40–42]. With the progress in deep learning, many novel networks were developed for SAR target recognition [44–59], in which the convolutional neural network (CNN) is the mostly used. Network architectures including the all-convolutional neural networks (A-ConvNets) , enhanced squeeze and excitation network (ESENet) , gradually distilled CNN , cascade coupled CNN , and multistream CNN , were developed and applied. Other works enriched the effective training samples using transfer learning, data augmentation, and so forth, thus improving the classification ability of the networks [51–53]. However, the performance of deep learning models has close relation to the scale of the training set. With scarce training SAR images, the final performance will be significantly impaired.
This paper proposes a novel classification scheme for SAR target recognition by improving traditional SRC. SRC performs linear representation of the test sample over the global dictionary established based on all the training samples. The reconstruction errors from different classes are analyzed to obtain the target label afterwards. In essence, the relative representation capabilities are compared in SRC but the absolute representation capability of each class is not exploited fully. Therefore, this paper represents the test sample over the local dictionaries from individual training classes. Therefore, the capability of each class can be fully investigated as for describing and representing the input sample. Considering the azimuthal sensitivity of SAR images [60, 61], the test sample is only related to those training samples, which share similar azimuths with it. When the atoms in the local dictionary are sorted according to the azimuths, the linear coefficients over the local dictionary are sparse ones with block structure; i.e., the nonzero elements accumulate in a small azimuth interval. Accordingly, the block sparse Bayesian learning (BSBL)  is employed to solve the sparse coefficients on the local dictionary, which could exploit the block structure with higher precision. Finally, the reconstruction errors of individual classes are analyzed to determine the target type. To investigate the performance of the proposed method, the moving and stationary target acquisition and recognition (MSTAR) dataset is employed for test and comparison. The results validate the superiority of the proposed method under the standard operating condition (SOC) and typical extended operating conditions (EOC).
SRC can be regarded as a modification of the linear representation problem with the idea of compressive sensing [40–43]. For the sample to be classified, it is represented over the global dictionary comprising of all the training samples while the linear coefficients are sparse with only a few nonzero ones. The global dictionary is denoted as , in which is a local dictionary with atoms of the class. For the test sample , the reconstruction process is illustrated as follows:where denotes the solved coefficient vector.
With the solution of , the target label of is determined by calculating the reconstruction errors of different classes and comparing them as follows:where extracts the coefficient vector of the class.
In SRC, the representation errors actually embody the relative capabilities of reconstructing the test sample for different classes. However, the absolute representation capability of each class cannot be effectively exploited. In other words, how could individual classes best reconstruct the test sample should be further evaluated.
3. Block Sparse Bayesian Learning over Local Dictionary
3.1. Sparse Representation over Local Dictionary
Rather than the representation over the global dictionary, this paper represents the test sample on the local dictionary as follows:where denotes the linear coefficient vector over the ith local dictionary and is the reconstruction error.
Figure 1 illustrates four SAR images of BMP2 target from the MSTAR dataset, which are measured at different azimuths. As shown, SAR images of the same target from notably different azimuths have obviously distinct appearances. Because SAR images are sensitive to the azimuth changing, only those training samples (atoms in the dictionary) with approaching azimuths to that of the test sample are useful in the linear reconstruction. By arranging the atoms in the local dictionary in the descending (or ascending) manner, the nonzero elements in tend to amass in a small azimuth interval. So, the resulting is a sparse vector with the block structure. To better reconstruct the test sample, BSBL is employed to estimate , which is demonstrated more suitable for the reconstruction of block sparse signals .
Compared with the traditionally global dictionary-based SRC, the sparse representation over the local dictionary could further exploit the representation abilities of the training classes. The reconstruction error in (2) reflects the absolute representation capability of the class as for representing the test sample . In addition, with the constraint of azimuthal sensitivity during the linear representation, the reconstruction errors from different targets can be used to make reliable decisions on the target label.
3.2. BSBL Framework
Assuming as a block sparse signal, it contains the block structures as follows:
The signal in the previous equation has blocks among which only a few ones are nonzero. Here, denotes the length of the block. Usually, the samples in the same block are closely related. To describe the block structure as well as the intrablock correlation, the BSBL framework  employs the parameterized Gaussian distribution:
In the previous equation, and are unknown deterministic parameters in which represents the confidence of the relevance of the block and captures the intrablock correlation. Assuming that different blocks are mutually independent, then the signal model can be rewritten as the following equation:where is a block diagonal matrix in which the principal diagonal is .
The observation is modeled as the following equationfd7:where is a sensing matrix and denotes the noise term. The sensing matrix is an underdetermined matrix and the noise is modeled as a zero-mean Gaussian distribution with and variance of with being an unknown parameter. Therefore, the likelihood is given by
The main body of the BSBL algorithm iterates between the estimation of the posterior with , and maximizing the likelihood with . The update rules for the parameters and are derived using the Type II Maximum Likelihood method, which leads to the following cost function:
Based on the estimations of the parameters and , the MAP estimates the coefficient vector as follows:
3.3. Target Recognition
By solving the block sparse coefficients on local dictionaries, respectively, the reconstruction error of each training class is obtained as follows:where is the solved coefficient vector over the local dictionary by BSBL. Afterwards, the target label is determined to the minimum-error class as (2).
Figure 2 illustrates the main idea of the proposed method. During the implementation, PCA is performed as a feature extraction step for both training and test samples and the detailed steps summarized as follows: Step 1: Arrange the training samples of each class according to their azimuths in an ascending order Step 2: Represent the test sample on the local dictionaries using BSBL Step 3: Reconstruct the test sample with different classes to obtain the residuals Step 4: Make the classification decision according to the minimum error
With volumes of measured SAR images, MSTAR dataset has long been used on the examinations of target recognition algorithms. As shown, SAR images of the 10 targets in Figure 3 are available in the dataset, collected by X-band radar with the resolution of 0.3 m (cross range) × 0.3 m (range). Samples for each target cover 0°∼360° aspect angles in both training and test sets. Accordingly, several experimental conditions could be set up to test the SAR ATR methods including SOC and EOCs.
Several reference methods are chosen from the current works to be compared with the proposed method including SVM , AdaBoost , SRC , and A-ConvNet . These methods aimed to improve the performance by updating the classification schemes. For SVM, AdaBoost, and SRC, they also performed on the PCA feature vectors, which are consistent with the proposed method for fair comparison. A-ConvNet was a CNN-based method, which was trained by the original image pixels. All these methods are performed by the authors on the same hardware platform with the proposed one.
4.2. Recognition Results
In the following experiments, the SOC is first set up for classification. Afterwards, three different EOCs are set up including configuration variances, depression angle variances, and noise corruption. Simultaneously, the four reference methods are tested and compared with the proposed one.
4.2.1. Recognition under SOC
The conditions for the SOC experiment are set up as in Table 1, which include the 10 classes of targets in Figure 3. Overall, the training and test samples are assumed to share high similarities. Specifically, the test samples of BMP2 and T72 include two different configurations from their training sets (denoted by the serial number). The classification results of this method in this case are obtained as in Figure 4, which is displayed as a confusion matrix. As shown, the correct recognition rates of different classes are higher than 97% recorded in the diagonal. As an overall evaluation, the average rate of the correct recognition () reaches 98.76%. Table 2 compares of the proposed method and the reference ones. It reflects that the result of A-ConvNet is slightly lower than the proposed method owing to the good classification of deep learning models. In comparison with SRC, the recognition performance is greatly enhanced by the proposed method, which validates the effectiveness of BSBL as a classification scheme. With the highest , the proposed method achieves the best effectiveness under SOC.
4.2.2. Recognition under EOCs
Different from the SOC situations, EOCs are common to see in real applications because of the variations of target, background, sensors, etc. As reported in the current literatures, the MSTAR dataset can be employed to set up several different EOCs with regard to target configurations, depression angles, and noises. In the following, the proposed method is tested under the three typical EOCs, respectively.
5. Configuration Variance
The military vehicles usually have several different variants, which have structural modifications. The training and test samples under configuration variance are set up as in Table 3 with four targets to be classified. Among them, BDRM2 and BTR70 are placed in the training set but with no test samples in order to enhance the classification difficulties. The test configurations of BMP2 and T72 are totally different from the counterparts in their training samples. Figure 5 illustrates four different configurations of T72. As observed, they share similar global appearances but have some local differences. Table 4 gives the assigned labels of all the test samples of BMP2 and T72. Each configuration from BMP2 and T72 can be correctly recognized with an accuracy over 96% and reaches 97.18%. of different methods with regard to configuration variance are compared in Table 5. With the highest , the superior robustness of the proposed method over the reference methods can be validated. Specifically, in comparison with traditional SRC, the proposed method noticeably improves with a large margin, which demonstrated the high effectiveness of BSBL.
6. Depression Angle Variance
When SAR images are measured from a depression angles notably different from the corresponding training sample, they have many differences although from the same azimuth. The training and test samples under large depression angle variances are set up as in Table 6 with three different targets. The training samples are combined by SAR images of three targets at 17° depression angle but the test set is comprised by two subsets from 30° and 45°, respectively. Figure 6 illustrates SAR images from the three different depression angles, respectively, in which their differences can be intuitively observed.
Table 7 compares of the five methods at the two test depression angles, respectively. At 30°, these methods maintain higher than 94%. However, at 45° depression angle, of each method degrades significantly below 73%. In comparison, the highest at both depression angles are achieved by the proposed method, showing its better robustness to large depression angle variances. of A-ConvNet decreases greatly at 45° depression angle as the training set could hardly reflect and describe the situations occurring in the test samples. As a result, the trained networks lose its high validity. Compared with traditional SRC, BSBL over the local dictionaries effectively improve the final performance.
7. Noise Corruption
Noises are common in measured SAR images, which cause obstacles to correct target recognition. In the previous works, two types of noises are used to simulate noisy SAR images for classification, for example, additive Gaussian noises  and random noises . Figure 7 shows exemplar SAR images with random noises. Some of the original pixels are replaced with randomly high values according to the noise level. At each noise level, the performance of different methods is tested and the results are plotted in Figure 8. As shown, the proposed method gets the highest at each noise level, showing its superior robustness to noise corruption. As a compressive sensing algorithm, BSBL has better robustness to noises. Similarly, SRC generally achieved better performance than SVM, AdaBoost, and A-ConvNet under noise corruption. Compared with traditional SRC, BSBL contributes to the better performance of the proposed method.
BSBL is applied to SAR target recognition in this paper, which is performed on local dictionaries. For each training class, it produces a reconstruction error for the test sample based on the solution form BSBL. These reconstruction errors fully exploit the representation capability of different classes, which can be used to judge the target label. As the azimuthal sensitivity, the linear coefficients generated for the test sample over local dictionary are block sparse ones; thus BSBL is more suitable for solution. From the results on the MSTAR dataset, the proposed method could achieve of 98.76% for 10 classes under SOC and 97.18% under configuration variance. at depression angles of 30° and 45° are 97.32% and 72.85%, respectively. The robustness under random noise corruption also defeats the four reference methods. All these comparisons show the superior performance of the proposed method.
⋆The MSTAR dataset used to support the findings of this study is available online at http://www.sdms.afrl.af.mil/datasets/mstar/.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
L. M. Novak, G. J. Owirka, W. S. Brower et al., “The automatic target recognition system in SAIP,” Lincoln Laboratory Journal, vol. 10, no. 2, pp. 187–202, 1997.View at: Google Scholar
J. Pei, Y. Huang, W. Huo, J. Wu, J. Yang, and H. Yang, “SAR imagery feature extraction using 2DPCA-based two-dimensional neighborhood virtual points discriminant embedding,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 9, no. 6, pp. 2206–2214, 2016.View at: Publisher Site | Google Scholar
B. Ding, G. Wen, X. Huang, C. Ma, and X. Yang, “Target recognition in synthetic aperture radar images via matching of attributed scattering centers,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 7, pp. 3334–3347, 2017.View at: Publisher Site | Google Scholar
L. Jin, J. Chen, and X. Peng, “Joint classification of complementary features based on multitask compressive sensing with application to synthetic aperture radar automatic target recognition,” Journal of Electronic Imaging, vol. 27, no. 5, Article ID 053034, p. 1, 2018.View at: Publisher Site | Google Scholar
J. J. Thiagaraianm, K. N. Ramamurthy, P. Knee, A. Spanias, and V. Berisha, “Sparse representations for automatic target classification in SAR images,” in Proceedings of the 4th International Symposium on Communications, Control and Signal Processing (ISCCSP), pp. 1–4, Limassol, Cyprus, March 2010.View at: Publisher Site | Google Scholar
D. E. Morgan, “Deep convolutional neural networks for ATR from SAR imagery,” in Proceedings of the SPIE, pp. 1–13, Baltimore, MD, USA, December 2015.View at: Google Scholar
S. Chen, H. Wang, F. Xu, and Y-Q. Jin, “Target classification using the deep convolutional networks for SAR images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 47, no. 6, pp. 1685–1697, 2016.View at: Google Scholar