Abstract

Dictionary construction is a key factor for the sparse representation- (SR-) based algorithms. It has been verified that the learned dictionaries are more effective than the predefined ones. In this paper, we propose a product dictionary learning (PDL) algorithm to achieve synthetic aperture radar (SAR) target configuration recognition. The proposed algorithm obtains the dictionaries from a statistical standpoint to enhance the robustness of the proposed algorithm to noise. And, taking the inevitable multiplicative speckle in SAR images into account, the proposed algorithm employs the product model to describe SAR images. A more accurate description of the SAR image results in higher recognition rates. The accuracy and robustness of the proposed algorithm are validated by the moving and stationary target acquisition and recognition (MSTAR) database.

1. Introduction

Due to the powerful day and night working ability under inclement weather conditions, the synthetic aperture radar (SAR) has attracted increasing popularity in recent years [1]. As one of the hottest topics related to SAR remote-sensing applications, SAR automatic target recognition (ATR) focuses on the recognition of the interest targets from 2-dimensional (2D) high-resolution SAR images. SAR ATR algorithms can be roughly categorized into template-based methods and model-based methods [2, 3]. With respect to template-based methods, model-based ones can achieve better performance.

Model-based methods often involve two related parts, which are feature extraction and classifier design [4]. Plenty of effective features have been exploited over the past decades, such as physical models [5], geometrical characteristics [6], and mathematical features [7]. The performance of these algorithms heavily relies on the precision of feature extraction.

As for the classifier design, some advanced classifiers such as support vector machine (SVM) [8], sparse representation [7, 911], and convolutional neural networks (CNN) [1, 5, 8] have been employed. Algorithms based on CNN [1, 5, 8] or other deep learning [3] have been enriched enormously. However, due to the special complexity of SAR images and shortage of data amount, these algorithms usually suffer from overfitting and local minima [2]. Besides, estimating and initializing associated parameters such as the learning rate, the number of hidden layers, and the number of hidden layer units are quite challenging. Nowadays, the linear representation-based classification has been widely studied due to the great discriminative power, which includes collaborative representation- (CR-) based algorithms and sparse representation- (SR-) based algorithms. Due to the efficient closed-form solution of the CR-based algorithm, satisfying recognition results can be obtained [1215]. As for the algorithms based on SR, the testing sample is sparsely described by using only few atoms in the dictionary [2, 9, 11], which have also achieved overwhelming recognition results. However, SAR image recognition suffers from the shortage of sample numbers, and the SAR image quality cannot be completely guaranteed due to various nonideal factors. As a result, we try to adopt learned dictionaries to overcome the obstacle. The learned dictionaries have been proved to be more powerful than the predefined ones, which has wider applications [16, 17].

Moreover, as is known, better descriptions of the SAR images result in better recognition results since the essential characteristics of the sample can be described accurately. And due to the coherent imaging mechanism of SAR, the speckle noise in SAR images is multiplicative [18, 19]. Therefore, different from the presented work, in which SAR images are modelled by the additive model [16], we describe SAR images in a more precise way by the product model in this paper. The motivation of the proposed method is to fuse the advantages of SR with learned dictionaries and the product model together to improve the robustness of the recognition under various severe conditions. In other words, we aim to realize SAR target configuration recognition under a learned dictionary in a statistical way in this paper. Since we adopt the product model to describe the SAR images, we named the proposed method as product dictionary learning (PDL).

Target configuration recognition aims to classify tiny differences of the samples within the same type. It is of crucial importance to application fields, such as battlefield interpretation and reconnaissance [2, 10, 16].

2. The Proposed Product Dictionary Learning (PDL) Algorithm

The speckle component in SAR images is inevitable due to the special imaging mechanism. As a result, for any SAR image, it can be expressed as [18, 19]where represents the radar cross-section coefficients of the clutter, represents the noise component, and the symbol “” represents element-wise multiplication. The product model shown in (1) can give a better description of the SAR image with respect to other models, such as the additive model [18, 19].

Due to the nature discriminative property, SR has been widely used in pattern recognition applications [7, 9, 20]. In this paper, we aim to enhance the performance and robustness of SAR image recognition by fusing the product model and the SR technique with learned dictionaries. The product model provides the accurate description of the SAR image, and the SR supports the discriminative power.

Assume that there are different configurations in the training datasets. For configuration , there are training samples. The training datasets with configuration can be expressed as (). And for a given sample , it can be expressed as the following formula with the definition of :where is the dictionary, is the sparse vector of , and is the corresponding noise component.

In this paper, we try to get the optimized dictionary to realize satisfying recognition performance with robustness. From the view of statistics, searching the dictionary by the maximum likelihood estimation (MLE) can be given by [21]

Due to the fact that has a tightly peaked maximum, we can solve the integration in (3) by approximating the value at its maximum point. Therefore, (3) can be simplified as

Firstly, we come to model the likelihood function . As can be seen from (2), the likelihood function shares the same distribution as the speckle component. It has been proved that the gamma distribution [18, 19] can realize accurate modelling of the speckle component, which can give a better description of the SAR image. Better descriptions of the SAR images will contribute to better recognition results. The distribution of the speckle component in SAR images can be expressed as [18, 19]where is the number of looks of the SAR image and denotes the gamma function.

Supposing that the dimensionality of is , i.e., (), from (1) and (2), we have . represents the matrix transposition, is the element of , and is the element of . As a result, can be given by

Hereto, we can get the distribution of the likelihood function:

What follows is to deduce the prior function . The requirement of the SR-based algorithm is to ensure the sparsity of the sparse vector. And, the Laplace distribution [22] has been validated to be effective, which can be expressed aswhere represents the exponential function and is a constant which can be determined by cross-validation. Here, we will give a brief discussion of how (8) works. We can see that we would like to be large to satisfy the objective function (4). Also, from the expression of the prior function shown by (8), we can tell that is a decreasing function. Therefore, a larger will result in a smaller , and a smaller implies that most entries of are zeroes or close to zeroes according to the definition of the L1 norm.

(4) can be given by the following formula by combining (7) and (8).

From (9), we can tell that we can iterate the following two steps to get the optimized dictionary : (1) solve with a fixed ; (2) solve with a fixed .

In Step (1), solving with a fixed is not a convex problem, so we can get the solution by employing the multistage convex relaxation method [23], whereas in Step (2), we can adopt the quasi-Newton algorithm [24] to obtain the dictionary. The gradient of the objective function with respect to is given bywhere and is given by

After solving all the learned dictionaries, recognition can be realized bywhere is the testing sample, is the determination function that confirms the label of , is the dictionary composed by all the optimized divided ones, is the sparse vector of obtained under dictionary , and is a function that only keeps the entries of with label and makes other entries to be zeroes. The minimum reconstruction error will correspond to the desired label.

The procedures of the proposed method are summarized as follows. (1) The training samples are divided into subsets according to the label of the samples. (2) The SAR images are described by the product model considering the inevitable speckle component in SAR images. And, the gamma distribution is chosen to model the likelihood function and the Laplace distribution is utilized to model the prior function. (3) The -learned dictionaries are obtained through parameter estimation. (4) We use the -learned dictionaries to form a composed dictionary. (5) The sparse vector of the testing sample is obtained under the composed dictionary. (6) Target configuration recognition is achieved according to the reconstruction error. The flowchart of the proposed algorithm is given in Figure 1.

3. Experimental Results and Analysis

The proposed algorithm is tested on the standard moving and stationary target acquisition and recognition (MSTAR) database. We conduct the experiments on different configurations of the targets. Samples with the depression angle 17° are used for training, whereas the ones with the depression angle 15° are used for testing.

Subimages with 64 × 64 pixels are extracted and normalized from the original images [9, 11, 16]. The parameter is determined from the set by using the 3-fold cross-validation. The number of the atoms in each dictionary is empirically set to be 60. Dimensionality reduction is realized by using the independent and identically distributed Gaussian random matrix [11, 20]. The feature dimensionality is set to be 800.

The original SR algorithm [20], the monogenic signal-based SR algorithm (MSR) [9], the joint SR algorithm (JSR) [11], the SR-based algorithm using the K-SVD algorithm [25] to learn dictionaries (DL), and the SR-based algorithm using the statistical dictionary learning algorithm [16] to learn dictionaries (SDL) are chosen to be competitors to test the advantage of the proposed method.

3.1. Target Configuration Recognition on Various Targets

In the beginning, we conducted the experiment on 7 different configurations belonging to 3 different targets. Datasets description is given in Table 1. To show the advantage of the proposed algorithm, we not only compare it with the abovementioned SR-based algorithms, but also compare it with a deep learning-based algorithm named as A-convnet [26]. Corresponding results are shown in Table 2. To make the comparison between A-convnet and the proposed algorithm convenient, we refer to the structure of A-convnet constructed in [26, 27]. The structure that results in the best recognition results for the 64 × 64 pixels image is displayed in Figure 2.

From the experimental results, we can see that the MSR and JSR perform better than the baseline (SR), thanks to the utilization of extra useful information. And, all the dictionary learning-based algorithms can enhance the performance effectively. This is due to the fact that with learned dictionaries, more robustness can be achieved. As for the three different dictionary learning-based algorithms, SDL performs better than DL. The reason is that SDL can give a better description of the SAR image than DL due to the utilization of the additive model and the Gaussian mixture distribution. As previously discussed, better depictions lead to better recognition performance. From Table 2, we can find that A-convnet-based recognition can obtain the highest recognition rates for BMP2-C21 and BTR70-C71. The proposed PDL can realize the best performance of all the algorithms, which is 6.16%, 2.93%, 2.13%, 3.89%, 1.10%, and 1.76% better than the SR, MSR, JSR, DL, SDL, and A-convnet, respectively. Thanks to the better description of the SAR images, satisfying recognition results can be obtained by using the proposed algorithm.

3.2. Target Configuration Recognition on One Target

From this part, we test the proposed algorithm on a much more challenging case, and we only compare the five SR-based recognition algorithms. We recognize 11 different T72 configurations from this part. Datasets description is given in Table 3, and the corresponding targets are shown in Figure 3. The recognition rates are demonstrated in Table 4. In this case with 11 different configurations with tiny differences, the proposed PDL can get the accurate recognition rate of 88.09%, which is 7.07%, 4.65%, 3.39%, 4.94%, and 1.91% better than SR, MSR, JSR, DL, and SDL, respectively. The reason for this phenomenon is that the proposed algorithm can describe the characteristics of SAR images more accurately in the statistical way. The comparisons of the proposed PDL and SDL further prove the fact that the product model can describe SAR images better than the additive model.

3.3. Configuration Recognition under Random Corruption

With learned dictionaries, the recognition algorithms can achieve more robustness. In this section, we test the proposed PDL algorithm under random corruption [9, 20]. We corrupt each SAR image of the datasets up to 15% with 5% increasing interval. The random chosen pixels are replaced with the independent and identically distributed samples from a uniform distribution [9, 20]. The recognition results of 11 T72 configurations under all the algorithms are given in Figure 4. From the results, we can see that the performance of all the algorithms will drop with the increase in the corruption percentage. However, the reduction amount of the dictionary learning-based algorithms is much smaller than other SR-based algorithms without dictionary learning. And, we can still find that the proposed algorithm will get the highest accurate recognition rates under all corruption conditions due to the usage of the product model and the gamma distribution.

3.4. Configuration Recognition with Limited Number of Training Samples

In this part, we test the proposed algorithm by another challenging case with limited training samples. We select 1/2, 1/3, and 1/4 of all the training SAR images to learn the dictionaries, respectively. Corresponding recognition results of 11 T72 configurations are displayed in Figure 5. Similarly, we can see that with the decreasing of the number of the training samples, the performance of all the algorithms will become poorer. Thanks to the accurate descriptions of the SAR images by using the product model, the proposed PDL algorithm can get the best performance with different percentages of training samples.

3.5. Computational Complexity Analysis

In the end, we evaluate the computational complexity of the proposed PDL algorithm and we compare it with other dictionary learning-based algorithms. The main computational complexity of the dictionary learning-based methods is to obtain the dictionary. Supposing that the size of the dictionary is , where is the dimensionality of the training samples and is the number of the atoms of .

Firstly, we come to evaluate the proposed PDL algorithm. As mentioned in Section 2, we solve the sparse vector by using the multistage convex relaxation method [23] in Step (1) and the computation complexity is [23, 28]. As a result, the computation complexity of obtaining all the sparse vectors corresponding to configuration will be . In Step (2), we solve the dictionary by using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method of the quasi-Newton algorithm, which requires the computational cost of [24], where is the size of all inputs. Therefore, the computation of Step (2) will be . Generally, for SAR target recognition, the number of samples corresponding to each configuration is less than the feature dimensionality, i.e., . Thus, . Assuming that we need to iterate these two steps times, the computational complexity of getting the optimized dictionary will be .

And, the computational complexity of DL and SDL will be , where is the number of iterations. Detailed discussions can be found in [16].

From the comparison of the computational complexity of the proposed PDL algorithm and the DL algorithm, we can tell that they have the same order.

4. Conclusions

In this paper, we proposed a PDL algorithm to realize SAR target configuration recognition. The dictionaries needed for the SR-based recognition algorithms are obtained in a statistical view to enhance robustness to noise. And the product model is utilized to describe SAR images considering the multiplicative speckle. Experiments on the MSTAR database have demonstrated that the recognition accuracy and robustness to noise of the proposed PDL algorithm are highly improved. And comparisons with other state-of-the-art algorithms further prove its superiority.

From the experimental results, we can see that the recognition algorithm based on deep learning has advantages on the recognition of some targets over the SR-based ones, and how to combine it with the SR technique to achieve better results is well worth working on.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant 61701289 and 61601274), the Natural Science Foundation of Shaanxi Province (Grant 2018JQ6083 and 2018JQ6087), the young talent fund of university association for science and technology in Shaanxi (Grant 20190106), and the Fundamental Research Funds for the Central Universities (Grant GK201903084).