Abstract

Aiming at the multiple target recognition problems in large-scene SAR image with strong speckle, a robust full-process method from target detection, feature extraction to target recognition is studied in this paper. By introducing a simple 8-neighborhood orthogonal basis, a local multiscale decomposition method from the center of gravity of the target is presented. Using this method, an image can be processed with a multilevel sampling filter and the target’s multiscale features in eight directions and one low frequency filtering feature can be derived directly by the key pixels sampling. At the same time, a recognition algorithm organically integrating the local multiscale features and the multiscale wavelet kernel classifier is studied, which realizes the quick classification with robustness and high accuracy for multiclass image targets. The results of classification and adaptability analysis on speckle show that the robust algorithm is effective not only for the MSTAR (Moving and Stationary Target Automatic Recognition) target chips but also for the automatic target recognition of multiclass/multitarget in large-scene SAR image with strong speckle; meanwhile, the method has good robustness to target’s rotation and scale transformation.

1. Introduction

Synthetic Aperture Radar (SAR) is an important sensor due to its all weather, day/night, high resolution imaging, and long standoff capability. Along with the development of radar technologies, as well as with increasing demands for target identification in radar applications, automatic target recognition (ATR) using SAR has become an important branch of image recognition.

Although there are many research findings in early SAR ATR field, most algorithms are based on the single target chips of the MSTAR dataset [1]. The MSTAR public dataset was provided by DARPA (Defense Advanced Research Project Agency)/AFRL (Air Force Research Laboratory). The MSTAR data is a standard dataset in the SAR ATR community, allowing researchers to fairly test and compare their ATR algorithms. The MSTAR data used in this paper consists of pixel chips, which are 1 foot resolution, X-band, and three types of ground military vehicles, that is, BMP2, BTR70, and T72. Each chip has SAR images separated by 1° azimuth increments within an angular coverage from 0° to 360°. All the chips are taken at depression angles of 17° and 15°. As the MSTAR dataset gives the target chip directly, which are fixed in size and target position, the target segmentation and detection processes of the above-mentioned algorithms are omitted. Moreover, as there is only one target in each sample chip, the difficulty of target feature extraction is significantly reduced, which gives a low availability for practical applications. So, these algorithms are not the real sense of SAR image ATR.

In real applications, a common ATR of SAR images from the input of image to the output of recognition result can be divided into four stages, (I) image preprocessing, (II) feature processing, (III) classification, and (IV) postprocessing. In stage I, the main works include image filtering and denoising, distortion correction, image segmentation and target detection, regions of interest (ROI) discrimination, and image normalization. Stage II commonly contains the feature extraction, feature selection, feature dimension reduction, data clustering, and novelty detection. In stage III, the tasks include the selection and design of classifier, the training of classifier, and target classification. The main work of postprocessing in stage IV is the further improvement for the early classification process; this stage critically focuses on the recognition precision and robustness, which are the chief properties of SAR ATR systems.

There are some special requirements to the SAR ATR algorithms, especially in the steps of feature extraction and the design of classifier. For example, the principle of the feature extraction method must be simple and can be easily realized with a real-time requirement; the feature has the advantages of antinoise and anticluster ability; and the feature has strong robustness for the translation, rotation, and scale transformation. On the other hand, the classifier must have high classification precision and learning efficiency.

By analyzing the literatures on SAR ATR [2], it can be found that the feature based classification methods get more and more attention than the whole image based methods. Many typical feature extraction methods have been used for SAR image classification, such as the PCA [3], the SDA [4], the shadow contour [5], multilinear subspace learning of tensor objects [6], the neighborhood geometric center scaling embedding method [7], feature selection [8], and feature sparsity [9]. In the classifier design and selection aspect, there are also some algorithms such as neural networks [1012], support vector machine [13], and boosting [14].

For the common off-line recognition task, the above-mentioned methods maybe have some merits and have high recognition precision, but most of these methods need a large number of labeled samples to train an efficient classifier and cannot support the robust and full-process application of ATR. Whether in the realization difficulty aspect of feature extraction, or in the aspect of adaptability on speckle, there are still large gaps to the practical applications. In recent years, the event detection and event recounting to the unconstrained web videos have attracted a lot of attention; some new algorithms also have achieved state-of-the-art performances using the zero-example or limited supervision methods [15, 16]. But analyzing the image frame in the real-time videos, we can see that the targets are clear and have no noise, which is different from the SAR image ATR. The SAR images are usually fuzzy, whether the targets in the images or the backgrounds. Especially for the targets in SAR image, the inevitable speckle, which is a chaotic phenomenon that results from coherent summation of the backscattered signals, may cause great disturbance to the target features. So, the robustness of the algorithm on speckle directly decides the final recognition efficiency. As a result, the performance evaluation, especially the robustness and adaptability analysis of the algorithm, is very important work [17].

On this basis, a robust method from target detection, feature extraction to target recognition is studied, which can solve the multitarget ATR in large-scene SAR images effectively. The distributions of the method mainly include three aspects. Firstly, a robust local multiresolution analysis method for image target is presented, which brings a fast realization of feature extraction. Meanwhile, the dimension of the feature vector is relatively lower than some other methods with the similar performance. To acquire effective features, a multiscale analysis method from the center of gravity of the SAR image target is presented. By introducing a simple 8-neighborhood orthogonal basis, an image can be processed with a multilevel sampling filter, and then the multiscale features in eight directions and one low frequency filtering feature of the image can be achieved. Furthermore, the feature extraction method can be simply and rapidly realized and has good characterization performance. Secondly, aiming at the multiscale features, a multiple kernel classifier and the fusion between the multiscale features and the multiple kernels are studied, which brings higher classifier precision than the common single kernel support vector classifier. Comparing with the traditional method, the presented algorithm is far more advanced in fast detection of target and the dimension of feature vectors. Thirdly, the presented algorithm is a full-process image target recognition method, which can realize the steps from multiple targets detection, feature extraction to target recognition in large-scene SAR images, and has better practical application value. Analyzing the relevant references on SAR ATR, we can find that most target classification algorithms are only suitable for the step-by-step recognition and single target image cases, which is not the real SAR automatic target recognition. The robust algorithm is effective not only for the MSTAR target chips but also for the ATR of multiclass/multitarget in large-scene SAR image with strong speckle. Also, the method has strong robustness against the rotation and scale transformation.

In the remainder of this paper, we go along through different sections which are organized as follows: in Section 2, we introduce the local multiscale feature extraction method and design the multiscale wavelet kernel classifier. The robust target recognition method and adaptability analysis against speckle are studied in Section 3. In Section 4, several experiments are carried out to testify the effectiveness of the method proposed in this paper. Finally, we conclude in Section 5.

2. Feature Extraction and Design of Classifier

2.1. Feature Extraction

On SAR image targets (especially the vehicle targets owning some structural characters), without loss of generality, we may also take the MSTAR dataset as the object of investigation. The targets in MSTAR chips have the following characters: the sample images are the chips with the same size; there is only one target in one chip; the target lies in the center of the chip; the targets are distributed around the centers of the chips within an angular coverage between 0° and 360°; the chips have the same resolution and scale.

Most of the traditional multiresolution analysis methods are realized by filter and sampling, which can solve the generation problem of orthogonal basis effectively, but the sampling is commonly from the beginning to the end of series. These methods have disadvantages for feature extraction, because the obtained orthogonal basis may not give the similar expression for the data having the same local characteristic. Inspired by the sampling filter idea of the traditional wavelet method, a new sampling method using a local extension is adopted to solve this problem. With this method, each sampling process is extended from the local point to around, which can guarantee the generated basis pointing to the local region and can guarantee the same local characteristics having the similar projection coefficients on the basis. In the detailed implementation, we can begin from the local point of an image (e.g., the point could be a Point of Interest (POI) or a geometric center point); then several times of filtering process can be executed using the fast filter; namely, the multiscale decomposition of the image is gained, and the difference of Gaussian (DOG) like space image of the original image is also obtained. Finally, directly utilizing the key pixels sampling from the multilevel image, we can rapidly obtain the local multiscale feature of the image.

Through the above analyzing, we can firstly construct a multilevel DOG like scale space based on image. Then we design an 8-neighborhood orthogonal basis, with which the multilevel sampling filter on image can be executed. As a result, the image features in the 8 directions and a low frequency feature are derived. The structure of the 8-neighborhood orthogonal basis and its frequency spectrum are shown in Figure 1.

Aiming at the images in MSTAR dataset, the targets first can be detected using the constant false alarm rate (CFAR) method. In consideration of the targets only occupying the central location of the chips, for convenience, the central area with the size of in each chip is taken as the research target.

To get robust and effective features, a multiscale analysis method from the center of gravity of the SAR image target is presented. The above-mentioned 8-neighborhood orthogonal basis is also a template matrix; a convolution operation between the original image matrix and the template matrix can be executed using where is the image after sampling, is the original image, and is the template image. In this process, the obtained image can be seemed as a sampling filter from the original image. If the convolution is applied repeatedly, the pyramid-shaped multilevel sampling filter images is gained. Finally, the multiscale features in eight directions and one low frequency filtering feature can be achieved from the direct selection of key pixels in every level of the pyramid images. In this paper, the obtained image is processed by a 4-level local multiresolution decomposition. As for the feature extraction, we can directly choose the pixels in each level of the image as follows: in the highest level, the image size is 3 × 3; all the 9 pixels are chosen as the 9-dimension feature; in the second level, the 8 image blocks with the size of 3 × 3 corresponding to the 8 peripheral pixels in the highest level are chosen as the feature, so the feature dimension is 72; the feature extraction method in the third level is similar to the second level; the eight 3 × 3 image blocks corresponding to the block centers in the second level are selected; the feature dimension is also 72; in the fourth level, the peripheral 8 central pixels are directly selected, so the feature dimension is 8. So, for a given image, the total feature dimension is .

2.2. Multiscale Wavelet Kernel Support Vector Classifier

Neural network based methods are widely used in hypersonic flight automatic control [1820] and the tracking control in the Internet of Things [21], but SAR ATR is a high real-time application, and in most cases, enough SAR target images could not be collected. So, the support vector machine is chosen as the classifier, which is suitable for small samples and can be well modified for the online learning. As the common support vector classifier (SVC) has low speed owing to the solution of quadric programming, it is hard for them to apply to some real-time cases. Suykens and Vandewalle [22] proposed an improved SVM method, least squares support vector machine (LSSVM). By replacing the quadric programming with a solution of linear equations, LSSVM has a great improvement on speed. According to the Structural Risk Minimization principle, the optimization problem of LSSVC is given as where is the normal vector of the decision surface, is the error penalty parameter, is the error of the th sample, is the th sample, is the corresponding class label, and is the bias term.

Transforming (2) to a nonrestricted optimization, the Lagrangian function can be defined aswhere are Lagrangian multipliers. The optimality of upper function is as the following sets of linear equations instead of quadratic program in traditional SVC.

By eliminating variables and , we get the following equation:where , , , is identity matrix, , and is kernel function, .

Then solving the linear equations, we get the decision function of LSSVC as

Set ; we can get the classifier coefficient vector and the bias item from (5)

The fusion of kernels with multiple scales is a special situation of multiple kernel learning [23, 24]. This kernel method has better flexibility and can bring more completed scale choice than other methods, such as the multiscale kernel method. In addition, with the wavelet theory and the multiscale analysis theory continuing to mature, the multiple kernel method gains good theory background by introducing the scale space, which is a great promotion for kernel method based machine learning [25, 26].

The foundation of multiscale kernel method is seeking a set of kernel functions owning the multiscale representation capability. Among the kernel functions being widely used, the Gaussian radial basis function (RBF) (9) is the most popular one, because of its general approximation ability. Meanwhile, it is also a typical kernel and can be multiscaled

Taking the RBF kernel as the example, it can be multiscaled as (10) (suppose the generated kernels have the translation invariant)where . From (10), we can find that when is small, the support vector classifier (SVC) with the RBF kernel can fit the samples having drastic variation. And when is larger, the same classifier can well classify the samples with mild variation. So the multiscale kernels can obtain better generalization. Inspired by the scale-variant rule of wavelet transformation, the values of can be defined as

Another typical multiscale kernel is the wavelet kernel function [27].

Theorem 1. Let be a mother wavelet function, and are the scaling factor and the transfer factor, , if , and then the inner product type wavelet kernel function can be expressed as and the transfer invariant wavelet kernel function is

Theorem 2. Considering a common wavelet function (14)if , then the wavelet kernel function is

By changing the values of the scale parameter , different scale wavelet kernel functions can be constructed.

In this paper, the multiscale wavelet kernel is introduced to the LSSVC, and the multiscale wavelet kernel LSSVC is defined as namely,

On the other hand, it is also an effective approach for the improvement of target recognition accuracy if we synthesize the features having multiresolution character and the multiscale kernel functions. In this paper, the 4-level local multiscale feature and the 4-scale wavelet kernels are synthesized; the scales of corresponding kernel functions are increased by 2 times. At the same time, the weights of kernels are determined by equal coefficients; namely, ; the schematic diagram is shown in Figure 2.

3. Robust ATR Complexity Analysis and Adaptability Analysis

3.1. Robust ATR Method and Complexity Analysis

The robust ATR procedure includes two stages which are the multiscale kernel classifier training and the recognition test; the schematic diagram is shown in Figure 3.

The multiscale kernel classifier training is mainly based on the MSTAR dataset; the steps are as follows.

Step 1. Finish the CFAR detection, respectively, for all the training target chips and obtain the target segmentation results.

Step 2. From the centers of the chips (namely, the centers of targets), execute the 4-level local multiscale decomposition for the target images.

Step 3. Encode the coefficients from the multiscale analysis, and extract the feature vectors of every target.

Step 4. Train the multiscale wavelet kernel classifier using the multilevel feature vectors.

The recognition test is based on the large-scene multiple targets image samples acquired in real-time; the steps are as follows.

Step 1. Do the CFAR detection for the large-scene image and segment the targets and regions of interest (ROI).

Step 2. Process the targets with the mathematical morphological, eliminate the false alert, and calculate the gravity center of each target, which is taken as the starting point of local multiscale decomposition.

Step 3. Conduct the 4-level local multiscale decomposition from the gravity center of each target.

Step 4. Extract the features of each target by sampling and construct the feature vector.

Step 5. Input the feature vectors to the multiscale wavelet kernel classifier, and output the recognition result.

In pattern recognition application using SVM, the training and testing are two different processes, so the algorithm complexity should not be understood as a whole. Here we mainly discuss the complexity in the training process, namely, the complexity solving of the quadratic programming problem. For a typical SVM training algorithm, its computational complexity is in the range of , where is the number of support vectors, is the number of samples in training set, and is the dimension of each sample, so in the worst case, the algorithm complexity is . In our algorithm of this paper, the number of training samples ; the sample dimension . Considering the multiple kernel classifier, we introduced kernels to the algorithm, and , and we still can think that the complexity of the multiple kernel training is . So, we can conclude that the computational complexity in the multiple kernel classifier training process is .

In the large-scene multiple targets ATR process, the complexity is mainly dependent on the kernel function computation between the testing sample vector and the support vectors. In this process, the complexity of each target sample is . In the large-scene SAR image, suppose the number of the detected targets is , then the complexity of the whole testing may be . In practical applications, the value usually is small; for example, in the experiments of adaptability analysis in our manuscript, there are 6 targets in the images. So we still can conclude that the complexity of the testing process is .

3.2. Adaptability Analysis Method

The speckle may cause inconvenience to SAR image ATR, but this noise is inevitable and cannot be absolutely eliminated. So, the adaptability against speckle of the target recognition algorithm directly decides the usability and robustness. On the adaptability analysis against speckle, the main method is adding speckle into the large-scene multiple target image and then analyzing the recognition precision. When the speckle adding degree (SAD) is plus 1, the speckle is added into the whole image with mean 0 and variance 0.04. On this basis, some parameters such as the mean, the variance, the dynamic range, and the peak signal noise ratio of the image can be calculated. Then for the image with added speckle, through executing the target detection, feature extraction, and classification again, we can study the target detection and recognition precision at different speckle degree.

4. Experiments

4.1. Single Target Recognition and Adaptability Analysis for Speckle

The single target recognition and speckle adaptability analysis are based on the MSTAR dataset; the numbers of the 3 classes samples are shown in Table 1. After feature extraction with local multiscale decomposition for all the sample chips, we design and train the classifier. Then the multiple-class classification problem is transformed into the two-class problem by “One VS One” method.

Utilizing the training set, we also can obtain the optimal multiscale wavelet kernel classifier, where the scale factors of wavelet kernel are , , , and . The penalty coefficient . After the features of the testing set being extracted, the feature vectors are sent to the classifier and the recognition precision is outputted. To analysis the algorithm performance in a nondistortion circumstance, some experiments and performance comparison between the proposed method and other typical methods are executed. The feature extraction methods, the feature dimension, the classifiers, and the classification precision are shown in Table 2.

From the experimental result, we can see that the fusion method with the multiscale feature and the multiscale kernel classifier gives a very high classification precision of 98.75% when there is no speckle added. In addition, the algorithm realizes the fast access and storage to nearly 3000 SAR images in a short time, which indicates good real-time performance. Comparing with the traditional method, the presented algorithm is far more advanced in fast detection of target and less dimension of feature vectors.

To analyze the adaptability against noise, the speckle with mean 0 and variance 0.04 is added to the MSTAR testing set. When SAD is plus 1, the speckle is added into the image one time. Through the comparison experiments under different SADs, the final recognition results are shown in Table 3.

When the speckle is added into the testing samples, the recognition precision comes to 93.41% with . As the enhancement of speckle, the recognition precisions reduce to 87.62% and 76.48% when and , respectively, which are still the preferably correct ratios. The reasons lie in that there is only one target in each sample, and we already know the targets lie in the center of sample chips. So, even though the target structure changes, we can still do the local multiscale decomposition from the center of sample and extract the exactly proper features.

4.2. Large-Scene Multiple Target ATR and Adaptability Analysis for Speckle

For large-scene multiple target ATR, firstly, we must construct the large-scene multiple target images. Here, we randomly select 6 targets being correctly classified in the MSTAR testing set, where 2 targets are selected from each class, respectively, and the targets are embedded in the large-scene clutter images. Using this method, two large-scene multiple target images are formed, where Image I has a size of 512 × 512, with the image parameters of mean , variance , and the dynamic range . Image II has a size of 768 × 768 and , , and , respectively.

In the simulation tests, 3 degrees of speckle is added into the two large-scene multiple target images; the result images with different SADs are shown in Figure 4. Then using the same target segmentation, mathematical morphological processing, and center of gravity calculation, we can achieve the target detection and marking result. Figures 5 and 6 show the two-image target segmentation, detection, and marking result under the speckle adding degree 1 (). Then, begin from the center of gravity, execute the multiscale decomposition and feature extraction, and gain the feature vectors. Finally, send the feature vectors into the multiscale wavelet kernel classifier, and output the recognition results.

The large-scene multitarget image parameters and the ATR results under different SADs are recorded in Table 4. The experimental results indicate that when the , although the center of gravity of target is shifted from the geometrical center of the chips after target segmentation, the 6 targets still can be detected and correctly classified, which testifies the effectiveness and robustness of the presented algorithm. With the enhancement of speckle, the number of correctly recognized targets is decreased. The targets marked “6” and marked “1” in Image I are error recognized with the and . In Image II, only the target marked “6” is error recognized with the two SADs; the rest 5 targets are correctly recognized. The main reason is the multiplicative characteristic of speckle, which can cause drastic variability to the target structure with target segmentation and mathematical morphological processing, so the center of gravity of target shifts dramatically. As a result, the local multiscale features are influenced. Overall, the feature extraction and classification method has good adaptability on speckle.

4.3. ATR with Scale and Rotation Transformation and Adaptability Analysis for Speckle

In this test, the rotation and scale transformations of the targets are introduced based on Image I and Image II in Section 4.2. For Image I, the targets marked “4,” “2,” and “5” are resized with scale parameters 2, 1.5, and 1.5, respectively, and with a rotation of 30°; then the targets are embedded into the original image and the new image named Image III is constructed. For Image II, the targets marked “1,” “4,” and “5” are resized with scale parameters 2, 1.5, and 1.5, respectively, and with a rotation of 30°; using the same method, Image IV is constructed.

In the tests, 3 degrees of speckle is added into Image III and Image IV. Then utilizing the same target segmentation, mathematical morphological processing with modulation of parameters, and center of gravity calculation, we can achieve the target detection and marking result. Figures 7 and 8 show the two-image target segmentation, detection, and marking result under the . Then, begin from the center of gravity, execute the multiscale decomposition and feature extraction, and gain the feature vectors. Finally, send the feature vectors into the multiscale wavelet kernel classifier, and output the recognition results.

Only considering the targets with rotation and scale transformation, the large-scene multitarget image parameters and the ATR results under different SADs are recorded in Table 5. Under the condition of , the 3 targets, respectively, in Image III and Image IV are still correctly recognized, which testifies the effectiveness of the feature extraction method and the robustness on rotation and scale transformation. But with the enhancement of the speckle, the recognition rate significantly decreased. For example, when , only 2 targets are correctly recognized, respectively, in Image III and Image IV; when , all 3 targets in Image III are error recognized, and only 1 target in Image IV can be correctly recognized. It is shown that the structure features of the targets are strongly changed after the rotation and scale; at the same time, the speckle has a far greater impact on large scale targets. As a result, the local multiscale features are extremely influenced, which leads to the error recognition of the targets.

To further testify the performance of the algorithm under rotation, scale transformation, and speckle, more large-scene image samples are constructed and tested. Based on Image I and Image II, the targets in the two images are resized with scale parameter 1.5 firstly; then the targets are rotated at 30°, and finally, the speckle with is added into the images. Ultimately, two datasets having 12 large-scene images, respectively, can be constructed from Image I and Image II. In each dataset, there are 72 targets in total with 3 classes.

The experiments are carried out with the two datasets, and the results are shown in Table 6. From the data in the table, we can find that the correct recognition rates of the three classes targets are 87.5% for Dataset I and 90.3% for Dataset II. The results once again indicate that the local multiscale feature has good adaptability on rotation, scale transformation, and speckle. Meanwhile, the fusion of local multiscale feature and the multiscale kernel classifier can bring better robustness and recognition rate for ATR systems.

5. Conclusions

Compared with the other image target recognition such as the face recognition, the gesture recognition, the fingerprint recognition, and the gait recognition, great obstacles are brought to the usability and recognition efficiency of SAR ATR as the strong speckle and low image resolution. Based on the fusion of multiscale feature and multiscale kernel machine, a robust full-process method from target detection, gravity center locating, local multiscale decomposition, and feature extraction to target classification with multiscale kernel LSSVC is studied, which can solve the multitarget ATR in large-scene SAR images with strong speckle effectively. Through adaptability analysis, the robust algorithm is testified having good adaptability on speckle. Meanwhile, the algorithm is well suitable for the requirement of practical applications. As you can imagine, the method can be applied to the vehicle target detection from the other imaging sensors such as visible light image and the infrared image; it also can be used for ship recognition in SAR images with complex sea clutters. On the other hand, to achieve better classification precision, a large number of labeled samples are needed to train an effective classifier for the support vector machines including the classifier presented in this paper. But the labeled sample acquisition is costly, laborious, and time-consuming; how to improve the classifier accuracy using unlabeled data has received considerable attention in medical applications and more recently in crowdsourcing and the event detection and event recounting problems to real video datasets [28]. So, in the next step, we look forward to making the significant improvement of performance and efficient implementation from this idea in SAR ATR.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This work was jointly supported by the National Natural Science Foundation for Young Scientists of China (Grant nos. 61202332, 61403397, and 61503389), China Postdoctoral Science Foundation (Grant no. 2012M521905), and Natural Science Basic Research Plan in Shaanxi Province of China (Grant no. 2015JM6313).