Abstract

Ground penetrating radar (GPR) is a powerful tool for detecting objects buried underground. However, the interpretation of the acquired signals remains a challenging task since an experienced user is required to manage the entire operation. Particularly difficult is the classification of the material type of underground objects in noisy environment. This paper proposes a new feature extraction method. First, discrete wavelet transform (DWT) transforms A-Scan data and approximation coefficients are extracted. Then, fractional Fourier transform (FRFT) is used to transform approximation coefficients into fractional domain and we extract features. The features are supplied to the support vector machine (SVM) classifiers to automatically identify underground objects material. Experiment results show that the proposed feature-based SVM system has good performances in classification accuracy compared to statistical and frequency domain feature-based SVM system in noisy environment and the classification accuracy of features proposed in this paper has little relationship with the SVM models.

1. Introduction

For its rapid, continuous, nondestructive, efficient, convenient, and high-resolution properties, GPR is widely used in national defense, airport construction, railways, and highways [15]. However, the interpretation of the large amount of acquired and stored GPR data requires a human operator with high skill and experience. In addition, the signals reflected from objects are corrupted with noise that cannot be completely removed. Hence, these problems lead to the growing request for the development of accurate and fast automated subsurface object detection and identification techniques in noisy environment. In recent years, a large number of scholars have proposed a variety of methods of automatic target recognition.

Torrione et al. used histograms of oriented gradients (HOG) features to detect landmines and suggest that other techniques from computer vision might also be successfully applied to target detection in GPR data [6]. Phase profile was used in detection and location of targets underground because phase profile is a function of depth [7]. A time-frequency feature extraction method based on Wigner distribution is proposed [3] and can detect and recognize diseases which have different compositions and shapes. Wahab et al. proposed a new hyperbola fitting technique to estimate the radius of buried utility (pipes and cables) [8]. GPR signal explanation model is established based on the support vector machine and the dyadic wavelet transform (DyWT) theory in [9]. It is applied in railway diseases detection and the results are effective. Ko et al. used principal component analysis (PCA) and Fourier coefficients as features to detect and identify landmines in different environments [10]. Principal components from PCA, Fourier coefficients, and singular values from singular value decomposition are three features which are used to detect landmines in various burial conditions [11]. Most of the above methods concentrate on the detection and localization of buried objects. However, very few works have been completed on the material identification of underground objects [12].

The basic motivation of proposing a novel feature extraction method is to improve automatic material classification accuracy in noisy environment and testify the results have little relationship with the SVM models. In this paper, the novel features, statistical features, and frequency domain features extracted from A-Scan data combined with SVM classifiers for material identification are tested. First of all, we used synthetic data to test them. The synthetic B-Scan (320 A-Scans composed of B-Scan in this paper) data are created by finite-difference time-domain (FDTD) method. The synthetic GPR data after mean filtering is corrupted by different intensity of white Gaussian noise, which denotes different levels of noisy environment. Then, we test them on laboratory data in which we do not know the noise. From the experiments results we can see the good performances of the features proposed in this paper combined with SVM classifiers.

2. Forward Simulation

The evaluation of different feature-based approaches studied in this paper is firstly performed using different generated GPR data synthesized by using the electromagnetic simulator “GprMax2D” [13]. GprMax2D was implemented using the FDTD numerical technique. An approximate solution for Maxwell’s equations is directly obtained in the time domain by discretizing it in both space and time through an iterative process.

In our simulations, several parameters have to be set for the transmission and acquisition system. Parameters for FDTD simulation are listed in Table 1.

One of the established simulation models is shown in Figure 1. The dimensions are 2.5 × 0.8 m. The model consists of three layers: 0.05-meter-thick air (, ), 0.25-meter-thick concrete (, ), and 0.5-meter-thick dry sand (, ). One cylinder is embedded in the model. Center coordinates of the cylinder are (0.95 m, 0.35 m) and the radius is 0.1 m.

There are four models with the same parameters except the material of cylinders. As metal, air, stone, and polyvinyl chloride polymer (pvc) are common materials of objects in subsurface, in this paper the cylinders’ materials embedded in the four models are the four materials, respectively. The parameters of the four materials of cylinders are listed in Table 2.

To save computation time and storage space, only 320 traces that can display the target signal completely are taken and the other 162 traces are removed. The simulation results are shown in Figure 2. The results show that there is a hyperbolic curve in every B-Scan datum. However, the shape of the hyperbolic curve is different more or less. For example, the curvature of the four hyperbolic curves, grey value of hyperbolic curve, and the distance between vertices of hyperbolic curve are different.

If the shapes have big difference we can classify them artificially. On the contrary, we hardly classify them with manpower when the shapes are similar (e.g., Figures 2(c) and 2(d)). In addition, particularly difficult is their classification in noisy environment. Thus, it is important to find an automatic and high accuracy recognition method instead of manpower in noisy environment.

3. Feature Extraction

3.1. Flow Chart

In the process of feature extraction, we first use DWT to transform A-Scan data and extract approximation coefficients. Then, FRFT is applied to transform approximation coefficients into fractional domain. Finally, fractional domain-envelope curve is extracted and we extract feature to construct feature vector. The flow chart of feature extraction is shown in Figure 3.

3.2. Data Preparation

As is known to all, echoes from ground surface are the strongest signals that can decrease the quality of GPR image and interfere with the identification of objects, but they can be removed by mean filtering [14]. Hence, to get rid of the interference of echoes from ground surface we apply mean filter to every B-Scan datum in Figure 2. Figure 4 is the result of Figure 2(a) after mean filtering. From Figure 4 we can see the echoes from ground surface are removed and the hyperbolic curve is more obvious. Besides, the received data in practice contains a lot of other noise which cannot be removed completely and can be regarded as white Gaussian noise to some extent. As the data after the mean filtering hardly contain the noise, to simulate the reality we add different amplitude from 10 dBW to 50 dBW of white Gaussian noise to B-Scan data after mean filtering. Figure 5 shows that the 35 dBW white Gaussian noise is added to Figure 4. Meanwhile, the mean amplitude of object’s signal is 30 dBW.

3.3. Extraction Approximation Coefficients

In this paper, we use DWT extraction approximation coefficients. If the function being expanded is a sequence of numbers, like samples of a continuous function , the resulting coefficients are called the discrete wavelet transform (DWT) of . The DWT transform pair is defined as where , , and are functions of the discrete variable , is the preparation A-Scan data in Section 3.2 and Figure 6 is one A-Scan trace of void that 10 dBW white Gaussian noise is added to Figure 4, is wavelet function, and is scaling function. Normally, we let and select (the length of the discrete samples of ) to be a power of 2 (i.e., ) so that the summations are performed over , , and . The transform itself is composed of coefficients, the minimum scale is 0, and the maximum scale is . The coefficients defined in (1) and (2) are usually called approximation and detail coefficients, respectively. In practice we select a wavelet from ready-made wavelets for a particular problem. Here we chose a wavelet called Daubechies wavelet. The approximation coefficients can be extracted from (1).

3.4. Fractional Domain-Envelope Curve Extraction

The fractional Fourier transform (FRFT) [15] can transform approximation coefficients into fractional domain.

The fractional Fourier transform, a generalization of the Fourier transform (FT), serves as a useful and powerful analyzing tool in optics, communications, signal processing [16], and so forth.

The FRFT of a signal is defined as where is approximation coefficients and the transform kernel is given by where , , and is order of FRFT. Although can take any real number, the period of FRFT is 4. Hence, we usually only consider . The axis is regarded as the fractional domain.

In this paper, we only consider the case of and set the interval of neighbor as 0.05, since the definition can easily be extended outside the interval by noting that is the identity operator for any integer and that the FRFT operator is additive in index; that is, . The FRFT depends on a parameter and can be interpreted as a rotation by an angle in the time-frequency plane [17]. Whenever , that is to say, , (3) reduces to the FT.

The fractional domain-envelope curve can be extracted:

The envelope curve of the four materials can be seen from Figure 7. It is indicated that the different materials have big difference. Hence, it is possible to extract features from the envelope curve to classify material of underground objects.

In [17], we have known that FRFT can be interpreted as a rotation by an angle in the time-frequency plane. That is to say, FRFT provides a signal comprehensive description of the whole process from time domain to frequency domain. In addition, with the from 0 to 1, the FRFT display signal changes gradually from time domain to frequency domain of all variation characteristics. from 0 to 1 and from 1 to 2 are symmetrical. The reason why different materials have different envelope curves may be that variation characteristics are different from time domain to frequency domain. Figure 8 shows the DWT-FRFT of air.

3.5. Feature Vector Construction

In this paper, we will extract four features from envelope curve to construct feature vector. These features’ calculation in MATLAB is as follows:(a)the maximum of envelope curve: (b)the average value of envelope curve: (c)the variance of envelope curve: where is the element in the vector of length ; from Section 3.3 we can see ( and set the interval of neighbor as 0.05);(d)the kurtosis of envelope curve: where and are the fourth moment and second moment of , respectively.

The kurtosis is used to reflect the envelope curve of the top tip forms or flat level. Due to the sharpness of the envelope curves being different, the kurtosis is different.

According to the above, we can construct feature vector:

4. Experimental Results

SVM was designed originally for solving binary classification problem. However, to solve a multiclass SVM problem, two approaches can be used, either by creating several binary classifiers or by using a larger multiclass optimization problem. However, it is more expensive computationally to solve a multiclass optimization problem in one step than a binary problem using the same data size. In this paper, the major multiclass SVM methods, for example, one-against-one method, based on constructing binary classifiers, are used. It generates classifiers; each classifier is trained on data from two classes. Many approaches can then be applied for material classification; the one used in this study is the voting approach using “Max Wins” strategy.

4.1. The Basic Concept of SVM Algorithm

Suppose that each class training set consists of feature vectors with feature space dimension . Each vector represents one of two classes . The linear SVM works by finding an optimal hyperplane between two classes that maximizes the separating margin. For nonlinearly separable data, a kernel method issued to map the data in a higher dimensional feature space, that is, , to become linearly separable and apply the linear SVM algorithm. The optimal hyperplane , characterized by the weight vector (normal to the hyperplane), and the bias can be identified by solving the following optimization problem: where is called slack variables used to account for nonseparable data and the parameter is added to control the tradeoff between the slack variable penalty and the width of the margin. Using a Lagrange functional, this optimization problem can be reformulated in (12), where the Lagrange multipliers () can be calculated by using dual optimization: where .

The problem formulated in (12) is a convex quadratic optimization problem, and can be obtained using the quadratic optimization problem solver. The final discriminant function is identified by and can be known, and hence, the support vector set representing the nonzero Lagrange multipliers can be identified. From the optimization problem formulated in (12), it is clear that the mapped data appear as an inner product only. From Mercer’s theorem, it is known that, for any two points and and a certain mapping function , a kernel function can be used to evaluate the inner product of the mapped points without knowing the mapping function; for example, . Thus, the linear classifier is changed to a nonlinear classifier by substituting the inner product in the optimization problem by the kernel evaluation. One of the most common kernel functions used is the Gaussian function given where parameter controls the Gaussian kernel width.

4.2. SVM Model Building and Training

From Figure 2 we can obtain 320 traces (A-Scans) in each B-Scan datum, but not all of them are reflected from the cylinder embedded in the model. Hence, 40 traces from 153 to 192 which are echoes of the cylinder are selected in each model and the total number of useful traces in the four models is 160. All the selected A-Scan data should be prepared in advance in Section 3.2.

In training of the SVM classifier which is LIBSVM-Faruto Ultimate Version [18], we uniformly select 10, 20, 40, and 80 traces, respectively, from the 160 traces as training samples to obtain four SVM models. The others are testing data. First, statistical feature vector [12], frequency domain feature vector [19], and DWT-FRFT feature vector proposed in this paper are extracted, respectively. denotes a feature vector. Then, a tag () is assigned to each feature vector to indicate the type of each material. Thus, each A-Scan datum can be expressed by a vector . Finally, the different number of training samples is used for training SVM classifier and we get different SVM models.

In addition, the SVM efficiency depends on the kernel’s type selection, the kernel’s parameters, and soft margin parameter . RBF is selected as the kernel function for the SVM model. Best combination of and is selected by a grid search with growing sequences of and .

4.3. Automatic Classification Results and Discussion

Now the trained SVM models are ready for classification of GPR data of different material objects. In this section, several tests are done to examine the classification accuracy of DWT-FRFT feature-based SVM system of different models for identifying different materials of underground objects. The classification accuracy can be computed by where is the number of correct classifications and is total test data. The performances of statistical features and frequency domain features in underground objects material identification using SVM classifiers are also studied and compared.

Figures 9(a)9(d) are the classification accuracy of the three features using different trained SVM models in noisy environment, respectively. From this comparative study, it is inferred that no matter which one SVM model is used, the DWT-FRFT feature-based SVM system outperforms the other two on the whole. On the other hand, it is indicated that the DWT-FRFT feature is more robust against noise effects than the other two. However, one drawback is that the classification accuracy is lower than statistical feature-based SVM system when the noise intensity is about 25 dBW. This may be because, in feature extraction, only the approximation coefficients are extracted, while the useful information in detail coefficients is removed. Besides, the smaller the noise intensity is, the more the useful information in detail coefficients will be discarded. In addition, when the noise intensity is above 40 dBW, the classification accuracy will be below 50% and the credibility is not enough. Thus, it is not important which accuracy is high when the noise intensity is above 40 dBW.

As is known to all, the number of training samples should be as little as possible if the classification accuracy is similar. Figures 10(a)10(c) show the results of performances comparison holding the same feature using different SVM models.

As can be seen from Figure 10(a), we can find that the classification accuracy curves of DWT-FRFT feature-based different SVM models are intensive. It is indicated that the classification accuracy has little relationship with the number of training samples. That is to say, we can select fewer samples to train SVM and can obtain satisfactory results. Although Figure 10(a) is not as intensive as Figure 10(c), the biggest difference of classification accuracy is less than 10% except some one point (e.g., the noise intensity is 40 dBW).

4.4. Experiments with Laboratory Data

In order to verify the performance of DWT-FRFT feature-based SVM system for material classification, experiments are realized on the laboratory data. The GPR system used in this work is LTD-2200 system (900 MHz, 1024 samples per scan, 190 traces, and trace spacing 0.015 m).

There are three different materials objects and they are buried in the same position of a 3 m × 9 m × 1 m bunker, respectively. The depth of objects is 0.3 m. The bunker is filled with uniform sand and the surface is flat. The size of the objects is similar (0.5-meter-long, 0.2-meter-thick). The three objects are copper, stone, and soil, respectively. The raw radargram of copper is shown in Figure 11. 30 traces from 80 to 109 are reflected from object.

First of all, the raw data of three objects is preprocessed by mean filter and the result of radargram of copper is shown in Figure 12. From Figure 12 we can see the signal of object is more obvious and the echoes from ground surface are removed.

Then, the traces reflected from the three objects are selected and we can obtain 90 useful traces. Statistical feature, frequency domain feature, and DWT-FRFT feature are extracted from the traces, respectively. We uniformly select 9, 15, and 30 traces as training samples to obtain three SVM models and the others are testing data, respectively. The results of classification are shown in Table 3. From Table 3, it can be seen that compared to the other two feature-based SVM systems DWT-FRFT feature-based SVM system for material classification performs well in classification accuracy. In addition, the classification accuracy has little relationship with the number of training samples.

5. Conclusion

A novel DWT-FRFT feature-based SVM system for material classification of the underground objects from GPR data proposed in this paper is proved from the experiments of synthetic and the laboratory data to perform well in classification accuracy in noisy environment and the relationship with the number of training samples. From the experimental results we can see that, compared to the statistical feature and frequency domain feature-based SVM system, the proposed feature shows encouraging performances in terms of material classification. In addition, a good performance can be achieved concerning the SVM models.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to thank Shili Guo, Chao Ma, and Mengen Ji of Henan Province Highway Test Co., LTD. This work was supported in part by the International Scientific and Technological Cooperation Projects of China under Grant 2011DFR10480 and Henan Province Research Program of Foundation and Advanced Technology under Grant 102300410113.