Abstract

The performances on the applications of synthetic aperture radar (SAR) data strongly depend on the statistical characteristics of the pixel amplitudes or intensities. In this paper, a new empirical model, called simply , has been proposed to characterize the statistical properties of SAR clutter data over the wide range of homogeneous, heterogeneous, and extremely heterogeneous returns of terrain classes. A particular case of the distribution is the well-known distributions. We also derived analytically the estimators of the presented model by applying the “method of log cumulants” (MoLCs). The performance of the proposed model is verified by using some measured SAR images.

1. Introduction

Nowadays, synthetic aperture radar (SAR) has become an advanced tool as compared to optical sensors for monitoring land or sea surfaces because of its advantages regardless of weather conditions [1, 2]. The interpretation of statistical properties of SAR clutter with various terrain classes is a crucial task for designing filtering [3], detection [4], segmentation [5], and classification [68] of algorithms to optimally exploit the contents of the processed SAR images. These analyses result in a growing interest in developing precise models for the statistics of the pixel amplitudes or intensities.

It is known that SAR images are strongly affected by speckle noise due to its coherent imaging mechanism [1, 2, 9, 10]. This noise effect degrades the information of SAR images and limits the surveillance performances. To overcome this drawback, much work has been done to match amplitude or intensity statistics [3, 5, 6, 8, 11]. Among the existing statistical models, the parametric distributions have been intensively investigated because of their high accuracy and flexibility, which are usually obtained by two approaches.

The first is to consider the physical mechanism of the backscattering from the land surfaces. The multiplicative model [1, 2] by combining the speckle noise and the terrain backscatter is commonly used in this class. Generally, multi-look SAR speckle noise intensity is assumed to obey a gamma distribution. Therefore, a central task for establishing this type of statistical models is to aim at the backscatter modeling of different terrain classes [1, 2, 11]. Two important distributions, consisting of as well as , which, in turn, in relation to the Gamma and inverse Gamma distribution for the intensity backscatter, have received a great deal of attention [2, 57, 9, 11, 12]. As compared to , agrees reasonably better with the heavy tail behavior coming from the extremely heterogeneous clutter like the cases of urban areas or other man-made structures [11].

The second is to conduct the distributions from a purely mathematics view irrespective with physical property of radar clutter backscatter. Some known examples are the lognormal, weibull, and more recently the Fisher [6] (completely identical to the ) distributions. The lognormal and weibull fit sometimes well with the SAR histogram of some heterogeneous and ocean regions [9]. However, they tend to occur a large deviation estimating the histograms with the heavy tail behavior.

This paper is devoted to report an empirical model (denoted simply as ) for characterizing the statistical properties of SAR clutter data to obtain the modeling ability of more heterogeneous clutter. The proposed model has the distribution as a special case. Furthermore, using the second-kind statistics theory developed by Nicolas [13], which relies on the Mellin transform [14], that is, “method-of-log-cumulants” (MoLC), we derive the parameter estimators of the new distribution model.

In the rest of this paper, the proposed distribution is first given in Section 2. Section 3 derives the corresponding parameter estimators based on the MoLC. We provide the experimental results of the model using measured SAR data in Section 4, the comparisons with that fits are also discussed in this section. The last section concludes this paper and give a perspective in the future work.

2. The Proposed Distribution

The proposed amplitude distribution is defined as where , , and are the power, shape, scale parameters, respectively. indicates the stretching parameter. represents the Gamma function. is the number of looks. The corresponding intensity expression of (1) is further given by

We refer to this distribution characterized by (1) or (2) as the distribution. Specifically, we call the distribution and the distribution, correspond to (2) and (1), respectively, to distinct the intensity statistic as well as the amplitude statistic. Figure 1 gives some examples of the distribution with respect to the various parameters. From this figure, it can be seen clearly that the parameter reflects the degree of homogeneity for the tested returns, which implies that the smaller value of obtains, the more in-homogeneous (i.e., larger tails) they are. Meanwhile, corresponds to a stretching of the SAR image amplitude or intensity, showing a strong effect for low values of the return. Moreover, as an independent parameter, indicates the whole fluctuation (contains magnifying or shrinking) of power of densities along the vertical axis. The parameter controls the peak value of the density.

Although the probability density functions (PDFs) characterized by (1) or (2) are shown as the empirical models, the model also can be derived within the structure of the multiplicative model [1, 2, 11]. Herein, we account for the intensity distribution as an example, the following theorem is established.

Theorem 1. Letting and indicates the backscattering RCS component and speckle noise one, respectively, denotes the observed intensity of SAR data. Therefore, the relationship of this three variables is expressed by the multiplicative model as
If obeys and the PDF of is
Then the distribution of the intensity return is characterized by the density shown in (2), that is, .

Proof. Combining (4) and (5) via (3), the PDF of can be easily derived as
A variable change of leads to and , (6) turns out be
Applying the integral formula , , , [15, 3.478, Equation 1], one can easily obtain the result that (7) is equal to (2).
Furthermore, has the charming property that the well-known , presented by Frery et al. [11] to model homogeneous, heterogeneous, and extremely heterogeneous terrains, is a special case of this proposed model when and . Thus, the proposed model exhibits higher fitting ability as compared to .
As derived in Appendix A, the th order moments of the are given by
The th order moments of the corresponding amplitude random variable can easily be given by .

3. The Parameter Estimators of the Proposed Distribution

3.1. The Log-Cumulants of

Nicolas [13] has proposed a parametric PDF estimation technique based on the MoLC for a function defined over . Given a positive-valued random variable with the PDF , the second-kind first and second characteristic functions are, respectively, defined as [13] where is the Mellin transform operator. The th order derivative of at is the log-cumulants of order , that is,

Hereafter, we take the intensity distribution as an example to estimate its parameters , , and . The processing of the amplitude one is similar. Owing to , consequently, via (8) and (9), the second-kind second characteristic functions of the distribution yields which leads to that the log-cumulants of the are expressed as where represents the digamma function, and is the th order polygamma function.

Given a sample set , , the log-cumulants can be estimated directly by

3.2. The Parameter Estimators of

We notice that (12) is independent of the parameter and the th log-cumulants shown in (12) are irrespective with the parameter on the condition that . In addition, we regard the parameter as a known constant, which can be replaced by the equivalent number of looks (ENLs) [4, 11] or obtained from some prior knowledge about processed SAR images, hence, allowing us to divide the parameter estimates to three stages.

First, the estimates and of the parameters as well as are obtained by solving the following equations resorting to the numerical computation:

Second, according to the first equation of (12), the estimate of the parameter is

We assume the observed amplitude PDF is from the actual data, which corresponds to the histogram of tested data. The theoretical amplitude PDF denotes as . As the previous analysis, since indicates the whole proportion fluctuation of , can be calculated by using , , and . Given a sample amplitude set , let the symbol be equal to , then the estimate of the parameter is simply given by where is a subset of . Here, in order to facilitate the stable fitting, we choose empirically all points of histogram within 3 dB bandwidth to calculate , such that .

4. Experimental Results

In this section, we aim at verifying the performance of the proposed . In order to assess how the performs, several space-borne TerraSAR-X geocoded scenes with various land-over typologies as examples are reported. Figures 2(a)2(d) show four selected typical patches from a large “Sanchagang” TerraSAR-X image with low resolution, which are the portions of water body, drying riverbed, mountain, and a town returns, respectively. The four types of scenes are related to the homogeneous and heterogeneous terrains. For simplicity, we denote them as “water-body”, “drying riverbed”, “mountain”, and “town.” Additionally, two urban areas shown in Figures 2(e)-2(f), extracted from a large “Beijing” TerraSAR-X scene with high resolution, are further carried out to demonstrate the effectiveness of the proposed model on the extremely heterogeneous terrains. Likewise, the two urban images are denoted by “urban1” and “urban2.” The main parameters of TerraSAR-X systems for “Sanchagang” and “Beijing” data are listed in Table 1.

The estimated PDFs of the proposed model for the histograms of six selected areas indicated in Figure 2 are shown in Figure 3. As the corresponding comparisons, the fitting results with the intensively used distribution are also provided, where parameter estimates of this distribution are derived by Tison et al. [6] based on the MoLC. Herein, the estimate in is first replaced by the ENL, that is, [4] where and are the effective number of looks in the azimuth and range. For the previous “Sanchagang” and “Beijing” TerraSAR-X images used in this investigation, and are listed in the last two columns of Table 1 according to reading their metadata files. Next, the estimates of all other parameters in are accomplished by the estimators derived in Section 3, which are shown at Table 2.

Furthermore, in order to quantitatively assess the performances that different distributions fit, we define an error ratio factor (ERF) as where is a sample set, represents the compared theoretical PDF, is the basic theoretical PDF, and indicates the actual PDF from the observed data. The symbol denotes 2-norm. It is obvious that the numerator and denominator in (18) separately imply the total fitting errors with the and to approximate . The better the performances of related to fit, the smaller is, and vice verse. Specifically, , if both and have the identical capability for fitting the measured data.

Let represent the basic amplitude PDF, and let be the compared amplitude one. The values of the previous six areas in this study are given in Figure 3 (see the yellow textbox). It can be clearly seen that the proposed better agrees with the amplitude histograms of all six terrains than the , as expected, because the values are larger than 1 for all six areas. The same conclusion can be confirmed from a visual point of view and implies the higher fitting precision, using than using , over homogeneous, heterogeneous, and extremely heterogeneous regions.

5. Conclusion and Perspective

We have developed an empirical model, , to exploit the knowledge of statistical characteristics of SAR amplitude or intensity images over the wide terrain classes with homogeneous, heterogeneous, and extremely heterogeneous backscattering properties. The parameter estimators of this model based on the MoLC are also provided. Consequently, we report the performances of different land-over typologies with distribution fits. The experimental results show that the distribution is a more advanced model compared with the known distribution to characterize the multilook processed SAR data.

As we know, a preliminary statistical analysis of SAR clutter data is important for designing signal processing algorithms, such as speckle filtering, target detection, building extraction, image segmentation, and classification. In future, it is worth expecting to use the distribution in these fields. Herein, we firstly attempt to give an analytical derivation for constructing a constant false alarm rate (CFAR) detector to promote the upcoming studies of target detection in SAR images.

Given the propose amplitude model shown in (1), its cumulative distribution function (CDF) is written as (see Appendix B) where is the Gauss hypergeometric function. For a given value of the false alarm probability, denoted by , the corresponding CFAR threshold for the distribution can be obtained from

Considering is strictly monotonously increasing, the threshold can be accurately calculated with the help of the numerical solution or a simple bisection method [16].

Our future work will focus on demonstrating the performances of the proposed CFAR detector using some measured SAR data and investigating how the distribution performs when extending it to other application fields.

Appendices

A. The Derivation of th Order Moments of the Distribution

Via (2), the th order moments of the are expressed as

A variable replacement leads to and ; thus (A.1) can be rewritten as

According to the integral formula , , [15, 3.194, Equation 3], one can obtain (8).

B. The Derivation of the Cumulative Distribution Function of

Based on the definition of the CDF, the CDF of is

Similarly, utilizing a symbol change , (B.1) turns out to be Likewise, applying the integral formula , , [15, 3.194, Equation 1], we arrive at (19) by simplifying (B.2).