Abstract

Soil moisture is the basic condition required for crop growth and development. Gaofen-3 (GF-3) is the first C-band synthetic-aperture radar (SAR) satellite of China, offering broad land and ocean imaging applications, including soil moisture monitoring. This study developed an approach to estimate soil moisture in agricultural areas from GF-3 data. An inversion technique based on an artificial neural network (ANN) is introduced. The neural network was trained and tested on a training sample dataset generated from the Advanced Integral Equation Model. Incidence angle and HH or VV polarization data were used as input variables of the ANN, with soil moisture content (SMC) and surface roughness as the output variables. The backscattering contribution from the vegetation was eliminated using the water cloud model (WCM). The acquired soil backscattering coefficients of GF-3 and in situ measurement data were used to validate the SMC estimation algorithm, which achieved satisfactory results (R2 = 0.736; RMSE = 0.042). These results highlight the contribution of the combined use of the GF-3 synthetic-aperture radar and Landsat-8 images based on an ANN method for improving SMC estimates and supporting hydrological studies.

1. Introduction

Soil moisture content (SMC) is an important parameter in hydrological, biological, agricultural, and other processes [1, 2]. Lower SMC can cause an increase in the bare soil surface, thus aggravating sandstorms [35]. Many technological advances now allow efficient acquisition of soil moisture data. On the ground, the International Soil Moisture Network (ISMN) provides a global network of soil moisture in situ observations [6]. This network measures soil moisture at specific locations; thus, the data are in the form of discrete values as opposed to a soil moisture spatial distribution, although they provide temporally continuous observations [7]. Microwave synthetic-aperture radar (SAR) collects data over a large area with high spatial resolution and provides an effective technological means of monitoring and assessing soil moisture.

Radar remote sensing is sensitive to soil moisture, due to the differences in the dielectric constants of soil and water; thus, the dielectric constant is one of the most important factors in the radar backscattering coefficient [8]. Several physical and statistical models have been developed to estimate soil moisture. The best-known physical model is the Advanced Integral Equation Model (AIEM), which simulates the radar backscattering coefficients from SAR and various soil parameters (radar wavelength, polarization, incidence angle, soil dielectric constant, and surface roughness) [9]. Statistical models based on experimental measurements are also widely used in soil moisture estimations. For bare soils, the most popular statistical models are those developed by Oh et al., which include an inversion diagram based on the cross-polarized ratio or copolarized ratio [1012]. The Dubois model uses HH and VV polarization data from multipolarized radar observations to estimate SMC [13].

The above-mentioned models are commonly applied to bare soil and cannot be applied directly in vegetation cover areas, due to the multiple scattering effects of vegetation canopies [14]. The water cloud model (WCM), a semiempirical forward model, generally assumes that the vegetation canopy is a uniform layer of a cloud of water droplets and has been widely used to separate out the contribution of vegetation backscatter [1518]. To minimize the effects of vegetation, many researchers have attempted to utilize optical remote sensing to obtain additional vegetation information [1921]. Meanwhile, other studies have demonstrated improved SMC estimation accuracy when optical and SAR data are combined, compared to estimates solely from SAR data [22, 23]. In addition, good performance for soil moisture estimation has been achieved using only one radar channel (one incidence angle and one polarization) [2426]; in one study, there was no significant improvement in soil moisture estimation using two polarizations (HH and HV, C-band) as opposed to one [27].

The estimation of soil moisture is usually a nonlinear, ill-posed, complex process [28], which makes it suitable for artificial neural network (ANN) application. ANN is a model-free estimator, as it does not rely on an assumed form of the underlying data [29]. The most direct way to train an ANN is using synthetic data generated by theoretical or empirical surface scattering models. The effectiveness of ANN inversion algorithms has been investigated in previous studies [3032]. Baghdadi et al. [33] tested the performance of an ANN in retrieving soil moisture and surface roughness for several inversion cases, with and without a priori knowledge of soil parameters; the results were promising for ANN soil parameter estimation. El Hajj et al. [34] developed and validated neural networks using synthetic and real databases; the application of both VV and VH showed results similar to those using VH only. Satalino et al. [35] combined an Integral Equation Model (IEM) and neural networks to retrieve SMC over smooth bare soils from ERS-SAR data. Paloscia et al. [36] trained neural networks using a backscattering coefficients database simulated from the IEM and WCM for a wide range of soil parameters; a real database composed of SAR, optical, and in situ measurements was used to validate the developed neural networks, and the results indicated a soil moisture estimation accuracy of 2–5 vol.%. Santi et al. [37] developed ANN-based algorithms for both active and passive microwave acquisitions; the results demonstrated that ANNs are a powerful tool for SMC estimation at both local and global scales. These studies demonstrate the potential of ANNs for retrieving SMC information from SAR remote sensing data.

Gaofen-3 (GF-3) is the first Chinese civil C-band SAR. To date, there have been few soil moisture estimation methods developed for the GF-3 satellite. Therefore, we create an inversion technique based on ANN to estimate SMC over agricultural areas by combining GF-3 and Landsat-8 satellite data. The WCM was first applied to eliminate the effects of vegetation and to obtain the backscattering coefficients of bare soil. Then, the ANN was trained using a sample dataset generated form the AIEM. Meanwhile, field SMC data in an agricultural region with wheat as the main crop type were used to evaluate the potential of the GF-3 sensor for retrieving SMC data.

The remainder of this paper is organized as follows: Section 2 summarizes the study area and datasets. Section 3 discusses the ANN and the inversion methodology. The results are presented in Section 4. Section 5 includes a discussion of our findings and conclusions.

2. Case Study Site and Data Description

2.1. Case Study Site

A study site located in the Luancheng County of Shijiazhuang city (centered at 114.65°E and 37.88°N; Figure 1) was chosen to validate the approach for soil moisture estimation. The site is relatively flat, and the main crop type is wheat. The study area has a typical subhumid, north temperate continental monsoon climate. The annual average temperature is approximately 12.8°C, and the annual average precipitation is approximately 474 mm, which is mainly concentrated in July and August. The top soil type of the agriculture fields is cinnamon soil with a high nutrient content, suitable for crop growth. Although the case study area is not large, it has the representative characteristics of crop-type distribution in the North China Plain.

GF-3 SAR and Landsat-8 OLI images were used in this study. In addition, 38 agriculture fields were selected to conduct in situ measurements, including those related to both soil and vegetation characteristics.

2.2. Data Description

Microwave and optical satellite data were fused to reduce the effects of vegetation cover on the backscattering of soil moisture. Satellite data are listed in Table 1.

2.2.1. GF-3 Data

In this study, one GF-3 SAR product acquired in the Quad-Polarization Strip I (QPSI) mode, processed up to level-1A SLC, was collected on May 27, 2017. Details of the QPSI model are listed in Table 2. The locations of the collected GF-3 SAR data are shown in Figure 1. PolSARpro software was used to calibrate the GF-3 image; the calibration aims to convert the digital number values of the GF-3 image into backscattering coefficients () in a linear unit. In this study, we focused only on the copolarizations of HH and VV. To reduce the effect of speckle noise, the mean backscattering coefficient of each sampling point was calculated from a calibrated GF-3 image by averaging the values of five surrounding pixels. Meanwhile, to reduce the impact of residential area on the soil moisture mapping, a filtering method was used to mask houses (red patches in Figure 1) to simplify the soil moisture map.

2.2.2. Landsat-8 Image

The NASA’s Landsat-8 satellite carries two instruments: the Optical Land Imager (OLI) sensor and the Thermal Infrared Sensor (TIRS). It images the land surface using 11 spectral bands in the optical and thermal infrared domains with a spatial resolution of 30 to 100 m and a temporal resolution of 16 days [38]. The Landsat-8 imagery used in this study was downloaded from the United States Geological Survey data archive (https://earthexplorer.usgs.gov/).

We directly downloaded the land surface reflectance product, in which the image had been preprocessed; preprocessing included radiation calibration and atmospheric correction. The reflectance values of near-infrared (NIR) and short-wave infrared (SWIR) bands were used to estimate the normalized difference water index (NDWI) and VWC. Finally, Landsat-8 reflectance data were extracted from the sample points and combined with field measurements to build the relationship between vegetation water content (VWC) and NDWI.

2.2.3. In Situ Measurement Data

Coincident with the GF-3 and Landsat-8 satellite overpasses, field campaign measurements of soil moisture and roughness, as well as crop biophysical parameters, were conducted over the 38 wheat fields. For each field, three sampling points were randomly selected (point separation 150 m). Soil volumetric moisture was measured using the oven-drying method (wet weight-dry weight) at a depth of 0–5 cm, given that the C-band radar signal is most sensitive to surface soil moisture. For each sampling point, measurements were collected at three locations that were uniformly distributed over 8 m (one pixel of the GF-3 satellite).The average soil moisture value of the three locations was considered the soil moisture of the sampling point. The VWC collected from 1.0 × 1.0 m squares selected at random was determined by weighting before and after oven-drying. Soil roughness was measured with a 1 m pin plate, including the root mean square height (S) and correlation length (L) of each sampling point. In the 38 wheat fields, 19 fields (57 points) were chosen to complete the VWC calculation and validate the WCM, and the remaining 19 wheat fields (57 points) were used to validate the soil moisture estimation model using GF-3 data.

3. Methodology

Our approach for the soil moisture estimation uses an ANN technique that combines GF-3 and Landsat-8 satellite data (Figure 2). The ANN was trained and tested on a training sample dataset generated from the AIEM. First, the WCM was used to eliminate the contribution of backscattering coefficients caused by vegetation. Then, the backscattering coefficient of the soil was determined. The obtained soil backscattering coefficients of GF-3 and in situ measurement data were used to validate the SMC estimation algorithm. Finally, SMC was estimated using the trained ANN.

3.1. Calculation of Backscattering Coefficient of Soil
3.1.1. Vegetation Water Content Calculation

VWC (kg/m2) is one of the most important parameters for the successful retrieval of SMC from microwave remote sensing observations [39]. Landsat-8 Operational Land Imager (OLI) data and ground-based VWC measurements were used to establish relationships in our study based on remotely sensed indices.

The NDWI is a widely used and reliable indicator to assess the vegetation water status, which is sensitive to changes in VWC [40]. Gao first proposed the NDWI by combining reflectance at 860 and 1240 nm to monitor VWC [41]. Because large-area VWC is more difficult to obtain, NDWI was used in the current study. NDWI can be calculated as follows:where is the reflectance or radiance corresponding to the SWIR wavelength channel (1.2–2.5 µm) [42]. For Landsat OLI, RNIR and correspond to bands 5 (0.845–0.885 µm) and 6 (1.560–1.660 µm), respectively.

VWC was measured in 38 fields at the first sampling point. The above ground biomass was removed, and fresh and dry weights were used to compute the VWC. The relationship between VWC and NDWI was generated based on the least-squares fitting method, as follows:where and are the coefficients and c is a constant calculated based on Landsat-8 OLI land surface reflectance data and ground-based VWC measurements.

3.1.2. Water Cloud Model

The WCM, introduced by Attema and Ulaby [15], assumes that vegetation is a source of homogeneous scattering. Radar backscattering coefficient from a canopy can be expressed as the sum of contributions due to (i) volume scattering from the vegetation canopy itself, (ii) surface scattering by the soil attenuated by the vegetation layer, and (iii) multiple interactions between the canopy and the ground surface [14]. For a given incidence angle θ, the WCM can be represented as follows:where is the two-way vegetation transmissivity. The interactions between vegetation and soil are neglected in the WCM [18, 43]; therefore, the WCM can be reformulated as follows:where the backscattering coefficient of bare soil is simulated based on the AIEM, which will be introduced in Section 3.2. mveg is the VWC (kg/m2). The incidence angle θ of GF-3 data used in this study was 24°. A and B are parameters that depend on the canopy type and sensor configuration, which can be calculated by the least-squares method.

3.2. Generating the SMC Training Sample Dataset

Bare soil backscattering depends on the dielectric constant and surface roughness, as well as the SAR instrumental parameters [44, 45]. The AIEM, a well-established theoretical model [9], has been widely used as a forward model to simulate the scattering coefficients and emissivity of bare soil surfaces with various ground conditions, due to its precision [4448]. Therefore, AIEM was selected to generate the SMC training sample dataset.

The Dobson dielectric model is commonly used to describe the relationship between the effective dielectric constant of soil and soil moisture [4951]. Therefore, we combined the Dobson model and AIEM to integrate soil moisture into the training sample dataset during the generation process. The equation can be conceptually represented as follows:where represents the satellite frequency (5.4 GHz for the GF-3 satellite), is the angle of incidence, PP denotes the polarization state (includes HH and VV polarizations), ACF is the autocorrelation function (an exponential ACF is adopted), s is the root mean square height, and l is the correlation length.

The incidence angle ranged from 20° to 60° with an interval of 1°. The s and l values were set based on field measurements within 0.5–2.0 cm and 10.0–30.0 cm, respectively. To reduce the number of parameters, surface roughness is expressed as one parameter, Zs (), using an exponential correlation function. Soil moistures ranged from 0.01 to 0.40 m3/m3, with an interval of 0.01 m3/m3. The training sample dataset was generated based on the AIEM that included 551040 datasets.

3.3. Artificial Neural Networks Approach

ANNs can mimic human learning and can build multivariate nonlinear relationships; as such, they have been widely used for estimating land surface parameters from remote sensing data [52]. An ANN is made up of a number of hidden neurons or nodes that work in parallel to convert data from an input layer into an output layer. Each ANN has two modes of operation: training and testing modes. In the training mode, neurons are trained using part of the training sample dataset as a particular input pattern to produce the desired output pattern. In the testing mode, when an input pattern is chosen, the ANN will produce its associated output [53]. The number of neurons associated with the hidden layer varies, depending on the optimum neural network architecture. Training is accomplished to obtain a minimum error between the ANN output and the input data by adjusting the correlation weights between them [54]. The ANN model was developed using the MATLAB software.

The incidence angle and backscattering coefficient (HH or VV) were the input variables; the corresponding SMC and surface roughness were the output variables. The ANN was trained for HH and VV polarizations separately. One hidden layer and 30 neurons provided accurate SMC estimation within a reasonable computing time, by adding or removing these components from the model for both HH and VV polarizations. Therefore, the optimal ANN architecture (Figure 3) was determined to be a three-layer network consisting of an input layer (two neurons: incidence angle and backscattering (HH or VV), one hidden layer (30 neurons), and a two-output layer (SMC and surface roughness). Although the structures for both HH and VV polarizations are the same, the detailed ANN differs between the two. HH and VV backscattering data were used separately as input parameters for their corresponding ANN. The optimum architecture has minimum error and maximum convergence, avoiding any possible overfitting. The training sample dataset generated from the AIEM was randomly divided into two parts: 90% of the cases were used for training the ANN and the remaining 10% of the cases were utilized during the testing process. The Levenberg–Marquardt method, an alternative to the Newton algorithm, was used to calibrate the synaptic coefficients. Linear and tangent-sigmoid transfer functions were associated with the hidden layer and output nodes, respectively.

3.4. Soil Moisture Estimation and Accuracy Assessment

GF-3 satellite data were preprocessed to obtain backscattering coefficients of the agricultural area. The backscattering coefficients of soil were generated based on the WCM to eliminate the backscattering contribution of vegetation. The SMC can be estimated using the trained ANN and the backscattering coefficients of soil as the input parameter. Direct comparison of in situ SMC measurements with SMC estimations using GF-3 satellite data is a reliable way to assess the accuracy of the proposed SMC estimation algorithm. The precision and accuracy of SMC were estimated using two statistical indices: the R2 value of linear regression and the root mean square error (RMSE). The RMSE values show how much the retrieval SMC values under- or overestimate the in situ measurements. For a perfect fit between retrieval and field-observed SMC data, values of R2 and RMSE should equal 1.0 and 0.0, respectively.

4. Results

4.1. Vegetation Water Content

Information about VWC is an important parameter of the WCM, which is useful for retrieving soil moisture from GF-3 satellite data. NDWI was chosen to generate relationships with VWC based on Landsat-8 OLI land surface reflectance data and ground-based VWC measurements. Then, coefficients (a and b) and constant (c) of (2), 1.56, 1.27, and 0.49, respectively, were calculated using the least-squares fitting method (R2 = 0.771). The VWC estimation results using the proposed algorithm and Landsat-8 data are shown in Figure 4. As for the spatial distribution, the large VWC estimates were mainly distributed over the farmland, and the VWC estimates at other areas were smaller. Due to the different farmland areas, the growth of wheat is not the same, so there are differences in VWC. Therefore, the VWC estimates using Landsat-8 data could preliminarily indicate the reasonability of the proposed VWC estimation algorithm.

4.2. Backscattering Coefficient of Bare Soil
4.2.1. Correlations between In Situ SMC and Corresponding Total Backscatter

A sensitivity analysis between the GF-3 total radar backscatter (HH and VV polarizations) and in situ SMC was conducted based on all the field measurements data to explore whether SMC could be retrieved directly using regression methods, as shown in Figure 5. GF-3 total radar backscatter (HH and VV polarizations) was correlated with SMC, which is consistent with previous findings [55], demonstrating the potential of GF-3 satellite data for SMC retrieval. However, R2 between and in situ SMC, both with HH and VV polarizations, was lower than 0.146 (Table 3), thus indicating that simple regression methods cannot achieve high-precision inversion of SMC. The SMC estimation under the vegetation cover area is affected by the vegetation canopies, which scatter and attenuate electromagnetic radiation, which makes it difficult to discriminate the radar return due to soil moisture [56]. Therefore, isolating the contribution of vegetation from the total radar backscatter is crucial for SMC estimation over agricultural areas.

4.2.2. Bare Soil Backscattering Coefficients

The bare soil backscattering coefficient , which assumes vegetation as a homogeneous scattering source, was calculated using the WCM. The simulated bare soil backscattering coefficient was developed based on AIEM. Then, the parameters A and B of the water cloud model were calculated through the least-squares method (Table 4).

The backscattering coefficients of 19 wheat fields (57 points) were extracted to analyze changes in and information (Figure 6). The backscattering coefficient value of each point was attenuated after the use of the WCM. However, the degree of attenuation of each point was not the same, mainly because the corresponding VWC was different.

4.2.3. AIEM-Simulated Backscattering Data

SMC estimation was achieved using an ANN method, in which the training sample dataset was generated based on the AIEM. Using the trained ANN, GF-3 satellite-measured soil backscatter data were used as the input and SMC as the output. To verify the reliability of the training sample dataset, the consistency between the simulated backscattering data () and radar-measured soil backscattering data () was explored based on the remaining 57 points, which is the different dataset used for calibrating the WCM, as shown in Figure 7. The results showed that both HH and VV polarization-simulated backscattering data from the AIEM agreed well with GF-3 satellite-measured soil backscattering data calculated using the WCM (R2 = 0.894 for HH and R2 = 0.855 for VV). Therefore, the training sample dataset generated from AIEM was applied to the SMC estimation in this study.

4.3. Soil Moisture

The soil moisture estimation results are shown in Figure 8. The remaining 19 wheat fields, which contained 57 points, were used to directly assess the soil moisture from GF-3 data using the proposed algorithm (Figure 9). There was a good linear relationship between field-measured soil moisture and estimated soil moisture. The soil moisture retrieval accuracy was satisfactory, with R2 and RMSE values of 0.7356 and 0.042 for HH polarization and 0.7096 and 0.051 for VV polarization, respectively. The main reason for some of the larger differences between the in situ SMC and the estimated SMC may be attributable to the field sampling period being out of sync with the satellite transit time, as the soil moisture could change over this time period.

In summary, the differences between the measured soil moisture and estimated soil moisture from GF-3 data were small, and the soil moisture retrieval results were satisfactory. These results indicate that the proposed soil moisture method for GF-3 data is reliable and that GF-3 data could achieve acceptable performance for soil moisture estimation. Thus, this approach shows the potential for providing the high-resolution soil moisture dataset for agricultural application, such as farmland soil moisture monitoring.

5. Discussion and Conclusions

We propose a soil moisture retrieval algorithm for agricultural regions that uses GF-3 satellite and Landsat-8 data based on the ANN method. The ANN structure was trained under a large range of land surface parameters, allowing the algorithm to have better adaptability to a variety of underlying conditions. The retrieval results using field soil moisture measurements obtained from an agricultural region with wheat as the dominated crop type showed that the proposed algorithm achieved satisfactory soil moisture estimation accuracy (e.g., RMSE = 0.042). The results indicated GF-3 satellite data had good performance on soil moisture retrieval, and the algorithm had potential to operationally estimate soil moisture from GF-3 satellite data. The major conclusions of this study are summarized as follows:(1)VWC is an important factor for accurate retrieval of soil moisture under the vegetation cover. In this study, we combined microwave and optical remote sensing data to eliminate the contribution of backscattering coefficients caused by the vegetation. Landsat-8 OLI land surface reflectance data were chosen to complete the VWC estimation. The remotely sensed NDWI was used to generate the relationship with VWC through ground-based observation data.(2)An AIEM-Dobson model was built to simulate the training sample dataset based on in situ measurements and GF-3 satellite parameters. This dataset includes the incidence angle, backscattering coefficients, and its corresponding SMC and surface roughness. The WCM was used to calculate the bare soil backscattering coefficient . As the model’s input parameters, the vegetation parameters A and B were calculated using a least-squares method.(3)The backscattering coefficients of GF-3 satellite data were attenuated to different degrees compared to the total backscattering coefficients after the use of the WCM. Due to the different VWCs of each point, the attenuation degree is also different. The effect of vegetation contribution is large and must be removed before the soil moisture retrieval process; otherwise, it will influence the accuracy of soil moisture retrieval.(4)The optimal ANN architecture in this study was determined as a three-layer network consisting of an input layer (three neurons: incidence angle and HH or VV backscattering), one hidden layer (30 neurons), and a two-component output layer (SMC and surface roughness). Our results and model sensitivity highlight the contribution of combined GF-3 SAR and Landsat-8 images using an ANN method for improving SMC estimates. HH polarization showed better SMC estimation performance than VV polarization.

Although satisfactory soil moisture retrieval performance was achieved, there were also several limitations to this study. The field measurement had many uncertainties that affected assessing soil moisture retrieval algorithms using remote sensing data. Real soil moisture values from ground-based measurements are difficult to match with pixel-level soil moisture estimates, although averaging multipoint measurements could reduce this error to a certain extent. Further work should focus on validating the proposed soil moisture retrieval algorithm using field soil moisture measurements with less uncertainty under various land cover-type conditions.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

Linlin Zhang, Qingyan Meng, and Qiuxia Xie conceived and designed the experiments. Linlin Zhang and Shun Yao performed the experiments and wrote the paper. Qingyan Meng, Xu Chen, and Ying Zhang contributed to paper revisions.

Acknowledgments

The authors would like to thank the National Key Research and Development Program (China’s 13th Five-Year Plan) (2017YFB0503900 and 2017YFB0503905), the Hainan Province Key Research and Development Plan (ZDYF2018231), and the Hainan Province Natural Science Foundation (417218 and 417219).