#### Abstract

Soil moisture is a crucial factor limiting the growth and survival of plants on the Loess Plateau. Its level has a severe impact on plants’ growth and development and the type and distribution characteristics of communities. This study area is the Jihe Basin in the Loess Plateau, China. Multiple linear regression models with different environmental variables (land use, topographic and meteorological factors, etc.) were developed to simulate soil moisture’s spatial and temporal changes by integrating field experiments, indoor analysis, and GIS spatial analysis. The model performances were evaluated in the Jihe Basin, with soil moisture content measurements. The result shows that soil moisture content is positively correlated with soil bulk density, monthly rainfall, topographic wetness index, land use coefficient, and slope aspect coefficient but negatively correlated with the monthly-averaged temperature and the relative elevation coefficient. The selected variables are all related to the soil moisture content and can account for 75% of the variations of soil moisture content, and the remaining 25% of the variations are related to other factors. Comparing the simulated and measured values at all sampling points shows that the average error of all the simulated values is 0.09, indicating that the simulation has high accuracy. The spatial distribution of soil moisture content is significantly affected by land use and topographic factors, and seasonal variation is remarkable in the year. Seasonal variation of soil moisture content is determined by the seasonal variation of rainfall and the air temperature (determining evaporation) and vegetation growth cycle. Therefore, the proposed model can simulate the spatial and temporal variation of soil moisture content and support developing the soil and water loss model on a basin scale.

#### 1. Introduction

The Loess Plateau of China, situated in the upper and middle reaches of the Yellow River, covers about 630,000 km^{2}, has an elevation of 1200–1600 m above sea level and is predominantly covered by loess deposits. This region has been prone to severe soil erosion that is a consequence of both natural factors (e.g., the unique geology and landforms, climate conditions, and vegetation coverage due to water resource constraints) and anthropic factors (e.g., poor land use management) [1]. Intensive soil erosion has resulted in the decline of land productivity and environmental degradation [2–4]. The key to ecological environment construction and sustainable agricultural development is to protect and rationally utilize water and land resources. As a significant limiting factor for plant growth and development in the Loess Plateau, soil moisture is of critical importance to hydrological processes at a variety of scales, and it is an essential factor affecting rainfall infiltration, runoff, and sediment yield [5–9]; therefore, objective estimation of the soil moisture is fundamental. However, the measured data of soil moisture content is very few, and it is more difficult to obtain at the moment before rainfall in a basin. The soil moisture model is usually constructed by the empirical relationship of antecedent rainfall [10, 11] or calculated using a daily model based on the antecedent rainfall [12]. The model can reflect soil dryness/wetness, but it does not reflect the real soil moisture content, so it has a significant impact on the results simulated and predicted with the model. Particularly at the regional scale, there are many factors affecting soil moisture content, such as rainfall, evaporation, soil type, land use, and topography [13–15]; all the factors interact with each other, and some factors are difficult to quantitatively express their influence on soil moisture content, which makes it difficult to simulate soil moisture content. Hence, simulation of soil moisture content still lacks for large-scale basins. Based on previous studies [16–22] and research results of small watersheds [23], the objective of this study was to estimate the soil moisture content in large-scale basins considering various factors affecting soil moisture content [24–26]. It would lay a foundation for studying the regional soil and water loss model and provide a scientific basis for soil and water conservation and ecological restoration in the Loess Plateau, China.

#### 2. Materials and Methods

##### 2.1. Study Area

The study area is the Xihe Basin located in the region of 34°2019 N~34°3859 N and 105°0750 E~106°0045 E, belonging to the third subregion of the Loess Hilly-gully region. The overview of the Xihe Basin is shown in Figure 1. The basin has an area of 1276.73 km^{2}, with an average annual rainfall of 558.9 mm. Rainfall varies greatly annually and is unevenly distributed during a year, mainly from July to September. The terrain is fragmented, and ravines crisscross. It is high in the northwest and low in the southeast, at altitudes between 1069 and 2717 m. Soil and water loss occurs in a wide range and large areas; the erosion types are complex and varied; the erosion process is concentrated with large intensity. Water resources are still the main restricting factors of soil and water conservation and ecological environment construction in the basin.

##### 2.2. Data Sources

###### 2.2.1. Digital Elevation Model (DEM)

The original data used was a 1 : 50,000 topographic map (1954 Beijing coordinate system, 1956 Yellow Sea elevation system, contour interval 20 m, reference ellipsoid Krasovsky). The topographic map was scanned, and Geoway was used for vectorization to generate the required layers of contour lines, elevation points, and slopes, and then, E00 was derived after splicing, which was transformed into coverage format in ArcInfo, and topological relations are constructed. Finally, the professional ANUDEM interpolation software is used to set parameters according to existing studies to generate HC-DEM with 10 m resolution (Figure 1) [27]. The DEM projection was Gaussian projection, which was transformed into the unified projection information in this study— ALBERS projection using ArcGIS.

###### 2.2.2. Meteorological Data

The daily rainfall and temperature data were provided by the China Meteorological Data Network (http://data.cma.cn/site/index.html) and the book of Hydrological Data of the Yellow River Basin. The monthly-averaged rainfall and temperature in November 2007 and May 2008 were calculated using the Inverse Distance Weighting (IDW) interpolation method.

###### 2.2.3. Topographic Wetness Index

Based on DEM and by using multiple flow algorithm, topographic wetness index was calculated as shown in Figure 2 according to the calculation formula of “” [28], where is the topographic wetness index; is the confluence area in a unit contour length or a unit grid; is the local slope gradient. Among them, is calculated by the formula “,” where is the total upstream catchment area of the grid, and is the effective contour length in the inflow direction around the grid.

###### 2.2.4. Land Use Coefficient

The land use map of the Xihe Basin is obtained by interpreting the TM remote sensing images acquired in 2005. According to current research results [23], the significant coefficients of farmland, barren grassland, and woodland on soil moisture content were 1.06, 1.0, and 0.65, respectively, and the coefficients of residential land and water body were set as 0. The land use coefficient map was obtained by reassigning the land use map, as shown in Figure 3.

###### 2.2.5. Slope Aspect Coefficient

The result of a representative study on small watershed [23] showed that in the eight slope aspects (north, northeast, east, southeast, south, southwest, west, and northwest), the slope aspect coefficients affecting the soil moisture content are 1, 0.90, 0.77, 0.81, 0.75, 0.79, 0.89, and 0.87, respectively. We have reassigned the slope aspect map, and the slope aspect coefficient map is illustrated in Figure 4.

###### 2.2.6. Relative Elevation Coefficient

Owing to the ravines crisscross in the Loess Hilly-gully region, the landform is complex, and the upper, middle, and lower parts of ridge, hills, and gully cannot be quantitatively represented accurately. Therefore, each sampling point’s relative elevation coefficients in a small watershed are selected to represent the slope position quantitatively. The calculation formula is shown in equation (1).

where is the relative elevation coefficient, is the elevation of the sampling point, is the maximum elevation in the watershed, and is the elevation at the watershed outlet. The calculation results of relative elevation coefficient are shown in Figure 5.

###### 2.2.7. Field Collection of Soil Moisture Content

The layout of sampling points: sampling points are collected according to the combination of land use and landform type, the sampling points cover the primary land use types, prominent small- and medium-sized landforms, different slope aspects, different gradient grades, and slope positions, and a total of 70 points are selected. Because the study area was large, data were collected at 43 sampling points in November 2007 and 27 sampling points in May 2008. At each sampling point, handheld GPS is used to record the longitude, latitude, and altitude. The soil properties, including the moisture content and dry bulk density, and the topography near the site, soil erosion, and land use type are recorded. Measurement of soil moisture content: considering that near-surface soil is the central origin of the erosion-induced sediment yield, only the soil at a depth of 0-50 cm is taken as the research subject. When sampling, a ring knife is used to collect soil at a depth of 0-50 cm, and the interval between samples is 10 cm. The samples are sealed and then taken back to the laboratory, and the moisture contents of the samples are determined with the drying method (105°C, 10 h). The average soil moisture content in the depth of 0-50 cm at each sampling site is taken as the mean of soil moisture content in each soil layer .

The mass content of moisture was converted into the thickness of soil water.
where , , and refer to the thickness of soil water in 0-50 cm depth soil layer (mm), the average mass moisture content of 0-50 cm depth soil, and the average dry volume weight of the 0-50 cm depth soil at each sampling point (g/cm^{3}), respectively.

#### 3. Model Evaluation Method

##### 3.1. Regression Analysis

Regression analysis is the most basic quantitative analysis method. According to the data statistics principle, the regression analysis method can help process a large number of statistical data mathematically, determine the correlation between dependent variables and some independent variables, and establish a regression equation (function expression) with good correlation. Besides, it can also help find a good regression coefficient and then carry out a correlation test to determine the correlation coefficient, which can predict the change of dependent variables after meeting the correlation requirements.

In this study, the multivariate regression analysis method in Excel is used to analyze the relationship between the soil moisture content (mm) (the dependent variable) and the soil bulk density, monthly-averaged rainfall, monthly-averaged temperature, topographic wetness index, land use coefficient, slope aspect coefficient, and relative elevation coefficient (7 independent variables).

Multivariate regression analysis gives the following evaluation indexes: multiple correlation coefficient , determinant coefficient , and the value of statistics. The multiple regression analysis results were completed by analyzing the variable coefficient, multiple correlation coefficient , decision coefficient , and value of statistics, etc.

##### 3.2. Relative Error

The ratio of the absolute error caused by measurement over the measured true value (in agreement) is the relative error, which can better reflect the credibility of measurement.

where is the relative error; is the measured value; is the simulated value.

In order to reflect the overall reliability of measurement, the average relative errors at all points are obtained by the following formula: where is the averaged relative error; is the relative error; is the number of sampling points.

#### 4. Results and Discussion

##### 4.1. Construction of Soil Moisture Content Model

The soil moisture content (mm) is taken as the dependent variable, and the soil bulk density, monthly-averaged rainfall, monthly-averaged temperature, land use coefficient, topographic wetness index, slope aspect coefficient, and relative elevation coefficient are the independent variables. Use the regression function in the data analysis tool of Excel to carry out multivariate regression analysis. Some data required for regression analysis and simulated results are shown in Table 1. (1)The model equation obtained through multivariate linear regression analysis is

where is the moisture content of soil (mm); is the volume density of soil (g/cm^{3}); is the monthly-averaged rainfall (mm); is the monthly-averaged temperature (°C); is the topographic wetness index; is the land use coefficient; is the slope aspect coefficient; is the relative elevation coefficient.

The main statistical parameters of the model equation are shown in Table 2. (2)The result of regression analysis shows that the multiple correlation coefficient is 0.84, and the determinant coefficient is 0.75. It implies that the variables under consideration account for 75% of the variations of soil moisture content, and the remaining 25% of variations have to be caused by other factors. This indicates that the factors affecting soil moisture are complex(3)The value of the statistics is , which is much smaller than the significant level of 0.05. It shows that the regression effect of the equation is remarkable, and the selected variables are all related to the soil moisture content(4)The value of the -test corresponding to the regression coefficient indicates that the regression coefficient is significantly away from zero, and it can be used to explain the change of soil moisture content. The regression coefficients of the variables (soil bulk density, monthly rainfall, topographic wetness index, land use coefficient, and slope aspect coefficient) are positive, indicating that soil moisture content is positively correlated with these variables. The regression coefficients of the monthly-averaged temperature and the relative elevation coefficient are negative, indicating that soil moisture content is negatively correlated with these two variables(5)Analysis of the errors in simulation results

If we substitute the data at each measuring point in Table 1 into equation (5), the soil moisture content at each measuring point can be simulated and shown in Table 1 (column 10). Substitute the simulated and measured values into equation (3) to calculate the relative error of the sampling point’s simulated soil moisture content, as shown in Table 1 (column 11). Finally, all points’ mean relative error was calculated by the formula (4), , and the mean relative error is 0.09. The results showed that the simulation accuracy of soil moisture content was high.

##### 4.2. Simulation of Soil Moisture Content in the Study Area

Considering that the seasonal variation of soil moisture content is mainly affected by rainfall and air temperature (evaporation), land use, soil factor, and topographic factors are relatively stable, the rainfall and temperature of each month in the year are inserted in equation (5), and the other variables remain unchanged. The seasonal variations of the soil moisture content were simulated from April to October 2005, and the simulated results are shown in Figure 6.

**(a) The soil moisture content in Apr. 2005**

**(b) The soil moisture content in May 2005**

**(c) The soil moisture content in Jun. 2005**

**(d) The soil moisture content in Jul. 2005**

**(e) The soil moisture content in Aug. 2005**

**(f) The soil moisture content in Sep. 2005**

**(g) The soil moisture content in Oct. 2005**

According to the simulated results shown in Figure 6, the average monthly soil water content from April to October is 110.03, 107.02, 94.79, 104.06, 101.81, 109.68, and 125.90 mm, respectively. Combining changes of the average rainfall and temperature from April to October 2005 in the Xihe Basin (Figure 7), it can be seen that the rainfall is the lowest in April and the highest in July, with no significant difference in other months, but the temperature rises gradually from May to August and drops gradually from August to October. The simulated monthly soil moisture content shows that the seasonal variation of soil moisture content is determined by the seasonal variation of rainfall and the temperature (controlling evaporation) and vegetation growth cycle. In April, although rainfall was less, the soil moisture content was not the lowest because the temperature was relatively low and evaporation was less. Rainfall is then increased in May, but owing to the increase of temperature and evaporation, vegetation growth gradually becomes vigorous, and the soil moisture content is lower than that in April. From June to August, the temperature rises rapidly. Vegetation thrives, and evaporation increases. Additionally, because the rainfall in June is relatively small, June’s soil moisture content is the lowest; when the rainfall increases in July, the soil moisture content is increased accordingly; however, the soil moisture content is decreased in August. From September to October, the temperature declines significantly, and the water consumption of vegetation and evaporation is decreased significantly; the soil moisture content is increased gradually, so the soil moisture content in October was higher (118.30 mm). These indicated that the seasonal variation of soil moisture content is not synchronous with the rainy season in climate [24] and is also affected by the temperature and vegetation growth cycle [25]. This is consistent with the conclusions of previous studies [29–32]. Therefore, it can be concluded that the proposed model equation can satisfactorily simulate the spatial and temporal variation of regional soil moisture content.

#### 5. Conclusions

Based on the measured data, multiple linear regression analysis was carried out to obtain the simulated equation of soil water content, and the equation was applied in the study area.

The specific conclusions are given below: (1) The simulation method of antecedent soil moisture content in watershed scale was explored. The multiple regression equation of soil moisture content was established, which could simulate soil moisture content’s spatial and temporal distribution. (2) The proposed method produces reasonable estimates of soil moisture content. The average error of all the simulated values is 0.09, indicating that the simulation has high accuracy. (3) The map of soil moisture content in April-October 2005 obtained by the proposed method to the study area shows that the spatial distribution of soil moisture content is significantly affected by land use and the topographic index; seasonal variations in the year are significantly affected by the temperature (evaporation). The result can reflect the spatial and temporal variation of regional soil moisture content. This study provides methodological support for estimating regional soil moisture content and can be used to study regional soil erosion models.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare no conflict of interest.

#### Acknowledgments

This study was supported by the National Key Research and Development Project of China (No. 2019YFC0409202), the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (No. 51721006), High-level talent support program of North China University of water resources and electric power, and Special support for an innovative scientific and technological team of water ecological security in the water source area of the middle route of South to North Water Diversion in Henan Province.