Abstract

The rapid inversion of soil organic matter is of great significance for agricultural soil testing and fertilization in order to protect and utilize land resources effectively. This study selected the developmental base in Lishu County with typical characteristics of China’s black soil as its research area. This study established a 600 nm “bow curvature difference” spectral index and the partial least squares regression model, and the accuracy of their results was compared. The correlation between the 600 nm “bow curvature” spectral index and the soil organic matter of straw cover no-tillage is analyzed. The average soil organic matter content in the study area was 16.67 g·kg-1, and the organic matter increased significantly from NT-0 to NT-100 by 16.26 g·kg-1. The study provided a deep insight to improve the quantitative inversion methods to estimate soil organic matter.

1. Introduction

China holds enormous crop straw resources, and its annual crop straw output was about in 2021. Crop straw is nutrient-rich and serves as an essential material base in the ecological agricultural cycles [1]. The conventional approach to utilize straw bales is incineration, which pollutes the atmosphere and wastes a huge amount of organic matter and nutrient resources. In contrast, straw returning to the field can improve organic matter content and straw decomposition rate. The amount of soil organic matter (SOM), which is carbon-containing organic matter in the soil, is a key indicator of both soil fertility and soil degradation status [2]. Therefore, rapid and accurate monitoring of SOM is of great importance to maintain soil fertility and sustainable agricultural development [3, 4].

In recent years, hyperspectral technology has achieved remarkable results to estimate soil organic matter content. Few other methods, including multiple stepwise regression, principal component regression, and partial least squares analysis are also widely used by domestic and international researchers to establish near-infrared (NIR) correction models to predict soil organic matter content. Nowkandeh et al. (2018) used the stepwise regression method, least regression method, partial least squares regression method, and principal component regression method to establish the SOM prediction model [5]. The logarithm of the inverse of the spectral reflectance was used in the stepwise multiple regression model [6]. Scholars have achieved important results in different regions and soil types by using hyperspectral curves and soil organic matter content. It is concluded that the absolute value of the SOM correlation coefficient in the 400-800 nm band range is above 0.6 [7, 8]. Various factors, such as the variation of soil organic matter content due to geological differences and the single form of spectral features selected in the experiment, lead to the difference in the accuracy of soil spectral prediction. Moreover, there is a lack of comparative studies among various forecasting models [9]. Therefore, in this study, the “deviation of the arch” (DOA) and the soil organic matter spectral index were used to establish the “deviation of the arch regression” (DOAR) method and the partial least square regression model based on the collection of soil samples. Two linear modeling methods were analyzed and studied.

Lishu County, Jilin Province, an important commodity grain base, is located in the middle of the Northeast Plain, China. The Lishu core model established in Lishu County has been widely promoted in China for its black soil protection measures and its advanced practical application [10]. This study selected a development base in Lishu County with typical characteristics of China’s black soil area as its research area. A prediction model of 600 nm “bow curvature difference” spectrum index and soil organic matter was established using the collected soil sample imaging spectral data. The results were compared with the partial least squares regression models to explore the feasibility of established spectral index for SOM mapping. The findings of this study provide novel ideas and methods for mapping soil organic matter using satellite remote sensing, which is critical to develop precision agriculture in the future.

2. Experimental Design

2.1. Test Design

Our experiment started in April 2007 and ended in October 2019. The sample plot had experienced traditional tillage for many years, mainly with maize crop before our experiment began. In the sample area, four experiments were set up with five different amounts of straw mulch; no-till straw without mulch (NT-0), no-tillage +33% straw mulch (NT-33), no-tillage +67% straw mulch. (NT-67), and no-tillage +100% straw mulch (NT-100). Using a random group design, each experiment (except for NT-0 experiment) was repeated 4 times, with a total of 20 experimental blocks, each with an area of . The sample area was mainly towards the northwest, with a 30 m protection row on the north side and a 15.5 m protection row on the west side. The conventional cultivated land around the sample site was selected as a control reference (CK). The measurement time and methods were the same as those of the sample sites.

The no-till straw of about 30 cm that is left after the autumn harvest each year was treated with different mulching experiments. The straw was treated with different mulching amounts; 100% straw mulching is about 7500 kg·hm-2, and NT-67 and NT-33 are 67% and 33% of the total straw, respectively. There was no mulching treatment that removed all the straw. The remaining straw was removed and directly sown in spring without any prior land preparation. All experiments were compared with conventional tillage samples (CK). The sample plot was left with a stubble of about 30 cm after harvest in the autumn, and all the remaining straw was removed. The stubble was rototilled, normal tillage, and soil was not further disturbed except for seeding and fertilizer. The height of the cultivated ridge was about 15 cm, and the spacing between the monopolies was about 60 cm. The same amount of maize seeds and fertilizer was used in each experiment. The maize cultivar used in the experiment was Xianyu 335, and the fertilizer amount was 252 kg N·hm-2, 135 kg P·hm-2, and 90 kg K·hm-2. The machine used for sowing was Jilin Kangda 2BMZF-4 no-till sowing machine, which can complete accurate sowing, fertilization, and repression operations at one time under the condition that the surface is covered by the straws (Figure 1).

2.2. Test Site

A series of experiments on black soil were conducted at the experimental field located in Gaojia Village, Lishu County, Jilin Province. This area is a development base of the Chinese Academy of Sciences () (Figure 2) and protective farming research area (Figure 2). The climate of the region falls in the temperate semihumid monsoon climate zone with an annual mean temperature of 5.8°C, an annual mean precipitation of 577.2 mm, annual sunshine hours of 2,698.5 h, and a frost-free period of 152 d. The precipitation resources are mainly concentrated between June and August. The soil parent material of the study area was loamy clay, and the soil type was medium-layer black soil.

2.3. Organic Matter Determination

The measurement date for the corn seedling period was May 22, 2016. The 0-5 cm of the soil tillage layer was collected, bagged, labeled, and brought back to the laboratory to remove organic residues and stones from the sample. The sample was dried indoors, hand-rolled, ground, and sieved with over 100 mesh. Referring to the organic matter determination method of Zhao et al. [11], the potassium dichromate oxidation capacity method-external heating method was used in this study. The experimental data were processed and plotted using Microsoft Excel 2020 [11].

2.4. Imaging Spectral Data Acquisition and Preprocessing

The soil sample was packed in the aluminum box covered with a lid and scraped flat with a straight ruler and placed on a mobile platform. The sample imaging spectral data was collected in the dark room, and the reflectivity spectral data was obtained via black-and-white correction processing. Ben-Dor et al. applied SOC710VP terrestrial spectrometer to determine the spectrum of soil organic matter and study the inversion of soil organic matter by imaging spectroscopy [1214]. In this study, SOC710VP ground spectrometer was also used to determine the spectral characteristics of soil. The SOC710VP imaging spectrometer was fixed on the sweeping platform, which was installed on the ground. The lens focus was debugged based on the sampling environment at the time. In order to ensure the authenticity and effectiveness of the image, a reference plate was used for calibration before the image acquisition. The acquisition and transmission of hyperspectral image featured information was completed using Hyperspec [15]. The image acquisition process was carried out in a laboratory simulation environment. The field view of SOC710VP was 15°, and the aperture of 17 mm Lens_F5.6_9-23-15_LHB0010-02 was selected. The lens was perpendicular to the standard gray board. The gray board digital number (DN) value was measured synchronously as the reference for each image spectrum. The object lens height was 130 cm from the sample, and exposure time was manually adjusted to 14.988 ms. The setting of parameters was adjusted to ensure the high definition of the image. The serial number of the sample were marked, placed on the horizontal standard gray board, and the Cube button was clicked to collect the imaging spectral data. The radiation brightness curve was observed during the collection process to prevent data loss caused by overexposure of the image. The collected imaging spectral reflectivity data extracted the average reflection spectral curve of each soil sample through Region of Interest (ROI) Tool in the ENVI5.3 software, which removed the large error bands at both ends and retained a total of 230 band data of 425-1015 nm for subsequent calculation and analysis (Figure 3).

2.5. Modeling Method
2.5.1. Construction of Bow Curvature Model

The nonlinear parametric regression and partial least squares regression methods were used; nonlinear parameter regression is a regression method that uses an independent variable based on the spectral index of 600 nm “bow curvature difference” calculated by spectral data, which can be called DOAR (DOA regression model). The 600 nm “bow curvature difference” is the difference between the 600 nm reflectance of each soil spectral curve and its average of 550 and 650 nm spectral reflectance [16, 17]. The partial least squares regression (PLS) is a method that uses 230 bands of spectral data as independent variables [18]. The integration of correlation analysis, principal component analysis, and multivariate linear regression analysis of partial least squares regression model can address the multicolinear problems of independent variables. It also allows regression modeling when the number of samples is less than the number of variables [1923].

The accuracy of the regression model was evaluated based on the root mean square error (, modeling set , and prediction set ), decision factor (modeling set , prediction set ), and residual prediction deviation ().

In the equation, the estimated soil organic and measured values for sample are and , respectively; is the mean of organic matter content in the sample set; and is the standard deviation to predict the measured value of organic matter in the sample set.

For , the value closer to zero indicates a better fit of the model. For , the value closer to 1 indicates higher fitting of the models. For , the larger the value, the higher the accuracy of the model. The calculation process of modeling and predictive verification was processed in Matlab R2013a.

2.5.2. Sample Set Division and Modeling Prediction

The accuracy of the model will change with the result of the sample set division. In order to avoid the impact of sample set division on the model results and better analyze the stability of the model, random sampling of the sample set is carried out. Each time, 1/2 sample is taken as the modeling sample set, and the remaining 1/2 is the prediction verification set; modeling and predictive verification are carried out, and each , , and are calculated in statistical analysis.

The accuracy of the model is expected to change with the sample set division. In order to minimize the impact of sample set division on the model results, random sampling of the sample set was carried out. Each time, 1/2 sample was taken as the modeling sample set, and the remaining 1/2 was used as the prediction verification set. In addition, modeling and predictive verification was carried out, and , , and values were calculated for each set.

3. Results and Analysis

3.1. Statistical Analysis of Soil Organic Matter Content

The average soil organic matter content in the study area was 16.67 g·kg-1, while the values for NT-0 and NT-100 were 9.18 g·kg-1 and 25.44 g·kg-1, respectively (Table 1). The quality content of the organic matter increased significantly from NT-0 to NT-100. The organic matter content of CK, NT-33, and NT-67 was 16.67 g·kg-1, 15.38 g·kg-1, and 16.48 g·kg-1, respectively. When the degree of dispersion of soil organic matter content in the cultivated layer of the study area was calculated, it was found that the degree of spatial variation was relatively high, and the variation coefficient of soil organic matter was 80.68%.

3.2. Analysis of Nonlinear Regression Results of “Bow Difference”

The correlation coefficient between soil organic matter content and “bow curvature difference” was -0.68, indicating the inverse relationship between them. When was adjusted to 0.8247, polynomial functions were better able to fit the functional relationship between soil organic matter content and “ bow curvature difference” than the linear function (Figure 4).

Polynomial functions were used to establish predictive models of randomly divided modeling sets, and independent samples of predictive sets were used to model and verify the results (Table 2). The change range of modeling was 0.76~0.87 with an average of 0.82. The prediction verification change range was 0.78~0.89, with the average value of 0.71. The average value was 1.76. The prediction verification was greater than the modeling , with the minimum value of 0.78, which indicates that the prediction results of data set division were good, and most of the prediction and modeling were greater than 0.5. The results showed that the organic matter content of the soil samples could be predicted more accurately using spectral index “bow curvature difference” calculated from imaging spectral data.

3.3. Analysis of Spectral Characteristics of Soil Organic Matter

Soil organic matter plays a vital role in enhancing soil fertility. The soil spectral reflectance of each test field and CK under no-till (NT-33, NT-67, NT-100) experiment was significantly lower than that of no-till (NT-0) test field. The spectral reflectivity tends to decrease with the increase of straw cover (Figure 5). The overall trend of spectral reflectivity was weakest for NT-100 followed by NT-67, NT-33, and NT-0, which is consistent with the nonlinear analysis results of “bow curvature difference.” Among all the treatments, the spectral reflectivity of NT-0 is the largest numerical difference from that of other test fields; the experimental field was not covered with straw during the experiment. Therefore, the return of straw to the field has a certain impact on the formation of soil organic matter of NT-0. The soil organic matter and the spectral index are inversely correlated. The content of soil organic matter increases with the increase of straw cover, and the spectral reflectivity decreases.

3.4. Analysis of Partial Least Squares Regression Results

The statistics of partial least squares regression results of 230 band spectral data were compared with the “bow curvature difference” fitting results (Table 2) in which SOM was kept as a dependent variable (Table 3). The maximum value of for PLSR was greater than DOAR; the mean value was less than DOA. The maximum and mean values of and for PLSR were smaller than DOAR. Overall, the nonlinear fitting result of DOA was slightly better than that of PLSR. The spectral exponential DOA calculated from the extraction of spectral data uses only three bands which are located in the most affected bands of SOM and effectively contain the information about the organic matter. The 230-band spectral data used by PLSR have also been considered informative, but they contained noise that affected the accuracy of organic matter estimation, resulting in a decrease in the PLSR estimation accuracy.

4. Discussions

The predictive accuracy of the soil organic matter estimation model was measured by taking into account the mean square root error () and relative analysis error () of the prediction value and modeling set. The smaller the , the higher the accuracy of the model. When indicates that the model has excellent predictive ability when indicates that the model can roughly estimate the sample, and indicates that the model cannot predict the sample [23, 24]. Zhao et al. [11] found that with the increase of SOM content, the original spectral curve gradually flattened in the range of 550~650 nm. The bow curvature difference gradually decreases. The SOM is negatively correlated with the bow curvature of the original spectrum. Compared with the SOM content of the two bow curvature prediction study areas in the 600 nm and 800 nm bands, the model accuracy of the 600 nm bow curvature is higher. In this study, the maximum value of in the DOAR model was 3.86, with an average value of 3.56. The maximum value of in the PLSR model was 1.68, with an average value of 1.55, which was less than the DOAR model. Therefore, it is concluded that based on the bow curvature, the DOAR model can predict the content of SOM, so that the accuracy of the DOAR model to predict the soil organic matter content was better than the PLSR model [25, 26].

The polynomial function method can accurately model and predict the soil organic matter with higher fitting results. In comparison, the partial least squares method affected the accuracy of the model because the spectral data contained some information that was unrelated to organic matter. The results of this study show that the accuracy of polynomial function was better than partial least squares regression, which is consistent with the previous studies [27, 28]. The results also show that there is a strong correlation between the spectral reflectivity of the soil and the soil organic matter content [29, 30]. The study also determined a significant linear correlation between the “bow curvature difference” and the organic matter content. The decision coefficient of the inversion effect of modeling and prediction set reached 0.8747 and 0.89, respectively. This supports the feasibility of estimating soil organic matter content using the hyperspectral characteristics of the soil [9]. Therefore, the spectral index “bowtie difference” built using three bands of spectral information is better suited for organic matter mapping of hyperspectral data, as compared to the partial least squares model. The DOAR model used in this study shows great potential to improve the SOM model estimation accuracy by providing higher correlation sensitivity indexes.

The accuracy of the model is affected by too much band information as input into PLSR that also contains some spectral information independent of organic matter, which affects statistical trends. The results of the study are the same as that of Shen et al. [18], but Zheng shows that the linear function fitting results are optimal. When the results of Shen et al. (2020) and Zhao et al. (2020) were compared, it was found that the change of organic matter content in the sample has the greatest impact on the “bow curvature difference” [10]. The relationship between the two is considered to be linear. Zhao et al. (2020) and other studies have shown that the PLSR model results have higher prediction accuracy than the DOAR model. This may be due to factors such as the type of spectral data, band range, and the range of SOM variation used in these two studies.

5. Conclusion

In this study, the soil organic matter estimation model was established using the nonlinear regression of “bow curvature difference” and the partial least squares regression model. The study followed the random sampling method, and the calculation process for modeling and predictive verification was carried out. The effectiveness and stability of SOM estimation methods were also carried out through accuracy comparison. The results of our study showed that the organic matter content and the “bow curvature difference” of 600 nm are inversely correlated, which is a mechanism of the spectral index application for SOM estimation. It has been concluded that the “bow curvature difference” spectral index created using three bands of spectral information can be used for SOM mapping of indoor soil imaging spectral data. The spectra selected for the model were all around the 600 nm band. The SOM content is also highly correlated with spectral indices calculated in other spectral ranges. The spectral feature form selected in this study is too single. Therefore, in later studies, it is necessary to try to further construct or calculate a variety of spectral characteristic indices. Compare the modeling effects of different types of spectral characteristic indices.

The results show that straw cover with no-tillage has an impact on the organic matter content in the tillage layer. The average organic matter content in the study area was 16.67 g·kg-1. The soil organic matter content between NT-0 and NT-100 has increased by 16.26 g·kg-1, and the organic matter content increased with the increase in straw cover. Therefore, straw returning to the field is considered to increase the soil organic matter content by promoting the accumulation of soil surface organic matter. The organic matter content in the soil layers can quickly and effectively be predicted using the remote sensing inversion technique. The results of remote sensing inversion of soil organic matter in black soil areas can be used for monitoring soil degradation, arable land quality, and soil organic carbon pool estimation. The study provides technical and data support and improvement in research methods for soil resource conservation and sustainable land use in the study area.

Data Availability

The data in this paper are field measurements and can be obtained from the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was funded by the Natural Science Foundation of Jilin Province (no. 20210101398JC).