Abstract

Excess heavy metal, for example, copper, in vegetation will depress the normal plant growth, and the yield of such plant will be harmful if they are loaded into the food chain. Spectroscopy is thought as an efficient noncontact method on detecting the heavy metal in vegetation. This paper is aimed at retrieving the copper ion content in copper-stressed tobacco leaves from hyperspectral data by inverting a modified radiative transfer (RT) model. The dataset regarding the reflectance spectra, biochemical components, and copper ion contamination of copper-treated leaves was obtained from a laboratory experiment on spectral data from copper-treated tobacco. A simultaneous inversion on multiple parameters was conducted to explore the difficulties in estimating copper ion concentrations without considering the correlation between input parameters. This simultaneous inversion produced an unsatisfactory result, with the correlation coefficient (R) and root-mean-squared error (RMSE) being 0.12 and 0.021, respectively. Then, the sensitivity of the input parameters of the RT model was analyzed. Based on the sensitivity bands and the RT model, a concrete procedure for a multiobjective and multistage decision-making method was defined to perform the inversion of the copper ion content. The accuracy of the inversion results was improved significantly, and the values of the R and RMSE were 0.60 and 0.015, respectively. The proposed method fully considers the correlativity among the model parameters. Additionally, the method promises to provide a theoretical basis and technical support for heavy metal monitoring using the spectroscopy method.

1. Introduction

In the past several decades, many resources have been directed into monitoring the vegetation pollution resulting from heavy metal mines [1]. In addition to the traditional geochemical methods [2, 3] that rely on field sampling and laboratory chemical analysis, hyperspectral remote-sensing techniques have received much attention in recent years [4, 5]. This noncontact measurement method is low-cost, effective, and precise and results in no secondary pollution [4]. However, the present hyperspectral remote-sensing techniques for vegetation monitoring primarily adopt empirical and semiempirical statistical methods to calculate the heavy metal content. In other words, the method first uses multiple linear stepwise regression analysis to select characteristic bands that are closely related to heavy metal content. Then, it establishes regression equations between the spectral indices and heavy metal contamination to assess the heavy metal content [610]. Although the method is simple and flexible, it cannot be explained physically because the regression models have little physical significance and the model parameters need to be redefined for different data sources.

In contrast, a radiative transfer (RT) model-based inversion method may overcome the disadvantages inherent in statistical methods and thus have broader application prospects, because such a method would fully consider the physical mechanisms between the remote-sensing signal and the vegetation leaves. Before conducting the inversion, an RT model that simulates the optical reflectance of heavy metal-polluted leaves by inputting leaf biochemical concentration parameters is needed. This paper adopts a modified PROSPECT model [11] published in our previous work, here called PROSPECTcu. In our previous work, the model was used to simulate copper-stressed plant reflectance and was verified against copper-stressed leaves with 46 groups of reflectance spectra and biochemical information about copper-treated wheat and lettuce [12].

When conducting the inversion process on this model, we found that the model suffers from several difficulties. For example, leaf reflectance between 400 nm and 800 nm is affected by the absorption of chlorophyll and copper ions, and from 800 nm to 2500 nm, leaf reflectance is influenced by the absorption of water, dry matter, and copper ions. Because absorption bands overlap for chlorophyll, water, dry matter, and copper ions, the results from the simultaneous inversion method will inevitably be influenced by the significant correlation between the biochemical components of leaves. Thus, further research is required to develop an approach that will relieve the dependencies and consider the choices of the characteristic bands.

This paper is aimed at proposing a method that will overcome the problem of the regression method by inverting the PROSPECTcu model. To achieve this aim, we used the dataset of the reflectance spectra, biochemical components, and copper ion contamination of copper-treated leaves of copper-stressed tobacco. The proposed method distinguishes the sensitivity wavelength range of the PROSPECTcu model, and a multiobjective and multistage decision inversion strategy was used to estimate copper ion content.

2. Data

Copper is an element indispensable to plant growth, but in an excessive density, it will damage normal plant growth. Generally, the concentration of Cu is 35 μg/g in clear soil, 100 μg/g in slightly polluted soil, and 400 μg/g in severely polluted soil according to the China environmental standard [13]. According to the environmental standard, copper-polluted tobacco was cultivated in copper-stressed soil in the laboratory of Beijing Normal University from August 2013 to October 2013. The experiment yielded a dataset containing the reflectance spectra and biochemical components of the leaves at different copper-stress levels.

Rustic tobacco was selected for the laboratory experiments on copper treatment. The seeds were sown in plastic pots, each containing approximately 400 g soil. To simulate heavy metal copper contamination, the soil was treated with copper sulfate solutions. Six treatments were designed, including a bare control group and five polluted groups of 100, 200, 300, 400, and 500 mg/kg, and each treatment was tested with three replicates. For this experiment, data were collected regularly, and the specific process of data collection was as follows. First, we collected the reflectance spectra of the leaves. Next, we measured leaf chlorophyll, water, dry matter, and copper contamination using the laboratory method.

The reflectance spectra were measured with an SVC HR-1024 field spectral radiometer (Spectra Vista Corporation, NY, USA), which has a wavelength range of 350 nm to 2500 nm. The machine’s spectral resolution is less than 3.5 nm from 350 nm to 1000 nm, less than 8.5 nm from 1000 nm to 1850 nm, and less than 6.5 nm from 1850 nm to 2500 nm. The spectrometer is equipped with a blade-clamping device (Unit 1539 Plant Probe, Model A122317). We selected three representative leaves from each plant to measure.

Through chemical analyses, we obtained the chlorophyll, water, dry matter, and copper ion contents of the leaves. Chlorophyll is not soluble in water but is soluble in organic solvent. Therefore, chlorophyll was extracted from the copper-treated leaves using 80% acetone. To measure the dry matter and water contents, the harvested leaves were heated at 105°C for 30 min and subsequently dried at 70°C in a furnace to constant weight. The leaf samples (approximately 0.5 g) were digested with concentrated nitric acid and hydrochloric acid. Next, the digested solution was filtered, and the copper concentration was analyzed using atomic absorption spectrophotometry.

3. Method

3.1. Overview of the RT Model and the Inversion Strategy

Although establishing the basis for the copper-stress RT model is outside this paper’s scope, we briefly review the profile of the RT model PROSPECTcu as a theoretical foundation for copper ion contamination inversion. A more detailed description of the proposed model can be found in the literature [8].

The development of the PROSPECTcu model is based on the classic broadleaf radiative transfer model, PROSPECT [11]. Compared with the PROSPECT model, the new model primarily considers the addition of copper ion’s specific absorption coefficient. The scattering process is described using a refractive index () and a leaf structure parameter (). The absorption is modeled with pigment concentration, water content, dry matter content, copper ion contamination, and the corresponding specific spectral absorption coefficients (, , , and ). This RT model is used to simulate the reflectance spectra of copper-stressed leaves from 400 nm to 2500 nm and can be further used to determine the concentrations of biochemical components and heavy metals in leaves.

In our proposed inversion strategy, a simultaneous inversion test for multiple parameters is first conducted without considering the correlation between the model input parameters. Then, using an uncertainty and sensitivity matrix, the input parameter sensitivity is analyzed of the radiative transfer model for copper-stressed plants. In the selected sensitivity bands, we adopted a multiobjective and multistage decision method to calculate the copper ion contamination in leaves.

3.2. Simultaneous Inversion Method

While we retrieved leaf biochemical component contents by inverting an RT model, the process is usually conducted using the following steps. First, the cost function for reflecting the bias of simulated and measured reflectance is constructed. Next, the optimal model solutions are calculated using the least-squares method [14]. The optimal solutions are the model parameter values at which the cost function is minimal. These parameters are then selected as the inversion values of the leaf biochemical component content.

To illustrate the difficulties in retrieving leaf physicochemical variables simultaneously, we adopted a simultaneous inversion method to calculate the chlorophyll, water, dry matter, and copper ion contents from 400 nm to 2500 nm without considering the choices of the characteristic bands and the dependencies between leaf biochemical components. In the PROSPECTcu model, in addition to four biochemical parameters, two other parameters exist: the leaf structure parameter () and the refractive index (). When proper wavelengths are selected, the inversion result of is hardly changed by an increase or decrease in the refractive index value [15]. We chose three wavelengths corresponding to three maximum reflectance peaks from 800 nm to 1200 nm to obtain the parameter . The cost function is expressed as follows: where is the measured reflectance of tobacco leaves at , and is the modeled reflectance of tobacco leaves at .

After and the refractive index are determined, we investigate the pigments and copper ions from 400 nm to 800 nm and water, dry matter, and copper ions from 800 nm to 2500 nm. The cost function can be calculated as follows: where is the measured reflectance of tobacco leaves at , is the modeled reflectance of tobacco leaves at , and is the corresponding content of leaf biochemical components. During this stage, the chlorophyll content, water concentration, dry matter content, and copper ion content are selected as freedom variables, and the other parameters, that is, the structure parameter () and the refractive index (), are considered fixed values.

3.3. The Multiobjective and Multistage Decision Strategy

The input parameters of the RT model for copper-polluted leaves are somehow correlated, and the parameters are difficult to estimate without fixing at least one of them [16]. Therefore, we must distinguish the sensitivity of different free variables and use prioritized sensitivities to conduct the calculation in a stepwise fashion. We adopt a multiobjective and multistage decision strategy to estimate the copper ion content. The specific inversion process and the data workflow related to the process are shown in Figure 1.

The strategy in Figure 1 starts with the laboratory experiment on copper-stressed tobacco. The experiment dataset includes the reflectance spectra, chlorophyll contamination, water content, dry matter content, and copper ion content of copper-treated leaves. This strategy conducts a sensitivity analysis of the model parameters; then, it selects the characteristic bands for copper ion inversion based on the results. For the characteristic bands located in the wavelength range from 800 nm to 2500 nm, the water and dry matter contents will be calculated using the reflectance on those bands. Similarly, the chlorophyll will be estimated from the reflectance data that sit on the spectral range from 400 nm to 800 nm. Then, the water and dry matter contents are determined using the method shown in the blue dashed box. After the above parameters have been determined, the copper ion content will be inverted from their sensitivity wavelength bands. The detailed flow and implementation of copper ion content inversion are in the green dashed box. All of the cost functions in the inversion procedure have similar formations as in (1) but with different free variables and band numbers.

3.3.1. Sensitivity Analysis of the Model Parameters

The sensitivity of different wavelengths towards chlorophyll, water, dry matter, and copper ion is analyzed using the uncertainty and sensitivity matrix (USM) method proposed by Li et al. in 1997 [17]. This method was originally used to describe the sensitivity of the parameters of the bidirectional reflectance distribution function (BRDF) model in each sampling direction.

The USM is defined as follows: where is the USM values of the model parameter at , and are the maximum and minimum values, respectively, of the modeled reflectance as a function of parameter in the uncertainty range, and is the modeled reflectance when all model parameters are fixed at the statistic average values for each uncertainty range.

The lower and upper bounds of the uncertainty range for the model parameters are the minimum and maximum measured values, respectively, obtained in laboratory experiments on copper-treated wheat, lettuce, and tobacco. The uncertainty range and the corresponding averaged value of the model parameters are shown in Table 1.

3.3.2. Assessment of the Impact of Water and Dry Matter

As shown in Figure 1, when characteristic bands are located in the wavelength range of 800 nm to 2500 nm, we need to assess the influence of water and dry matter contents on the inversion values of copper ion content.

When evaluating the influence of water content on the inversion values of copper ions, the dry matter content is fixed at the corresponding measured values. The values of water content are in the uncertainty range (between 0.004 g·cm−2 and 0.030 g·cm−2), and the sample interval is 0.005 g·cm−2. That is, the water content values are 0.005, 0.010, 0.015, 0.020, 0.025, and 0.030 g·cm−2. For copper ion content, we then acquired six groups of corresponding inversion values with different water contents based on the RT model in the sensitivity bands. We compared the inversion values with the measured values of copper ion contents and analyzed the relative error (RE) between them. The RE values were used to estimate the influence of water content on the inversion values for copper ions. The formula used to compute the RE for sample is as follows: where is the relative error of sample , and are the maximum and minimum inversion values, respectively, of the copper ion content for sample when the water content changes among the six values described above, and is the measured value of the copper ion content for sample .

The method used to estimate the influence of the dry matter content on the inversion values of copper ion is similar to that used for water content. The values of dry matter content are in the uncertainty range (from 0.001 g·cm−2 to 0.004 g·cm−2), and the sample interval is 0.0005 g·cm−2. That is, the values of dry matter contents are 0.0010, 0.0015, 0.0020, 0.0025, 0.0030, 0.0035, and 0.0040 g·cm−2. We also used the RE indicator to evaluate the influence of dry matter content on the inversion values of copper ion contents. The formula used to compute the RE for sample is the same as in (4).

3.3.3. Determination of Water Content

When calculating the water content of leaves, the normalized difference water index (NDWI) is the most widely used vegetation index. The NDWI is derived from the normalized difference vegetation index by Gao in 1996 [18]. The critical problem in using the NDWI is the choice of optimum wavelength bands. In Figure 2, we plot the leaf reflectance and the specific absorption coefficient of water. Water has two weak absorption peaks in the region from 850 nm to 1300 nm. Furthermore, the index is more sensitive to water content when it uses the reflectance from the water absorption peak and other high-reflectance plateaus [18]. Therefore, the wavelength range used to calculate the vegetation index is set to 850 nm to 1300 nm.

To calculate the efficient band for the NDWI from 850 nm to 1300 nm, we use a randomized sampling method from the simulation data. First, 1000 random sampling points from the above model parameters that were compliant with the uniform distribution were generated in the uncertainty range. Next, we simulated the leaf reflectance based on the PROSPECTcu model using the 1000 model parameters above. The correlation coefficients () between all possible two-band normalized vegetation indices and water contents were determined from 850 nm to 1300 nm using the modeled data. Among the above vegetation indices, we selected the one with the maximum . Then, we fitted a statistical model between the selected vegetation index and the water content. Finally, the model was applied to copper-stressed tobacco data. We calculated the vegetation index using the following equation: where is the vegetation index calculated with reflectance at band and reflectance at band .

3.3.4. Inversion of Copper Ion Content

Regarding the variables that can significantly affect the inversion results of the copper ion content, we conducted the direct inversion with prior knowledge of those variables. We used the calculated average value and the standard deviation of the parameter to update the uncertainty region, defined as being between and . Then, the value of the variable can be updated by further inverting the PROSPECTcu model using the inverted values of copper ion content in the updated uncertainty region. This process is repeated until the correlation coefficient between the present and former inversion results for copper ions is close to 1. Here, we define a threshold of 0.97. The optimum calculation values of copper ion content are the final inversion results. The cost function used to determine copper ion contamination and parameter content is defined as follows: where is the measured reflectance value at band , is the modeled reflectance value at band , and is the copper ion content or the parameter contamination. A wavelength between 1896 nm and 1973 nm was selected for reasons that will be described in Section 4.2.

4. Results

4.1. Results of Simultaneous Inversion

In Figure 3, we compare the measured leaf biochemical component values and the inverted values using the simultaneous method. The inverted results of the leaf biochemical components are higher than the measured values. Unsurprisingly, the inverted results have a low accuracy; that is, the current results exhibit higher root-mean-squared errors (RMSEs) and lower correlation coefficients (). The low accuracy can be partly explained by the following factors. Firstly, leaf absorption between 400 nm and 800 nm is increased by chlorophyll and copper ions. Thus, the inversion results of the chlorophyll and copper ions are affected by their own interactions. As shown in Figure 3(a), 80% of the inverted values of the copper ion content are greater than the measured results, and 80% of the calculated values of chlorophyll content are less than the measured results in Figure 3(d). Secondly, the leaf absorption from 800 nm to 2500 nm is primarily caused by water, dry matter, and copper ions. Therefore, the calculation results for these three variables are directly influenced by the correlations between them, leading to large biases between the inverted and true values. All of the inverted values for water are less than the measured values, and all of the inverted values of dry matter are greater than the measured values (Figures 3(b) and 3(c)).

This unreasonable result can also be further explained using the spectral absorption features of the above leaf components, as shown in Figure 4. Absorption between 400 nm and 800 nm is affected by chlorophyll and copper ions. Absorption from 800 nm to 2500 nm is mainly affected by water, dry matter, and copper ions. The strongest absorption band of copper ion occurs at the wavelength of approximately 2000 nm. So the chlorophyll, water, dry matter, and copper ions have absorption bands that overlap, leading to a simultaneous inversion method that hardly distinguishes the interaction among these leaf biochemical components on the leaf’s reflectance. As a result, to produce accurate results, the sensitivity bands and the correlations between the model parameters should be considered before estimating the desired copper ion variable.

4.2. Results of Sensitivity Analysis

The results of the uncertainty and sensitivity matrix are shown in Figure 5. In the wavelength from 400 nm to 800 nm, the copper ion is much more sensitive to the leaf reflectance than the chlorophyll in the copper stress-treated plant samples. Furthermore, in the full wavelength range, copper ion’s sensitivity peaks at approximately 2000 nm. When the wavelength ranges from 800 nm to 2500 nm, dry matter has a very low sensitivity, while copper ion and water maintain relatively high sensitivities, although that of copper is relatively higher than that of water.

Based on the sensitivity of the RT model parameters, reflectance data on the sensitivity wavelength range (from 1896 nm to 1973 nm) of copper ions are selected to retrieve the copper ion content. This range is shown using two red dotted lines in Figure 5. In this sensitivity wavelength range, copper ions have the highest sensitivity, dry matter has the lowest, and water exhibits a small sensitivity valley. In the overlapped wavelength region, the water and dry matter contents should be determined first, as shown in the blue dashed box in Figure 1 (Section 3.3).

4.3. Results of the Impaction of Water and Dry Matter

The results of the inverted copper ions at different water contents are shown in Figure 6. All of the measured values of water contents are approximately 0.02 g·cm−2. From Figure 6, it can be seen that as the water contents approach the measured values, the inverted values of copper ion contents are closer to the measured values. When the water contents increased from 0.005 g·cm−2 to 0.03 g·cm−2, the inverted results for copper ions decreased by approximately 0.12 μg·cm−2.

We further compare the measured values of copper ion contents and the inverted values at different dry matter contents (see Figure 7). When the dry matter contents increased from 0.001 g·cm−2 to 0.004 g·cm−2, the inverted values of copper ion contents are approximately constant, decreasing by only approximately 0.0037 μg·cm−2.

To illustrate the bias caused by the two variables of water and dry matter, the relative errors of estimated copper ion are shown in Figure 8. The relative errors caused by water are distributed in the 150%–300% region, and the relative errors caused by dry matter are no more than 10%.

All of the data show that the inversion accuracy of the copper ion content is significantly affected by water and less influenced by dry matter. Therefore, the water content must be determined to accurately estimate copper ion. Meanwhile, the less sensitive parameter, dry matter, can be set to a constant value of 0.003 g·cm−2, which is the average value of dry matter in the uncertain region.

4.4. Estimation Results for Water Content

To identify the feasible spectral band for the water content, we plotted the correlation coefficients between the vegetation index and water content under different band combinations using a color image in Figure 9. The correlation coefficients between water content and the vegetation index using a reflectance below 1100 nm are very low, all less than 0.5, while within the 1100–1300 nm range, higher correlation coefficients occur, with a maximum of 0.994, and reflectance of 1149 nm and 1223 nm is selected.

The regression model of the NDWI and water content is shown in Figure 10. A significant linear relationship occurs between water content and the selected vegetation index. The linear regression model is described as follows: where is the water content and NDWI is the selected vegetation index.

Based on the above analysis, we applied the statistical model to obtain the water content, and the result is shown in Figure 11. The scatter plot of Figure 11(a) indicates that the estimated values of water contents are uniformly distributed on both sides of the 1 : 1 line. The correlation coefficient and RMSE between these data are 0.36 and 0.0028 g·cm−2, respectively. The low correlation coefficient is mainly due to the distribution of water content, which is relatively concentrated and is not sufficiently varied in scope. For example, the average value and RMSE of water contents are 0.021 g·cm−2 and 0.00088 g·cm−2, respectively. The relative errors between the estimated and the measured values of the water contents are shown in Figure 11(b). All of the relative errors are lower than 25%, and approximately 60% of the relative errors are lower than 15%. Therefore, the calculated values of the water contents meet the requirement of improved accuracy for estimating copper ions.

4.5. Results for Copper Ion Content

The inverted result of the copper ion content is described using RMSE and RE, which are plotted in Figure 12. From Figure 12(a), it can be seen that the RMSE of copper ion content decreases from 0.145 g·cm−2 to 0.0087 g·cm−2 while the inversion iteration number increases. This result indicates that the inversion accuracy of copper ion content can be improved by 90% using the proposed method. According to the histograms for water contents, for most of the samples, the relative error values of water contents decreased from 20% (blue bar for relative error 1) to approximately 10% (orange bar for relative error 6) as the inversion iteration number increased. Thus, the relative errors of water contents decreased, and the inversion accuracy improved. An exception was observed for sample 5, which had a high accuracy (RE less than 2%) for water content from the NDWI relationship. Therefore, the sample showed little apparent improvement in the iteration of RT model inversion.

The final results of the measured and inverted values of copper ion contents are shown in Figure 13. From the scatter plot shown in Figure 13(a), it can be seen that the estimated values of copper ion contents are uniformly distributed on both sides of the 1 : 1 line and that the correlation coefficient and RMSE are 0.87 and 0.0087 g·cm−2, respectively. To indicate the contribution of proper water content for improving the copper ion accuracy, we plot the scatter of the original results using an improper water content when estimating the copper ion in Figure 13(b). The results show that the original estimated result has a lower correlation coefficient and a higher RMSE of 0.56 and 0.0145 g·cm−2, respectively. As a result, the accuracy of the copper ion content calculated using the multiobjective and multistage decision method in Figure 13(a) is higher than the accuracy of the copper ion content calculated using the simultaneous inversion method, as seen in Figure 3(a) compared with Figure 13(b).

5. Conclusions and Discussion

Based on a laboratory experiment on copper-stressed tobacco, we collected the dataset of the leaf reflectance spectra, biochemical components, and copper ion contamination of copper-treated leaves. The input parameter sensitivity of the PROSPECTcu model for copper-stressed plants was subsequently analyzed using the uncertainty and sensitivity matrix. Then, the sensitivity wavelength for copper ion content inversion was selected in the range from 1896 nm to 1973 nm. In this sensitivity wavelength range, the sensitivity of copper ion content is the highest and that of dry matter is the lowest. At the same time, the sensitivity of water exhibits a small valley. Based on the above sensitivity bands, we confirmed the concrete procedure of the multiobjective and multistage decision strategy for estimating copper ion content in the sampling leaves. The average relationship efficiency value and RMSE of the copper ion content calculated using the above method are 0.87 and 0.0087, respectively. The accuracy of the copper ion content calculated using the multiobjective and multistage decision method is higher than the accuracy of the copper ion content calculated using simultaneous inversion method that cannot properly account for water content.

This paper successfully estimates the copper ion content in leaves using the PROSPECTcu model. Further, the developed model of copper-stressed leaves is proven suitable for inversion leaf biochemical properties when an appropriate inversion strategy is designed. We find that the simultaneous inversion method suggests that prior knowledge from the model could improve accurate estimation of leaf copper ion. Thus, the sensitivity of the model parameters must be fully considered.

Due to the limitations of the experiment conditions, the dataset obtained from the laboratory experiment on copper-stressed tobacco is limited to one type of tobacco plant. Although to a certain extent this limitation affects the inversion accuracy of the copper ion content, it remains feasible to retrieve the copper ion content from an RT model, which is very different from the traditional regression method. We believe that these preliminary research results provide the methodology with a theoretical base and a technical support for heavy metal monitoring using hyperspectral data. However, we are aware that further validation is needed regarding the proposed method’s accuracy using more data for other vegetation types.

Conflicts of Interest

The authors declare that they have no competing interests.

Acknowledgments

This work was supported in part by the Natural Science Foundation of China (41671333) and the National Science Foundation Project (2014FY210800-3).