Abstract

Both real total and aqua regia contents of trace elements in soils are often denominated by the same word “total” though the results are not identical. The formulas would be helpful for recalculation of aqua regia contents to real totals. Data for primary formulas were taken from the International Soil-Analytical Exchange Program of Wageningen Evaluating Programs for Analytical Laboratories. The degree of extractability DE of element in a sample was calculated by weighting the ratios of median contents in aqua regia to median real total contents in different periods with weights proportional to the respective number of determinations. According to descending median values of DE (%) in ISE European soil samples the elements are arranged as follows: Hg(98), Cd(94), Cu(91), Zn(90), Mn(89), Ni(88), Co(86), As(82), Mo(82), Pb(79), V(61), Sn(57), Cr(56), Sb(53), Be(51), B(46), U(35), and Ba(21). These values can be used for approximate recalculation of aqua regia contents to real totals and adjustment of contamination assessment. An attempt to obtain more explicit prediction by fitting regression models and problems related to high leverage and possibly influencial points are discussed and their possible relation to the specificity of soil composition is pointed out.

1. Introduction

Classification and choice of analytical methods for geochemical investigations depend on the field of application. According to the completeness of extraction the analytical methods for determination of trace elements values in soils or sediments can be subdivided into two main groups: (1) for real total contents (totals) and for partial contents [1].

Aqua regia digestion [2] is widely used for determination of the contents of harmful chemical elements in the soil aiming to solve environmental tasks. Unfortunately, during this time the unsupported tradition to denominate the aqua regia contents as “total contents” was established. This denomination is sometimes still used, despite the following statement in description of standard aqua regia procedure used for soil improvers [3]: “Elements, extractable in aqua regia cannot be described as “totals.” Denomination of “total contents” for aqua regia extraction results is especially widely spread when comparing aqua regia contents with the contents obtained by extraction procedures which do not use strong acids [46]. There are also cases when the phrase “total contents” is attributed to aqua regia contents even if the authors are well aware that actually their research deals with “pseudototal contents” [79]. Besides, some researchers point out that in topsoil, where residual silicates do not display high metal concentrations, the values obtained by aqua regia digestion are representative of the total metal concentration [10]. According to Sastre et al. [11] aqua regia contents of the main hazardous pollutants Hg, Cd, Zn, Cu, and Pb in environmental samples do not much differ from real total contents obtained by digestion with HF.

Despite the wide use of aqua regia extraction results, in most cases geochemical exploration or mapping cannot do without determination of real total contents [1, 1217]. According to Reimann et al. [18] “geologist would prefer to use analytical results from “total” dissolution (HF+HClO4 or others acids compositions) or an analytical technique (e.g., XRF or INAA) resulting in total element concentrations.” There is an opinion that strong extractants should be taken as a starting point of monitoring, because they give a worst-case estimate of possible long-term changes [19].

The problem arises when the same term “total content” is used for both real totals and aqua regia contents in soils and when exact description of the procedure is not given. In this case the possibility to compare quite different results and obtain incorrect conclusions appears. Besides, comparison of screening values of soil in various European countries [20] is problematic, because screening values in some countries are based on aqua regia or other weaker extracts; meanwhile in other countries the total content is required.

As both groups of analysis have their own advantages, it might be useful to determine the relationship between their results. Some solutions of this problem can be obtained by presenting approximate formulas for prediction of real total contents according to aqua regia contents or vice versa. The idea to investigate comparability of results obtained by different methods was proposed in an overview of ISO and CEN standards used in the European Union for chemical analysis of solid matrix material [21]. The theme of comparison of aqua regia contents to real total contents (HF-extractable or obtained by XRF) is also topical in international scientific research and geochemical practice [15, 16, 18]. The researchers present correlation coefficients and the percentage of aqua regia contents of chemical elements from HF-extractable or obtained by XRF content. Despite these achievements, there is still a lack of formulas for recalculation or prediction of one content to another content.

The aim of this research was to present the principle of preparation of the formulas for the relationship between aqua regia contents determined according to ISO 11466 : 1995 [22] and real total contents of harmful trace elements on the basis of some soil samples from Europe. The novelty of this research is that it is based on measurements of the same samples by a large number of different laboratories which use various digestion/extraction procedures for real total contents and various methods of detection of both real total and aqua regia contents. International Soil-Analytical Exchange program of Wageningen Evaluating Programs for Analytical Laboratories (WEPAL ISE) gives such possibility.

The tasks were the following: to collect data on medians of real total contents and aqua regia contents [22] of harmful trace elements Ag, As, B, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Se, Sn, U, V, and Zn from annual or quarterly WEPAL ISE reports; to estimate the values of the degree of extractability of harmful elements from each sample based on measurements during a particular period; to calculate the weighted estimate of the degree of extractability of harmful elements from each samples based on all repeated measurements of samples during different periods; to analyse the strength of the correlation between aqua regia contents and real total contents and present formulas for prediction of real totals according to aqua regia contents.

The mentioned elements were chosen because their soil screening values for unacceptable risk are listed in normative documents of various countries [20]; they are often shown as phytotoxic [23]; besides, it is sometimes misjudged that there is no substantial difference between the values of their real total and aqua regia contents.

2. Data and Methods

2.1. WEPAL ISE Program as a Source of Data

The most impartial comparison of results obtained by different extraction procedures is achieved when measurements of the same samples are done by as great as possible number of laboratories which use similar extraction procedures but not necessarily the same equipment. Advantages of WEPAL ISE are the following. The same samples are analysed by many laboratories; aqua regia and real total contents are processed in each of these groups separately; there is a great number of analysed chemical elements.

In WEPAL ISE program four soil or sediment samples of unknown chemical composition are sent every 3 months to the participating laboratories. The samples (100 g each) are dried and represent a fraction <0.5 mm. Homogeneity and stability of the materials distributed are sufficient for the goal of proficiency testing [24]. The participants analyse them according to their own procedures and for those elements and parameters they are interested in. Before the end of each quarter they must send to WEPAL the results and method indicating codes (MICs). MIC is the basis for assigning analytical results of reported determinand to a certain group. The first character indicates the method of extraction or digestion. The next three characters indicate the method of detection [25]. Till 2005, there were 2 different groups: real totals and so-called totals (acid extractable) [26]. The results obtained by aqua regia extraction were in the group of the so-called totals. In 2006, the results obtained by aqua regia extraction using the procedure of ISO 11466 [22] were distinguished into a separate group [27]. However, results obtained by this procedure have never been in the group of real totals.

Part of the procedures for obtaining real total contents is without digestion/extraction: neutron activation, X-ray fluorescence, or some others. Another part requires either acid digestion/extraction with HF and final medium HCI (or H2SO4, or HNO3, or HCIO4) or melting. There are more than 30 possible methods of detection of trace elements in prepared solutions [25]. All these methods of detection can be used also for the measurement of aqua regia contents obtained using the procedure described in ISO 11466 [22]. Earlier comparison of proficiency testing results of 7 different methods of detection did not show their significant influence on results [28].

2.2. WEPAL ISE Methodology for Elimination of Outliers

WEPAL ISE participants are informed about their results in quarterly and annual reports which give them valuable information not only about evaluation of their own results but also about other measurements. For each analyte in each group, the reference value for proficiency testing is obtained which is influenced by the algorithm used [29].

One of the possible methodologies is based on estimators for location and scale which give less weight to observations in the tails [30]. This methodology was used by WEPAL ISE for a long time. If the number of laboratory measurements exceeds 7, it is based on consecutive elimination of outliers using medians and median absolute deviations; the final estimate can be the first, the second, or the third median [27]. In 2009, a new robust and insensitive to outliers methodology was introduced which provides the statistical characteristics (mean and standard deviation) of the highest mode of the data set and is based on normal distribution approximation [31]. So since 2010 only the first median has been presented in WEPAL ISE reports.

2.3. Data and Contribution of the Institute of Geology and Geography

Data on real totals and aqua regia (ISO 11466 : 1995 [22]) contents of Ag, As, B, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Se, Sn, U, V, and Zn in 32 samples analysed by WEPAL ISE participants were taken from reports: 4 annual (2006, 2007, 2008, and 2009) [25, 27, 32, 33] and 1 quarterly (2010) [34].

The Institute of Geology and Geography analysed all these samples for determination of real total contents of chemical elements by two different methods. Optical atomic emission spectrophotometry OAES was used until 2009 [17]. Vaporization of 1 g of fine-milled soil was direct and done at 5000–6000°C in the electric arc between three graphite nails; measurement of Ag, B, Ba, Co, Cr, Cu, Mn, Mo, Ni, Pb, Sn, V, and Zn was done by optical spectrophotometer DFS 13. In 2007 energy-dispersive X-ray fluorescence EDXRF (Spectro XEPOS) for determination of As, Ba, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Se, Sn, U, V, and Zn in pressed pellets was started [17].

Aiming to analyse only 19 European soil samples, the non-European soil samples and river clay samples were excluded from WEPAL ISE samples. The European soil samples (Table 1) reflect a wide variety of lithological composition (from sandy soil to clay soil or marshland) and some other characteristics (clay particles, loss on ignition, and Ca content).

The intervals of the contents of harmful chemical elements in the study samples are wide and include the median values of the North European agricultural [15], European [16], and Lithuanian soils [13] (Table 2). Median contents of harmful trace elements in the study samples are rather similar to respective medians in European soils. Maximum contents of most trace elements are close to soil screening values which are used in some European Union countries including Lithuania for an intermediate (warning) or unacceptable risk of residential soil use [20, 35].

2.4. Mathematical Treatment of Data

Estimation of degree of extractability of an element in a sample measured in period was done by dividing medians of aqua regia contents [22] by medians of real total contents, both given in WEPAL ISE reports. To obtain the percentage of the following formula was used: is a median aqua regia content (mg/kg) given in report of period i, is a median real total content (mg/kg) given in report of period i, and both medians are usually after elimination of outliers done by WEPAL ISE methodology.

The respective numbers of measuring laboratories used for estimations of two medians in this period were and . Minimum from these two numbers was used to characterise the number of measuring laboratories related to estimation of .

As most of the samples were analysed repeatedly, it was necessary to combine information of the same sample analysed during different periods. The following procedure was used for this aim. The percentage of degree of extractability of an element in a sample was calculated by weighting according to the following formula: where is the percentage of degree of extractability of an element in period i, is the number of measuring laboratories related to estimation of in this period, and k is the number of periods.

The number of laboratory measurements () of both real total and aqua regia contents of each element in each sample (indicated as a subscript in Table 1) was obtained by summing up the numbers of measuring laboratories in all periods when the sample was analysed.

The primary matrix of degree of extractability in ISE European soil samples was censored in the following way. If , would be deleted (too unreliable results). As WEPAL ISE does not perform elimination of outliers when the number of measuring laboratories is below 8 [25], the values of when are also not very reliable, but these results were not deleted from the statistical treatment.

Median values were calculated for WEPAL ISE soil samples of 2 European countries (The Netherlands—12 samples, Switzerland—6 samples) and denoted by . Respective value of soil from Spain was based only one sample.

Analogous values in agricultural soil from 10 countries around the Baltic Sea were calculated using results given in the tables of “Agricultural soils in Northern Europe: a geochemical atlas” [15]. The contents obtained by treatment and analysis were considered as one group of real totals; therefore respective medians were averaged. For each of these countries, median aqua regia contents were divided by averaged medians of real totals and multiplied by 100.

EXCEL software was used for calculation of , , and values, their statistical characteristics, and preparation of charts and was helpful for calculation of validation parameters. STATISTICA (version 9) software was used for determination of Pearson correlation coefficients and nonparametric Spearman rank correlation coefficients between trace element aqua regia and real total contents and their significance levels [36]. This software as well as SPSS (release 8.00) was used for the selection and modification of a regression model. Detailed discussion will be given in results.

3. Results and Discussion

3.1. Variability of Degree of Extractability

Degree of extractability of As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Sn, V, and Zn was characterised for all 19 samples and after elimination of a part of the points according to procedure which will be described in 3.2 (Table 3). of Se was known only in 17 samples, of B—in 15, of U—in 15, and of Ag only in 4 samples. Besides, results of different elements in different samples are unequally reliable according to the number of laboratory measurements (Table 1): some of them are lower than 8. According to the percentage of samples, for which , the elements can be arranged as follows: Zn, Cr, Ni, Cu, Pb, Mn, Co, As, V, Hg (100%) > Cd, Mo (95%) > Ba (84%) > Be (79%) > Sb (68%) > Sn (58%) > Se (21%) > U (16%) > B (5%) > Ag(0%).

The minimum real total contents of all harmful trace elements, except Se, are lower than their soil geochemical background values [13, 35]. Maximum real total contents of Ag, As, B, Cd, Cr, Mn, Ni, Se, Sn, and V measured in soil samples exceed Lithuanian soil screening values [35].

The median percentages of (Table 3) can be used as the first approximation for recalculation of aqua regia content results (ISO 11466 : 1995) to real totals according to the following formula: where is real total content of trace element (mg/kg), is aqua regia content of harmful trace element (mg/kg), and DE is the median percentage of degree of extractability of this element.

Degree of extractability of 10 harmful chemical elements from soil of different European countries is variable (Figure 1). Soil samples analysed for “Agricultural soils in Northern Europe: a geochemical atlas” [15] are from 10 countries in the eastern part of Europe; meanwhile, samples analysed by WEPAL ISE program participants are from 3 countries of Western Europe. Two of the latter countries—The Netherlands and Switzerland (Spain was characterised by a single sample)—are distinguished by higher of As, Cr, Mn, Pb, V, and Zn compared to soil from 10 countries of the eastern part of Europe. This elevated extractability can be determined both by soil characteristics and geochemical peculiarities in various countries and by differences between methodology. Besides, the fact that soils of Switzerland and Spain greatly differ from soils of other countries according to of Ba is obvious.

Agrarian soils of Lithuania have only slightly higher of Ba compared to soil from Netherlands. The difference between of As in soil of The Netherlands and Lithuania is also small; both countries are distinguished from other countries by a higher degree of extractability of this element. Analysis of Spearman rank correlation coefficients R between values of As, Ba, Co, Cr, Cu, Mn, Ni, Pb, V, and Zn characterising different countries has revealed significant positive correlation between soil of The Netherlands and soil of Lithuania: , significance level . However, soil of Lithuania is even more similar to soil of Poland according to values of the above-mentioned elements: , .

3.2. Equations for Recalculation Aqua Regia Contents to Real Total Contents

According to reliability of results, chemical elements studied can be subdivided into 2 groups: a less reliably characterised group which includes Sb, Sn, Se, U, B, and Ag for which there are less than 75% of samples with a sufficient number of laboratory measurements (); a more reliably characterised group which includes As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, V, and Zn for which there are more than 75% of samples with .

Scatterplots of real total versus aqua regia contents of elements from the less reliably characterised group (Figure 2) indicate that presently for most of them (Se, Ag, U, and B) it is difficult to find out the equations for recalculation of aqua regia contents to real total contents. For Se the task is impossible, because according to all 17 samples both Pearson () and Spearman () correlation coefficients are insignificant ( and , resp.) and even negative due to obvious outlier (sample No. 885, braunerde pseudogley) and a great number of less reliably () characterised samples. For Ag both (1.000) and (0.999, ) coefficients are very high, but this element is represented by only 4 samples, each of them with less than 7 laboratory measurements. High correlation is due to sample No. 995 (sandy soil) with high Ag content. For U and B both (0.815 and 0.812) and (0.811 and 0.929) are positive and significant () indicating that the relationship between aqua regia and real total contents exists. Still these 2 elements have a low number of reliably () characterised samples: B only one and U only three. Besides, their distribution pattern is complicated, for example, for U the point representing sample No. 889 (soil with high carbonate) and for B sample No. 962 (sandy clay soil) are at great distance from other samples. Therefore equations for these chemical elements will also not be presented. Only for Sb and Sn from this group, the equations will be derived, because their (0.984 and 0.995) and (0.918 and 0.949) are positive and significant at and more than a half of the samples are reliably characterised.

The scatterplots of 14 chemical elements from the more reliably characterised group (Figure 3) clearly demonstrate that there is a relationship between their aqua regia and real total contents; significant () and values confirm this.

According to all elements from more the reliably characterised group and 2 elements (Sb, Sn) from the less reliably characterised group are arranged as follows: Zn, As(0.998) > Mn(0.995) > Ni(0.993) > Cd, Cu(0.992) > Mo(0.989) > Co(0.988) >Be(0.984) > Hg(0.974) > Pb(0.966) > Sb(0.918) > V(0.907) > Sn (0.891) > Cr(0.867) >Ba(0.810). The equations will be derived for these 16 chemical elements using all 19 samples.

It can be supposed from respective scatterplots of these elements that the relationships are linear. However, the curve estimation procedure from SPSS (release 8.00) was used to check for the possibility to apply several models: linear, 2nd order polynomial, logarithmic, and exponential regression functions. For each model, an ANOVA table was given, the values of -test were significant () for all these models. However, the highest values of adjusted were obtained either for linear or for 2nd order polynomial regression functions (Table 4). It can also be presumed from scatterplots of some chemical elements, especially Cr, that the 2nd order polynomial functions might be a more precise model. To check whether linear or 2nd order polynomial model is more appropriate for each chemical element, STATISTICA (version 9) software was used. The polynomial regression procedure from GRM (general regression models) module was chosen using a forward stepwise option (critical values to enter a new member and to remove the included member were equal to 0.05). It was applied several times.

The preliminary stage (S0) was necessary for calculation of leverage values of points for each element [36]. It is obvious from the scatterplots that there are points with high leverage. Since such points might be influential for model, it is expedient to eliminate them from the data set. Different critical values are used for revealing the points with high leverage: 2*/ [37] and 3*/ [37, 38], where is the number of samples and is the number of parameters in the model. It is obvious that critical values depend on the model. Since our data set is small (), we have chosen the second criteria. So if the linear regression model = + * (2 parameters) was chosen in the preliminary stage by forward stepwise polynomial regression, the critical value was ; meanwhile if the 2nd order polynomial model + * + * (3 parameters) was selected, the critical value was .

During preliminary stage S0 for all 16 elements the linear member was always included during the first step though values corresponding to values for inclusion of both linear and quadratic members were lower than 0.05. Such regularity can be explained by higher values calculated for linear members. Meanwhile during the second step the quadratic member was usually not included due to higher than 0.05 value of -test. So forward stepwise polynomial regression resulted in a linear model for most chemical elements (11 from 16), only for Cr, Cu, Hg, Sb, and Sn the 2nd order polynomial model was chosen (Table 5).

To check if coefficients , , and are significant, -test was used. If values corresponding to values are lower than 0.05, the coefficient is significant. For all elements coefficients were significant both in linear and 2nd order polynomial regression and coefficients were significant for Cr, Cu, Hg, Sb, and Sn. Meanwhile coefficients were significant only for some chemical elements: Ba, Be, Mn, Mo, and Zn in linear model and Sb and Sn in 2nd order polynomial model.

The preliminary stage S0 resulted in elimination of 1-2 points for each of the 16 elements. The sample No. 995 (sandy soil) was eliminated for 8 elements (Cd, Cr, Cu, Hg, Pb, Zn, Sb, and Sn), No. 910 (clay soil) for 6 elements (As, Be, Co, Pb, V, and Zn), No. 872 (braunerde clay) for 2 elements (Ni and Mn), No. 909 (marshland) for only Ba, No. 867 (forest sandy soil) for Mo. For Cr, there are two samples with high leverage values: one of them with its highest content and the other one with its lowest content. The specificity of soil samples for which high leverage influential measurements were revealed is given in Table 6.

After the preliminary stage, the main part of analysis followed which consisted of two stages. The first stage S1 was applied to a data set without samples eliminated in the preliminary stage. If coefficient was not significant (), the stepwise polynomial regression was repeated without intercept. In this stage the model of Cr, Cu, Hg, Sb, and Sn changed from the 2nd order polynomial to linear indicating that elimination of high leverage points often simplifies the model. On the other hand, the model of As and Zn changed from linear to the 2nd order polynomial regression indicating that reverse process is also possible. The coefficients in S1 stage were significant for the same elements: Ba, Be, Mn, and Mo, except Zn. However, they became significant also for Cu, Pb, and Sn.

At the end of S1 stage, the data were checked for possibly influential points. They can be either unusual predictor -values or unusual -values (discrepancy values or outliers). STATISTICA software enables to calculate not only leverage values but also characteristics that are helpful for revealing influential data: standardised residuals, studentised residuals, deleted residuals, studentised deleted residuals, Mahalanobis distances, Cook’s distances, DFFITS, and standardised DFFITS [39]. We have selected only the last from these characteristics. The reason is that it measures the combination of leverage and discrepancy of observation and that for small to medium sized data sets, cut-off value is simple: if absolute value of standardised DFFITS exceeds 1, the point is possibly influential [40]. Such points were not found for As, Ba, Co, Cr, Hg, and Mn. Meanwhile, for other part of chemical elements possibly influential points were found. Only a part of them was from those in the preliminary stage: sample No. 910 was possibly influential for Ni and Mo, No. 909 for V and Zn, and No. 872 for Be. In addition to them other points were found (Table 5, Figures 2 and 3). For 10 chemical elements the next stage S2 was continued after elimination of possibly influential points.

In S2 stage, the stepwise polynomial regression gave results very similar to S1 stage: the model remained the same for all elements, except Zn; the presence of intercept remained for Be, Cu, Mo, Pb, Sb, and Sn. For Zn, the model changed once more from the 2nd order polynomial without intercept to linear without intercept.

After the main part of the analysis, there were mostly linear equations derived either with intercept (Ba, Be, Cu, Mn, Mo, Pb, Sb, and Sn) or without intercept (Cd, Co, Cr, Hg, Ni, V, and Zn); only for As the 2nd order polynomial regression without intercept was obtained (Table 5).

The above-mentioned procedure can formally be applied also for some least reliably characterised chemical elements, that is, B and U aiming to find out high leverage points or possibly influential points and type of model. For B with 15 samples, the procedure resulted in linear regression model without intercept. During preliminary stage, sample No. 962 was eliminated; meanwhile, during S1 stage sample No. 909 was deleted as possibly influential (Figure 2). For U with 15 samples, the procedure resulted in the 2nd order polynomial regression model without intercept. During the preliminary stage, sample No. 889 (soil with high carbonate) was eliminated; meanwhile during S1 stage sample No. 865 (loamy soil) was deleted as possibly influential (Figure 2). For Se and Ag neither linear nor 2nd order model could be applied, because P values to enter were higher than 0.05. However, outliers are clearly seen in the scatterplots and are confirmed by forced application of the linear model despite the fact that both and coefficients are not significant ().

Cross-validation of results can be done either by leave-one-out or leave-many-out procedures [41, 42]. Due to the small data set the simpler leave-one-out procedure was chosen. The cross-validated correlation coefficient was calculated according to the following formula: where are the measured, the predicted and, the averaged values of dependent variable; summing is over all samples. Since the difference comprises deleted residuals the calculation of was simplified using deleted residuals, from stepwise polynomial regression. Analogous formula is used for external validation: where are the measured, the predicted, and the averaged in the training set values and summing is over validation samples which were not used in the training set. We presumed that the samples eliminated either in preliminary stage S0 or in S1 stage can be used for external validation; meanwhile the remaining samples comprise the training set.

Cross-validation was performed after S0, S1, and S2 stages according to measurements in the training set; meanwhile external validation was done only after S1 and S2 stages according to validation set consisting of only several samples which were eliminated in either S1 or S2 stages (Table 5).

Cross-validation in preliminary stage S0 showed that for As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Sn, V, and Zn (all 16 elements), the initial model was suitable, because values were higher than 0.5: for Cd, Mn, Cu, Mo, Co, Zn, As, Ni, and Sn they exceeded 0.99, and for Be, Pb, and V were from 0.90 to 0.99, and for Hg was only slightly lower (0.896). The lowest values were for Ba (0.766), Cr(0.746), and Sb(0.525).

Cross-validation after stage S1 showed that for 6 chemical elements (Sb, Hg, Pb, As, Zn, and Co) value increased in comparison with value after S0 stage indicating some improvement of the model; meanwhile for 10 other chemical elements (Sn, Cr, Ba, Cd, V, Ni, Be, Mo, Cu, and Mn) value decreased. The improvement is especially obvious for Sb, Hg, and Pb.

The external validation of elements with improvement of the model (As, Co, Hg, Pb, Sb, and Zn) according to samples eliminated due to high leverage values gave high values ranging from 0.855 for Sb to 0.998 for Co. This fact indicates that elimination of high leverage points was expedient for adjustment of the model; besides, it enables good forecast even in points with high leverage outside the interval of the training set.

The most obvious decrease of is for Sn and Cr; they are followed by Ba, Cd, and V, for Ni, Be, Mo, Cu, and Mn the decrease of is lower. Since all values are higher than 0.5, the model derived in stage S1 is suitable for prediction in the interval without high leverage points.

The external validation of elements with lower value in stage S1 compared to S0 gave also high values ranging from 0.896 for Ba to 1.000 for Cd indicating the possibility to forecast even outside the interval of the training set, the only exception is Cr. For Cr the forecast outside the interval of the training set is not recommended, because value is negative.

For some chemical elements, S2 stage was implemented; for most of these elements (Sn, Pb, Zn, Cd, Ni, Be, Mo, and Cu) value in stage S2 was slightly higher than in stage S1 indicating some improvement of model; only for V and Sb it was slightly lower.

The external validation of elements in stage S2 has shown that values in S2 stage are always lower than in S1 stage. For Sn and Pb they are lower than 0.47 and for Mo even negative indicating that it is not recommended to predict outside the interval of the training set. For the other 7 chemical elements (Sb, Zn, Cd, V, Ni, Be, and Cu) prediction outside the interval of the training set is possible, because values range from 0.775 for Cd to 0.990 for Cu.

3.3. Practical Significance of Results

Our results do not confirm the statement of Sastre et al. [11] that the difference between aqua regia and real total contents of Hg, Cd, Zn, Cu, and Pb in environmental samples is not great. As concerns Pb, our results show that about 20% of its real total content is not extracted. Besides, the results of our research show great variability of median degree of extractability for different harmful trace elements (Table 3). Their arrangement according to median percentage of degree of extractability in European soil samples analysed by WEPAL ISE participants from 2006 until the first quarter of 2010 is the following (Table 3): Hg(98), Cd(94), Cu(91), Zn(90), Mn(89), Ni(88), Co(86), As(82), Mo(82), Pb(79), V(61), Sn(57), Cr(56), Ag(54), Sb(53), Be(51), B(46), Se(35), U(35), and Ba(21).

Despite the obvious fact that aqua regia extraction differs from digestion with HF in combination with other acids or alkaline melt, the results obtained by aqua regia extraction are often denominated as “total contents.”

Controvertial content of some regulatory documents [43] is the subsequence of the tradition to denominate both real total and aqua regia contents by the same word “total.”

When total contents of a group of chemical elements are used for risk assessment, the term “total content” should be well explained; otherwise due to incomparable results of risk assessment, the problem will be also complicated. For example, in Lithuania soil additive contamination index is based on real total contents of harmful trace elements used for evaluation of possible health disturbances [35, 44]. Index is calculated by summing up the concentration coefficients of contaminating elements. The concentration coefficients are calculated by dividing the contents of elements in a sample by their background values. The problem of improper determination of index arises when the content in numerator is determined in aqua regia extract and the content in denominator is real total content.

An example is given below to illustrate the problems which are possible during estimation of and the category of danger. It is based on Vilnius territory between the works of drills and former works of radio engineering [45]. A new composite sample was taken from one site of this territory in 2008. The site area was about 400 m2. It was located 100 m to the east from works of drills in the yard of a dwelling house. Real total contents of Ba, Cr, Cu, Mn, Ni, Pb, Se, Sn, U, V, and Zn were determined by EDXRF and of Ag, B, and Co by OAES. They were the following (mg/kg): Ag-0.34, As-3.2, B-24.8, Ba-359, Co-7.18, Cr-53.0, Cu-28.0, Mn-277, Mo-18.7, Ni-12.8, Pb-39.0, Se-1.27, Sn-9.88, U-2.26, V-44.0, and Zn-160. The values of concentration coefficients for calculation of were obtained using particular background values [35]. The value of was 42.8, so site was attributed to dangerous contamination category. Supposing the researchers use aqua regia digestion to analyse soil from the same site, the approximate estimates of aqua regia contents in soil of the same site can be obtained according to real totals and medians of degree of extractability (Table 3). If these estimates are divided by the same background values [35], other values are obtained. Finally, this results in lower value of , which is equal to 30.1. So in this case, the site will be attributed to medium dangerous contamination category. Therefore, it is quite possible that the site will not be selected for management. This example shows the necessity to use the identical contents of harmful trace elements both in the numerator and in the denominator for calculating concentration coefficients .

The necessity to use identical contents requires to have relevant terminology. In this case, it is useful to remember the definition of total concentration (for inorganics) given in the vocabulary of soil quality, that is, ISO 11074 : 2005 [46] with the following note: “determination of the total concentration requires use of an instrumental technique such as X-ray fluorescence or a powerful solvent combination such as a mixture of hydrofluoric and perchloric acid, or alkaline melt.” So real total contents are total contents.

4. Conclusions

Comparison of aqua regia and real total contents of Ag, As, B, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Se, Sn, U, V, and Zn (20 elements) in 19 European soil samples analysed by WEPAL ISE revealed the following results. (1)Aqua regia digestion enables to extract near 98% (median amount) from real total contents only of Hg. However, for Cd, Cu, and Zn nearly 10% of real total content may be not extracted; for Mn, Ni, Co, As, Mo, and Pb from 11 to 21% of real total content may be not extracted; for V, Sn, Cr, Sb, Ag, and Be—from 39% to 49%, for B, Se, U, and Ba—from 54% to 79%. However, presently the equations for prediction can be given for As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Sn, V, and Zn (16 elements). Data of Ag, B, Se, and U are unsuitable for prediction equations due to low number of samples with a sufficient number of laboratory measurements. Such models can be created in the future when sufficient data are included to the data set.(2)Since the data contain high leverage points and possibly influencial points, the leave-one-out cross-validation was applied. The cross-validated squared correlation coefficient has shown good internal predictivity of real total contents of As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Sn, V, and Zn (16 elements) according to respective aqua regia contents by a linear or 2nd order polynomial regression model. Good internal predictivity () was observed in all stages: preliminary stage S0, stage S1 after elimination of high leverage points, and next S2 stage after elimination of possibly influential points. (3)External validation by calculating the values of according to eliminated samples indicates that extrapolation of prediction for aqua regia values higher than upper limit of prediction interval in the final S2 or S1 stages is not recommended for Cr, Sn, Mo, and Pb. On the other hand, such extrapolation is possible for As, Ba, Be, Cd, Co, Cu, Hg, Mn, Ni, Sb, V, and Zn (12 elements).(4)The high leverage points and possibly influencial points should be explained. Though this is out of scope of the present paper, it has been noticed that the specificity of soil composition (clay content, loss on ignition, and major element contents) should be taken into account.(5)For As, Ba, Be, Cd, Co, Cr, Cu, Hg, Mn, Mo, Ni, Pb, Sb, Sn, U, V, and Zn, the difference between the median values of degree of extractability calculated for all data and respective median values calculated according to the data set without high leverage and possibly influential points does not exceed 2%.(6)Relevant terminology should always be used in description of analytical methods and results. The contents determined in aqua regia should be denominated strictly as aqua regia contents; only real total contents can be denominated as total contents.

Acknowledgments

This paper was funded by a Grant no. LEK-03/2012 from the Research Council of Lithuania. The authors acknowledge WEPAL working group and especially Bram Eijgenraam for the possibility of participation in the International Soil-Analytical Exchange Program of Wageningen Evaluating Programs for Analytical Laboratories and the permission to use the data for preparing this paper. The authors are also very grateful to Dr. V. Chrastny for the helpful comments and suggestions.