Abstract

This study investigates the ability of field spectra measurements to discriminate between soils from non-sites (natural soils) and from archaeological sites, such as middens (rubbish-dumping areas) and animal byres. First, we tested whether there is a difference in the concentration of elements between different soil types using analysis of variance while random forest (RF) and forward variable selection (FVS) methods were used to select important soil elements for the classification of the archaeological sites. In the second approach, we evaluated the ability of field spectroscopy reflectance measurements to discriminate among nonsites, middens, vitrified dung, and nonvitrified dung byres. The guided regularised random forest (GRRF) was used to identify important wavelengths for the discrimination of abovementioned archaeological and nonarchaeological soils. Thereafter, the selected soil elements and wavelengths were used as input variables in the RF classification algorithm to classify the nonsites, middens, vitrified dung, and nonvitrified dung. The findings reveal that there is a significant difference in the composition of chemical elements and spectral signatures of nonsites, middens, vitrified dung, and nonvitrified dung. In summary, high classification accuracies achieved when using field spectroscopy data prove that remote sensing techniques can be used to exploit the spectral differences among the abovementioned soil types in mapping archaeological feature characteristics of farming communities’ settlements.

1. Introduction

The presence of archaeological materials in the soil has a localised impact on the composition of physical and chemical properties, thus making it different from surroundings [1, 2]. Anthropological activities such as animal penning and rubbish damping change soil structure and colour. For example, refuse middens containing ash are normally characterised by loose fine greyish particles [3], whilst animal penning areas appear grey because of the deposition of high organic animal secretions [4]. Chemically, human activities have an impact on soil organic content, affecting the amount of phosphates [5] and potassium [6]. Middleton and Price [7] found out that there is a high concentration of K, P, and Mg in the hearth area. Huffman et al. [6] studied the formation and difference in chemical composition of nonvitrified dung and vitrified dung deposits in archaeological sites. This alteration of soil physical and chemical properties occurs through weathering and incorporation by depositing matter into an area and tipping the balance in the pedogenesis process [1, 8, 9]. Approaches for identifying different past human activity areas through geochemical analysis are routinely used at the intrasite level [1, 2, 10, 11]. Material remains and changes in soil physical properties such as texture and colour are also of interest at a landscape level for the purpose of the archaeological survey and site identification [4, 12]. Fieldwalking survey is the main method for identifying archaeological features visible on the Earth’s surface [4, 6, 13]. However, in areas where archaeological features are not clearly visible or structural information about the use of space within a site is inconclusive, the concentration of elements in the soil has been analysed in order to identify archaeological sites [14] or different activity areas within the site [11, 15]. The aforementioned traditional archaeological survey methods, geochemical analysis, and field walking are time-consuming and expensive to carry out.

The advent of very high-resolution multispectral sensors in the late 1990s did not only offer the possibility of exploiting higher spectral resolutions but also offered an opportunity for effective data processing methods in archaeological prospection [1618]. Most published research has tested the use of vegetation indices calculated from multispectral remote sensing data to identify archaeological anomalies [1922]. However, the complex nature and small size of most archaeological features limits the wide use of multispectral sensors, which are prone to information loss and confusion due to low spatial and spectral resolution [23, 24]. Moreover, some archaeological features may have subtle chemical and physical differences with their environment, which might be masked by multispectral sensors.

Spectroscopy data which are commonly captured using airborne, handheld, and spaceborne sensors offer hundreds of narrow spectral wavebands. These narrow wavebands allow for an exhaustive discovery of detailed archaeological data that are otherwise missed by the generic wavebands captured by multispectral sensors [25, 26]. However, using spaceborne and airborne data comes with challenges related to the spectral mixture of features and radiometric and wavelength calibration uncertainties which as a result affect the quality of captured data [23, 2729]. Spaceborne and airborne sensors also lack spatial and spectral resolution similar to that of the handheld spectrometer [23, 27, 28, 30]. In consequence, the aforementioned challenges have resulted in the limited use of both airborne and spaceborne hyperspectral data in soil analysis [23].

In recent years, laboratory visible-near-infrared (VISNIR)/short-wave infrared (SWIR) spectroscopy data have been viewed as a potential cost-effective option to the field-based survey and traditional laboratory approach for soil analysis [29, 3135]. This has led to the development of spectral libraries documenting the spectral signatures of different soils and their properties. Researchers have used field spectroscopy data to assess soil properties such as organic content, minerals, texture, and moisture [34, 3640]. Therefore, field spectroscopy data might be useful for discriminating archaeological sites with specific soil properties.

The documentation of the reference data for the spectral reflectance of different archaeological features in a library is lacking. In recent past, only handful research studies have used field spectroscopy/hyperspectral data to document the spectral signatures of vegetation overlaying archaeological materials and investigating the potential of detecting and mapping the vegetation health as a proxy indicator of buried archaeological materials [4144]. Consequently, the role of field spectroscopy in archaeology applications is still poorly explored. For instance, and to the best of our knowledge, there is no study to date examining the use of field spectroscopy in discriminating archaeological surface features using soil characteristics as indicators.

One of the most notable challenges in the use of field spectroscopy is the large data redundancy due to the strong correlation between the spectral features [45]. This high dimensionality requires sufficient training sample and computational process which might be time-consuming and prohibitive in cost [46, 47]. In most archaeology studies, the size of training samples is restricted by the magnitude of archaeological sites or by issues of accessibility. This may result in some problems such as the Hughes phenomenon or “curse of dimensionality,” whereby the accuracy of classification algorithms decreases when working with a limited number of training samples [48, 49]. This is because the ideal number of classification features is restricted by the size of the training sample [48, 50, 51]. As a result, there is a need for the reduction of dimensionality when processing field spectroscopy data in order to avoid the aforementioned challenges. Dimensionality reduction methods improve the discriminative ability of the dataset by decreasing the number of spectral bands without dropping vital information [5254]. Numerous variable selection methods have been used to decrease the high dimensionality in hyperspectral data by selecting the most important bands for data classification. The most commonly used feature selection methods are genetic algorithms [5557] and random forest (RF) [30, 58]. Genetic algorithms (GAs) are based on the process of natural selection, influenced by the principle of survival of the fittest. GA is usually embedded in classifiers such as SVM as a band selection algorithm [55, 57, 59], in order to improve classification accuracy. However, genetic algorithms are vulnerable to random correlations of the features [60] and have high computational demands [55, 61].

RF classifier, which has been described as the best machine learning algorithm for handling high-dimensional data [62], measures the importance of each variable in classification. However, it is prone to producing redundant features because of its biases towards the correlated predictors [63, 64]. In RF, samples for bagging are most commonly dominated by less important features, therefore, degrading the classification accuracy [65]. RF also ranks features without selecting a subset of optimal features [30].

Recently, Deng and Runger [66] developed a guided regularised random forest (GRRF) algorithm aiming at curbing the limitations of the traditional RF algorithm. GRRF eliminates feature redundancy by not selecting features carrying similar information with the already selected ones in a subset at each node [66]. The GRRF algorithm guides the feature selection process in the regularised RF using importance scores from the normal RF [66]. To date, only two studies have used GRRF for the reduction of high dimensionality in hyperspectral data for vegetation studies [67, 68].

This study investigated whether field spectra measurement can discriminate archaeological sites using soil properties as indicators. More precisely, the objectives of the study were to (i) investigate if there is any significant difference in concentration of soil elements across different archaeological sites, namely, middens, vitrified dung byres, and nonvitrified dung byres, (ii) use in situ hyperspectral measurements to discriminate among nonsites (natural soils), middens, vitrified dung, and nonvitrified dung, and (iii) identify important wavelengths for discriminating among the aforementioned features using the guided regularised random forest algorithm.

2. Materials and Methods

2.1. Study Area

The study was conducted in the Mapungubwe cultural landscape, located at the confluence of the Shashi and Limpopo rivers, in the province of Limpopo, South Africa, shown in Figure 1. The Shashi-Limpopo confluence area (SLCA) forms the boundary of three countries: Botswana towards west, South Africa towards south, and Zimbabwe towards north. Geologically, the SLCA lies within the Limpopo mobile belt which joins the Zimbabwe and Kaapvaal cratons [69]. This area is characterised by igneous and sedimentary rocks of the Karoo supergroup [70]. Erosion is rampant, in particular in areas closer to the river channels, therefore, forming sandstone ridges and outcrops which cover most parts of the SLCA with a sparse distribution of volcanic intrusions [71, 72]. Generally, soils in the Limpopo mobile belt include clays and sands originating from the Karoo system [73].

Archaeologically, the study area has been continuously occupied by different groups of farming communities since 900 AD [7477]. These societies practised the central cattle pattern (CCP) settlement system [78]. This is a settlement system whereby animal byres are located at the centre of the settlement, close to the male gathering area [79] (Figure 2). Social changes in the SLCA occurred during the twelfth century AD, with the occupation of Mapungubwe hill, when leaders and commoners became physically separated [76, 81]. Animals were only kept at commoner settlements where the CCP continued to be practised, whilst rulers would reside in stone-walled elevated areas, secluded from the commoners [76]. The archaeological features characterising these sites are vitrified dung byres, nonvitrified dung byres, middens, grain bin stands, and pottery scatters. The great majority of archaeological sites occupied by late farming communities in the SLCA appears as open spaces within woodland vegetation especially those characterised by vitrified and nonvitrified dung. The aforementioned differences in vegetation cover might possibly be influenced by the chemical composition of vitrified and nonvitrified dung sites which was found to be different from that of their surroundings [6, 82].

Middens are areas where the general waste of a household, including remains of unused materials such as broken potsherds, animal bones, beads, and other utensils and ashes from fireplaces, were discarded [83, 84]. Middens differ in size depending on the duration and density of site occupation [75]. Some of the middens in areas classified as capitals, such as K2, reached a diameter of 182.88 meters and a depth of 6 meters [79, 85, 86]. Vitrified dung and nonvitrified dung are two types of dung deposits in the study area, which indicates areas where animals were kept in the settlement. Vitrified dung is formed by the burning of dung deposits which is at least more than a meter in thickness, at very high temperatures (in the region of 1100°C) [87, 88]. Vitrified dung contains high quantities of nitrates and phosphates which makes it impossible for some grasses to grow on them [87]. Nonvitrified dung is characterised by unburned dung deposits. The byres for both livestock and cattle have an average diameter of 3 meters and 18 meters, respectively (T. N. Huffman personal communication, September, 2018).

2.2. Soil Sample Collection and Analysis

Soil samples for three archaeological features (middens, vitrified, and nonvitrified dung areas) and bare soil (nonarchaeological site) from the surrounding natural landscape were collected in February 2017 for lab spectral measurement and chemical analysis. A purposive sampling technique was used during the fieldwork data collection by visiting sites which appear in the literature and are affirmed to have dung deposits and middens [4, 89]. In order to avoid contamination with the archaeological features due to wind and water erosion, bare soil samples were collected from areas far away from the archaeological sites. A total of 356 samples were collected across the study area at 0–20 cm depths which corresponded to the surface horizon at each location. Only top soil was sampled because optical sensors cannot penetrate the surface to map subsurface soil properties; therefore, it will be fruitless to sample horizons beneath the top soil for spectral analysis [89]. In addition, a GPS point for each sample was taken for spatial reference. Between 60 and 117 samples were collected from nonsites, middens, vitrified dung, and nonvitrified dung sites in the field (Figure 3). All the collected soil samples were packaged in zip-lock plastic bags for field spectral measurements and chemical analysis in the laboratory.

2.3. Laboratory Spectral Data Acquisition

Soils collected in the field were air-dried and sieved to 2 mm [39] before being flattened on a black plastic plate to create a smooth surface. Spectral reflectance measurements were carried out in a controlled environment using the analytical spectral device (ASD) FieldSpec® 4 optical sensor with 350-2500 nm spectral range [91, 92]. The analytical spectral device (ASD) FieldSpec® 4 optical sensor capture data insampling intervals of 1.4 nm between 350-1000 nm and 2 nm between 1001-2500 nm [91]. The spectral measurements were taken from the surface of each soil sample at the nadir position with 10 mm field of view using the Hi-Brite contact probe fitted with the 100W halogen reflector lamp. The spectrometer was calibrated using a white spectralon reference panel after every 10 to 15 measurements. Soil samples from each bag were randomly divided into three samples. Three spectral measurements were taken per each sample by randomly moving the probe over the soil surface. The nine (n = 9) spectral measurements were then averaged to represent the whole soil sample (Figure 3). All 2151 bands were included in the analysis because the data were collected within a controlled environment, so there was no need to remove spectral bands to improve the signal-to-noise ratio. The reference data were arbitrarily separated into training (70%) and test (30%) datasets.

2.4. Soil Analysis

The samples were analysed for the composition of 33 elements by ALS in Johannesburg. The samples were air-dried and dry-sieved using a 180-micron screen (Tyler 80 mesh). Thereafter, 0.25 grams of a readied sample was then digested with perchloric, nitric, hydrofluoric, and hydrochloric acids. Thereafter, a dilute hydrochloric acid was mixed with the residue, and the resulting solution was analysed using inductively coupled plasma-atomic emission spectrometry [93].

2.5. Statistical Methods for Soil Analysis

Descriptive statistics were performed for the individual soil classes to get the typical values, mean, median, and standard deviation of the concentration of elements within each class. The variation of the concentration of elements between and within classes was measured using the coefficient of variation and interquartile range. The two aforementioned measures of variation were chosen because of their ability to get rid of the challenges associated with outliers [94, 95]. The coefficient of variation also has the ability to standardise data for comparing the variability of two or more distributions from different or the same data with different means [94, 96]. Levene’s test of homogeneity was used to test for homogeneity of variances, and Welch’s analysis of variance (ANOVA) was used whenever homogeneity was violated. ANOVA was used to test if there are statistically significant differences on the average composition of chemical elements between nonsites and each archaeological features: middens, nonvitrified dung, and vitrified dung [97, 98]. The Games–Howell post hoc test was performed to find out which elements are significantly different across the classes since variances were heterogeneous and sample sizes were unequal. The level at which differences between the means were considered significant was set at . However, the major limitation of ANOVA is that it is not linked to any machine learning classifier; therefore, it does not measure the importance of each element in the prediction model [99].

To curb the aforementioned limitations, RF was combined with forward variable selection (FVS) procedure in an attempt to locate the ideal subset of elements with the least classification error [100]. RF was used for ranking the variables based on the importance score of the mean in decrease accuracy determined using out-of-bag (OOB) data and evaluating the classification accuracy. Then, a stepwise procedure was employed for FVS, whereby elements were added into the model according to their importance beginning with the most important one [101]. The method continued by repetitively constructing new RF models while adding a single element in each iteration and recording the OOB error. The parameters for the number of trees to be grown (Ntree) and the number of variables needed to split each node (Mtry) were optimised using grid search at each iteration. This procedure was reiterated until all elements were utilised; then, the smallest subgroup of elements with the least OOB error was identified. The optimal elements chosen were then used as input variables to construct two different classification models in the RF classifier. In the first model, the subset of optimal elements was used to classify nonsites, middens, vitrified dung, and nonvitrified dung sites. In the second model, the subset of optimal elements was combined with the optimal bands chosen by GRRF and classification of nonsites, middens, vitrified dung, and nonvitrified dung sites was done. This was done in order to check if there would be an improvement in the classification accuracy of the general model when optimum bands and soil elements are combined.

2.6. Using Guided Regularised Random Forest for Variable Selection

GRRF is a feature selection algorithm which applies some form of regularisation to different types of decision trees models, performing a selection of feature subsets [66, 102]. The regularisation is guided by scores of feature importance measured by the traditional RF using the Gini index. Gini importance assesses the level of impurity of each variable in relation to the importance it gained over others, in the sampled set of variables, and selects the optimal split at each node [103, 104]. The Gini index at a node y can be defined as follows:where denotes the proportions of observations for class k at node y. The Gini information gain (hi, ) is then calculated as the difference between the Gini index at node y and the weighted mean of Gini indexes at each child node of y. Gain information of feature hi based on impurity at node y can be defined as follows:where Gini (yL) and Gini (yR) represent the impurities, while and are the weights for the left and right child nodes.

In GRRF, regularisation is added to the gain information from the traditional RF and each individual feature is given a penalty coefficient. The regularised information gain is defined as follows:where is the coefficient of regularisation for and F is the set of features chosen in the preceding nodes. is computed basing on the importance score of from the traditional RF as follows:where is the importance coefficient and is the base coefficient controlling regularisation. Regularisation allows the model to reduce redundancy associated with tree models of feature selection, by only choosing a new feature for splitting data in a tree node if it yields different information from the feature that was chosen in the earlier split. Unlike other feature selection methods, the regularised framework allows for the construction of a single model at a time therefore reducing the time needed to train the model [102]. GRRF also has the ability to select the ideal subset of variables with the lowest misclassification error. In this study, GRRF was used to select the key wavelengths to accurately discriminate among nonsites, middens, vitrified dung, and nonvitrified dung. This was done so as to reduce the high dimensionality inherent within the hyperspectral data [101, 105, 106]. The key wavelengths selected by GRRF were then utilised as input variables in the traditional RF algorithm to discriminate among the nonsites.

2.7. Random Forest Classifier

RF classifier was used to discriminate between nonsites, middens, vitrified dung, and nonvitrified dung using key elements selected by FVS and the key wavelengths selected by GRRF. The RF classification algorithm has been extensively employed in the classification of both hyperspectral and multispectral data [106109]. Generally, RF can be described as an ensemble of classifiers which creates binary decision trees and assigns class basing on majority votes at each node [106]. The decision trees are grown independently of each other using different samples, facilitated by bagging which randomly creates subsets of the original datasets, with replacement, for each node [110]. The variables which are not included in the bootstrap sample, which makes a third of the data, are called the out-of-bag (OOB) sample [110, 111]. Each tree grows without being pruned [110]. However, Ntree, the default number is 500 trees, and Mtry, the default is the square root of the total number of variables , are defined by the user [106]. The Mtry and Ntree have to be optimised, in order to archive high classification accuracies [112]. This study identified the best combination of Mtry and Ntree parameters using a grid search based on the OOB approximation of error [113]. The optimisation of Ntree was done using values ranging between 500 and 10000 at the interval of 500, while the Mtry was optimised using a multiplicative factor of its default.

RF has inherent measures of variable importance and the ability to estimate prediction accuracy [111]. RF estimates prediction accuracy by cross-validating the bagging sample with a third of the data being excluded from the sample [104]. The classification error from these accuracy predictions is called the OOB error. Variable importance helps in understanding the relevance of each variable predictor in data classification. Variable importance measures in RF comprise mean variable importance/mean decrease accuracy (MDA) and Gini importance [103]. MDA was calculated using OOB observations and was used to rank elements used as input variables in the FVS model in this study. High MDA values indicate important variables in the classification, while low values represent variables which are less important in the classification. The Gini index was used to assess the importance of different spectral bands in discriminating different archaeological sites. High Gini impurity indicates heterogeneity between classes, while a lower Gini impurity indicates homogeneity within classes. Therefore, features with a less mean decrease in the Gini index are of less importance in the classification because they do not play any role in splitting the data into classes.

2.8. Accuracy Assessment

The accuracy assessment was done using a holdout dataset created by randomly dividing the laboratory data into 70% training and 30% testing before classifying it. Misclassification was assessed using the OOB error, which is an internal process of estimating the RF error. An error matrix was done to calculate the user’s accuracy, producer’s accuracy, and the overall accuracy for the assessment of classification accuracy of the RF classifier. Kappa coefficient was used to evaluate the agreement between the reference data and the classifier because of its ability to compensate for chance agreement [114]. A perfect agreement is achieved if the kappa value is one or close to one [115].

3. Results

3.1. Statistical Analysis

This study used statistical methods to assess the concentration of elements within natural soils, middens, vitrified dung, and nonvitrified dung. The soil analysis was done on samples taken from each of the abovementioned classes. The descriptive statistics for the concentration of different elements are summarized in Tables 14 for natural soils, vitrified dung, nonvitrified dung, and middens, respectively. The statistical analysis revealed high variability of the concentrations of most elements within each soil class with the coefficient of variation between 12% and 50%. The concentrations of Zn and Cu within the middens were very highly variable with a coefficient variation of 53.6% and 56.9%, respectively. The concentrations of most elements across different classes are also highly variable. The mean concentrations of P, Mn, Ca, Mg, Zn, and Ba were very high in vitrified dung. Nonvitrified dung and middens also had comparatively higher concentrations of the abovementioned elements than nonsites (Tables 14).

The results obtained using Welch’s ANOVA () demonstrate that there are significant differences between the means of different elements across different soil classes. Comparisons of the average concentration of individual elements between pairs of soil classes have shown that only the concentration of phosphorus (P) was significantly different across all four classes vitrified dung, nonvitrified dung, middens, and natural soils (Games–Howell, ). P has shown significant p values, ranging from 0.004 between middens and nonvitrified dung to 0.000 between middens and nonsites. Other elements such as Mg, Na, Be, Fe, Mn, S, and Zn also recorded insignificant differences between two or more soil classes. For example, Mg has shown significant p values, ranging from 0.003 between middens and nonsites to 0.000 between middens and vitrified dung. However, an insignificant p value of 0.379 for the Mg element was recorded between middens and nonvitrified dung.

3.2. Variable Importance and Measurement

MDA measure inbuilt within the RF classification algorithm was used to measure the importance of each element in discriminating amongst nonsites, middens, vitrified dung, and nonvitrified dung sites. Generally, Ca, P, and Sr were the top most important elements in discriminating among the aforementioned classes, as shown in Figure 4. Ga, Bi, and Th were the least important elements in discriminating amongst nonsites, middens, vitrified dung, and nonvitrified dung sites (Figure 4). When assessing the importance of each element in discriminating among individual classes, Sr, Al, and Ca were the most important elements in differentiating sites with vitrified dung from the rest of the sites (Figure 5). P and Mg were important in discriminating nonsites, middens, vitrified dung, and nonvitrified dung.

Basing on the measurements of variable importance (MDA) provided by RF, the FVS procedure was used to find the smallest set of elements that resulted in the highest predictive accuracy in classifying nonsites, middens, vitrified dung, and nonvitrified dung sites using RF. The optimal selected predictor variables (elements) with the lowest OOB error rate (15.38%) were P, Ca, Sr, Mg, Fe, Zn, and Co. When using all elements, the error rate increased to 17.31% (Figure 6). The selected elements were then used as input variables in a RF classifier model for mapping nonsites, middens, vitrified dung, and nonvitrified dung sites.

All the wavelengths captured using the field spectrometer were input into a RF classification model. Mean decrease in the Gini index in an ordinary RF was used to assess the importance of variables in differentiating between vitrified dung, natural soils, middens, and nonvitrified dung sites. In general, the highest mean decrease in the Gini index occurred in the wavelengths within the visible spectrum (350–576 nm), with 513 nm being the most important wavelength of them all. However, important variables cover a wide range of electromagnetic (EM) waves from visible wavelengths to the near-infrared wavelengths with notable peaks, between 350 and 576 nm, 1292 nm–1380 nm, 1575 and 1748 nm, and 1801 and 1808 nm (Figure 7).

The GRRF was used to select the best wavelengths for classifying vitrified dung, middens, nonvitrified dung, and natural soils. This was done using variable importance scores for each wavelength obtained from the normal RF to guide the selection of features in a regularised RF. The best-selected wavelengths were 549 nm and 624 nm within the visible spectrum, while within the near-infrared, 996 nm, 1026 nm, 1665 nm, 1774 nm, 1934 nm, and 2290 nm were chosen, as shown in Figure 8. The selected wavelengths were then inputted into the RF classifier to map nonsites, middens, vitrified dung, and nonvitrified dung sites.

3.3. Accuracy Assessment

The prediction ability of RF was tested using optimal elements selected by the FVS procedure and all elements. A holdout dataset, which was created by randomly dividing data into training (70%) and testing (30%), was used to test the accuracy of both models. RF was optimised using the grid search with the optimum combination of Mtry and Ntree achieving the lowest OOB error of about 15.38%. The findings demonstrate that the sites can be more accurately mapped using the selected seven bands than when using all the thirty-three elements (Table 5). The user’s and producer’s accuracies of the two models are compared in Table 6.

Classifications on the ordinary RF algorithm were also done using optimal bands selected by the GRRF model. The optimum combination of Mtry and Ntree yielded the lowest OOB error of about 0.11. Overall, the classification model achieved an accuracy of 84.76% when using all wavelengths. However, a higher overall accuracy of 87% was achieved when using the optimal bands selected by GRRF (Table 7). Table 8 shows a comparison between the user’s and producer’s accuracies of the aforementioned datasets.

A combination of optimum elements and bands were also put into the ordinary RF classifier to see if they can improve the classification accuracy. The results show that a combination of optimal elements and bands produce a better classification accuracy than predictive models for all bands, all elements, and when only selected elements are used. The model achieved an overall accuracy of 85.71% (Table 9). The optimal Mtry and Ntree for the model produced the lowest OOB error of about 14.29%.

4. Discussion

The main aim of this study was to assess the possibility of using hyperspectral data to discriminate nonsites, middens, nonvitrified dung, and vitrified dung site characteristics of areas previously occupied by farming communities. It also assessed if the aforementioned classes can be distinguished based on their chemical composition. The findings of this study have shown that there is a significant difference in the composition of elements characterising archaeological features. Most importantly, it has also shown that remote sensing techniques can be used to map surface archaeological features. This is an important development in the archaeological survey because the use of remote sensing techniques will enable the fast and cheaper documentation of sites over large areas.

Overall, the results for statistical analysis indicate that there is a significant difference in the concentration of elements in nonsites, middens, vitrified dung, and nonvitrified dung sites. This is because different anthropogenic activities have different effects on the composition of soil elements [15, 116]. The concentration of P was significantly different across the nonsites, middens, vitrified dung, and nonvitrified dung byre. On average, vitrified dung had high concentrations of P followed by nonvitrified dung, middens, and nonsites. This is unsurprising since phosphorus is widely used in archaeological research as an indicator of different human activities [117, 118]. Phosphorous is incorporated into the soil by a number of human activities, which include food preparation, garbage disposal, and animal dung deposits; therefore, there is a need to analyse its concentration in combination with that of other elements in order to identify different activity areas [10, 116]. In this study, phosphorus was incorporated into the soil by deposits of animal dung in byres and ash and organic deposits in middens. Other elements such as Ca and Mg were significantly different among vitrified dung and the nonsites but were not significantly different between middens and nonvitrified dung. Generally, Mg and Ca were highly concentrated in activity areas with vitrified dung, nonvitrified dung, and middens than in nonactivity areas characterised by nonsites. The concentrations of the aforementioned elements within soil classes were also highly variable. This is also supported by the findings from Luzzadder-Beach et al. [5] on the concentration of elements on anthropogenic activity areas in Turkey and Mexico and Huffman et al. [6] on the concentrations of Ca in vitrified dung deposits within the study area. This is influenced by the differences in the concentration of Ca and Mg in grasses consumed by animals and wood ash, which increase their levels in the soil [68, 119, 120]. The differences in concentrations of elements within the same class can also be a result of depositional and postdepositional processes, which affect the overall concentration of elements in the soil, such as human activities, erosion, and leaching [6, 7, 121]. On the contrary, potassium (K) is insignificantly different across all the classes except vitrified dung. This supports Huffman et al. [6] findings that K2O was significantly higher in vitrified dung than in nonvitrified dung. It is still not yet clear as to what causes the high levels of K in the vitrified dung deposits; however, Huffman et al. [6] attributed some of it to mopane logs used to construct the fence around the kraal.

The FVS procedure basing on the importance score of features calculated by ordinary RF was used to select a subset of key elements that can accurately discriminate among different archaeological sites. P, Ca, Sr, Mg, Fe, Zn, and Co were chosen as the optimal elements for discriminating among middens, nonsites, nonvitrified dung, and vitrified dung sites. This is in line with studies by Wilson et al. [2] who found out that the composition of the aforementioned elements is affected by the presence of human activity areas such as middens and kraals. The concentration of the selected elements was also significantly different across different classes as discussed above. The classification done on the RF classifier using chosen optimal elements as input variables in the RF classifier yielded high classification accuracy. This demonstrated that classification algorithms can also be used to predict the sites using their chemical composition. This also confirms findings by Oonk and Spijker [14] that archaeological sites can be predicted by using elements as input variables in classification algorithms.

The spectral analysis results obtained from this study has shown that field spectroscopy data can be utilised to discriminate nonsites, middens, nonvitrified dung, and vitrified dung from each other. The GRRF model was used to reduce high dimensionality in the dataset [66, 68]. This algorithm has produced good results in vegetation mapping but has never been tested in soil analysis [67, 68]. The algorithm selected eight important wavelengths across VIS/IR spectrum, 549 nm, 624 nm, 996 nm, 1026 nm, 1665 nm, 1774 nm, 1934 nm, and 2290 nm, for discriminating among nonsites, middens, nonvitrified dung, and vitrified dung. The selection of wavelengths from the visible spectrum (549 nm and 624 nm) might be influenced by differences in soil organic content. This is consistent with the results of studies by Bartholomeus et al. [122] and Nolet et al. [123] who found that there is a correlation between the absorption of wavelengths within the visible spectrum and the amount of organic matter in the soil. Soil organic content is also associated with soil colour. This is because a high concentration of organic matter results in dark soils and high wavelength absorption [124]. Wavelengths 996 nm and 1026 nm in the near-infrared region can be correlated with the concentration of elements such as Mg and Ca or their compounds. This result is supported by Thomasson et al. [125] who found out that concentration Ca and Mg levels in soil are sensitive to spectral regions between 950 and 1500 nm. The absorptions at other four selected wavelengths 1665 nm, 1774 nm, 1934 nm, and 2290 nm can be associated with the concentration of phosphorus in the samples. The increased levels of phosphorous highly correlate with the absorption of wavelengths between 1500 nm and 2500 nm [126]. Concentrations of calcium phosphate and magnesium phosphates in the soil also show high correlation with reflectance spectra in the aforementioned wavelength regions [127]. However, it has to be noted that different wavelengths within the same spectral region from those chosen by other models in other researches might have been chosen because the model was trained to remove correlation within the bands.

In general, high classification accuracies were achieved when using optimal bands selected by GRRF than when using all the 2151 bands. Low classification accuracy achieved when using all the 2151 bands is caused by high autocorrelation inherent within the field spectroscopy data, which affects the performance of the prediction models [101, 128, 129]. The results of this study demonstrate that removing correlated variables improves classifiers prediction accuracy. Most importantly, high classification accuracies achieved using a subset of optimal wavelengths selected by the GRRF affirms its ability to select important wavelengths that improve the prediction accuracies of RF in classifying archaeological sites [67, 68]. The combination of key elements (n = 7) and key wavelengths (n = 8) has yielded lower classification accuracy compared to the use of key elements (n = 7) and key wavelengths (n = 8) separately. Low classification accuracy achieved when combining key elements with key wavelengths was caused by data redundancy which resulted in noisy classification output [101, 128, 130]. The redundancy comes from the fact that wavelengths provide information about both the physical and chemical properties of the soil [31, 126], which means that combining the elements and spectral data in this study produced redundant data. Above all, the wavelengths selected by the GRRF model corresponds with the band placement of some spaceborne and airborne sensors such as WorldView-2, Landsat 8 OLI Airborne Visible Infrared Imaging Spectrometer (AVIRIS), Hymap, Hyperion, Airborne Prism Experiment (APEX), and Compact High Resolution Imaging Spectrometer (Chris) [131135]. However, spectrometer captures data in finer details with less noise and narrow spectral bands as compared to broad bands captured by common airborne and spaceborne sensors. This is because the sensor abilities, such as spatial sampling and signal-to-noise ratio, decline with increasing altitude [29, 136138]. The aforementioned tradeoff in spectral and spatial resolution may hinder the use of airborne and spaceborne sensors in mapping most of archaeological features because of their subtle spectral differences and small nature compared to the pixel size [139]. Hence, the need to carry out a test study identifying a suitable airborne or spaceborne sensor for mapping archaeological features in the study area.

Generally, high classification accuracies achieved in this study show that it is possible to directly detect archaeological features such as middens, vitrified byres, and nonvitrified byres using field spectroscopy data. This, therefore, promises a cost-effective method, which can be used to carry out archaeological surveys over large areas within a short period when compared to the use of element data and field walking surveys. Surveying using remote sensing techniques will also make documentation and monitoring of archaeological sites located in inaccessible areas such as war zones and places with dangerous wild animals easy.

5. Conclusion

The focus of this study was to investigate whether field spectra measurement can discriminate amongst archaeological sites using soil properties as indicators and identify the important bands for doing so. Statistical methods were used to assess if there is a significant difference in the concentration of elements between archaeological features and natural soils. Based on the outcomes of this study, the following inferences can be made:(1)There is a significant difference in the concentration of elements between nonsites, middens, nonvitrified dung, and vitrified dung sites. This difference in the composition of elements within the aforementioned features can be used to discriminate among them when input into a classification algorithm. P, Ca, Sr, Mg, Fe, Zn, and Co were identified as the important elements for discriminating among nonsites, middens, nonvitrified dung, and vitrified dung sites when used as input variables in a classification model.(2)Field spectroscopy data have the ability to discriminate between nonsites, middens, nonvitrified dung, and vitrified dung sites. This means that nonvitrified dung, vitrified dung, middens, and nonsites have different spectral signatures.(3)Wavelengths within visible-near-infrared spectrum can be used to discriminate among natural soils, middens, vitrified dung, and nonvitrified dung byres. A subset of eight important bands that gave the highest classification accuracy when discriminating among the aforementioned classes was identified across the visible-near-infrared spectrum using GRRF. These were 549 nm and 624 nm within the visible spectrum while within the near-infrared wavelengths 996 nm, 1026 nm, 1665 nm, 1774 nm, 1934 nm, and 2290 nm were chosen.

In summary, the results of this study have shown that there is chemical contrast between archaeological features such as middens, vitrified dung byres, and nonvitrified dung byres and natural soils which make it possible for field spectroscopy to discriminate among them. As such, the potential of remote sensing in detecting and mapping archaeological features with distinct soil physical and chemical characteristics such as the ones used in this study is present. Although the potential is present, further studies are needed to upscale field spectral measurements to different sensor spectral resolutions to ascertain which satellite sensor has the optimum wavelengths for detecting the archaeological sites characterised by middens, vitrified dung, and nonvitrified dung.

Data Availability

The data on soil elements and spectral properties used to support the findings of this study have not been made available because we are still using it for further research.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

Thanks to the training department at the University of Botswana for supporting this research. Gratitude is also given to SANParks for allowing access to Mapungubwe National Park and the DeBeers Group (through Duncan MacFadyen) for allowing access to Venetia Nature Reserve and to use the research facility. The authors are also grateful to Prof. Thomas Huffman for availing his data and devoting his time by taking us through the study area and Ihnos Dhau for his assistance in fieldwork during data collection and data processing in the lab. Special thanks are due to Lesego Madisha (former archaeologist at SANParks, Mapungubwe) for her kind assistance, SANParks’ Cultural Heritage Manager Crispen Chauke, and the Venetia Nature Reserve staff for their help. This work was funded by the University of Botswana and the University of the Witwatersrand.