#### Abstract

Soil-vegetation interrelationships in a secondary forest of South-Southern Nigeria were studied using principal component analysis (PCA) and canonical correlation analysis (CCA). The grid system of vegetation sampling was employed to randomly collect vegetation and soil data from fifteen quadrats of 10 m × 10 m. PCA result showed that exchangeable sodium, organic matter, cation exchange capacity, exchangeable calcium, and sand content were the major soil properties sustaining the regenerative capacity and luxuriant characteristics of the secondary forest, while tree size and tree density constituted the main vegetation parameters protecting and enriching the soil for its continuous support to the vegetation after decades of anthropogenic disturbance (food crop cultivation and illegal logging activities) before its acquisition and subsequent preservation by the Cross River State government in 2003. In addition, canonical correlation analysis showed result similar to PCA, as it indicated a pattern of relationship between soil and vegetation. The only retained canonical variate revealed a positive interrelationship between organic matter and tree size as well as an inverse relationship between organic matter and tree density. These extracted soil and vegetation variables are indeed significantly important in explaining soil-vegetation interrelationships in the highly regenerative secondary forest.

#### 1. Introduction

Soil and vegetation exhibit an integral relationship, in that soil gives support (moisture, nutrient, and anchorage) to vegetation to grow effectively on the one hand, and on the other hand, vegetation provides protective cover for soil, suppresses soil erosion, and helps to maintain soil nutrient through litter accumulation and subsequent decay (nutrient cycling). Hence, vegetation and soil are interrelated and provide reciprocal effects on each other. Vegetation supports critical functions in an ecosystem at different spatial scales. Vegetation strongly affects soil characteristics, including soil volume, chemistry, and texture, which feed-back to affect various vegetation characteristics, including productivity, structure, and floristic composition [1].

Soil nevertheless is fundamental to ecosystem and agricultural sustainability and production because it supplies many of the essential requirements for plant growth like water, nutrients, anchorage, oxygen for roots, and moderated temperature [2, 3]. Soil serves a vital function in nature, providing nutrients for plant to grow as well as habitat for millions of micro- and macro-organisms. Healthy soil enables vegetation to flourish, releases oxygen, holds water and diminishes destructive storm runoff, breaks down waste materials, binds and breaks down pollutants, and serves as the first course in the larger food chain [4, 5]. The disturbance, compaction, and degradation of soils impact the soil structure and reduce its ability to provide these functions.

The lowland secondary forest of Tinapa Resort is a regenerative forest approaching climax, after series of anthropogenic disturbances notably food crop cultivation, fuelwood gathering, and illegal logging activities. The forest vegetation over nine years of abandonment following its acquisition by the Cross River State government is characterized by a luxuriant forest canopy and diverse tree species. There is therefore a need to identify the basic soil and vegetation parameters encouraging the regenerative capacity of this once degraded forest and impoverished soil [6]. This reciprocal relationship between soil and vegetation demands a multivariate approach in order to determine critical vegetation and soil properties that sustain this integrative association. On this premise, multivariate analytical techniques (principal component analysis, canonical correlation analysis, factor analysis, and canonical correspondence analysis among others) are very useful in the analysis of soil and vegetation as each consists of data corresponding to a large number of variables. Thus, analysis via these techniques produces easily interpretable results [7].

Studies explaining the reciprocal effect of soil and vegetation have been conducted in the past by scholars in the fields of ecology, geography, forestry, and soil science using varying multivariate approaches. For instance, studies on soil-vegetation relationships of saline localities have been documented [8–14]; studies on soil-vegetation relationships in tropical rainforests have also been conducted [15–17]. However, in Nigeria, studies on soil-vegetation interrelationships in the rainforest zones show locational bias, as most of the studies were carried out in the South-Western ecological zone [18].

Only perhaps the study by Ukpong [19] used multivariate analyses to determine soil-vegetation interrelationships in the coastal mangrove swamps of Cross River, however, multivariate analysis of soil-vegetation interrelationships using PCA in the rainforest belt of Cross River State has not been fully documented in the literature. It is on this background that the present study attempts to analyse soil-vegetation interrelationships in the secondary forest of Tinapa Resort, Cross River State. The aim is to identify significant soil properties that influence vegetation regeneration and productivity as well as vegetation parameters that help to protect and nourish the soil to sustain its luxuriant outlook.

#### 2. Materials and Methods

##### 2.1. Study Area

Tinapa is located on latitude 05° 02′ and 05° 04′ N and on longitude 08° 07′ and 08° 22′ E. The area falls along the coastal fringes of Cross River State where raining season lasts for about 10 months. The proximity of the Atlantic Ocean has a moderating effect on temperature with highest average daily maximum of 35°C and recorded mean actual temperature of 26°C. The area has an average relative humidity of 80–90% at 10 am during the wet season [20]. The vegetation of the area is a mixture of mangrove and rainforest. The rainforest is further subdivided into the lowland rainforest and the freshwater swamp forest. The mangrove swamp is found in the southern fringe of the area and stretches from the freshwater limits to the ocean beaches [6]. The rich and luxuriant vegetation of the area has been subjected to severe degradation in the past before the advent of Tinapa; as such, the vegetation comprised secondary forest approaching climax. The soils are generally deep, porous and weakly structured, and well drained with low moderate status [20].

##### 2.2. Sampling Procedure and Analysis

The grid system of vegetation sampling was used to superimpose grids of 2 cm × 2 cm on the lowland forest using the vegetation map of the area, the grid intersections were numbered, and then 15 grids were randomly selected. This approach was used to establish fifteen quadrats of 10 m × 10 m in dimension across the area; and in each quadrat, vegetation and soil samples were collected. The floristic and structural vegetation samples determined included tree density, species composition, tree height, tree size/girth, vegetation/crown cover, species diversity index, and aboveground biomass. Data on vegetation/crown cover was obtained using the line-intercept method [21–23]. Tree size/girth was taken at 1.37 m DBH. Species diversity index was determined using Shannon-Wiener’s approaches [24, 25]. Tree height was determined using the trigonometry method [26, 27]. The above-ground biomass was estimated using the allometric formula given by FAO (1989) for tropical areas as cited by Woomer [28] as *y* = exp** ^{(-2.134+2.53 InD)}**, where

*y*= above-ground biomass in kilogrammes, exp = 2.71828, and

*D*is the measured diameter at breast height in cm. However, during the collection of vegetation parameters, only mature trees with ≥ 0.30 m girth were enumerated and analyzed.

In the same way, fifteen (15) soil samples were collected using a soil auger at rooting depth of 30 cm. The soils were put in polythene bags with label; they were thereafter air dried and taken to the laboratory at the Department of Agronomy, University of Ibadan, Ibadan for analysis of soil physical and chemical properties. Particle size composition was determined using the hydrometer method [29]; organic carbon by the Walkley-Black method [30], after which values obtained were multiplied by 1.72 [31] to convert to organic matter; total nitrogen by the Kjeldahl method [32]; available phosphorus was determined by the method of Bray and Kurtz [33]. The soils were leached with 1 N neutral ammonium acetate to obtain leachates used to determine exchangeable bases adapted from the method described by Daly et al., [34]. Soil cation exchange capacity was determined by the summation method, while pH values were determined using a glass electrode testronic digital pH meter with a soil : water ratio of 1 : 2.

##### 2.3. Data Analysis

Two data matrices representing soil and vegetation characteristics were constructed and the SPSS for windows (Ver. 17.0) and SAS (ver. 9.0) software packages were used for performing principal component analysis (PCA) and canonical correlation analysis (CCA), respectively. PCA was performed to find the main factors determining the reciprocal effects of soil and vegetation. Principal components according to Li et al., [14], are considered useful if their cumulative percentage of variance approached 80%. The scores of rotated component loadings (correlation coefficients) from the PCA output were used to determine the main soil and vegetation components sustaining the regeneration, enrichment, moisture content, and productivity of forest vegetation. The rotated component loadings for the variables were determined using Varimax rotation (variance maximization); this method was applied as it helps to minimize the complexity of the components by making large loadings larger and smaller loadings smaller within each component [7, 35, 36]. The idea of Varimax rotation is that each variable should load heavily on few components as possible to make interpretation easier [37, 38]. Variables were also rotated to obtain new significant and uncorrelated variables called principal components or principal axes, and thereafter, the number of principal components was reduced by eliminating relatively unimportant components [7].

In order to determine main components, only principal components with eigenvalues greater than 1 were selected; components with an eigenvalue of less than 1 accounted for less variance than did the original variable (which had a variance of 1), and so were of little use, as such were not extracted. From each extracted component, variables with coefficients ≥ ±0.70 were selected and considered significant (Johntson, 1980 and Wotling et al.*,* 2003 quoted in [39]). However, in order to determine the basic soil and vegetation variables sustaining these interrelationships, the component defining variables (CDVs), that is, those variables with the highest loadings (correlation coefficients) on each extracted principal components for soil and vegetation, were selected to represent the extracted components because they provide the best relationship [40].

What this means is that on every extracted component for soil and vegetation, variable with the highest coefficient, for example, on component 1, tree size was selected as the most significant variable (for either soil or vegetation variable) to represent that component. In addition, canonical correlation analysis (CCA) was performed to examine the main ways in which the properties of soil were related to those of vegetation. The extracted components of soil and vegetation were used to form pairs of linear combinations of the two variables in such a way as to maximize the correlation between each pair. This analysis provides a clearer picture of the complex interrelationships between soil and vegetation variables. Theoretically, canonical correlation does not distinguish between predictor and criterion variables, but for this study, it was done to enhance interpretation. The soil variables were used as predictor variables, while vegetation parameters as criterion variables. The essence of canonical analysis is the formation of pairs of linear combinations of two sets of variables in such a way as to maximize the correlation between each pair.

#### 3. Results

##### 3.1. PCA Result on Vegetation Parameters

PCA was performed for seven (7) vegetation parameters across the fifteen (15) sampled quadrats in order to identify critical vegetation factors that protect and enrich the soil. Component loadings (correlation coefficients) and the variances (eigenvalues) for the various vegetation parameters were computed. Table 1 shows results of the ordinary component matrix of vegetation parameters with eigenvalues ≥1. It shows that three (3) vegetation parameters loaded heavily on component 1, the parameters/variables included tree size (0.88), above-ground biomass (0.85), and vegetation cover (081). This component accounted for 51.3% of the total variance in the vegetation parameters. On component II, only one variable, tree density (0.71) loaded heavily; this component accounted for 30.1% of the variation in the data set. The lack of spread of variable loadings across the two extracted components and also the overwhelming concentration of significant variables in component 1 affected interpretation as well as understanding of the vegetation data structure. In order to have a fair distribution of variables and to discover the set of vegetation parameters that helps to protect the soil for its continuous nutrient enrichment, the two extracted components were rotated using Varimax method (Table 2). However, the rotation did not affect the sum of eigenvalue (cumulative explanation) but altered the distribution of eigenvalue as well as assigned variable loadings to higher components (Tables 1 and 2).

The loadings of rotated components on vegetation parameters are depicted in Table 2. From the table, two components were extracted, and they accounted for 81.3% of the total variance in vegetation data set. Three vegetation parameters loaded heavily on component 1; they included tree size (0.97), above-ground biomass (0.96), and tree height (0.88). This component was regarded as measuring vegetation structure and accounted for 3.11 of the total eigenvalue loading and 44.5% variance in the linear combination of vegetation parameters; while in component II, three parameters also loaded heavily on it, these variables included tree density (0.96), species diversity (0.93), and species composition (0.72). This component exemplified floristic attributes, and it accounted for 2.58 total eigenvalue loading and 36.9% variance in vegetation dimension. These results based on the criteria of component defining coefficients (CDV) implied that the main vegetation parameters influencing and protecting the soil included tree size and tree density (Table 2).

##### 3.2. PCA Result on Soil Parameters

In the same manner, PCA was performed for thirteen (13) soil properties across fifteen (15) sampled quadrats to determine the main soil factors/variables that facilitate the regenerative capacity of the once disturbed forest to the state of climax. The ordinary component matrix of soil properties is shown in Table 3. Based on the significant threshold for variables, only one soil property loaded on component I, the variable was exchangeable potassium (0.71); this component accounted for 3.34 of eigenvalue loading and 25.7% of the variance in the soil data. Components II and III equally had single soil property; these included available phosphorus (0.77) and silt content (0.73), and they accounted for 18.6% and 14.7% of the variation in soil data, respectively. However, the remaining components (i.e., IV and V) had no soil property loaded on them based on the threshold that only component loadings ≥±0.70 are significant, but they accounted for 20.5% of the combined variation in the data set. Again, the lack of spread of component loadings across the five extracted components affected interpretation as well as understanding of the dimension in the soil data. For better distribution of component loadings and interpretation of soil structure, the five components were rotated (Table 4).

Table 4 depicts loadings of rotated components on soil properties. It shows that five components with eigenvalue loadings of ≥1 and above were extracted, and they accounted for 79.5% of the total variance in the original data set. On component 1, two soil properties, exchangeable sodium (0.90) and exchangeable magnesium (0.85), loaded heavily; this component measured soil cation concentration and as such accounted for 2.18 eigenvalue loading and 16.8% total variance in soil data set. On component II, two soil properties loaded heavily on it; these included organic matter (0.90) and total nitrogen (0.79); this component represented organic accumulation and as such accounted for 2.18 eigenvalue loading and 16.7% total variance in the linear combination of soil variable. Component III had also two soil properties that loaded heavily on it; this component accounted for 16.0% total explanation in the soil data.

The two soil properties identified on this component were cation exchange capacity (−0.84) and base saturation (0.81); this component in essence measured the effect of CEC content in the soil. Also, on component IV, only exchangeable calcium loaded heavily with coefficient value of 0.90. This component measured the effect of calcium content and accounted for 15.5% of the variance in soil data. In addition, two soil properties loaded heavily on component; they included sand content (−0.87) and clay content (0.82). This component exemplified soil texture and accounted for 14.4% of the total variation in the linear combination of soil variable. Based on this result and the criteria of CDV, the basic soil factors that influenced vegetation productivity and sustainability included exchangeable sodium, organic matter, cation exchange capacity, exchangeable calcium, and sand content.

##### 3.3. Canonical Correlation Analysis

Canonical correlation analysis (CCA) is one of the most general of the multivariate techniques that is used to investigate the overall correlation between two sets of variables. It examines the main ways in which two multivariate measures are related as well as the strength and nature of the interrelationships [18, 41]. The basic principle behind canonical correlation is determining how much variance in one set of variables is accounted for by the other set along one or more axes, which are orthogonal (uncorrelated). Unlike many other techniques, in CCA, there is no designation that one set of variables is independent and the other set is dependent, but for clarity, the predictor-criterion language is used. However, from the two variables, a linear combination is derived such that the association/relationship between them is maximum; these pairs of maximally correlated linear combinations are called canonical variates, [18, 42].

The results of correlating the five soil properties with the two dimensions of vegetation characteristics are shown in Table 5. The canonical correlations for the first and second canonical functions (or variates) were 0.99 and 0.97, respectively, which were significant using the Bartlett’s [43] test at 5 percent significance level. However, in the literature, the significance of the canonical correlation is believed to be insufficient in making valid conclusions, as there are contentious arguments on using the significance of canonical correlation to make conclusion as well as to determine the number of canonical variates or functions to retain for the purpose of making inference. The reason is that significance test tells us absolutely nothing about the magnitude of the relationship (i.e., it does not reveal the amount of variance shared by the two sets of variables), and its statistical significance is heavily influenced by sample size; as it is possible for the test to be statistically significant with large sample sizes (see [41, 44, 45]). On this note, the use of redundancy coefficient was suggested as it reveals the amount of variance shared by the two sets of variables. Redundancy coefficient or index is an asymmetric index that measures how much variance in one set of variables (say soil properties) is shared by the variability in the other set of variables (vegetation characteristics) [46]. However, the redundancy result in Table 5 shows that the redundancy coefficient for first canonical variate for soil properties indicated that 19 percent of the variance in vegetation characteristics on the first canonical variate was accounted for by the variability in soil properties; likewise, the redundancy coefficient for the second canonical variate for soil properties indicated that 12 percent of the variance in vegetation characteristics was accounted for by the variability in soil properties. The redundancy result for vegetation characteristics equally showed that 83 and 16 percent of the variance in soil properties on the first and second canonical variates were accounted for by the variability in vegetation characteristics. Based on the magnitude of relationships shared by the two sets of variables across the two canonical variates (considering the redundancy coefficient), the first canonical variate was chosen for further explanation, because it explained a large proportion of the variation in soil and vegetation dimensions. The results in Table 5 also showed that two canonical variates were extracted, and each is identified by soil and vegetation components with loadings exceeding 0.60. The first canonical variate of soil properties loaded positively and heavily on organic matter, while the first linear combination of vegetation characteristics loaded positively and heavily on tree size and negatively on tree density. This therefore implied that strong positive correlation existed between organic matter concentration and tree size, while the linear association between organic matter and tree density depicted an inverse relationship. In essence, the result of the first canonical function/axis showed that organic matter concentration and tree size were positively and directly related; implying that an increase in organic matter concentration in the soil would result in a corresponding increase in tree size and vice versa, while tree density and organic matter concentration showed an inverse relationship, meaning that increase in the density of trees and the continuous addition of nutrient through decomposition of large tree residue would reduce the amount of OM in the soil.

#### 4. Discussion and Conclusion

Indeed, with the aid of varimax-rotated principal component analysis, the large, intercorrelated soil and vegetation properties initially thought to determine soil-vegetation interrelationships had been reduced to fewer, uncorrelated and more important variables. This is because PCA is a fact-finding tool that reduces measurement problems, such as bias, and reduces the complexity of correlated data, as it extracts only variables that have significant contributions among a set of variables or principal components which account for most of the variance in the observed variables [47, 48]. However, the PCA result for this study showed that the reciprocal effects of soil and vegetation were influenced by seven sets of soil-vegetation variables. PCA identified two basic vegetation parameters that sustained and enriched the soil for its continuous support to vegetation regenerative capacity; these included tree size and tree density. These variables stood out as critical vegetation parameters protecting the soil from varying climatic conditions like heavy rainstorm, regulating soil erosion and maintenance of soil moisture among others. These parameters also helped to conserve soil fertility through biomass accumulation and subsequent decay; their combined effects to the soil system were significant through variation in the mineral contents of biomass or litter and the canopy hydrological effects.

Also, the PCA result showed that five fundamental soil properties sustained the regrowth and luxuriant characteristics of the secondary forest; they included exchangeable sodium, organic matter, cation exchange capacity, exchangeable calcium, and sand content. The findings of this study somehow corroborated that of Aweto [18] who identified, based on the threshold of ≥ ±0.70 organic matter status, pH, sand proportion, total nitrogen content, clay, silt, bulk density/porosity, and potassium as paramount soil components; and tree density, tree size/vegetal cover, nanophanerophytes, and species composition as vegetation components influencing soil-vegetation interrelationships. In addition, the result of the canonical correlation analysis indicated a pattern of relationship between soil and vegetation. The only retained canonical variate (the first canonical function) revealed a positive interrelationship between organic matter concentration and tree size as well as an inverse relationship between organic matter concentration and tree density.

This meant that organic matter and tree size were interrelated; the importance of tree size in this association is obvious, as an increase in tree size would lead to an increase in nutrient accumulation in the forest by increasing litter production and protecting the soil against accelerated nutrient (OM) destruction and subsequent loss through erosion. This implied that tree size helped to improve the content of organic matter in the soil by the addition of nutrient in solution form through stem flow as well as through the accumulation and decomposition of biomass (plant residue). Nevertheless, the inverse relationship between organic matter and tree density was expected as simple Pearson’s correlation indicated a negative correlation. The negative relationship was attributed to the high rate of OM addition through the accumulation of large plant residue. According to Foth [49], high organic matter contents in soils are the result of slow decomposition rates rather than high rates of organic matter addition. The present area enjoys high precipitation and temperature, which facilitates the quick decomposition of biomass, thereby increasing the rates of organic matter addition. The results of canonical analysis therefore indicated that organic matter and tree size were positively interrelated, while organic matter and tree density were inversely related. It therefore implied that organic matter and tree size were the major soil and vegetation variables sustaining as well as supporting the regenerative capacity of the secondary forest.

#### Acknowledgments

The authors would like to thank Mr. Igwebuike Ebuka for editing the paper and Mr. Paulinus Igenegbai of the University of Ibadan for analyzing the soil.