Tree height (H) and diameter at breast height (D) are key variables to calculate tree volume and biomass. We developed a height-diameter (H-D) model for Cinnamomum tamala by evaluating 18 nonlinear models. Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Root Mean Square Error (RMSE), mean bias, Mean Absolute Error (MAE), graphical appearance, and biological logic were the criteria used to evaluate the predictive performance of the models. Gompertz model (M14) performed the best for predicting the total height of C. tamala trees with the least RMSE (1.742 m), mean bias (0.012 m), and MAE (1.342 m) and satisfied model assumptions and biological logic. Validation data ranked the Gompertz model as the best model with RMSE (1.546 m), mean bias (-0.106 m), and MAE (1.149 m). Despite the consistent performance of the Gompertz model, it tended to underestimate the height prediction for taller (dominant crown class) and larger trees. Further work on refitting and validation of the proposed model with data from a larger geographic area, wider-ranging sites, and stand conditions is recommended.

1. Introduction

Tree height (H) and diameter at breast height (D) are fundamental variables in most forest inventories which are required to calculate tree volume, biomass, carbon storage, and survival analysis [14]. Accurate in situ measurement of D is easy and cost-effective. However, height measurement is labor-intensive, time-consuming, expensive, and prone to observational and measurement errors [5, 6]. Predicting forest dynamics through growth and yield simulation also requires individual tree level information, such as H and D. A height-diameter (H-D) relationship model can be easily built when both H and D variables are measured. The model then can be used to estimate missing tree height, biomass production, and stand dynamics.

The allometric equation [7] serves as a tool to relate easily measurable morphometric variables (e.g., D) to the total height of the tree. Conventionally, H-D relationship models have been developed and applied to pure even-aged stands or plantations using D as a predictor variable [1, 2]. Recent studies employ other stand attributes (e.g., site quality, stand age, stand density, and dominant height) in mixed-effect H-D relationship models [813]. In mixed effects H-D relationship models, population-averaged (fixed parameters common to population), as well as subject-specific effects as random effects, are allowed. Mixed-effect H-D relationship models improve accuracy over nonlinear models that are based on the minimization of sums of squares.

Generally, the site-specific H-D relationship models that are developed as H-D relationship can vary with differences in age, site quality, competition, stand age, and silvicultural treatment applied [1417]. Although stand specific H-D relationship models are labor-intensive, costly, and time-intensive, these models have been proved to be more accurate in the realistic description of forest structure, growth simulations, and plot-level volume [18]. Due to a limited geographical range of natural C. tamala stands, cross-sectional nature of data, and for simplicity of model use, we employed ordinary nonlinear least square equations. The model developed should be considered as a site-specific model as it was built considering fewer stands.

Cinnamomum tamala T Nees and Eberm (family Lauraceae) is a moderate-sized, evergreen, monoecious tree species distributed at 900–2000 m above sea level in the tropical and subtropical Himalayas [19]. Natural stands of C. tamala, being a moderate shade-tolerant species [20], are mostly found in shady-moist sites and grow well under full illumination. C. tamala is native to Nepal, China, India, and Bhutan [21]. C. tamala trees are extensively managed, in situ and ex situ, for leaf and bark production due to dwindling natural population and high economic potentials [22]. The bark is traded to India [19] and is often used as a substitute for Cinnamomum zeylanicum [23]. Department of forests, Nepal [24], has prioritized C. tamala for research and management because C. tamala bark forms a significant part of national nontimber forest product trade, by both volume and economic value.

Despite receiving a priority for research and management in Nepal and conservation in India, studies on C. tamala forests are limited. Since bark is a highly traded product, H-D relationship model is an essential decision-making tool both for management institutions and government for estimating forest volume and above-ground biomass, growth and yield modeling, and modeling ecophysiological process-based models of forests. However, research in this direction is limited within the distribution range of C. tamala and has mostly focused on conservation, ethnopharmacological properties, dispersion pattern, cultivation, and harvesting practices [2529]. In Nepal, Tree Improvement and Silviculture Component, TISC [30], and Poudel et al. [31] developed bark biomass production models in farm-grown trees and natural forests [32]. Equations for the H-D relationship for Cinnamomum tamala grown in the natural forest have not been reported in past studies. Tree height increases nonlinearly with stem diameter [33, 34]. Therefore, linear models are inadequate for predicting tree height [35]. In this paper, we evaluated 18 nonlinear models to develop the H-D relationship model for C. tamala trees. To the best of our knowledge, this is probably the first attempt to establish an H-D relationship model for natural C. tamala forests. Our model would be beneficial to researchers, managers, and academicians within the distributional range of C. tamala. However, precaution should be used for accurate prediction of height outside the size and height ranges, site, and stand condition that differs from our study.

2. Materials and Methods

2.1. Study Area

The study was conducted at Mijure Danda Village Development Committee (VDC) of Kaski district, Nepal, which extends from 28°13′57′′ to 28°20′57′′N latitude and 84°08′53′′ to 84°12′42′′E longitude. The forest in the study area is managed by the Sikles unit of Annapurna Conservation Area Project (ACAP) and constitutes C. tamala as a dominant species having contiguous distribution pattern and with an Important Value Index (IVI) of 158.0 [27]. An IVI value was calculated by summing relative density, relative frequency, and relative dominance. The most frequent common associate species, with their IVI values, are Schima wallichii (77.89), Castanopsis indica (26.35, Madhuca latifolia (5.29), and Macaranga postulata (4.30) [27]. The study site falls under the humid subtropical monsoon climatic zone with hot and wet summers and relatively cold winters. The average monthly climate normal (1981-2010) temperature (minimum, maximum) to the nearest weather station was 21.1°C (13.4°C, 26.1°C) and precipitation was 325 mm (17.8 mm, 940.3 mm). Average annual precipitation was about 3900 mm year−1 [47].

2.2. Data Collection and Processing

We collected the data through the inventory of natural stands of C. tamala, which includes 30 destructively sampled trees covering a wider range of H and D. Trees were selected for felling covering all the site conditions and tree size (H and D) distributions across the study sites in and outside of sampling plots. Nondestructive sample data came from a phytosociological survey of same forests [27]. We used systematic sampling plots with a size of 10 m × 10 m. Each plot has 5 m × 5 m subplots at northeast and southwest corners and 1 m × 1 m subplots at four corners. Information on species such as H and D were collected from 10 m × 10 m and 5 m × 5 m plots. The combined dataset size (n = 179) is smaller compared to other studies (e.g., [12, 13, 18, 48, 49]), but it can be used to construct H-D relationship models [5052]. The combined dataset was split randomly into fitting (80%) and validation (20%) using R [53]. Summary statistics for H and D for both fitting and validation dataset are presented in Table 1.

2.3. Model Development and Evaluation

Initially, we selected 18 candidate nonlinear H-D relationship models (11 biparametric, and seven triparametric) based on a literature review (Table SA) to fit the model. We assumed that the total tree height is a function of diameter. In the later stage, we included only the models with all significant parameter estimates at 5% level of significance for further evaluation to compare their performance (Table 2). These models take the following functional form:where is the observation of the dependent variable tree height (m), is the observation of the independent variable diameter at breast height (cm), is the vector of parameter to be estimated, is the random error term, f (.) is a nonlinear function, and i is the observation with i = 1, 2, 3, …, n. A constant term 1.3 is the height of stem above ground at which D was measured. We added this term in the function to avoid prediction of zero H when D approaches zero.

The nonlinear function “nls2”, an improved version of the function nls, in statistical software R [54, 55] was used to estimate model parameters using the ordinary least square method. There are several criteria to assess model performance [56]. We evaluated model performance based on numeric analysis of the following: (1) asymptotic t-statistics of parameters, (2) Root Mean Square Error (RMSE: see (2)), also known as a measure of model’s precision, (3) mean bias (see (3)), and Mean Absolute Error (MAE: see (4)), and (4) information criteria—Akaike Information Criteria (AIC: see (5)) [56, 57] and Bayesian Information Criteria [58] (BIC: see (6)), followed by graphical presentation such as histogram, probability plots of residuals, and observed versus model predicted values. Criteria RMSE, AIC, BIC, Mean bias, and MAE are collectively called performance criteria. A coefficient of determination is not advisable to employ for assessing nonlinear regression. A statistic analogous to the coefficient of determination also called a fit index or modeling efficiency [14, 59] could be used. However, in this study, we did not use the fit index to evaluate our models as a fit index also suffers all the weaknesses of the coefficient of determination [14].where are the observed and predicted values of height, respectively; n is the total number of observations used in model fitting; ln is natural logarithm; RSS is the residual sums of squares; and k is the number of parameters in the model.

Models with least RMSE, AIC, BIC, mean bias, and MAE are considered to perform the best [60]. Reliability and validity of the model increase with its performance outside the test data [6163]. However, due to limitation of dataset, model validation with an independent dataset covering more extensive geographical regions and management condition across the distributional ranges of species can be a potential topic for future research.

We evaluated models based on the significance of model parameters, performance criteria, and practical application. After initial performance screening, models were subjected to further comparison by ranking the performance criteria and considering the practical application of the model. The models with the least value of performance criteria (RMSE, AIC, BIC, mean bias, and MAE) received the highest ranking and vice versa. The rank value of each criterion was added to get the total sum of ranking, which was then ranked in ascending order—the lower the value, the better the model. The model with the smallest rank value was considered the best H-D relationship model of C. tamala naturally grown in mid-hill of Nepal.

3. Results

3.1. Descriptive Statistics and Parameter Estimates

Model parameters were estimated using nonlinear least squares techniques for the model fitting dataset. Descriptive statistics includes mean ± st.dev (minimum, maximum) for both diameter and height [D (cm): 13.26 ± 7.72 and H (m): 8.07 ± 3.26] for fitting dataset; and [D (cm): 11.14 ± 7.47 and H (m): 8.07 ± 3.15 for validation dataset]. All models were evaluated against multiple model performance criteria. Almost all (15) models (Table 3) showed significant parameter estimates (p<0.05, unless otherwise mentioned) and were considered for further evaluation.

3.2. Model Performance and Selection

Parameter estimates of models indicate that the best model cannot be determined solely based on significant t-statistics. We additionally considered performance criteria values to assess the model’s performance. Table 4 presents the parameter estimates and fit statistics associated with the selected models for fitting and validation dataset.

Among 15 models, M14 produced the least bias (0.012 m), followed by M1 (0.016 m). The values of RMSE ranged from 1.742 m (M14) to 2.42 m (M15). M14 yielded the least MAE (1.342 m) and AIC , while M15 produced the highest AIC and BIC . M14 performed significantly better than other models, produced similar parameter estimates for validation dataset (Table 6), and similar fit statistics for fitting and validation dataset (Table 4).

We ranked models based on values of performance criteria and calculated average rank on sum ranks of performance criteria. M14 performed the best (rank = 1) for the fitting dataset, while M14, M12, and M4 performed the best with equal rank (rank = 1) for the validation dataset (Table 5). Although M14, M12, and M4 have identical rank for validation dataset, we considered M14 the best model because it performed the best (rank = 1) for fitting dataset too, while M4 and M12 were the third best models for fitting dataset. Parameter estimates of M4 and M12 for validation dataset (Table SB), graph of the model, and residual distribution (Figures SA and SB) confirm that the models M4 and M12 are the best alternatives to model M14. Interestingly, M12 an M4 yielded strikingly similar performance statistics (identical to the three-decimal place) for model fitting, validation, and combined dataset (Tables 4 and SB).

3.3. Residual Analysis and Shapiro-Wilk Test

To come up with the best model, we used test for residual normality. Residuals graphs (histograms, probability plots) and plot of fitted line overlaid on the observed heights data were produced for a visual check. Figure 1 presents the plot of residuals against the observed diameters of fitting data. Most of the models showed no systematic departure from random pattern.

The Shapiro-Wilk test of residuals of the best three models [M14- (W = 0.990, p = 0.4143) for fitting and (W = 0.978, p = 0.6858) for validation; M12- (W = 0.984, p = 0.627) for fitting and (W = 0.984, p = 0.871) for validation; M4- (W = 0.984, p = 0.627) for fitting and (W = 0.984, p = 0.871)] showed a p value greater than 0.05, which suggests that residuals do not violate the assumption of normal random error. Figure 2 shows curves of the best fitting model M14 for fitting and validation dataset and histogram of residuals with a superimposed normal curve. Normally, the model tended to overestimate for shorter trees with larger diameter size.

3.4. Paired t-Tests

The paired t-test of M14 with observed height values revealed insignificant statistics [t (142) = 0.084, p = 0.933]. Similarly, the paired t-test of observed height values with predicted heights of M12 [t (142) = 0.1734, p = 0.8625] and M4 [t (142) = 0.1368, p = 0.8914] for fitting dataset also resulted in statistically insignificant statistics. To determine the better model between M12 and M4, we compared the predicted height from both models between them and with the predicted height from the best model, i.e., M14. Our results of paired t-test to compare the predicted heights showed significant statistics between M4 and M12 [t (142) = 3.1143, p = 0.002] and between M12 and M14 [t (142) = -2.8968, p = 0.0044]. The M4 showed no significant difference in mean predicted heights from the M14 [t (142) = -0.8743, p = 0.3983]. This suggests that the M14 has more accurate prediction power followed by M4.

4. Discussion

Almost all the candidate models performed well in explaining H-D relationship for C. tamala trees grown under natural forest. Model selection based on multiple criteria suggested that M14, M12, and M4 were more effective in predicting C. tamala tree height compared to other candidate models. The Gompertz model (M14) demonstrated competitive performance statistics for both fitting and validation dataset. Although Gompertz model follows growth theory ([8], Nord-Larsen and Cao 2006; Schmidt et al. 2011), it generally underestimates tree heights [10, 52]. Several studies have used the Gompertz model to describe the height-diameter relationship (Rupšys et al. 2011; Özçelik et al. 2013; [52, 64]). Zhang (1997) selected six models for evaluating H-D relationship and found that the Gompertz model underestimates tree heights for larger sizes trees (D>100 cm). Similarly, in our case, the Gompertz model tended to underestimate heights (10.84%) for taller trees with a larger diameter (H ≥12 m and D ≥17 cm). C. tamala trees from dominant crown category grow taller, larger, and faster. Trees under shade allocate more biomass to diameter growth than height as height growth requires more competitive advantage [65, 66]. This indicates that C. tamala H-D relationship at our study site might have been mediated by shade tolerance or crown dominancy. Studies have shown that shade tolerance and climate jointly effect on allometric variation [6668]. However, assessing the possible variation in C. tamala H-D relationship is beyond the scope of this study due to the small size of our dataset.

H-D relationship models are often developed based on sampling data from permanent sample plots that are monitored for a relatively extended period [59, 64, 69, 70]. Studies that employed longitudinal data along with other plot-level covariates in nonlinear mixed H-D relationship models show improved model performance compared to ordinary nonlinear least square models [18, 52, 59, 71, 72]. Although the inclusion of stand variables may improve the prediction power of the model, this often requires greater sampling efforts especially when data collection involves an invasive sampling approach. Our study did not include such covariates because of the cross-sectional nature of data that were collected from both inside and outside of sample plots. Future researchers can test the validity of our model and improve the predictive performance utilizing stand and plot-level covariates when available.

Predictive performance of the models beyond the observed range was not possible as the fitting dataset has larger sizes and taller trees. Independent datasets from different geographical locations and management systems with larger and taller trees than those used in this study could confirm the predictive power of the model [14]. Motallebi and Kangur [73], using the annual height and diameter increment data of 114 trees (observation periods spanned from 40 to 115 years) from three European countries, indicated a variable allometric relationship (significant coefficients across sites) between tree height and diameter which did not follow elasticity or geometric scaling rules across sites. H-D relationship varies among ecoregions due to differences in bioclimatic conditions [48, 69, 74] and among management regimes [32]. Therefore, H-D relationship of species should be evaluated across climatic and management systems to understand the changes in the stem allometry of species. As C. tamala prefers moisture rich areas, water stress could change the H-D relationship. Therefore, the inclusion of water availability in original H-D relationship models and a mixed-effect modeling approach could further optimize the model’s performance. Since this study’s fitting and validation data came from same climatic and management conditions, validation of these models with independent data from different geographic, climatic, and management practices could be a potential topic for future research.

5. Conclusion

We calibrated and tested 18 nonlinear models on Cinnamomum tamala trees grown under the natural multilayer tree structure in mid-hills of Nepal. The best model was selected based on multiple performance criteria, error diagnostics, growth theory, and practical application. Results showed that the Gompertz model (M14) best fitted the fitting dataset and performed consistently on the validation dataset and can be used reliably in biomass estimation and management for C. tamala trees in the mid-hills of Nepal.

The existing studies on C. tamala height-diameter (H-D) modeling are limited to even-aged stands or plantation and have not attempted to develop H-D relationship models under natural forests. The methodology presented in this paper utilizes ordinary nonlinear models to generate H-D relationship in the uneven-aged natural forest of C. tamala. Since we considered diameter as the only predictor variable, caution should be applied while using our model to other forests stands that differ in site and stand conditions to the basis of this study. In future research, our model can be extended to other regions and management regimes (e.g., the tree outside forestry), following updates through refitting and validation against independent data from the broadest possible ranges of size, site, and stand conditions, including stand and plot attributes across the distributional ranges of C. tamala.

Data Availability

The data used in this paper came from phytosociological survey of Cinnamomum tamala forests (natural). Of the total trees used in the analysis, 30 trees were sampled destructively which was previously used in bark biomass production model. If required, raw dataset can be made available. R-code used in the analysis is presented at the end of the Supplementary Materials.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


We want to thank The National Trust for Nature Conservation/Annapurna Conservation Area Project (NTNC/ACAP), for providing financial support to this study. This work was also supported by the United State Department of Agriculture (USDA), National Institute of Food and Agriculture (NIFA), McIntire Stennis Project # 1017946. We extend our thanks to Conservation Area Management Committee members of Mijure danda Village Development Committee. Thanks are due to people who rendered assistance during field works, especially, Bishwa Kiran Giri, Hari Prasad Subedi, and Rishi Ranabhat.

Supplementary Materials

Table SA: selected 18 models. Table SB: parameter estimates and fit statistics of M4 and M12. Figure SA: fitted model and histogram for validation dataset of M4 and M12. Figure SB: fitted model and histogram for combined dataset of M4 and M12. R-Code used in the analysis. (Supplementary Materials)