Abstract

We propose a hardwood species identification method based on wood hyperspectral microscopic images. A SOC710VP hyperspectral stereomicroscope was used to acquire microscopic images of a hardwood cross section. In these microscopic images, each part’s spectral features are discussed. We found that the spectral divisibility of wood vessels’ peripheral and central regions in the hyperspectral microscopic images can be used for hardwood species recognition. Mathematical morphological operation and K-L divergence were used to extract spectral features at the wood vessels’ peripheral regions and central regions, respectively. By comparing wood vessels’ spectral similarity across wood species samples, we found that wood vessels’ peripheral spectral divisibility is larger than its central. Finally, the spectral information from randomly selected regions of interest (i.e., ROI) and that of wood vessels’ peripheral and central regions have value as a classification basis. In our hardwood species classification experiments, three dimensionality reduction algorithms, principal component analysis (PCA), kernel principal component analysis (KPCA), and multidimensional scaling (MDS), and the three classifiers, BP neural network, support vector machine (SVM), and Mahalanobis distance (MD), are combined to perform hardwood species classification work. Experimental results indicate that the best recognition effect can be achieved at the peripheral region of wood vessels using PCA or MDS with the MD algorithm.

1. Introduction

Wood is an important renewable natural resource. Different tree species result in many different wood products circulating in the market. In particular, many hardwood species are widely used to make furniture, and they cannot be identified by visual observation or odor discrimination. Hardwood misclassification may result in significant economic losses due to the differences in quality and price of different hardwood species.

With the rapid development of machine vision, it has become possible to identify the processed hardwood species by using intelligent methods. Pan and Kudo [1] and Yusof et al. [2] used microscopic images of wood cross sections to discuss the characteristics of different hardwood species. They showed that different hardwood species have differences in their wood’s vessel structure. Furthermore, some textural or statistical properties for wood vessels were extracted from the 2D wood images to perform hardwood species classification [3, 4]. Macroscopic wood features are also used in wood species recognition. For example, Hu et al. used a deep learning scheme for lumber grade classification in terms of lumber defects, textures, and species [5].

In recent years, spectral analysis schemes have been proposed for wood species classification [68]. These schemes are in essence spectrum-based classifications which mainly use 1D spectral curves. They usually have low computational complexity with fast speed due to the 1D data but may suffer from spectral noise and disturbance from the surrounding site. Therefore, wood species recognition with 3D hyperspectral imaging which includes both visual and spectral information of wood tissues may be a good choice. In fact, 3D hyperspectral imaging schemes have been used in agricultural engineering for food quality evaluation [9] and crop content predictions [10, 11].

In this paper, the microscopic hyperspectral images of hardwood cross sections were recorded using a SOC710VP hyperspectral stereomicroscope to investigate the possibility of hardwood species identifications. The spectral features of different wood tissues such as wood vessel, wood ray, and background are discussed. It was found that spectral information from the peripheral and central regions of wood vessels can be used for hardwood species identification.

2. Materials and Methods

2.1. Wood Samples

Six hardwood species were studied (Table 1). One thousand four hundred and forty wood samples were prepared from six species, and each sample was cut into a cube with length, width, and height of 2 cm, 2 cm, and 3 cm, respectively. Nine hundred samples were randomly selected as the training set, and the other 540 samples are used as the test set. Each sample’s cross sections () were cut using a disk saw to make the two surfaces as smooth as possible, while the radial sections and tangential sections () were cut using an electric plane. The wood sample’s cross sections were used for hyperspectral imaging. In these cross sections, the wood growth rings were as close to parallel with the section’s two edges as possible. Quercus rubra and Fraxinus americana had similar colors, while cross sections for the other four species had different colors.

The intraclass differences of wood samples are caused by the tree’s age, cutting position, and place of growth. For each wood species, our wood blocks are taken from different individual trees at random position in the cross section of the timber to include these intraclass differences. In this way, the reliability of our proposed scheme can be ensured.

2.2. Spectral Data Acquisition

The spectral acquisition instrument used for this research was a SOC710VP hyperspectral stereomicroscope, with 128 spectral bands and a spectral range of 372–1038 nm. To improve spectral acquisition, all wood sample surfaces were flat, without burrs, and the spectral acquisition environment was consistent and stable. The dimension of captured microscopic hyperspectral images was . To eliminate the influence of dark current, it was also necessary to perform black and white correction processing on the detected samples. The wood image’s enlargement factor was 45.

To control costs, we chose the visible/near infrared hyperspectral stereomicroscope as the experimental instrument. Spectral data in the visible region may vary because wood sample color may change with environmental variation. Therefore, we maintained uniform environmental conditions during the spectral experiments and wood sample preservation. In addition, 780–900 nm is the near infrared band, which best characterizes the different wood properties. A stereomicroscope was used to capture wood microscopic images conveniently for each wood sample. The wood slice is not used as its production is complex and time consuming.

2.3. Spectral Analysis of Different Tissues of Hardwood Microscopic Hyperspectral Images

The microscopic hyperspectral images of cross sections mainly include wood vessels, wood rays, and axial parenchyma, as shown in Figure 1. The closed holes are wood vessels which are important features for species recognition [12, 13]. The dense, thin, approximately parallel lines are called wood rays, while the thick lines approximately perpendicular to the wood rays are the axial parenchyma.

Hyperspectral images are 3D cubes which have both spatial information and spectral information, and the spectral curves of different wood tissues for the same hardwood species are different. For example, Figure 2 illustrates the spectral curves of Intsia bijuga in the central and peripheral regions of vessels, in the regions of wood rays, axial parenchyma, and the background. Since vessels are the most important features in hardwood species identification [1, 3, 4, 12], the spectral information in the vessels’ central and peripheral regions is extracted and discussed.

2.4. Spectral Analysis in Vessel Peripheral Regions with Mathematical Morphology

Mathematical morphology is widely used in image processing, which can be used to extract useful image components to enhance ROI and suppress background regions. We denote the original image, background image, and ROI image as , , and , respectively. The structural element with a specified shape is denoted as A. The background image and ROI image can be expressed as follows:where represents the morphological opening operation.

In order to eliminate noise, a corrosion operation is required for the obtained . Then, the binary image is formed by using threshold processing to . The gray-level image for each band of hyperspectral image is processed using the abovementioned procedures to obtain the 128 binary images which form a cube of dimensions (i.e., are the width and height of hyperspectral image).

This cube can be regarded as vectors of 128 dimensions, and the element in every vector is either 0 or 1. The number of element 1 in every vector is registered so as to form a matrix D of dimensions. A threshold k is defined to convert the matrix D into a binary image E as follows, which is illustrated in Figure 3.

For the binary image E, the spectra are extracted whose corresponding pixel value is 1 so as to form a new matrix X of dimensions, while represents the number of element 1 in binary image E.

The abovementioned image and spectral processing operations were performed to a ROI in a hyperspectral image of one sample. In this work, the size of ROI is set as . In a wood hyperspectral image of dimensions, the initial ROI is placed in the upper-left position in the image to obtain the computed matrix X. Then, the ROI moves gradually from left to right and from top to bottom to traverse the whole hyperspectral image to obtain multiple matrices . These matrices are connected by columns to obtain a matrix with a dimension of , and the mean value for each column is computed to form a new vector of 128 dimensions. For a hardwood species with m hyperspectral image specimens, we may obtain multiple vectors and calculate the mean vector . As an example, Figure 4 illustrates the mean spectral vector for the six hardwood species in our measurements in the wood vessels’ peripheral regions. The training samples are used to calculate with the ROI of and the structural element A of disk shape with the size r. In Figure 4, we can see that the mean spectral curves for training sets in wood vessels’ peripheral regions are different for the six hardwood species. Therefore, this spectral difference can be used to identify hardwood species. Since different hardwood species have different vessel sizes, the structural element A with different size r is used for the 6 hardwood species. This is explained in detail in Section 3.4.

2.5. Spectral Analysis in Wood Vessels’ Central Regions with K-L Divergence

K-L divergence is widely used in information theory as a signal similarity measuring tool [14]. We suppose the probability distributions of two discrete random signals as and , respectively. The K-L divergence for Y to X is then defined as follows:

Therefore, the difference between two band images of a hyperspectral image cube can be measured using the K-L divergence equation (3).

Suppose a hyperspectral image cube , where and L, N represent the number of spectral bands and pixels of a 2D band image, respectively. After normalization of data in X, the K-L divergence of the jth band image relative to the ith band image can be defined as follows:

In this way, we can calculate the K-L divergence of two band images to obtain a symmetric matrix D. The ith row of this matrix is defined as , and the mean value of its L elements represents, to some extent, the information entropy of the ith band image. Therefore, we may use the 10 spectral bands with the 10 largest entropies to form an image cube (Figure 5(a)) and its complementary image cube (Figure 5(b)). This cube is classified into 4 parts by use of a K-means clustering algorithm so that the spatial pixels in the original wood hyperspectral image cube are classified into 4 parts, as illustrated in Figures 5(c)5(f).

As shown in Figure 5, the white part in Figure 5(d) is the wood vessels’ central region, which is segmented accurately. Figures 5(c), 5(e) and 5(f) indicate the wood vessels’ peripheral and background regions. In fact, the wood vessels’ peripheral region is clustered separately into different parts, as illustrated in Figures 5(c), 5(e) and 5(f). Therefore, this classification is not suitable for accurately extracting the wood vessels’ peripheral region. In order to more accurately segment the vessels’ central region, it is necessary to perform a morphological corrosion operation on the white sections in Figure 5(d).

The clustering process in Figure 5 has two problems. Firstly, selecting the ROI randomly in a hyperspectral image does not ensure location of the wood vessels’ central region. Secondly, as can be seen from Figure 5, the K-means cluster results in 4 parts (i.e., Figures 5(c)5(f)), but only Figure 5(d) represents the wood vessels’ central region.

To solve these problems, we propose an automatic identification scheme for the wood vessels’ central region. For a ROI W of size , the binary images produced by K-means cluster are . The selected 10 spectral bands obtained from K-L divergence are , and the corresponding binary images for these 10 band images are denoted as . Here represents the occurrence number of p in matrix X, and represents the number of identical elements in the same position of matrices X and W. The 4 measurement parameters are defined as follows:

Here is defined as follows:

We calculated for the binary image with equation (8) and define and a threshold T. When , the corresponding binary image includes the wood vessels’ central region. Using this approach, the ROI moves gradually from left to right and from top to bottom with a fixed step length S to traverse the whole hyperspectral image. In this way, we locate all binary images and include the wood vessels’ central region. These image matrices are connected by columns to obtain a matrix with a dimension of , and the mean value of each column is computed to form a new vector of 128 dimensions. For a hardwood species with m hyperspectral image specimens, we derive multiple vectors and calculate the mean vector . Figure 6 illustrates the mean spectral vector for the six hardwood species in our studies of the wood vessels’ central regions. The training samples are used to calculate with a ROI size of .

3. Results and Discussion

3.1. Spectral Difference Analysis for Wood Vessels’ Peripheral Region, Central Region, and Randomly Selected ROI

We may use spectral properties of a specific wood region or tissue to guide wood species identification using both interclass and intraclass differences. It is obvious that large interclass differences and small intraclass differences will produce more reliable results. These two differences can be represented by the spectral curve’s interclass and intraclass distances. Therefore, here, we use the divergence matrix to indicate the spectral curve’s interclass and intraclass differences. We define the intraclass divergence matrix for the jth wood species as follows:

Here ; represents the kth sample for the jth species, represents the mean value of all samples for the jth species, and represents the number of all samples for the jth species. The overall intraclass divergence matrix can be defined as follows:where represents the prior probability for the jth wood species, and it can be estimated by , where n represents the overall number of all species samples. Similarly, the interclass divergence matrix is defined as follows:where m represents the mean value of all species samples. The overall divergence matrix is then defined as follows:

Finally, the pattern divisibility rule is defined as follows:where represents the trace of matrix X. It is obvious that a higher value will produce a more reliable identification.

In terms of a specific wood region or tissue, we considered the wood vessels’ peripheral region, central region, and randomly selected ROI. Spectral extractions in wood vessels’ peripheral and central regions have been discussed in Sections 2.4 and 2.5. The size of the randomly selected ROI was ; the mean vector of all spectral vectors in this ROI is the final spectral extraction for a hyperspectral cube sample.

The smoothing procedure for the spectral preprocessing and the SNV calibration procedure are usually required before spectral extraction and pattern recognition as they cancel the noise disturbance and waveform offset. This preprocessing process is illustrated in Figure 7.

Table 2 gives the values for the wood vessels’ peripheral region, central region, and randomly selected ROI in the training samples. It indicates that the spectral divisibility for the wood vessels’ peripheral region was most useful, while that for the wood vessels’ central region was least useful. Since the wood vessels’ central region is hollow, with some resin and probably some filling material, its spectral curve is variable and thus has a large intraclass difference with a small value.

Finally, the time required was compared for spectral extractions in the wood vessels’ peripheral region, central region, and random ROI. The computer configuration used was as follows: CPU Intel Core I7-6700, internal memory 8G, hard disk 1T, display card AMD Radeon R7200 Series. In these three spectral extractions, one hyperspectral image was cut into 30 ROI of . Time required for processing one ROI is given in Table 3. We can see that the processing speed of random ROI was the fastest while that for the vessels’ central region was the slowest.

3.2. Comparisons of Identification Accuracy

Before sending the extracted spectral vectors into classifiers, spectral dimensionality reduction was usually required. In this work, three algorithms, PCA, KPCA, and MDS, were used to reduce dimensionality. In spectral classification, three classifiers, BP neural networks, SVM, and MD, were used for identification comparisons.

Firstly, we used the MD classifier to calculate the MD between a test sample and the training sample sets of the 6 hardwood species. The usual nearest neighbor classification was used to judge the test sample’s wood species. The classification results are illustrated in Figure 8. We can see that the classification correct ratio (i.e., CCR) was influenced by spectral dimensionality reduction. For instance, if MDS was used, the CCR stabilized only when the spectral dimensionality was larger than 7. On the other hand, if PCA or KPCA was used, the CCR stabilized only when the spectral dimensionality was larger than 3. Moreover, we can see that the CCR was also influenced by the wood region of the spectral collection. In fact, the CCR of the wood vessels’ peripheral region was the highest, approaching 100%, while the CCR of the wood vessels’ central region was the lowest, and its mean value was approximately 70%–80%. The CCR of random ROI fell between the peripheral and central vessel regions and was approximately 90%.

The classification accuracy was the calculated CCR, which was the overall classification ratio (%) for the 540 test set samples. This classification accuracy is illustrated in Figures 811 and Table 4.

Secondly, we used the BP neural network classifier. Since the threshold and weight for the network initialization were selected randomly, the classification accuracy of every trained network may fluctuate slightly. Figure 9 illustrates the test CCR for the 25 trained networks, while the spectral dimensionality was reduced to 5D by PCA. We observed that the overall CCR was stable, though there were slight differences between each CCR across the 25 networks. Therefore, the mean CCR for 25 networks was calculated and used as the final CCR. Figure 10 illustrates the mean CCR for different spectral dimensionality reductions and different wood regions of spectral collection. We can see that CCRs from vessel central regions were lowest, with mean values of approximately 70%–80%, while those from peripheral regions and random ROI were almost identical, approaching 100%.

Thirdly, the SVM classifier was used. The predictive power of SVM depends on the kernel function used. The radial basis kernel function was usually used. When determining the penalty factor C and value, the grid search method can be used to find the best parameters. Figure 11 illustrates the identification comparisons with the best parameters C and . We can see again that the CCR of central regions was lowest, with a mean of approximately 80%, while those of peripheral regions and ROI were almost identical, approaching 100%. The CCR of random ROI decreased with the application of KPCA.

The best identifications with different classifiers and spectral dimensionality reductions are given in Table 4.

Figures 8, 10, and 11 and Table 4 indicated that the CCR of vessels’ central regions was the lowest in all cases and that the CCR of vessels’ peripheral regions was slightly higher than that of random ROI in the case of BP network and SVM. In the case of MD classifier, the CCR of vessels’ peripheral regions was much higher than that of random ROI. Moreover, we obtained good CCR of peripheral regions with a low spectral dimension. For example, the best CCR (100% of vessels’ peripheral region) was achieved using PCA and MD classifier with a spectral dimension of 8 or by using MDS and MD classifier with a spectral dimension of 9.

Concerning the spectral dimensionality reductions, good results were obtained using PCA and KPCA with a low spectral dimension. However, if MDS was used, the CCR stabilized only when the spectral dimensionality was larger than 7. In general, PCA was the best choice in terms of identification accuracy and spectral dimension.

3.3. Comparisons of Identification Robustness

Wood hyperspectral image acquisition may be influenced by external environmental factors such as temperature, humidity, and illumination. In this section, a simulation experiment on noisy wood hyperspectral image classification was performed. Since the training set of wood images can be acquired in advance in a controlled environment, we only added white noises to the test set. The signal-to-noise ratio (SNR) was used to measure the test set’s image quality.

The robustness of our three spectral extraction schemes (i.e. vessels’ central, vessels’ peripheral, random ROI) was compared using an error rate defined in terms of SNR.where and refer to the classification accuracy for ideal test images and noisy test images, respectively. The falling trend of “error rate” (Figure 12) indicates that the faster the curve falls, the more robust the corresponding spectral extraction scheme. Therefore, we conclude that in the case of BP and MD classifiers, the vessels’ peripheral extraction scheme was the most robust, while for SVM, the vessels’ peripheral scheme was slightly more robust than random ROI and much more robust than the vessels’ central scheme.

3.4. Discussions of Parameters

In the wood vessels’ peripheral region extraction, the two key parameters are the size r of structure element A and the threshold k in equation (2). Here we explain the effect of these two parameters and give the best values of r and k for each species.

Figure 13 illustrates the wood vessels’ peripheral region extraction comparisons in a ROI with and different k for wood species Intsia bijuga. The value interval of k is with a step length of 5. We can see that a small k will make the wood rays or axial parenchyma remain in vessels’ peripheral region, while a large k will make this region contract.

Figure 14 illustrates the extraction comparisons with and different r values. The value interval of r is with a step length of 5. We can see that a small r will make this region disappear and make wood rays appear, while large r will make this region contract again.

In summary, we conclude that small r and k produce more wood rays or axial parenchyma, while large r and k make these tissues disappear while causing vessel peripheral regions to contract. Table 5 gives the best values of r and k for our 6 experimental species, and the corresponding extraction results are illustrated in Figure 15.

3.5. Incremental Wood Species Identification

To test the incremental identification performance of our scheme, two new wood species were added (i.e., Manchurian Ash (Fraxinus mandshurica Rupr.) and Chinese Oak (Quercus mongolica Fisch.)). It was usually difficult to discriminate these two wood species since they have similar external appearance. The identification results with PCA and three classifiers, BP neural network, SVM, and MD, are illustrated in Figure 16. Again, we can see that the CCR of vessels’ central regions was the lowest in all cases, while the CCR of vessels’ peripheral regions was higher than that of random ROI. In fact, the CCR of vessels’ peripheral regions approaches 90% approximately. Therefore, we conclude that the CCR of vessels’ peripheral regions was stable and best when new wood species samples were added.

3.6. Comparisons with Other Works

Finally, the proposed scheme is compared with the two algorithms described in references [15, 16]. Hyperspectral images must be converted to grayscale images before comparison. To this end, two methods can be applied. The first method consists of selecting the waveband of the hyperspectral images using the procedure described in Section 2.5, then performing wavelet-based image fusion on selected images, and finally saving grayscale images after image fusion. This method is detailed below.

The retained images after waveband selection were defined as , where the image size of was . Given that each image was derived from a hyperspectral image, image registration of the above images was not necessary. Each image was processed by a wavelet transform based on Haar wavelets. The number of layers for decomposition was 4. The low-frequency wavelet coefficients and high-frequency wavelet coefficients of each image were then obtained. The high-frequency wavelet coefficients include horizontal, vertical, and diagonal high-frequency coefficients. The new wavelet coefficients and were then calculated by fusion as

These new wavelet coefficients were then used for inverse wavelet transform to obtain the fused digital image X′. The results of image fusion are presented in Figure 17. Note that the fused image includes the texture features of all of the waveband images.

The second aforementioned method consists in applying principal component analysis (PCA) to reduce the dimension of hyperspectral images. The first four principal components are presented in Figure 18, but we only retained the first principal component with clear texture.

The method described in reference [15] is a traditional approach to image texture classification applied for the identification of wood species based on GLCM (Gray-Level Co-Occurrence Matrix), which will not be described here. The method described in reference [16] also aims to image texture classification for wood species but based now on IBGLAM (Improved Basic Gray Level Aura Matrix). The corresponding specific steps are described below.

According to the descriptions in reference [16], an image needs to be first compressed into a 4-bit grayscale image, which is processed with equation (17). refers to an 8-bit grayscale image; represents downward rounding; refers to a 4-bit grayscale image. The result of grayscale image compression is shown in Figure 19. Note that the image textures are still clear despite the compression of the pixel values of the grayscale image.

Then, the feature matrix was calculated with the expressions in equation (19). The grayscale image was defined as , where M and N represent the length and width of the grayscale image, respectively. To make the subscripts of valid, it is necessary to add 1 to each pixel value of before calculation. In this way, we avoid the occurrence of 0, with . The matrix calculated with equation (17) is a square matrix.

To reduce the dimension of the feature vector, the square matrix was calculated with the following expression:

Finally, the square matrix was converted according to equation (19), where represents the elements in series. The converted vector was the feature vector with a length of 136. Figure 20 shows the schematic diagram of features. Table 6 shows the processing time for a hyperspectral image when applying the above method.

Figure 21 shows the CCR of the vessel peripheral spectral extraction method proposed in this paper in comparison with the IBGLAM and GLCM methods based on support vector machines (SVM) for the identification of eight tree species, including the six tree species in Table 1 and the two tree species in Section 3.5.

It can be seen from Figure 21 that the CCR of the two methods described in references [15, 16] were only around 80% for the samples used in this paper, whereas the CCR of our method proposed in this paper was much higher than that of the other two methods. In addition, when PCA was used for generating grayscale images, the accuracy of image classification with GLCM and IBGLAM was only 70.93% and 73.70%, respectively.

4. Conclusions

We have proposed a novel hardwood species identification scheme using hardwood microscopic hyperspectral images. The spectral feature of wood vessels is especially considered to investigate the possibility of tree identification to the species level. Specifically, the spectral features of vessel peripheral regions are acquired using morphological operations and those of central regions by use of K-L divergence and an automatic identification scheme. The spectral divisibility rules in Section 3.1 and experimental comparisons in Section 3.2 indicate uniformly that the CCR of peripheral regions is best in all cases of spectral dimensionality reduction. Moreover, we can also conclude that wood vessels’ peripheral extraction scheme is robust when wood images are disturbed by white noise.

Our work attempted to identify hardwood trees to species level using spectral analysis. Considering the instrument’s cost, we chose the visible/near infrared SOC710VP hyperspectral imager with 128 spectral bands and a spectral range of 372–1038 nm. Spectral data in the visible region may vary due to changes in wood color produced by environmental variation. Therefore, we kept the environment constant during spectral work and wood sample preservation. In addition, 780–900 nm is the near infrared band, which best characterizes different wood properties.

The method described in reference [15] is suitable for wood species with large different textures, but the highest CCR was only 75.92% when using the samples provided in this paper. Zamri et al. [16] classified 52 tropical tree species and obtained the highest CCR of 97.01%. However, the main research objects of reference [16] were tropical tree species. These species tend to have large evenly distributed vessels, and most of them are diffuse-porous species that often have clear texture features. The highest CCR was only 82.03% with the method of reference [16] using wood samples provided in this paper. The research objects in this paper include not only tropical tree species, but also subtropical tree species. Moreover, the vessels of these subtropical tree species present different sizes and they are not evenly distributed. Furthermore, the method proposed in this paper does not require rich textures in the cross sections of wood species. As long as the vessels of tree species are clear, the tree species can be classified according to the spectral information of structures around the vessels.

Data Availability

Access to research data is restricted. The experimental instruments and wood species samples are owned by other academic institutions; therefore, we cannot publicly release the research spectral wood data.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (grant no. 31670717), the Fundamental Research Funds for the Central University (grant no. 2572017EB09), and the Heilongjiang Province Natural Science Foundation (grant no. C2016011).