Abstract

Wild ornamental plants are beneficial as well as dangerous for the environment. Because the introduction of attractive plants that are not suited to the local ecosystem can result in significant environmental damage, a quick integration strategy based on an enhanced clustering algorithm is proposed for wild ornamental plant resources. The technique is enhanced with density stratification by integrating the k-means distance measurement formula and establishing the objective function of clustering optimization. The cluster termination condition is controlled by the number of clusters k, and the wild plant data categories are continually merged. Uneven density distribution is used to deal with the wild plant distribution dataset. To obtain the distribution of wild ornamental plants in different regions, to estimate the optimal parameters of wild plant samples, to combine with maximum likelihood classification to obtain the plant flora differentiation degree, and to complete the resource integration, remote sensing images were used. Comprehensive survey and systematic sampling were used to conduct a complete survey of the protected area. The heat map of the plant size distribution shows that there is a clear negative correlation between the spatial scale difference and the overall density difference of the plant distribution, that is, it appears spatially. From the experimental analysis, it is observed that the high-density small-scale and low-density large-scale agglomeration distribution characteristics delay is 1.96 s.

1. Introduction

Wild plant resources have been in a natural and self-growth state for a long time. Many species have distinct flower kinds and blossoming times, as well as high stress tolerance (drought resistance, cold resistance, disease and insect resistance, salt and alkali resistance, heat resistance, and so on) and environmental adaptation. It is a natural gene bank that contains extraordinarily rich genetic information, as well as mankind’s valuable assets and the foundation of modern garden plants [1]. The protective development and utilization of wild plant resources can not only increase the beauty of urban greening and beautification but also increase the urban biodiversity index and reduce the serious losses caused by the introduction of ornamental plants that are not suitable for the local environment [2, 3].

A series of abnormal changes have occurred in the natural world as a result of the emergence of climate change and ecosystem instability, particularly with the expansion of human activities, changes in production systems, and predatory development. Wild plants are facing the dilemma of resource loss and endangerment [3, 4]. Approximately 29% of the country’s wild plant populations are under jeopardy, and natural resources of a range of therapeutic materials are rare, with just a tiny number of commodities accessible or none at all. In order to provide a theoretical basis for the future management and protection of wild plant resources in Zhouzhi Nature Reserve, providing suggestions and ideas for the rational planning and utilization of plant resources [4] is suggested [5, 6]. China has a vast territory and abundant resources of wild ornamental plants. There are about 30000 kinds of higher plants and more than 6000 kinds of ornamental garden plants. There are many different types of high-quality decorative plants available in China, and ornamental plant cultivation has a long history. As early as in the Western Zhou Dynasty from the 11th century B.C. to the 7th century B.C., the working people of the country had cultivated flowers and trees in gardens. However, there are not many kinds of plants used for urban greening in China [5]. For example, Nanjing, Hangzhou, Ningbo, and other cities generally have 200–300 species, while Shanghai has nearly 400 species. Furthermore, with the exception of areas with extremely unique living conditions, such as tropical, frigid, or arid regions, there is no discernible difference in the greening of most Chinese cities. The repetitive and similar plant materials produce a condition similar to one city in a thousand, which does not correspond to the position of a large nation with many plant resources.

Flower production is widely acknowledged as a successful agribusiness branch, despite the fact that floriculture crops are not classified among the primary agricultural commodities. Low labour costs, as well as ecologically and occupationally unsustainable practices, have all contributed to this rise in Kenya, Ethiopia, Colombia, Brazil, and Thailand [6]. The fast increase of flower production in emerging nations is aided by seasonal and meteorological benefits [7]. Pesticides are still used more in the flower business than in any other agricultural crop. Because the safety guidelines for pesticide usage are less severe than those for other horticultural crops, large volumes of pesticides are regularly used to maintain the attractiveness of flowers and plants [8]. But due to the use of huge pesticides, there is a growth of wild ornamental plant resources which is hazardous. The greenhouse growing technique boosts production and assures a year-round supply of high-quality ornamental plant produce. Climate variables like temperature and humidity, on the other hand, are designed to boost crop development in such protected areas [9]. Hence, an enhanced clustering approach is proposed in this research work to ornamental plant resource integration.

The rest of the study is organized as follows: Section 2 discusses the improved clustering algorithm with rapid integration of wild ornamental plant resources. In Section 3, experimental results are discussed, followed by conclusion in Section 4.

2. Proposed Method

The work in this study focuses on the enhanced clustering algorithm for quick integration of wild ornamental plant resources. The steps of the proposed work are elaborated as follows.(i)An enhanced clustering approach is proposed for quick integration of wild ornamental plant resources(ii)The goal function of clustering optimization is defined using the k-means distance measurement formula, and the technique is enhanced by combining it with density stratification. The clustering termination condition is dictated by the number of clusters K.(iii)The wild plant data classes are continuously merged to deal with the wild plant distribution resource dataset with uneven density distribution. The distribution of wild ornamental plants in different regions was determined using remote sensing photos, and the ideal parameters of wild plant samples were computed.(iv)The differentiation degree of flora and species was derived using a combination of maximum likelihood classification, and resource integration was accomplished.

2.1. Improved Clustering Algorithm

The k-means is an unsupervised algorithm for clustering data objects. The k-means clustering technique separates n data items into k-clusters, each of which contains the data object with the closest mean value. Each group’s data items are compact, whereas the objects in the other group are disjunct [10, 11]. The sum of squares is used in the k-means algorithm to construct groups of diverse items. The input parameter of the algorithm is the number of cluster centers. Then, the distance between each element and the center of each cluster is calculated. The distance between the calculated data elements and each cluster center is compared, and the data elements are assigned to the nearest cluster center. The distance measured in k-means clustering is the Euclidean distance. The distance between sample and is defined in the following equation:

Equation (1) can be used to calculate the distance between each data element and the cluster center. Data elements are allocated to the cluster center with the minimum distance. The cluster center is the mean value of all data points in the group. Each cluster center with a data element set is called a cluster [10].

In cluster analysis, a cluster is defined as the sample set with the minimum dispersion (or maximum compactness), and the dispersion is measured by the distance from the sample to the cluster center. Combined with the k-means distance measurement formula, the objective function of clustering optimization is defined in the following equations:

The more compact the samples in the cluster are, the less discrete they are from the standpoint of data distribution [11]. The implementation steps of clustering algorithm are as follows:Input: according to the number of clusters k, randomly select k samples from sample data to make Output: select the object to be clustered from the data object set as the initial cluster, which is also .Procedure: the dataset is divided randomly, and the arithmetic mean of each cluster is calculated.

According to (1), the distance from the sample to the center of each cluster is calculated as given in the following equation:

The sample is divided into the nearest cluster, the cluster center is recalculated as shown in the following equation, and the sample is divided until the cluster does not change.

The enhanced method is a density-based agglomerative hierarchical clustering technique that requires the number of clusters to be determined in advance as the clustering termination condition [12, 13]. At first, each data object is regarded as a separate class, and then, the classes are merged until the termination condition is satisfied.

The points in the dataset with a minimal density of roughly 10% are eliminated as deviation points in the modified method, and the remaining data points are hierarchically grouped on the two layers of maximum and minimum densities. On the basis of hierarchical clustering, the whole dataset after excluding deviation points is integrated into agglomerative hierarchical clustering, and finally, the deviation points are divided into the nearest clustering. The number of classes in the known dataset is k, and the total number of data points is . The implementation steps of the improved algorithm are as follows:

(A) Calculate the density of each data point in dataset .

The point density is the number of points in a certain area and is defined in the following equations:where represents the distance between point and point . is the cutoff distance. The selection principle is that the average number of adjacent data points is 1–3% of the total number of data points, and the average number of data points for a dataset within 1000 is generally about 10. The density of data point is as follows: the number of points whose distance to point is less than the cutoff distance .

(B) Take about 10% of the data points with the smallest density in dataset as deviating points to form dataset and exclude them and combine the remaining data points in to form dataset .

The deviation point is the point where the density is less than the cutoff density , and the process of obtaining the deviation point is the process of continuously adjusting the cutoff density . Initially, set a cutoff density and find the number of points whose density is less than the cutoff density and is calculated as given in the following equation:

If , adjust the truncation density to , if , adjust the truncation density to , update , and then calculate the number of points again. Keep adjusting until the number of points n meets , at which point is the final cutoff density. Knowing the final result, we can determine which points in the dataset have a density less than the cutoff density, so as to determine the deviation points and form the dataset . The red points shown in Figure 1 are deviated points from the dataset and the black points are deviated points from the dataset .

(C) The dataset is composed of about 25% of the points with the highest density in the dataset , and the dataset is clustered into about 2 k categories according to the agglomerated hierarchical clustering method. The process of obtaining dataset is similar to the process of finding deviating points. The green points are about 25% of the points with the highest density in the dataset . These green points are deviated points from the dataset . The clustering of dataset is agglomerated hierarchical clustering, that is, initially treat each point in as a separate class and then continuously merge the classes with the smallest distance between classes until there are only 8 categories in ; the final clustering effect is shown in Figure 2.

Assuming that are different data points, the distance between the two clusters is defined in the following equation:

(D) Finally, the deviating points in the dataset are divided into the nearest k classes in , and the clustering of the entire dataset is completed. Because the points in the dataset are all deviating points and there is no complete new class in the point set , there is no need to perform hierarchical clustering on the points in separately. Just divide these deviating points into the nearest existing k categories. Figure 3 shows the dataset clustering impact before and after off-point processing. According to the clustering effect, the improved algorithm is suitable for spherical clusters.

The parameters cutoff distance and cutoff density have a minor impact on the enhanced method [14]. The enhanced algorithm is based on a straightforward concept. The entire clustering process relies on the calculation of the distance between data points, and there is no complicated formula [15, 16]. The method of excluding some points with the smallest density makes the improved algorithm insensitive to outliers and noise [17, 18]. The enhanced technique is stratified by density, and the notion of hierarchical clustering in the greatest, and lowest density layers can not only increase the algorithm’s performance but also allow it to handle datasets with unequal density distribution.

2.2. Rapid Integration of Wild Ornamental Plant Resources
2.2.1. Location

Hebei Wu’an National Forest Park is located in the Taihang Mountains in the west of Handan City, Hebei Province, within the territory of Wu’an City, bounded by the west edge of Wu’an City in the east and Shexian County in the south. It is adjacent to the Fengfeng Mining Area of Ci County and Zuoquan County in the west. The geographical coordinates are 113°45′–114°22′ east longitude and 36°28′–37°01′ north latitude. The total area of the park is 40500 km2. The forest park is full of peaks, crisscross ravines, undulating terrain, and complex and diverse landforms.

2.2.2. Topography

It is located in the first-order structure of the New Cathaysia, the contact part of the Taihang Mountain uplift and the subsidence zone of the North China Plain. The soil types are mainly leached cinnamon soil and brown soil, which are distributed vertically with the altitude. From top to bottom, they are mainly mountain brown soil, grassy brown soil, and cinnamon soil (eluvial cinnamon soil, mountain cinnamon soil, and carbonate cinnamon). The landform features are complex, and the terrain slopes from northwest to southeast. It is known as Sanhe, Wuchuan, and Jiugou. The highest peak in the territory is Qingyazhai with a height of 1898.7 m and the lowest Yonghe village with a height of 87 m. The terrain is relatively undulating. There are two main mountains, Xiaomotianling and Shibapan.

2.2.3. Climate and Hydrology

It belongs to a warm temperate continental monsoon climate. The main characteristics are four distinct seasons, rain and heat in the same season. The annual average temperature is 11°C–13.5°C, the extreme minimum temperature is −19.9°C, and the extreme maximum temperature is 35.5°C; the annual average sunshine hours is 2297 h, the annual average frost-free period is 230 d, and the average annual precipitation is 600–800 mm. Most of them are concentrated in June to August, with the characteristics of rain and hot in the same season and distinct dry and wet. Figure 4 shows the wild ornamental plants.

2.2.4. Plant Resource Integration

The plant characteristics and habits of wild ornamental plants are very different; their adaptability and resistance abilities are also not the same, and the value of development and utilization is also different [19, 20]. As a result, before developing and utilizing wild plant resources, evaluation standards should be established based on the development and utilization’s purpose and main objectives, and targeted screening should be conducted to achieve key and orderly rational utilization, as well as increase the development and utilization efficiency of wild ornamental plants. Table 1 provides the distribution of wild ornamental plants in different regions.

Analyzed from the perspective of the main color distribution, the distribution of the main color will affect the combination effect of the plant landscape configuration. When the main color distribution of the plant landscape is relatively sparse, the beauty value is higher, and when the main color distribution of the plant landscape is more concentrated, the beauty value is lower. It indicates that people prefer the plant landscape configuration with scattered colors. In terms of the main color ratio, the main color ratio of the plant landscape configuration with a higher degree of beauty is more regular. While, the main color ratio of the plant landscape configuration with a lower degree of beauty is more irregular. So, the main color ratio affects the beautiful degree value of the plant landscape configuration, in terms of color coordination. It is the color coordination which directly affects people’s judgment on the plant landscape configuration, and people like the plant landscape configuration with higher contrast tones and degrees. In terms of hierarchical structure, a good plant landscape configuration hierarchy has a higher beauty value, and a weak plant landscape configuration hierarchy has a lower beauty value, indicating that the strength of the color hierarchy affects the beauty value of the plant landscape configuration. From the point of view, when cool colors account for a larger proportion, it can greatly increase the value of plant landscape configuration [14, 15].

Based on high-resolution remote sensing images, the basic data and characteristics of the current urban landscape pattern are extracted, and groups of sample observation values are randomly selected from the fusion results of the fused high-resolution remote sensing images. According to the characteristics and values of the samples, the optimal parameter estimation of the wild plant samples is performed [19, 20]. Assuming that the classification feature index of the fused remote sensing image has the condition of multivariate normal distribution, calculate the covariance matrix and average vector of each type of training sample, and using the following equation, calculate the probability of the wild plant distribution information in the th category.

and , respectively, represents the total mutation matrix and its inverse matrix. The expression of is shown as follows:where is the number of feature classes, is the covariance of the and th classes, and is the average vector of the th class. In addition, represents the sample feature represented by the d-dimensional matrix, and its expression is given as follows:

Using the maximum likelihood classification method expressed in the above equation, the classification results of remote sensing images are obtained. Then, normalize the plant distribution index to realize the extraction of wild plant information from remote sensing images. The extraction process is shown in the following equation:

Massive resources are classified by their diverse integration of wild ornamental plant resources. The problem to be solved by classification is to analyze and predict resource data. A classification function is mined on the basis of existing data. Therefore, this article uses a decision tree to generate the classification function. The calculation amount is within the allowable range, and it can handle continuous and discrete fields. It can also clearly indicate the important field. Without considering the noise, the real data may have some field incompleteness or errors. In the case of noise, it will affect the classification performance of the decision tree. Therefore, pruning is needed to simplify the decision tree structure and make it easier to understand. First, establish a decision tree and use the evaluation equation in each classification node to obtain the test set with the largest function as the optimal condition to complete the node division. This research study uses the method test function as the information gain rate function as shown in the following equations:where represents information gain, is the attribute of the decision tree, represents the number of discrete values of , and represents the number of positive examples.

Use floristic species differentiation (SD) to reflect the degree of differentiation of different flora as in the following equation:where represents the degree of differentiation of the flora. The larger the value of , the higher the degree of differentiation of the flora in this area, and vice a versa the lower the degree of differentiation. , respectively, represent the value of family, genus, and species in a flora.

3. Results

A comprehensive survey of the protected area shall be carried out using a survey method combining line survey and systematic sampling. Shrubs use a 100 m × 10 m sample line method with 4 repetitions, and herb plants use a 1 m × 1 m sample method with 4 repetitions. The survey content includes statistics on all plant species and their habitats, latitude and longitude, coverage, abundance, frequency, height, and aboveground biomass. At the same time, observation records are closely related to the ornamental value, biological characteristics, resource potential, economic value, and ecological value. Plant type, ornamental parts, flower and fruit color, flower diameter, leaf shape and leaf color, ecological habits, and other indicators are noted, and take photos and collect specimens. A total of 235 sample lines and samples were investigated in the past two years.

The “hot spot” detection tool is used to generate a heat map of the plant size distribution in the protected area. The red region shown in Figure 5 represents a hot location where high values congregate, whereas the blue area represents a cold place where low values congregate. Spatial hot spot detection (Getis-Ord ) is used to test whether the spatial scale has statistically significant high and low values in a local area. Visualization methods can be used to reveal the “hot spot” and “cold spot” areas for the study of the spatial scale of plants. The statistical calculation of Getis-Ord is given in the following equation:

It can be seen from Figure 5 that there are obvious differences in the spatial distribution of wild plant landscapes in this area, roughly showing high north and low south, and the northern region has a ring-shaped surrounding feature of “low central and high surroundings.” There is an obvious negative correlation between the difference in the spatial scale of the plant distribution and the difference in density, that is, the spatial distribution characteristics of high-density small-scale and low-density large-scale agglomeration.

The various integration techniques are turned into programme codes and then entered into the created test platform. In order to ensure that the integration method can run in the experimental environment, the relevant parameters are configured. The corresponding integration result is shown in Figure 6.

Integration runtime delay is calculated as in the following equation:where represents a pixel, represents an image segmentation area, describes a remote sensing image target segmentation area, and arithmetic symbol represents a logical addition.

Take the running data of the background program in the experimental environment, extract the relevant data about the running time, and input it into the Excel data processing software.

As shown in Figure 7, the running delays of the integrated methods presented in reference [5], cholinesterase inhibition [8], and genic simple sequence repeat markers (eSSRs) [9] are compared. It takes 4.64 s, 3.86 s, and 2.94 s running delay, respectively, by these three algorithms, whereas the proposed methods of clustering discussed in the study has a delay of 1.96 s which is minimal among all approaches. The designed integration based on the improved clustering method runs faster and integrate the results. The main reason is that the method introduces an improved clustering algorithm which shows optimal results in lesser iterations and minimum time. The improved algorithm is not sensitive to outliers and noise. It is stratified according to the density and the idea of hierarchical clustering that the maximum and minimum densities can not only improve the efficiency of the algorithm but also can handle datasets with uneven density distribution.

4. Conclusion

This study proposes a rapid integration of wild ornamental plant resources based on an enhanced clustering algorithm to prevent the substantial losses caused by the introduction of ornamental plants that are not suited for the local environment. To increase the algorithm’s efficiency, density stratification is combined to continue integration of wild plant data categories. Wild plant distribution resource datasets with uneven density distribution are processed, and hierarchical clustering at the highest and lowest density levels is performed which are suitable for processing datasets with uneven density distribution. To obtain the distribution of wild ornamental plants in different regions, to estimate the optimal parameters of wild plant samples, to combine with maximum likelihood classification to obtain the plant flora differentiation degree, and to complete the resource integration, remote sensing images were used. The difference in spatial scale and density of plant distribution as a whole, as shown in the heat map of plant size distribution, demonstrates the features of high-density small-scale and low-density large-scale agglomeration distribution in space. The integrated running time delay is 1.96 s.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

We would like to thank China National Natural Science Foundation, China Social Science Foundation and National Social Science Foundation for their support to complete this research study.