To enable automatic transplantation of plug seedlings and improve identification accuracy, an algorithm to identify ideal seedling leaf sets based on Fourier descriptors is developed, and a classification method based on expert system is adopted to improve the identification rate of the plug seedlings. First, the image of the plug seedlings is captured by image acquisition system, followed by application of K-means clustering for image segmentation and binary processing and identification of the ideal seedling leaf set by Fourier descriptors. Then we obtain feature vectors, such as gray scale , hue H, and rectangularity. After that the knowledge model of the plug seedlings is defined, and the inference engine based on knowledge is designed. Finally, the recognizing test is carried out. The success rate of the identification of 10 varieties of plug seedlings from 190 plates is 98.5%. For the same sample, the recognizing rate of support vector machine (SVM) is 85%, the recognizing rate of particle-swarm optimization SVM (PSOSVM) is 87%, the recognizing rate of back propagation neural network (BP) is 63%, and the recognizing rate of Fourier descriptors SVM (FDSVM) is 87%. These results show that our recognition method based on an expert system satisfies the requirements of automatic transplanting.

1. Introduction

Plug-seedling cultivation was developed in Europe and America in the 1970s and has been rapidly enhanced in recent years because of its advantages, including mechanized operation, low cost, high survival rates, and easy transport [13]. Automatic transplanting is a necessary process in plug-seedling cultivation. Different transplanting schemes need to be adopted according to different varieties of plug seedlings in fully automatic plug-seedling transplantation. Therefore, identification of plug-seedling varieties is a technical requirement of automatic transplantation. Currently, there is little work on algorithms focused on classifying plug seedlings, although increasing effort has been applied to the classification of healthy and weak seedlings. Reference [4] used 1.8G-1.5R-1.8B color components to gray scale tomato-seedling images and used the maximum variance between classes to segment the image in order to isolate weak seedlings. Using image processing to distinguish seedling characteristics is currently a research hotspot [5]. Reference [68] showed that according to seedling characteristics, color features associated with the super green method (2G-R-B) could be applied to image segmentation, resulting in good results, although the effects of serious overlaps between the seedlings represent a limitation. Reference [9] used RGB and HIS color to extract characteristic values of cucumber leaves with different lesions while the area, perimeter, rectangularity, circularity, degree, and complexity of shape features were used to describe the lesion area. Reference [10] used morphological operations associated with the plug-seedling image for sorting, although this results in limited accuracy.

Recently, numerous studies have adopted intelligent recognition methods. Reference [11] used multifeature fusion and depth-belief networks to identify plant-leaf varieties, resulting in a recognition rate of 95.6%. Reference [12] used Gabor wavelets to convolute using 2G-R-B and (G+R+B)/3 components and the Q component of YIQ color, with this method using amplitude as a feature vector of the disease. A support vector machine (SVM) was subsequently used to identify the leaf disease associated with honey pomelo, with recognition accuracy of 94.16%.

These intelligent-recognition methods mainly extract the shape, color, texture, and other information from seedlings by vision, followed by their classification using a neural network. The main problems of neural network identification methods include the following: a large number of training samples are needed, and it is difficult to obtain samples in practice; and it is difficult to obtain ideal seedling leaves for seedlings with uneven layers and overlapping adhesion, which affects the acquisition of seedling-characteristic vectors. In this paper, we use a Fourier descriptor algorithm to cluster the seedlings, after which an ideal seedling leaf set and feature vector are obtained. Moreover, an expert decision algorithm is used to identify 10 kinds of seedlings. We then confirm the accuracy of the method through practical application.

2. Equipment and Materials

2.1. Image Acquisition Equipment

Image acquisition is realized by an image acquisition system, which includes a camera and light system. The camera uses a GRAS-50S5C apparatus (FLIR Integrated Imaging Solutions Japan Co., Ltd., Tokyo, Japan), the lens was a Sony ICX625AQ 2448 2048 CCD (Sony, Tokyo, Japan), and the interface is IEEE-1394 with a PCIeBus speed of 2.5GT/s. The lighting is arranged in 4 × 4 light bands (4 layers of 4 on each floor). The camera is connected using an industrial control computer, and the image acquisition card and expert identification system were installed on the industrial control computer.

2.2. Test Materials

A total of 120 pans and 10 varieties of seedlings are collected from Zhejiang University of Science and Technology, Zhejiang University, and Zhejiang Modern Agricultural Park (Figure 1),

where S1: pansy; S2: Salvia splendens; S3: cucumber; S4: tomato; S5: pepper seedlings; S6: Geely Red Star; S7: Cattail melon; S8: eggplant; S9: zucchini; S10: organic cauliflower.

Specific sample selection is as follows: ten samples of pansy seedlings (S1) are prepared, numbered from S101 to S110. The same as below, S2 has ten samples, (S201–S210), S3 has 70 samples, including 7 days of seedlings (S301–S310), 8 days of seedlings (S311–S320), 9 days of seedlings (S321–S330), 10 days of seedlings (S331–S340), 11 days of seedlings (S341–S350), 12 days of seedlings (S351–S360), and 14 days of seedlings (S361–S370). S4 has 30 samples, 6 days of seedlings (S401–S410), 8 days of seedlings (S411–S420), 17 days of seedlings (S421–S430), Additionally, 20 samples of S5, and 10 samples of S6, S7, S8, S9, and S10 are prepared.

3. Image-Feature Acquisition

3.1. Collection of Ideal Seedling Leaves

The acquired image features mainly include those associated with color and shape. Because of the overlap, irregular shape, and large number of seedlings leaves, obtaining the ideal seedling-leaf set is key to obtaining seedling-characteristics parameters. If a single seedling leaf is a closed boundary comprising pixels, the coordinates of any point on the boundary can be expressed in complex numbers as follows [13, 14]:where , and then performs a one-dimensional Fourier transform to obtain the following:The one-dimensional line vector comprising the Fourier coefficients is the Fourier descriptor, which represents the characteristics of the shape boundary of the target and restores the shape boundary as follows:To ensure that the eigenvectors comprising the Fourier coefficients are invariant in shape translation, rotation, and scale transformation, it is necessary to normalize the deformation. If the object is shifted by △ length and magnified R-fold and rotated by the angle, and the starting point coordinates are set to . Therefore, (2) can be expressed as follows:where is a DC component for . Equation (4) can be reformed as shown in where .

K-means clustering is used to separate the leaves from the Lab color space, and Otsu’s method was used as a threshold to obtain a binary image [15]. The binary image of the plug seedlings is obtained by morphological processing, after which the image contains a large area of sticky seedling leaves, incomplete seedling leaves, overlapping seedling leaves, and healthy seedling leaves. Recognition of the proper leaves is according to the following stepwise process: the binary images of the plug seedlings are numbered from 1 to ; the Fourier descriptor for each seedling leaf is calculated ; and the distance between any two seedling leaves is calculated according to follows:By setting a threshold value and using a two-dimensional array , to record the distances derived from leaf comparisons, the values of the distances that are less than the threshold value are recorded in the array , thereby forming the category of seedling leaf sets. The ideal seedlings and leaves are acquired according to the following:where is the number of seedling leaves and is the number of seedling leaves in each category of seedling.

Figure 2(a) is the original image of seedling S328. Figure 2(b) is the binary image of the results of K-means clustering. Figure 2(c) is the images are then separated by Fourier-descriptor clustering. And the mean value of the shape feature is obtained based on the ideal seedling leaf.

As shown in Figure 2, the feature-acquisition algorithm can filter deformed seedling leaves to obtain the ideal leaves when the seedling leaves are sticky.

3.2. Plug-Seedling Features Acquisition

After obtaining the ideal leaf set of the seedlings, seedling features, including morphological and color features, can be obtained. Morphological features include rectangularity, (8), elliptical flatness, (9), aspect ratio, . (10), roundness, (11), and Fourier descriptors (5).where is the area of the seedling leaves, is the width of the outer rectangle, is the length of the outer rectangle, is the long axis of the ellipse, is the short axis of the ellipse, and is the circumference.

Color features are obtained through the combination of color components of R, G, B, L, a, b, H, and S. For the same plug seedlings, we select stable features; therefore, S1 seedlings are used as the research objects in order to identify the changing rules. In laboratory devices, increasing the light band gradually enhances the light in the visual box, allowing measurement of color components [Figure 3]. S2 samples are subsequently used as research objects with constant brightness. The brightness of the images versus the associated eigenvalues is plotted [Figure 4], and the features of the different seedling stages are analyzed. Figure 5 shows the variation in the H feature at different seedling stages.

As showing in Figure 3, the “a” feature value increases gradually along with increased brightness. The values of “,” “,” and “” also gradually increase along with increased brightness. Although “” increases along with increased brightness, the amplitude registered only a small increase. However, “” decreased along with increased brightness. We found that after the use of a combination of different colors, the features become more stable. Therefore, the color features are selected as “,” “”, “,” and “” as feature vectors.

Figure 4 shows that, under constant brightness, the color characteristics are unaffected. Therefore, the final feature vectors for the plug seedlings areAs shown in Figure 5, the features of seedlings exhibiting similar growth stages were basically similar, with relatively stable eigenvalues and large degrees of division.

4. Expert System Model

The expert sorting algorithm used for the plug seedlings utilizes semantic transformation of the seedling features, knowledge representation, and inference engine design.

4.1. Expert System

The expert system is a program used to simulate expert decision-making and represents a branch of artificial intelligence [1618]. It is widely used in fault diagnosis, aided design, and expert consultation [19]. Reference [20] used an expert system to identify rice pests and diseases with high degrees of accuracy. Currently, the rules-based expert system represents the most widely used variant, with its composition mainly including rules, an inference engine, and facts.

The expert system used in this method is CLIPS, which is an efficient production system implemented in the C language. It is widely used because of its high efficiency and portability and includes rules, facts, and inference engine. The inference engine continuously scans the rules and activates matching rules to include them in the agenda. CLIPS can be programmed independently or embedded into a virtual C, and the knowledge base can be populated in at any time without affecting other program structures. The reasoning cycle of CLIPS can be divided into four stages: pattern matching, conflict resolution, activation rules, and action [21].

4.2. Modeling of Plug Seedlings

The expert system requires the construction of an initial factual model associated with the plug seedlings that is used for characterization and reasoning. The factual model can fully characterize the color, shape, and other characteristics of the seedlings and is convenient for the inference engine. According to this definition, the structure model of the plug seedlings can be expressed as described as follows:where represents the name of the seedling, is the description of the seedling, represents the feature name of the seedlings on the plate, and represents the feature value of the seedlings on the plate. is the weight value that indicates the importance of the feature, with its size capable of changing through the learning process. The weights of some features supporting seedling identification in trays are strengthened through the learning process, with those with low support weakened through this process. represents the objective credibility of the feature, which indicates the objective existence of a feature and is determined by domain experts. is a characteristic text description that facilitates maintenance of the knowledge base. is the threshold, range .

4.3. The Knowledge Model and Inference-Engine Design

After establishing the model of the plug-seedling structure, it can be used to characterize specific plug seedlings. In the expert system, the characteristic model of the seedlings is called a “fact,” which is the basis used for identifying seedlings. Additionally, the knowledge model associated with seedling reasoning is established and can be recognized by the inference engine to infer the desired result. The knowledge representation is represented by the following:ifthen

where is the number of plug seedlings, is the characteristic number of a plug seedling.

After constructing the knowledge model, the inference engine is designed as the key component of the expert system. Its main task involves continuously searching the knowledge base according to the given “facts” in order to identify matches according to the matching algorithm. After searching the entire knowledge base, the matching algorithm calculates the result, which is displayed on the interface of the expert system. The decision process is associated with the inference engine involves the following steps.

Step 1. The purpose of training is to build knowledge base. First, the system obtains pictures, which are transferred to the host computer after the image-processing module. The feature values of each plug seedling are calculated, respectively, after which the staff confirms the seedling type and the expert system writes the name of the seedling and its feature values into the knowledge base. For the same variety of seedlings, if the seedling stage is the same, generally only one sample is needed to complete the training. After the training is complete, the “fact” of the seedling is established in the knowledge base. If the “fact” (knowledge) of the seedlings is complete, the second step is initiated.

Step 2. A new plug-seedling image is collected and processed by the image processing module. The characteristic and its corresponding characteristic value is obtained. The expert system initiated the inference engine and simultaneously opens the knowledge base.

Step 3. The inference engine searches the knowledge base according to the search strategy. It is assumed that when the fact in the knowledge database is searched, the inference engine obtains the feature value of the fact as with the weight represented as and the objective credibility as .

Step 4. The similarity is computed, and the inference engine acquires the numeric value of a feature in the knowledge base according to the feature name F. The similarity is calculated based on the feature values of the collected images according to where is the current feature number and k is the fact number in the knowledge base .

When is 1, the two seedlings are exactly the same. is 0 and the two seedlings are completely different.

Step 5. The credibility distance of fact is calculated according to

Step 6. Candidate results are selected according to the method described in Step 5, where the inference engine matches all of the facts in the knowledge base with the collected features to obtain a confidence distance, followed by sorting them in descending order

Step 7. Calculate the maximum value of each feature similarity using a matrix operation (17) according to all facts in the knowledge base obtained in Step 4:where each row represents the similarity of all of the features for a certain seedling (row number represents seedling number) and each column represents a feature associated with all plug seedlings. The maximum similarity of each column is denoted according to the following example:Finally, the row number that maximum number of labels is recorded and if the seedling number from Step 6 is selected, the sorting process is deemed successful and the seedling number is considered an identification parameter. Otherwise, this will be determined according to which shows that the seedling number, k, as the largest confidence distance as a result of identification. Therefore, the reliability of the successful seedlings is represented by the largest confidence distance in the whole sample space.

4.4. Learning Algorithm

The learning process adjusts the weight value w, and improves the sorting ability. The goal is to increase the value of feature weights exhibiting a high degree of similarity by learning from the same seedlings in order to improve the competitiveness of the feature.

For any sample, P, the similarity is estimated as follows: , where the learning threshold is and the weights of each feature are . The evaluation function is shown as where is the learning coefficient (0.96) and is the actual output.

The learning process involves constant adjustment of the weights using a gradient method. The equation is used to obtain , with obtained as follows:where is the learning pace and is the similarity. Therefore, the large similarity is assigned a larger weight. The derivation process is as follows:where .

When the output function is linear, ; therefore, the weight-adjustment function is shown in (23), and convergence is shown in Figure 6.

(.02, .96, TH = 1).

5. Results and Analysis

In order to test the sorting ability of the expert system, 190 seedling feature vectors for the seedlings were collected from the seedling images, and then the expert system is used to identify and test the seedlings. The software is mainly run in MATLAB (2016a) using image-acquisition software and a CLIPS expert inference system (version 6.22).

5.1. Training Sample Selection and Recognition Rate

First, 190 plates and 10 types of seedlings are divided into training and test samples, with 10 samples selected as training samples among all of the plug seedlings. Test data (180 plates) are subsequently used, with the results showing a sorting rate of only 68.8% [Figure 7(a)]. Recognition errors mainly occurred on seedlings, such as S3, S4, and S5 comprising seedlings at different stages. Additionally, the morphological characteristics of the different seedling stages are quite different, which likely also promoted a decrease in the recognizing rate.

One training sample for each seedling stage is chosen in order increase the samples size to 19. The results shown in Figure 7(b) showed a sorting rate of 98.3% and indicated that seedling stage greatly influenced the recognizing rate. In practical applications, it is generally necessary to establish corresponding training samples for different seedling stages.

5.2. Plug-Seedling Recognition Rates under Bright Light

Light significantly influences image acquisition. S1 was used to investigate the effect of brightness on recognition rates. On the S11 training sample, increased light gradually weakened, with reliability results shown in Figure 8.

To obtain the data shown in Figure 8, S11 was used for training, and seedlings S12−S110 are used for testing. We then evaluated S11 training reliability by testing using the whole seedling spaces (S1−S10). The results showed that the reliability distance of the S1 seedlings decreased gradually along with a decrease in light intensity; however, this distance remained higher than that of other seedlings, enabling its successful identification. For the other seedlings, distinctions between seedlings are small and easy to misjudge. These results suggested a requirement for a visual-box design in order to avoid the effects of external light.

5.3. The Effect of Seedling Stage on Recognition Rates

Changes in seedling stage mainly affect seedling-leaf morphology and seedling characteristics. To evaluate the degree of influence, we used S3, which included seven seedling-stage seedlings, to test the recognition rates using different training samples at seedling stage (Figure 9).

Figure 9 shows that training with the S301 set only identified seedlings at days 7 and 8, whereas using the S320 allowed identification of seedlings at days 9 and 10. Moreover, seedlings at early growth stages resulted in high recognition rates, indicating that seedlings and leaves did not overlap or slightly overlapped, but were easy to distinguish. Furthermore, these results demonstrated the effectiveness of the inference engine and knowledge base. Additionally, we found that changes in seedling stage resulted in decreased recognition rates, requiring training of the system with new seedling sets.

5.4. Comparison with Other Recognition Algorithms

To compare our method with current algorithms, we divided the 190 feature vectors into training (20, 40, 60, 80, and 100) and test (170, 150, 130, 110, and 90) samples and applied a back propagation (BP) neural network, an SVM, a particle-swarm optimization (PSO)-SVM, and a Fourier-descriptor FDSVM. Parameters for the BP neural network included a target error of 0.001 and a training iteration number of 9000, returning an overall recognizing rate of 62%. The SVM parameter c takes 1.8, parameter g takes 1.1, and toolbox used libsvm version 3.12, which returned a recognition rate of 84%. The PSO-SVM had a target error of 0.001 and performed 200 iterations with 20 populations, resulting in a recognition rate of 87%. The FDSVM returned a recognition rate of 87.4%. Compared with these results, the highest recognition rate obtained using the expert system is 98.5%.

Two methods are used to select the training samples. The first involved random selection without considering seedling stage. Use of this training sample resulted in a recognition rate shown in Figure 10. The second method involved selection according to seedling stage, resulting in a recognition rate shown in Figure 11.

It can be seen from Figures 10 and 11 that the expert system has the highest recognition rate when the training sample is the same as the test sample. The result shows that use of training sets generated randomly resulted in lower recognition rates. This is because the features values associated with the plug seedlings are quite different, suggesting that the training data are unable to meet the requirements of the test samples. However, the recognition rates increased along with increases in the number of training samples. These results suggested that the algorithms performed better using training sets generated according to seedling stage; however, the neural network required a larger amount of training data. In contrast, the expert system achieves a higher recognition rate with fewer training samples.

6. Conclusion

Our findings show that feature vectors significantly influenced seedling identification and that an ideal seedling-leaf set was obtained using a Fourier-descriptor clustering algorithm to obtain stable morphological features. Additionally, we used nine features for the acquisition of seedling feature vectors by combining color and shape features. This enabled us to obtain ideal seedling-leaf features, even under conditions involving a slight overlap of seedling leaves. Furthermore, we found that plug-seedling leaf features are closely related to brightness and seedling stage. In a given seedling stage, the seedling features remained relatively stable; however, different seedlings often had the same feature values, which made classification difficult. Results obtained using the recognition algorithm based on an expert system relied upon seedling data stored in a knowledge base, which subsequently required less training data. The recognition process is performed by calculating the reliability distance using an inference engine by combining a voting method according to the measured reliability in order to improve the recognition rate. Additionally, we demonstrated that the knowledge base associated with the expert system was easily managed and that, for newly added seedlings, there was no need to change the inference engine and that this information needed only to be added to the knowledge base.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This research is supported by Zhejiang Public Welfare Technology Research Project (LGG19E050005), Zhejiang Province Natural Science Foundation of China (Grant no. LGN18E050002 and Y19F030001), and the Zhejiang Province Visiting Scholars Program.