Abstract

The present work proposes to evaluate, compare, and determine software alternatives that present good detection performance and low computational cost for the plant segmentation operation in computer vision systems. In practical aspects, it aims to enable low-cost and accessible hardware to be used efficiently in real-time embedded systems for detecting seedlings in the agricultural environment. The analyses carried out in the study show that the process of separating and classifying plant seedlings is complex and depends on the capture scene, which becomes a real challenge when exposed to unstable conditions of the external environment without the use of light control or more specific hardware. These restrictions are driven by functionality and market perspective, aimed at low-cost and access to technology, resulting in limitations in processing, hardware, operating practices, and consequently possible solutions. Despite the difficulties and precautions, the experiments showed the most promising solutions for separation, even in situations such as noise and lack of visibility.

1. Introduction

The most widely used approach in image separation is to integrate its components based on their color similarities, i.e., color detection. In the agricultural environment, its application is widely used to identify pests and seedlings as opposed to soil and to identify fruits due to their characteristic colors. Access to cameras is easier by obtaining a color image by providing more information about the environment than monochrome cameras, and the comparative simplicity of the color approach is advantageous compared to approaches that are separated by shape and structure application of these techniques [1].

However, this color separation technique still has many challenges, especially when it is used outdoors, due to the difficulties in working in an uncontrolled environment. [2] In particular, the effects of lighting and the presence of shadows are important sources of variation in the visible properties of the object, making detection and identification very unstable and difficult, especially in terms of color. To overcome these challenges, image processing techniques and methods can be developed and improved, removing noise associated with ambient conditions and extracting only useful information from the image [1, 3, 4].

Approaches to removing the effects of a light source on the colors of an object are called color consistency methods. Proposals for solving problems described on the basis of assumptions about visual, optical, capture device, or their combinations [5] are very different. However, despite the many proposed mechanisms for color stability, no single solution has been identified, the possibilities of applications and environmental conditions are so vast and diverse that they prevent the global solution from being achieved, and thus approaches are highly dependent applications and specific issues [6, 7]. Recent advances have shown promising results through convolutional neural networks and in-depth learning [810], but its operation requires training data and relatively high computational costs, restricting its use.

Because the agricultural environment is a highly controlled universe of conditions and applications, there are certain approaches to treating light variations. Many spaces and color codes were created and used for color pictures for the plant segment, based on specific conditions, such as the natural difference between plants’ green and soil in general. However, again, there is no evidence for a single approach to problem solving because the conditions of the experiments are so specific that the methods are difficult to compare, and the accuracy of the documents is unclear [11].

Despite the different approaches to color coding, there is a lack of use of broad and traditional methods for light stability in the agricultural environment. Another small explored feature is the estimation of processing times and cost of methods, while the hardware used is less discussed and processing is rarely carried out on embedded units, making it difficult to analyze performance and reliability all-in-one view settings.

Considering what has been presented here, it can be observed that the path to technological development in agriculture is very promising because; in addition to the challenges of mitigating environmental impacts, most productive plantations of global production are below its potential [12]. Similarly, the applications of color stabilization algorithms can be shown to be an interesting alternative to the use of vision systems in outdoor and agricultural environments. Although there are different types of algorithms for color consistency, the solutions are very specific to the applications and operating conditions of the vision system. Developments to explore these nodes should ensure a more stable environment and an equal global food supply [13].

2. Implementation

The determination of the project hardware was the main motivation for accessibility, cost, and strength in aligning with the proposed objectives of the project. A wide range of comprehensive and basic hardware solutions with acceptable performance and affordable price were sought in the market. In addition, more than one solution was considered in the study, which aims to expand compatibility and ensure greater consistency to its results. In this excuse, two different capture settings were defined: a Logitech USB webcam model C615 and a Raspberry Pi model Raspicam rev1.3 embedded system module. Both devices have RGB sensors, and their specifications are summarized in Table 1.

For some analyzes, two different processing systems with different operating systems and operating systems were evaluated, and their configurations can be found in Table 2. The two systems proposed, although very accessible, and have very different characteristics. The first desktop personal computer (PC), running on the Windows operating system, was fully compatible with the webcam capture device but could not communicate directly with the RaspiCam module [14, 15]. The second smaller and cheaper module, the Raspberry Pi 3 Model B (Rpi), is worth approximately 35 USD as of the date of the study and is offered as one of the most popular alternatives for embedded applications: credit card size, 45 g, 5 volt and 4 W power, audio output, HDMI video, and a built-in Wi-Fi internet adapter [16]. In addition to the hardware features, the module runs an open and free operating system based on Debian, and the system is compatible with both its dedicated camera, RaspiCam and webcam.

As for the software, again, two different approaches were used in [17]. Advanced programming software Wolfram Mathematica (WM) was selected for the development of algorithmic logics and fast performance evaluations. It has a wide range of imaging tools that greatly speed up algorithm development time. In addition, the software was used as a second measurement system to analyze the performance of research approaches, i.e., to provide information on the reproduction of algorithms, which were evaluated in parallel to the main one.

The main software was implemented using the Python programming language, a relatively high-level language with excellent transparency, autonomy, and computational power, in [18] addition to being widely spread and recognized that the recognized language is not arbitrary, and its use aims to ensure the proposed access through research, ensuring its compatibility and ensuring the portability and usability of algorithms in different systems. In this way, test codes can use the same codes, regardless of the hardware and operating system used [1921].

The development of algorithms in Python relies on the use of the OpenCV (Open Source Computer Vision) programming functions library, which supports computer vision applications for a free and open source, educational, and business use. Its development, based on computational skills and application in real-time applications, builds a set of all the basic tools and structures needed for image capture, manipulation, and processing throughout the study.

3. Experimental Results

The test campaign was basically divided into two stages, the initial stage for analyzing the test parameters, the second tests for validating the measurements, and evaluating the measurement method and comparing the division strategies. These two moments of the project share the same capture and acquisition method but use different infrastructures.

Several components need to be defined and adjusted to ensure the representation of the experiments with the application and consequently the validity of the study results. Infrastructure features cover everything from the implementation to the elements that make up the scenery for creating experimental images. Its character and reason for study are given below. Information about the configurations of these elements in the experiments will be described in Section 3.6.

Following the section of the test campaign discussed, two different test sites were used, the first being an indoor environment with strict control of lighting conditions. The second approaches the application stage, under experiments in the external environment and under natural light conditions.

The interior environment consists of a room with white walls without windows, with only light coming from artificial lighting. The room was illuminated by two lamps with a color temperature of 6800 K, i.e., colored equal to daylight, and a lamp with a temperature of 3000 K was used to simulate light at the end of the day, all with a high reproducibility ratio of fluorescent and color (IRC). In this configuration, with approximate brightness of 161.0 lux and constant distortion of 5.4 lux was the object of interest throughout the experiments, this variation is within the expected range for devices.

The external environment, on the other hand, is created by an open environment with exclusive exposure to natural light and, therefore, subject to its variations. Seizures at this stage occurred during the first week of November 2018, and all of this was carried out between 19 : 30 and 19 : 15 to the day level (evening) between 14 : 30 and 15 : 30 hours and between 11 : 45 and 19 : 50 in the main light (at dusk).

The material of interest in the pictures created by the plant seedlings characterizes an important aspect of the division process and, consequently, the experiments. In preliminary studies, three types of seedlings were selected with deliberately different characteristics of structure, size, leaf shape, and mainly color. Selected seedlings can be found in coriander leaves, mint, and curry leaves (Figure 1).

For comparative testing, it was decided to use seedlings from cultures that were more explicit in the national context to bring more relevance and applicability to the results; soy and corn seedlings were used. These, in addition to their high agricultural importance, have distinctive characteristics of color and texture, with corn seedlings being soft and slightly lighter green shades and soybean seedlings being coarse and dark (Figure 2).

In the background of the scene, the soil is mostly made of clay soil, also known as terra roxa, as it is in line with the proposals in the study area (State of Tamil Nadu) and is one of the most common soils seedlings for testing.

In some experimental stages, small portions of the organic substrate, naturally black, were added to create visual disturbances, as well as coconut fiber, pine bark, spruce, dried leaves, sticks, and small stones. These components were added to simulate unpredictable and complex conditions in the final application, commonly referred to as debris or visual noise testing. To reduce the variance between test conditions, soil and other debris were placed in a plastic reservoir, and for each capture stage, the seedling was transplanted to the center of this situation.

Furthermore, the position of the camera relative to the base is guaranteed by the support attached to the height-gain system and the adjustable angle, which ensures control over the capture pose. Finally, a reference object was developed to classify lighting conditions in an open environment and to implement color consistency through the reference method. The instrument consists of an image of 4 squares, each in pure colors, white, red, green and blue, and a 40 mm dark gray sphere, which ensures that light is captured from different angles by minimizing the concentration points in your grip. This last component was inspired by the work of [1321]. Figure 3 shows the note in our application.

4. Metrics

This topic explores the criteria used to evaluate and differentiate sectional approaches, i.e., we currently define test response variables based on planned experiments. Evaluation focuses on two different, one aimed at evaluating unit performance, measuring aspects such as sensitivity, errors, and accuracy. The other looks at the delay of the calculated cost and approaches. Both responses were calculated in standard scenes of the scene, not in the videos. However, it is important to emphasize that this approach is appropriate for the application. The codes generated using the OpenCV library process the images from the cameras as well as the sequence of photos.

4.1. Computational Performance Metrics

For performance evaluation, techniques related to classification measurements were used. If the partition is considered a binary classification function, in other words, as a result of the partition, each pixel of the image is immediately marked as belonging to the plant group or background group.

Several classification measurements were used to evaluate specific features of the results, but due to the extensive data size of the test campaign, it was decided to use a single measurement. In the first analysis, a total error was used, but after some evaluation of the measurement method, the -score metric was chosen to compare the results. The latter proved to be a good alternative, with a high representation of segment performance, summarizing the critical accuracy and sensitivity information of approaches at the same value.

Also known as the -score, the -rating is defined as the corresponding average between accuracy and sensitivity and other measurements discussed in its formation and sequence:

TP (true positive) is the number of true positive classifications, TN (true negative) is the number of correct negative classifications, and FP (false positive) is the number of false classifications such as false positive (type I error) and FN (false negative) and number of lost classifications of the object of interest (Type II error).

In a practical way, total error refers to the ratio of misalignment relative to total, which, although intuitive in use, depends on the size of the object of interest in the scene, which creates an unwanted relationship. Results and discussion. Sensitivity or recall, on the other hand, provides the ratio of the exact classifications relative to the total number of pixels owned by the plant; so, the higher the sensitivity of the approach, the more likely the plant is to be properly classified. However, false positive errors are not considered. Accuracy observes the ratio of positive and correct classifications relative to the sum of the positive classifications, i.e., the higher the accuracy, the more certain that a classification as a plant actually belongs to this class. Its calculation evaluates the relevance of the exam but does not look at the absolute ratio of the correct answers.

The -score metric combines the properties of both measurements and mitigates their main implications; so, only high-sensitivity and precision approaches can yield good results. This measurement covers the area from 0 to 1, where 1 is the exact classification and 0 is the absence of true positives. This measurement proved to be very strong to produce uniform performance results and show variation regardless of the distance of the object in the scene.

To calculate the classification criteria, it is necessary to use references to determine whether the image pixel classification is accurate or not, so that for each image evaluated in the study, a real image will be created manually and carefully by experts and acted upon as a standard.. The best part of the plant is on display.

Considering that one of the possible purposes of separating plant seedlings is to guarantee the recognition and monitoring of this material, a second performance indicator was used, which estimates the distance between the centers of the material separated by a given method. This distance is determined by the difference in pixels between the centers in the and directions of the adjusted images as a function of the diagonal size of the image. Creating a percentage error result that does not affect the image size is specified as follows:

With and , the difference in pixels from the center of the segmented object in relation to the center of the reference on the and axes of the image.

4.2. Computational Performance Metrics

The calculation was performed by measuring the operating times of the algorithms based on the performance rating. For this purpose, high-precision time counters were implemented in specific functions associated with each approach, so that the common processing for all approaches, such as capture, output image recording, and interface commands, was not calculated.

It only considers aspects related to timing, including color sampling functions, required data manipulations and transitions, scale adjustment, color rearrangement functions, entry functions, classification, and other required processing. Postprocessing steps such as color stabilization approaches and postprocessing were monitored separately from the section process.

To guarantee greater accuracy and reliability in processing time estimation, the processing value calculated by an average of 100 consecutive processes for each approach was calculated during the acquisition process by the processing unit and all data in the same context.

5. Result Analysis

Due to the causal structure of the experiment, most researchers used common tools and techniques for this type of approach. Results are often evaluated graphically and occasionally analytically using statistical tools.

In graphic analysis, all the data of the test are arranged in a graphic to facilitate a practical analysis of the results: the axis contains the answers, and the shows the test conditions by the titles of the appropriate groups of the test. This method of compiling data classifies a map into a variance chart or a multivariable chart.

For quantitative analyzes, the effects of factors and the correlations between factors were calculated. The response to change at the factor level has a numerical effect indicating the change of the variable, i.e., explaining the cause-and-effect relationship between the factor and the experimental response. Its calculation is simple and is based on the sum of the values of all the test stages, in which the factor or correlation level is one (+) with the conditions at level two (-). Its calculation can be described as follows:

After calculating all the outcomes of the experiment, an occasional Barreto chart facilitates the comparative visualization of the results, thus highlighting the largest contributors to the change in the response variable of the experiment.

6. Acquisition System

Despite the different test conditions, the computer and data used in the study are consistent and configure the basic functions of the test campaign. As described in the Infrastructure section, the plants are mounted on the bases and the device attached to the capture system is attached, which ensures the display and pose of the images.

The capture system is characterized by the Raspberry Pi embedded module, a capture device (webcam or rospecam) and battery, which ensures portability on the system and enables its use in the open environment. Capture commands are carried out by remote access to the computer via a VNC (Virtual Network Computing) connection over a wireless local network using a portable router, which allows the camera to display the image and control the capture time (Figure 4). Images are then stored in the mobile device’s memory and stored for analysis.

Capture software was developed in Python and followed the acquisition process used in the application, aimed at accessing customized images for higher capture speed and processing. Therefore, the captured image samples are more relevant to those obtained in practice. Additionally, the program allows you to manually configure the parameters of both cameras, allowing you to calibrate and adjust the gain of their channels, as well as enable or disable the automatic white and brightness adjustment functions of the cameras.

The captured images were classified as test samples. These were submitted to preprocessing or section approaches, eventually forming binary images, which were exported with their average processing time. Two versions of this program were created, one on Python, the other on Wolfram Mathematics, and the Python version on both processors.

These images were then evaluated with their actual image, and the data were finally compiled into spreadsheets for analysis, generating final answers to the functions of a program section in mathematics and the computational performance measurements described in the previous sections.

In other words, this first inquiry aims to understand the problem of comparing approaches and to assess the conditions and configurations that may affect this process. Table 3 below provides an overview of the rated components and capture system with internal environment, 400 mm height, and .

In the results, we looked at the impact of plant characteristics, the impact of capture resolution on measurements, the relationship between different processors and software effects, and the representation of total error (%) and bot section performance measurements based on the calculation. Check the performance generated based on the processing time (MS) to measure the performance and also check the gain generated by the simulation process in subsequent processing.

This first test was carried out indoors under control lights, disabling the standard pose and white camera’s automatic pose and exposure, which minimizes test variations as a result of lighting. Its morphological postprocessing and other experimental approaches have completion functions following the opening by a square structural element. Furthermore, three sample repetitions were performed for each condition, i.e., three consecutive recordings of the scene. Eventually, 30 different images of the scene were obtained. These were used in 4 section approaches, with two image states on three different processors, a total of 650 processes, each with processing time data and total section error.

7. Results

The experiment used the experimental framework of variance component analysis, with three corrections, to evaluate the measurement method and graphic analysis of the experiment (Figure 5).

Groups A1 to A4 correspond to the conditional 1 to 4 section approaches in Table 4, respectively.

With data covering values from less than 1 ms to more than 1 s, a large variation in processing times was initially observed. Many elements produced changes in the results of time, demonstrating the sensitivity of the measurement system and showed repeated values at each test stage, and their variance was significantly smaller than the variances found on the system. The estimated standard deviation for the duration of this test is 0.110 ms, which demonstrates the high accuracy of the submitted test. However, the occurrence of some erroneous points proves that the processing is not uniform enough, and that the increase in the number of repetitions will benefit the metric and, consequently, the results.

In turn, it should be noted that the main contributors to the difference in computational times are processors, with OpenCV placing a clear emphasis on system performance and, for worst results, Raspiberry Pi executing on the embedded module. By checking the algorithms, it also verifies that the material in the same process does not have a significant impact over time, which shows signs of visual freedom at the calculated cost. These graphical observations can be found in Table 5.

Furthermore, it is noteworthy that other factors also contribute to the variability in results. To better illustrate the contribution of the sources of variation, the diagram in the time results normalized by the mean value of each processor.

For these maps, the mean values (green horizontal lines) correspond to the approaches (from A1 to A4), which explains the apparent difference they make in the process; so, the higher green approach will take longer, followed by the HSV transition, with Otsu is directly expecting only HSV and short-term function. In addition, by adjusting the scale, the regeneration of time between processes is greater, and assumptions and evaluations of approaches and results can be made under any process so that the answers are sufficiently similar, thus making use reasonable. Very powerful processors for prototype or preliminary evaluations of the performance of approaches in embedded modules.

However, this assumption is not exhaustive, there are processing differences, and it is worth noting that real-time can only be obtained on the target processor, which becomes clear when observing the effects of image operations between processes. Its use constantly increases the calculation cost for all conditions, but its contribution to the algorithm time for each processor is different as seen in Table 2.

This difference may be due to architectural differences, different levels of memory, and the mathematical processing of its variables. As an important observation, this type of function presents a significant computational challenge for the embedded module, although relatively small (14% increase) for the personal computer, thus increasing the computational cost of the system operation to 80%. A paradoxical behavior is observed for small images in mathematics, with an excessively proportional increase in the cost of calculation, perhaps due to some overlap with activity.

Since it is classified as a postprocessing step, the morphological function does not depend on the submitted approach or view, and its function depends only on the size of the structural elements and the number of pixels in the image, i.e., for each resolution., Although these are found in Table 4, its duration increase is approximate.

In addition to the other observations made so far, this map shows the apparent impact of image size on the cost of computing the large number of pixels. The resolution levels are 4 times higher than the ratio of the number of pixels between them, and this ratio is noticeable in average operations, especially in the Raspberry Pi bag, but differences in processing indicate that they handle different amounts of data. The differences in these relationships can be seen in Table 6.

This table indicates the relationship of the calculated times between the processors, which will act as a measure of the time for simulation and algorithmic testing between different sites. This speed ratio depends on the resolution of the image, but on average, the calculated times on the embedded module are approximately 30 times longer than the advanced system and 3.5 times longer than the advanced language Wolfram Mathematica Computer.

It is important to highlight that despite the great influence of matter on sectoral responses, and consistency was maintained in sectoral approaches, i.e., best and worst methods were maintained despite the analyzed plant. This information can be used to recreate results for different plants and expand the potential of the results. The graphic results discussed can be found in Table 7.

In terms of morphology, its application demonstrates total error reduction in all applications, making consistent improvements to the results. However, the improvement was not as significant as expected, especially in situations where the error was already reduced by the use of approaches aimed at color separation. The mean reduction of the total error with the morphological application was 0.706%, a difference of 0.059%, which was even smaller when considering color approaches alone, indicating a proportional improvement of approximately 3%.

The resolution, in turn, reflects the most significant impact, taking into account most variations in the same approach. In color-based approaches, there is a tendency to reduce the error as the image size increases, from averaging to low resolution: 1.17%, 1.73%, and 2.19%, respectively. This difference can be explained by two effects, some changes and incorrect classifications in current noise and material definitions; so, the higher the resolution, the smaller the ratio of pixels to the change and, consequently, the lower the error. Another possibility is that the quality of the reference image deteriorates, reducing clarity increases classification errors, making the manual process for determining plant pixels more difficult, and having a greater impact on measurements due to any errors in the actual image for a small number of pixels.

8. Discussion

Visual systems are already a technology in precision farming, and in its various applications, the image processing problem for the plant segment is the initial and important step in obtaining valuable information from the environment, and the importance of this process is comparable to its difficulty. The study shows that the process of separating and classifying plant seedlings is complex and depends on the captured scene, which becomes a real challenge when exposed to unstable conditions of the external environment without the use of light control or more specific hardware. These restrictions are driven by functionality and market perspective, aimed at low-cost and access to technology, resulting in limitations in processing, hardware, operating practices, and consequently possible solutions.

Under these restrictions, literary analyses suggest color-based separation processes as one of the most effective in identifying plants. Therefore, common devices such as RGB cameras and modular processing units and open and widespread software were considered. On this basis, several techniques based on color separation and color stabilization methods were combined, compared, and tested under different and general lighting conditions. Despite the difficulties and precautions, the experiments showed the most promising solutions for separation, even in situations such as noise and lack of visibility.

Preliminary tests have identified key features of the measurements to support other comparative efforts. The processing time measurement was accurate and very effective in differentiating approaches, but the process showed signs of instability. It was found that display features did not affect processing times, lighting conditions, objects in the scene, or the distance of the capture system. On the other hand, the resolution directly triggers the processing, which is proportional to the number of pixels. The different processing units analyzed showed the limitations of low-cost modular hardware, but the reproduction of responses found in both processing time and segment performance allowed for the adjustment of factors and encouraged the use of high-level software for prototype and testing approaches.

In terms of division performance, the first tests helped to modify and improve the classification measurements, indicating that the -score was a more accurate and accurate measure of the division than the total error used in the first attack used in conjunction with the proportional error: measurement between centers for comparative studies. Both measurements show sensitivity to related test methods and factors, and with the right increase in the number of test reviews, they set the correct response variables for the measurement method.

The rated components in the first experiments show that high-resolution images have made little progress in the category for most strategies, but their influence has not been precursor and high reproducibility has been observed between responses. The differences were said to be due to the actual loss of actual images, calculations, and ultimately the transition between object and background. Capture devices showed significant differences in performance associated with transient approaches, where the intrinsic oscillations of the capture were related to the less separable nature of the system; otherwise, the difference was not observed. Morphological operations such as postprocessing and noise removal in binary images showed consistent improvements in results, but an increase in their computational time, especially in the embedded volume, did not compensate for the minor improvements made, and this function is only recommended.

9. Conclusion

In conclusion, we obtained the following results from this experiment by combining computational performance and segment answers: (i)The measurements were able to detect differences between the other test variables, showing great potential in the proposed measurement system; although, it has been predicted that the potential variation in the responses of the approaches to the continuous images could provide better use of the best estimates of the mean value. In addition to providing important information about the stability of the proposed solutions.(ii)Processors have a great influence on the calculated cost, but one can observe the regeneration by adjusting the size. Features of the scene, subject, etc. do not seem to change the timing of the answers(iii)Since it is proportional to the number of pixels, the resolution directly and significantly affects the processing time. However, the increase in processing, with significant improvements in segment performance, establishes a cost-benefit ratio(iv)The morphological function produced improvements in all responses, but its high computational cost, especially in the embedded module, encourages its use due to the small segment gains(v)The characteristics of seedlings affect the response of the segment and should be considered in performance analyzes(vi)Approaches are expected to have a major impact on extraction, with HSV location-based approaches showing the best results in this experiment; in addition, the responses of the methods show reproduction between different seedlings(vii)Because both the computer and the embedded module (Rpi) are closely related to the required alerts, advanced tools such as math and software can be used for prototyping and testing approaches

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding authors on reasonable request.

Conflicts of Interest

There is no conflict of interest.