Abstract

In precision agriculture, 3D vision systems are becoming increasingly important. By applying different optical 3D vision techniques, the acquired 3D data can provide information regarding the most important phenotype features in every agricultural scenario. However, most of these 3D vision systems are expensive, except some of the triangulation techniques. In this study, we focus on estimating accurate shapes using shape from focus (SFF), which is a triangulation technique. Typically, the SFF system incurs significant errors from images, including noise. As a solution to this problem, a simple low-pass filter such as the Gaussian filter has generally been used in most studies. However, when a low filter is applied, the noise is depressed but the signals are also blurred, which results in inaccuracies regarding the depth map. In this study, the noise is depressed independently without losing the original signals, and the edge components, which play important roles in finding a focused surface, are enhanced using the independent component analysis (ICA). The edge signals are amplified with a simple basis vector correction in the IC vector space. The experiments are implemented with simulated objects and real objects. The experimental results demonstrate that the obtained accuracy is comparable to that of existing methods.

1. Introduction

Precision agriculture requires sensing methodologies that provide information about individual crops and animals. During the last decade, 3D vision has become a key technology in precision agriculture for their extended capabilities compared to 2D. As 3D sensors are becoming smaller and smarter, the number of studies and applications related to agricultural 3D vision systems is increasing rapidly [1]. The 3D image generation techniques are mainly classified into three categories according to their principal measurement, namely, triangulation, time of flight, and interferometry. However, most of these 3D vision systems are of high cost owing to the expensive lasers or complicated scanning mirror systems involved, and hence, they are limited in their application, except some of triangulation techniques [24]. In the literature, many of the studies are aimed at improving the performance of triangulation methods in agricultural applications [47].

The cost can be reduced by applying passive optical methods. Shape from focus (SFF) system, a passive optical method that uses the image focus, is less expensive than other 3D solutions because of the simplicity of the system configuration and provides high accuracy. The expensive optical elements are not needed, except only one CCD camera. The purpose of SFF is to estimate the shape of objects by finding the exact focused position of each pixel from the scene. In the system, an image sequence of the scene is obtained from a fixed point of view, changing the focus consecutively with a predefined step. Then, SFF infers the 3D shape of the object from the image sequence. Many of the SFF studies have sought to improve the 3D shape in multiple ways. The research on SFF techniques is mainly divided into three categories: focus measure (FM), approximation, and optimisation techniques. FM is defined as the local sharpness measure of each pixel on the image. The initial depth map is obtained after applying the FM to the image to find the maximum response on every pixel from the image. The sum of the modified Laplacian, Tenenbaum (TEN), and gray level variance are generally used in 3D shape recovery [8, 9]. After FM is applied, approximation techniques refine the initial depth map. Various focus curve fitting methods with planar and curved approximation for searching the exact focused image surface have been developed for better accuracy and speed. Different optimisation techniques have been applied, such as machine learning, and the hardware has been improved to correspond to the specific industrial field requirements and achieve better efficiency. Various algorithms have been used, such as neural network (SFF.NN), dynamic programming (SFF.DP), fuzzy logic (SFF.FL), and principal component analysis (SFF.PCA) [810].

Compared to other imaging systems, SFF has some inherent errors due to translation, magnification, and its discrete number of frames. CCD noise is the main cause of error in the SFF system. Therefore, preprocessing to reduce noise is an essential part of most SFF algorithms. Convolving the original image with a Gaussian filter is the typical way to reduce the noise effect [11]. However, this method is not effective with a severely noisy dataset because important signals are also averaged. Thus, the noise effect still remains, which results in failure to find the accurate shape of objects.

2. Materials and Methods

The independent component analysis (ICA) is a useful tool for the separation of a set of signals from the original signal. The ICA decorrelates the mixed signal assuming that the signal is composed of an independent signal vector and a mixing matrix . It can be expressed by the following equation:

This equation can also be expressed as follows, with the assumption of a square matrix of :

The goal of the ICA algorithm is to find the separation matrix .

2.1. Preprocessing

(1)Vector population: the input image , composed of frames, is converted by the vector for each pixel, where , , and , in each of the dimensions. The vector population is defined by Equation (3)

Here, consists of seven neighbouring pixels. Figure 1 shows in the Cartesian coordinate. (2)Zero mean and whitening: is normalised considering the first- and second-order statistics. The equations are as follows:

Here, is the mean of and is the covariance matrix, is the eigenvalue matrix, and is the orthogonal eigenvector.

2.2. FastICA

To divide the independent components from the original vector, the separation matrix is calculated. In this study, a fixed-point FastICA is used for its generality. is induced by minimisation of mutual information. The mutual information is defined using negentropy, which is the index of non-Gaussianity. The negentropy is defined by the following equation:

Here, is the entropy for , and is a Gaussian with unit variance. The minimisation of mutual information is equal to a maximisation of the negentropy. Here, the FastICA algorithm yields the solution as the following equation [12]:

Here, is the first derivative of a nonquadratic nonlinear function, and means the averaging over all column vectors of matrix . The calculation of is repeated until convergence.

2.3. Denoising

Assuming that the original signal is contaminated by white Gaussian noise, the shrinkage function is used to find the noise components. Here, the probability distribution of the function is assumed as Laplacian [13]. The shrinkage function is followed by where denotes the covariance of the Gaussian noise and the noise level of the image.

2.4. Edge Enhancement

Typically, the focus is on the surface of objects, that is, the edge components in the images. Thus, enhancing the edge component is one of the strategies to improve the depth map. From the independent signal , the edge components are enhanced by eliminating the components that maximise the mutual information. The component involves common information such as the average images between two consecutive frames. Here, the components are removed by simply replacing the value of the basis vector of to zero.

2.5. Transition to Principal Components

Finally, the signal is changed to principal components from the independent components. After the transition, the final depth map is estimated using the first principal component, which has maximal covariance. Figure 2 shows the transition stages of the proposed method. The final depth map is calculated by the following equation:

Here, is the first principal component of the input signal.

3. Results and Discussion

3.1. Experimental Setup

For the experiments, a simulated cone, which was created by a virtual program, is used. The simulated cone consists of 97 frames sized . Figure 3 shows the real objects used for the experiments, which are a real cone object composed of 30 frames sized , a TFT-LCD cell composed of 60 frames sized , and the Lincoln head on a US penny composed of 60 frames sized . Figure 4 shows the simulated cone with 30 and 60 frames. It has a runtime of only 1 min with dual core Intel i3-2100 processors running at 3.1 GHz and 8 GB RAM in MATLAB.

3.2. Quantitative Analysis

For the quantitative analysis of the proposed algorithm, the root mean square error (RMSE) and correlation are calculated for each algorithm. The RMSE is an indicator of the amount of error in the images. It is denoted as where and are the number of pixels in the horizontal and vertical directions in the image, respectively, is the reference image, and is the resulting image of the algorithm. Correlation is the similarity of the two images. It is denoted as where and are the reference image and the resulting image of the algorithm, respectively, and and are their means.

In the case of experiments for synthetic object, it was possible to calculate RMSE and correlation as their true depth maps are known. For evaluating the performance of the proposed method, we experimented with three different Gaussian noise situations by changing the variance. The first situation is noise free, i.e., , the second is, and the third is =0.1. Figure 5 shows the simulated cone with additive Gaussian noise.

The traditional SFF, which simply uses a Gaussian filter, the SFF.PCA, and the proposed method were applied to remove these noise effects, recover the 3D shape, and compare it with respect to the above two matrices. Table 1 presents the robustness of the proposed algorithm compared with that of the others. In the table, it is clear that our approach significantly improved the results in comparison with the other methods. In a severe noise situation, when , the RMSE is improved by 30% and 38% compared to the traditional SFF and SFF.PCA, and the correlation is improved by 99% and 139% compared to the traditional SFF and SFF.PCA, respectively.

3.3. Qualitative Analysis

Figure 6 shows the restored 3D shapes of the simulated cone. The first row is the traditional SFF, the second is SFF.PCA, and the third one is the proposed method. The first column is the noise-free situation, the second column is, and the third one is . In all the cases, it is shown that the results of the proposed algorithm have a smoother surface compared to that of others. Figure 7 shows the restored 3D shapes of the real objects with noise variance of 0.05. The first column is the traditional SFF, the second is SFF.PCA, and the third one is the proposed method. In all the cases, the proposed method excels in comparison with the others. It is shown that the results of the proposed algorithm have a smoother surface compared to that of the others.

4. Conclusions

For application in precision agriculture, a low-cost SFF system to recover accurate 3D information is developed. The conventional SFF system is weak for noise. In this study, a novel technique to improve the shape accuracy, which employs the ICA to reduce noise and enhance the edge components, has been proposed. First, the raw image passed through vector population and normalisation. Next, with the proposed approach, the signal was decomposed to independent components based on the fixed-point FastICA, and the noise components of the signal were reduced by the shrinkage function. Then, the edge components of the signal were enhanced by removing the common component from the basis vector. Finally, the depth map was restored using the first principal component of the signal. The experiments were conducted on a simulated cone object and three real objects, using three different noise situations, thus ensuring the robustness of our algorithm. The experimental results proved that the proposed method improves the accuracy compared with the previous SFF.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported in part by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2019-2018-0-01433) supervised by the IITP (Institute for Information & Communications Technology Promotion), and in part by the MOTIE Research Grant of 2019 under Grant 10067764. We thank Moon-Gu Jeon for his assistance with useful discussion.