Abstract

The overall objective of this work was to develop and evaluate computer vision and machine learning technique for classification of Huanglongbing-(HLB)-infected and healthy leaves using fluorescence imaging spectroscopy. The fluorescence images were segmented using normalized graph cut, and texture features were extracted from the segmented images using cooccurrence matrix. The extracted features were used as an input into the classifier, support vector machine (SVM). The classification results were evaluated based on classification accuracies and number of false positives and false negatives. The results indicated that the SVM could classify HLB-infected leaf fluorescence intensities with up to 90% classification accuracy. Though the fluorescence intensities from leaves collected in Brazil and the USA were different, the method shows potential for detecting HLB.

1. Introduction

In recent years, there has been an increasing use of machine vision and learning approaches in the domain of agriculture applied towards detecting and classifying various diseases on trees and fruits [13]. Such techniques need to be developed for sustainable agriculture and to prevent great economic losses. Major citrus diseases such as citrus canker (Xanthomonas axonopodis pv. citri) and Huanglongbing (HLB) or citrus greening (Candidatus Liberibacter asiaticus) are a serious threat to citrus production worldwide including regions in Brazil and Florida. The citrus industry in both these countries, Brazil and the USA is making continuous efforts to control and contain these citrus diseases. The disease management practices include insect vector (Asian citrus psyllid/Diaphorina citri Kuwayama) control through pesticide application and eradication of symptomatic trees. Therefore, there is a need for a reliable diagnostic tool to identify the disease symptoms and consequently eradicate these trees (source of inoculum) to prevent further spread of the diseases. However, the process of identifying diseased tree through scouting, the process of frequent field inspection for recognizing symptoms, followed by laboratory diagnosis using polymerase chain reaction (PCR) analysis is time-consuming and expensive. A machine vision technique offers an alternative for disease detection under field conditions and thus better control of such diseases. Machine vision can aid in the scouting process and complement PCR analysis. Moreover, fewer samples can be sent for PCR analysis, thereby reducing the disease detection costs.

Pydipati et al. [4] used machine vision to identify citrus diseases such as greasy spot, melanose, and scab in leaves under controlled laboratory conditions. The researchers applied the color cooccurrence matrix (CCM) features to detect citrus diseases using the features recommended by Haralick et al. [5]. The authors were able to outline very important properties of disease detection such as color and texture. Though there have been applications of machine vision for citrus disease detection under laboratory conditions, there is a need for a sensing technique that can be applied under field conditions. The fluorescence imaging spectroscopy (FIS) has a potential to obtain spatial and spectral (fluorescence) information than regular imaging techniques. With rapid development and miniaturization of new light sources and detectors in the last decade, its application under field conditions is growing.

In this paper, we present the development and evaluation of machine vision technique for detection of HLB in citrus leaves using FIS under field conditions. The present research results in a unique comparison of the performance of the FIS system for HLB detection in leaves from Brazil and Florida.

2. Material and Methods

2.1. Leaf  Sample Collection

Fluorescence data were collected from HLB-infected citrus leaves in Brazil and the USA. In Brazil, twenty citrus trees from six different orchards (four municipalities in São Paulo State) were evaluated. The HLB-infected leaf samples were collected from trees of Citrus sinensis (L. Osbeck). The images were collected from both symptomatic and nonsymptomatic leaves from each field plant. A total of 100 samples of HLB-symptomatic leaves and 55 nonsymptomatic leaves (confirmed as healthy) were collected. It should be noted that in order to avoid any detachment time effect, the fluorescence measurements were performed within five minutes after the leaves were collected from each plant [6]. The data were collected between June 18 and July 2, 2009. After the onsite measurements, the leaves were transported in closed styrofoam boxes to the laboratory to confirm the presence of HLB using the laboratory method (PCR analysis). In the USA, leaf samples were collected from eight citrus trees at Citrus Research and Education Center’s citrus orchard in Lake Alfred, Florida from 14 to 17 July 2011. The leaf samples were collected from the same cultivar of citrus trees as from Brazil. A total of 68 symptomatic (HLB-infected) and 17 non-symptomatic (healthy) leaf samples were collected. Fluorescence data were collected from HLB-infected samples in Brazil and the USA, to evaluate the applicability of fluorescence imaging spectroscopy under different conditions. Once infected with HLB, citrus leaves develop symptoms such as botchy mottle, yellowing and thickening of veins or the entire leaf. The time period between the HLB-infection and the appearance of symptoms vary with age, time of infection, cultivar, and physiological status of the tree.

2.2. Fluorescence Imaging Spectroscopy System

The fluorescence spectroscopy imaging system was a portable custom-designed unit, which comprised of a monochromatic charged couple device camera (CCD) (model mvBlueFOX120a, Matrix Vision, Germany) that uses a USB communication port, a filter wheel (model CFW-1-8, Finger Lakes Instrumentation, USA) that holds up to eight optical filters and uses a USB communication port, four bandpass optical filters (models FB570-10, FB610-10, FB690-10 and FB740-10, Thorlabs, USA; 570, 610, 690, and 740 nm emission wavelengths), an objective lens, high power light emitting diodes (LEDs) at different wavelengths (365, 405, 470, and 530 nm) as excitation sources, and a standard laptop computer. Figure 1 shows a schematic diagram of the portable FIS system with major components that operates on car batteries. All the parts of the FIS system, except the laptop, were placed inside a closed box. The CCD and filter wheel were computer controlled using a program.

2.3. Segmentation

The data analysis procedures were written in MATLAB (ver. 7.6, The MathWorks Inc., Natick, MA). Figure 2 presents a sample fluorescence image at 690 nm emission wavelength using excitation light at 530 nm. It can be observed from Figure 2 that the image texture of healthy and HLB-infected samples was different, with healthy samples having a more uniform texture than those of the HLB-infected leaves. The first step in data analysis is the segmentation to identify leaf areas within an input image. In this step, the pixels are grouped into clusters.

There are several techniques for segmentation. Some of the methods tested in this study were -means clustering, histogram thresholding, Markov random field (MRF), graph cuts, and 2D histograms. Histogram thresholding and -means clustering did not produce immaculate boundaries. MRF, graph cuts, and 2D histograms represent a higher order statistics [7]. Following segmentation, the leaf pixels were separated as rectangular areas, as shown in Figure 3. These rectangular regions were further processed for feature extraction and classification.

2.4. Feature Selection

The features of texture (such as autocorrelation, contrast, and energy) were extracted from the segmented image based on co-occurrence matrix [4, 5]. A cooccurrence matrix (or a gray tone spatial dependence matrix) is a matrix distribution that is defined over an image to be the distribution of cooccurring values at a given offset. From this matrix, four metrics are computed from the image segment as shown in following equations: where, , , , are the means and standard deviations of marginal distributions associated with .   is the matrix of relative frequencies with which two neighboring resolution cells are separated by distance “” on the image, one with gray tone “” and other with gray tone “.” is the number of neighboring resolution cell pairs used in computing a particular gray tone spatial dependence matrix. The descriptors (uniformity, contrast, correlation, and homogeneity) were computed for each offset corresponding to eight-point neighborhood. Following that, the mean value was calculated for a single descriptor over each offset. Thus, for an eight-point neighborhood, the number of feature dimensions was four.

2.5. Classification

The extracted texture features were used as an input in the classifier support vector machine (SVM). The classification of the data was performed with 20 sample data in the training dataset (10 healthy and 10 diseased). The SVM classifier works on the concept of decision planes that define decision boundaries. A decision plane is a hyperplane that separates between a set of objects that belong to different classes. A support vector machine [8] creates a plane in -dimensions to separate -dimensional points lying in different classes. For a new point, the class is predicted based on which side of the plane the point lays. For the nonlinear separable data, a nonlinear mapping function is used to map the data into a higher dimension feature space. The decision function is based on dot product of the input features with support vectors, and the mapping of data into higher dimensional feature space is not needed upon use of kernel [9]. Some commonly used kernels are Gaussian radial basis functions and polynomial and hyperbolic tangent. The radial basis function and polynomial kernel were used in this study.

The classification of the data was performed using the SVM classifier. The classifier was used on the fluorescence data collected from Brazil and USA separately. The performance of the classifier was evaluated using classification accuracies, precision recall (PR) and receiving operating characteristics (ROC) curves. The ROC curve is a plot of false positive rate (FPR) on the -axis and the true positive rate (TPR) on the -axis. The FPR measures the fraction of negative samples that are misclassified as positive, while the TPR measures the fraction of positive samples that are labeled correctly. The PR curve is the plot of recall on -axis and precision on -axis. The goal in the ROC curve is to have low false positive rates [10].

3. Results and Discussion

3.1. Image Preprocessing

During the preprocessing of the fluorescence images, the data were segmented using normalized graph cuts in preference to other segmentation methods (-means clustering, histogram thresholding, MRF, and 2D histograms). The normalized cuts algorithm results in smooth spatial consistency in comparison to other methods such as -means clustering in color space and color histogram thresholding, which do not have high order information about spatial layout. In normalized graph cuts, the image is treated as a graph partitioning problem. The “cut” criterion measures both the total dissimilarity between the different groups as well as total similarity within the groups. The texture features such as uniformity, contrast, correlation, and homogeneity were extracted from the eight-point neighboring pixel of the segmented image. These features were used in SVM classifier. Preliminary evaluation of fluorescence bands indicated higher discriminatory power at 690 nm emission wavelength with 530 nm excitation. The fluorescence at 690 nm that was measured represents chlorophyll fluorescence. As the disease affects plant physiology including chlorophyll fluorescence, this wavelength will be a useful in HLB detection. Therefore, the segmentation and feature selection results from the fluorescence images at 690 nm were used as an input in the classifier SVM.

3.2. Classification

The classification results are presented in Table 1. An increase in the training sample size of higher than 20 samples did not improve the classification accuracies. Therefore, training was performed with 20 samples. The overall accuracies of SVM were 90% and 61% during the classification of data collected from Brazil and the USA, respectively. The PR and ROC curves on the Brazil data are shown in Figure 4. The classification accuracy was high (90%) for Brazil leaf samples with very low false negatives (9%). For selection of a suitable classifier, it is more important to have a low false negative rate than false positive rate, because the HLB-infected samples should not be classified wrongly as healthy. The false positive rate was also low (Table 1).

When the USA leaf samples’ fluorescence intensities were compared, it was found that the leaf samples emitted fluorescence at a very low intensity under same illumination and detection conditions, with about 20% of fluorescence intensity as those of leaf samples from Brazil. The cause for this difference is under investigation. The possible reason for this observation could be due to variation in plant physiology resulting from soil differences in these two areas, Florida and São Paulo. Due to very low fluorescence intensity of USA samples, the SVM classification accuracy was much lower (61%) than that of Brazil samples. The false positive as well as negative rates were also higher. Other possible reasons for variable classification accuracies could be effect of seasonal variation in leaf symptoms, effect of sample size, and variation in growing conditions among others [11, 12].

Variability of symptoms has also been observed by a number of researchers [11, 13, 14]. The irregularity in the appearance of symptoms can also occur within a tree, due to irregular distribution of the bacterial inoculum [11]. Chamberlain and Irey [13] stated that variation in symptoms can occur periodically during a year, also from grove to grove and tree to tree at a given period of time. This signifies the importance of calibrating the sensing systems under different conditions prior to field application. These findings also suggest the need for such sensing techniques to improve the scouting detection efficiencies.

4. Conclusions

The present study explores the potential of fluorescence sensing as a means for rapid identification of HLB infection in citrus orchards. A custom-designed fluorescence imaging sensor was evaluated with leaf samples collected from citrus trees in Brazil and the USA. The data were segmented using normalized graph cuts and texture features were extracted using cooccurrence matrix during data preprocessing. The extracted features were used as an input in the classifier, support vector machine. The classification results indicated that though the classifier accuracy was low (61%) for USA samples, the accuracy was good (90%) when it came to classifying leaf samples from Brazil. The probable reason for this observation could be the low fluorescence intensity of leaf samples from the USA. The false negative rate for classifying Brazil samples was low, indicating 91% accuracy while classifying HLB-infected leaf samples. Our future studies will involve assessing other fluorescence features for HLB detection in citrus orchards and comparison of the FIS technique on other stress conditions (nutrient deficiencies and other diseases).

Acknowledgments

This work was supported by Fapesp (Fundação de Amparo à Pesquisa do Estado de São Paulo), Citrus Research and Development Foundation (CRDF), and United States Department of Agriculture (USDA) National Institute of Food and Agriculture (NIFA). The present research results from the collaboration between University of Florida, University of São Paulo, and Fundecitrus (Araraquara, SP, Brazil).