Abstract

It is quite natural that the crops may be affected from a number of diseases due to many factors namely, change in climate, variations in environmental changings, deficiency of urea etc. Among these factors, the deficiency of the natural nutrients is one of the common reasons that may effect on the overall production of a certain crop. The grape leaves are one of crops that are affected by the deficiency of nutrients like Potassium, Magnesium, Nitrogen and Phosphorus. Furthermore, the effects of these nutrients may have similar disorders on the grape leaves like tilting off the leaf from edges, change in color, rusting off from the root etc. and it is hard to find the identify the nutrient which is likely to be deficient in the grape leaf. In order to ensure the good quality and high production, it is necessary to design an automated system that helps in classifying the affected grape leaf in any of the four classes namely, Potassium-deficient (K-deficient), Phosphorus-deficient (P-deficient), Nitrogen-deficient (N-deficient) or Magnesium-deficient (Mg-deficient). To achieve this target, we performed a series of experiment in which we first created a dataset of grape leaves affected from the deficiency of nutrients, from the crop fields in a controlled environment. The dataset is also augmented since the data instances were not in appropriate amount to achieve the negotiable results. After preprocessing, the Convolution Neural Network (CNN) classifier is used to achieve the average individual accuracies of 77.97%, 77.74%, 81.81% and, 78.09% for K-, Mg-, P- and N-deficient grape leaves, respectively using conventional training testing ratio and while for the same sequence individual accuracies achieved are 95.95%, 92.70%, 90.91% and , 94.76% using n-fold cross validation approaches on the original dataset. These accuracies were improved when these approaches are applied on the augmented dataset. The results were also compared with recent studies concluding that our proposed approach outperformed the previous studies. Our experimental results are equally applicable and beneficial when implemented on mobile devices for getting real-time results.

1. Introduction

The economy of many under developing countries usually stands on the back of the Agricultural sector. Agriculture and natural food play a key and vital role in enabling a country’s economy to stand on its back (Aurangzeb et al., 2020). According to the different studies, Agriculture contributes about 24 percent in Gross Domestic Product (GDP), It provides cheap raw material and food items but also holds the back of the country’s economy in providing employment opportunities to almost half in a number of the whole population. In the recent few years, the grape industry is the most trending and favorite industry in Pakistan. According to approximation, grapes are being grown on over 13,000 ha with a production of total equal to 49 thousand tons in Pakistan enabling the Pakistani product like grapefruit, grape juices, wines jams, etc. to compete with different countries in the world as a significant exporter.

However, the disease and deficiency in grape plants have caught up the path of development and boosting of the grape industry which is hitting back in the form of huge loos to the economy of Pakistan. Prevention from these diseases and deficiencies in an efficient way as well as their treatment helps in improving yield quality and quantity that is the reason it has caught huge attention from researchers to research in this specific field. Traditionally, human expertise was required for the detection and treatment of diseases which is a very time-consuming process and also opens of many errors in its execution. Therefore, in recent years many developments and researches have been conducted for the automation of different agricultural components and processes. As a result of this automation, it ensures faster production and the best quality generation of the crops. For the process of disease detection at an earlier stage with accuracy, and effective Automatic detection system of grape’s plant diseases and deficiencies is very important and time needs. Once a plant gets infected, the symptoms start appearing on the surface of the plant’s leaf. In this research, a system has been proposed which will work on convolutional neural network working techniques and it will detect the plant disease based upon the deficiencies of basic nutritional elements of plants at an earlier stage which may enable the farmers and agriculturists to prevent and cure against the diseases.

In this paper, we developed an automated system to detect the type of disease of grape leaf affected from the deficiency of one of the nutrients namely, Magnesium (Mg), Nitrogen (N), Phosphorous (P) and Potassium (K). A series of experiments were performed on the dataset, generated by collecting the images of affected grape leaves under controlled environment, using Convolution Neural Network (CNN). The achieved results are then evaluated using variants of n-fold cross validation approaches thus proving that the n-fold cross validation outperformed the proposed approach when compared with the state-of-the-art approaches.

The paper is divided into the following sections: Section 2 covers the recent literature review with the issues that are to be solved. In Section 3, we provided a detailed discussion on the dataset we generated for performing the task of classification. The proposed approach is discussed in Section 4 while Section 5 depicts the results achieved after performing a series of experiments. Section 6 concludes the paper with future works and a conclusion.

2. Literature Review

In this section, the state-of-the-art work published in the field of image processing related to grape leaves disease detection.

A CNN-based model was proposed by [1, 2] in order to diagnose diseases in grape fruit, including leaf blight, black measles and, black rot. The researchers put focus on a rather small computation models that are likely to executed on hand-held devices; a unique method designed using a lightweight CNN in combination with a CA (channel wise attention) mechanism. While considering CNN models, the Shuffle Net V1 and V2 are recommended by most of the users. On the other hand, the SE (squeeze-and-excitation) blocks are most widely used blocks while computing a framework for hand-held devices. The proposed model was checked and verified on an open dataset of 4,062 grape leaf images containing instances from 1 healthy and 3 diseased class. The effectiveness of the method is determined by the result of the experiments. The highest accuracy achieved was 93.14% and the size of the model is packed down from 227.5 MB to 4.2 MB. Similarly, in [3] the authors proposed a solution based on CNN model that works in three steps; (a) extracting the key features using a set of pre-trained models namely the AlexNet and ResNet101; (b) selecting the characteristic features by applying the Yager Entropy embedded with Kurtosis (YEaK) approach and finally (c) combining the key features using Euclidean Distance approach and then plugging in to the classification using a least squared support vector machine (LS-SVM). The simulations achieved an average accuracy of 99% when applied on the leaf images of infected grape plant collected from local crops fields.

A quite different approach is recommended in the work reported in [4, 5] in which the proposed approach is developed to compute and assess the expected date of the disease using the average month-wise temperatures and precipitation as the key features. The accuracy of the results is checked and verified on yearly-based cross-validation observations. It was observed that the Lasso regression analysis, gradient boosting algorithms and Random Forest (RF) were successful in showing better performance than the generalized linear models. Furthermore, it was also observed that an accurate estimation of date of disease reflects significant effect on quality of results rather than the weather-forecast related features. Among these approaches, the best performing algorithm was evaluated to observe the impact of weather forecasts. The results conclude that an appropriate use of fungicide treatment would like to reduce the attacks of disease by 70% when experiments were performed on the Bordeaux vineyards. Similarly, the authors [2] generated a state-of-the-art dataset and suggested an improved model based on CNN for an accurate diagnosis of the diseases. Among the other related work, the proposed approach in this paper is also considered as one of the rapid response model based on Regional based CNN (R-IACNN) equipped with the Inception-v1 module, Inception-ResNet-v2 module, and SE-blocks. The experimental results show an average of 81.1% mAP on GLDD was achieved by detection model Faster DR-IACNN and the speed of the detection reaches 15.01 FPS. So this research ends up with the indication that real-time detector Faster DR-IACNN based on deep learning gives a possibly better solution for the diagnosis of GLD and makes available some guidance for the detection of other plant diseases.

In (B [6], the authors proposed a quite different approach is introduced for the same purpose. Besides, they also collected 107,366 images of both the infected and healthy grape leaves. For feature extraction, the Inception-based structure is used that is capable enough to extract the multidimensional feature extraction. At the end, a Deep Inception based CNN (DICNN) is trained on these images using all features thus achieving an overall accuracy of 84.37% as compared to results obtained through the GoogleNet and ResNet-34 that achieved 76.29% and 78.04%, respectively. This study explores a novel approach to diagnose the plant diseases by bridging the gap between the theoretical models and its applications.

In a notable work reported in [7], the whole process is phases in following stages namely, image enhancement, regional segmentation, finding the textural features, and finally the classification. For image enhancement, the stretch-based enhanced algorithm was adapted. Then fragmentation was done using the grouping k-means method. The textural features are mined through the segments obtained from the previous stage. Finally, the classification is performed using two different classifiers namely; the Multi Support Vector Machine (MSVM) and Bayesian Classifier. This combination of different classifier also helped in making the comparison analysis to detect the foliage disease type. In total, the 400 grape foliage images were taken as a dataset. Out of which 100 grape leaf Esca (Black measles), 100 grape leaf black rot, 100 black healthy leaves and 100 grape blight leaves. The data was divided into 360 samples for training and 40 samples for the testing phase. The empirical results were evaluated by four parameters namely, sensitivity, accuracy, precision, and specificity. The results concluded that our proposed approach achieved the average accuracy of 92.35% while achieving 93.04%, and 87.64% on sensitivity and specificity measures.

In a quite different work reported in [8], the authors designed a detection system that is based upon analysis of image and backpropagation neural network (BPNN). The disease images were denoised by the Wiener filtering method based on wavelet transform. In this system, for improving the lesion shape morphological algorithms were used and then Parameters were extracted. In this proposed work the researcher uses deep learning methodologies to detect plant disease by using healthy plant leaves as well as the affected diseased leaf of plants. The data set of the system contains more than 87 thousand images of 25 different plants with 58 distinct classes. The success rate of the system was determined equal to 91.53%, weighing that this system can be expanded and may be used in the detection of plant disease from real fields.

In the work reported in [9] the researcher extended the work of the already proposed system by Johannes et al. (2017), with the help of the deep Residual neural network-based algorithm for dealing with different types of diseases of different plants. They studied 3 endemic wheat diseases from the images collected with mobile devices in a count of 8178 images. Through the system accuracy of the system which was already proposed is enhanced from 0.87 to 0.96. In [9], the authors proposed that cucumber disease can be recognized using three pipeline procedures namely, the k-means clustering of diseased leaf images, the lesion Information for extraction of features like shape, and color, and, the sparse representation (SR) for the classification of images of affected cucumber.

In [9], the authors claim that the system proposed reduces the computational cost and improves its performance based upon recognition. The accuracy of the proposed system was found equal to 85.7% in comparison with other features extraction-based models. Similarly, in [10], a quite different approach is proposed that assist in the diagnosis f the in-field wheat crop based upon deep learning. They formulated a dataset named WDD2017 in wild conditions. Two different architectures named VGG-FCN-VD16 and VGG-FCN-S were used to achieve the accuracy rate that was equal to 97.95% and 95.12% respectively which were further undergone a 5 fold cross-validation process resulting in 93.27% and 73%.

From the literature it is concluded that a number of research work [1114] is performed on the detection of various diseases in grape leaf using conventional and advanced approaches but a little effort is observed in the detection of deficiency of nutrients. Since nutrients play key role in generating the healthy and flourishing crop of a specific product, there is a need of designing an autonomous system that assist in detecting the type of nutrient deficiency in the grape leaves. To perform this task, we generated a state-of-the art dataset of these images since we there is not available any dataset of this kind, up to the best of our knowledge. The subsequent section discusses the generation of our dataset in detail.

3. Dataset

In data science research activities, the accuracy of the result achieved solely depends on the availability of the correct and precise data. The more precise and relevant data, the higher will be the accuracy and efficacy of the model. This activity eventually leads to developing a benchmark dataset. In this paper, we designed a dataset that is a higher degree of prevalence with the proposed classifier used to classify the correct category of the grape leaf. As mentioned earlier, our research is related to the classification of the grape leaf based on the type of disease it is affected from. Since there is not available any state-of-the-art dataset for the said purpose, we developed our data to perform the classification task. Our dataset consists of a total of 880 grape leaves; out of which 210, 240, 220, and 210 leaves were affected by the deficiency of the nutrients namely Potassium, Magnesium, Phosphorous, and Nitrogen respectively, as shown in Table 1. These leaves were collected from the crop fields of the Islamia University of Bahawalpur, Pakistan under a controlled environment. Each set of leaves of the training set was marked and labeled manually with the help of experts from the agricultural field. This activity assisted us in the classification of leaves at an expert level. Some of the samples of the dataset are shown in Figure 1.

3.1. Data Augmentation

The problem of over fitting in the results can be resolved using the conventional approach of data augmentation approaches. This activity assists in performing the training the classifier in relatively smooth way. It is concluded from the literature that when the random noise is adjusted in the data instances instead of applying the underlying relationship of the data instances, the problem of over fitting problem occurs when deep networks are applied for classification purpose (He, 2106). It is evident generating a large number of images by image augmentation approaches, the classifier is make capable enough to learn and explore the patterns and features in the training phase. This activity is quite useful in avoiding the possible and unavoidable the issues related to over-fitting. Furthermore, the benefits of data augmentation assists in many aspects including improving in the prediction accuracy of the model by adding more training data into the models; preventing the data scarcity for better models, reducing over-fitting ( i.e. an error in applied statistics that is defined as a function that is closely related to a limited set of data points). Besides, it also assists in creating the variability in data by increasing generalization ability of the models by helping to resolve the imbalance the intra-class issues while classification.

A number of digital image processing approaches are introduced till date to perform the data augmentation operations. It is worthy to mention that the factors associated with the image intensity like brightness, contrast, intensity and others have positive impact on the data collection instances when collected at different weather conditions. The Gaussian blur [15] is regarded as one of the computation parameter that helps in visualizing and simulating the effects of bad and hazy weather on the process of image acquisition. Furthermore, the positions of the camera and the images of the affected grape leaves are processed through the rotation transformations at different angles namely 90, 180 degrees, and 270 degrees using including horizontal and vertical symmetry operations. It is pertinent to mention that the Gaussian noise, interference of contrast, and sharpness are also used along with PCA jittering to augment the dataset having images of different.

Assuming V0 as the original RGB at an arbitrary pixel of the grape leaf image, then the value of V depicts the adjusted value, and d is the brightness transformation factor. The RGB value is the transformed and may be expressed as:

The contrast impression of each image is set to the median value of the brightness by increasing the RGB values of higher values and at the same time lowering the RBG pixels having smaller values. The overall process of transformation of RGB values is expressed as:

Then the widely used Laplacian template [18, 19] is made into use to adjust the sharpness of each image instance by assuming the RGB image pixel that may be formally represented as:

The formula for above equation may be represented as

It is pertinent to mention that every image in applied the rotation transformation at the same angle while considering the center point as articulation point. Assuming is an arbitrary point in the image then its new rotated coordinate after applying the rotation at by is . The new coordinates of the can be expressed as:

In the linear algebra related to the matrix related operations, the horizontal median represents the vertical symmetry operation on an axis that also represents a symmetrical transformation on remaining pixels. By assuming has the height and let is an arbitrary point in the image. By applying the vertical symmetry new coordinates are . It is worthy to note that the horizontal symmetry operation is similar to the vertical symmetry operation.

Using this augmentation approach, we generated a total of 13 new images from each image in the dataset thus resulting 12,204 images, 18,732 images, 13,274 images and, 17,429 images for the N-deficient, K-deficient, Mg-deficient and, the P-deficient grape leaves datasets, respectively. The details of the augmented data set are given in Table 2.

4. Proposed Model

The block diagram of the proposed nutrient base grape leaf disease detection system is shown below in Figure 2a. The proposed system solely depends on the feature extraction capability of the convolutional neural network model that is equipped with the featured map output layer. Our proposed model classifies the given input of the nutrient-deficient grape leaf in one of the 4 classes namely; Potassium-deficient (K-deficient), Nitrogen-deficient (N-deficient), Magnesium-deficient (Mg-deficient), and Phosphorous-deficient (P-deficient) on the two datsets shown in Table and Table 2. A detailed discussion about the phases of the proposed model is given in Figure 2, below.

4.1. Preprocessing

In this phase, the images of our dataset will undergo as set of pre-processing steps like noise reduction, image size normalization, gray and binary conversion. It is pertinent to mention that the images are resized to 50 × 50, by keeping the aspect ratio locked to retain the significant data and area of interest since the grape leaf is typically large in size as compared to other fruit leaves.

4.2. Feature Extraction

In this important phase, the features are extracted including the aspect ratio of the image considering the number of lines in both the horizontal and vertical direction; value of each pixel of the region of interest, etc. These features are then recorded in appropriate data structures labeled with the appropriate class name. It is noteworthy that the extracted features are then compared with the features of the input image for classification in the next phase.

The architecture of the CNN is rather different from a conventional artificial neural network (ANN) model in the sense that former network has a number of hidden computation layers working at different granularity levels between the input and output layer. The novelty in the significance of deep neural network (DNN) is that it’s the output layer is fully connected with the last hidden layer whereas, in conventional ANN, each neuron is connected to a single neuron in subsequent layers. One of the many reasons behind the successfulness of the CNNs is that these networks are capable enough to cover the hidden and key features of the images and automatically detects the characteristics features without any manual and human supervision. For instance, on inputting the images of animals like cats and dogs, it captures the key distinctive features that are playing role in differentiating the images thus ignoring the redundant properties by itself. One of the many promising features of CNN is that is also considered a computationally efficient model in image classification.

As mentioned earlier, we produced a dataset of grape leaves affected by the nutrient deficiency and are categorized into 4 general classes with an individual size of 50 × 50 pixels that will be plugged in as input for our proposed CNN model. The initial layer of our proposed model is a 2D convolution layer having a kernel of size 5 × 5. It is pertinent to mention that using a kernel of a larger size failed to capture the key details of the features while analyzing image data. On the other hand, smaller size kernel performs better than the large size kernel due to its simplicity while computing the features. Therefore, we choose the kernel size that is suitable for extracting the feature in a negotiable computation time.

The output layer in the convolution component is capable enough for processing the activation function ReLU (rectified linear unit) [62] that resolve the issue of the gradient descend when comparison analysis is performed with the standard “sigmoid” function [63]. Finally, the fully connected layer (FC) at the tail of the model performs the classification process on the data and features provided by the successive layer. The model was tested for the nutrient-deficient grape leaves to classify them in any of the four classes, aforementioned. The blueprint of the CNN model applied in our experimental work is shown in Figure 3.

5. Experimental Setup and Results

To achieve the promising results, we performed a series of experiments on the original augmented datasets for evaluating the accuracy of the results obtained through our proposed model in two different set of experiments. In the first set of experimental work, we classified the grape leaves using 3/4 of each class data training data and the remaining 1/4 as the test data. In the second set of experiments, we applied variants of k-fold cross validation since it helps in reducing the computation time because the process is repeated only 10 times when the value of k is 10. This activity helps in reducing the biasness among the results that might appear in the conventional approach of using a 70-30 of training-testing ratio.

While training the CNN model equipped with four convolutional layers for both the set of experiments; the number of hidden neurons in internal layers, batch size and , the learning rate, are considered as a set of tunable parameters. The literature related to CNN and its applications in image processing tasks observed that the CNN (and its variants) performed in a significant way by increasing the network scale. It is pertinent to mention that increasing the scalability of the network may cause the problem of over-fitting since it consumes higher computation time during the training phase. This activity of tuning the batch size assists in retrieving the optimal state of the CNN. This activity helps the researchers to identify the optimal input size for which the CNN generates promising results. The series of the experiments discussed in the literature concluded that the batch size has key impact on the memory space therefore the batch size in not too small to lengthen the process of experiments and should be not too large to difficult to handle. The Figure 4 depicts the effect of both the batch size and learning rate, on the quality of accuracy values. In order to overcome the issues aforementioned, a controlled environment is set to perform the training by tuning the momentum value to 0.8. This setup helped us in acquiring the optimal results.

The literature concluded that the augmenting the number of convolutional layers in any variant of CNN model may cause the issue of the over-fitting; therefore we intended to increase the layers gradually to achieve the optimal state. To achieve the global gradient descent, the batch size of relatively higher denominations is used. Furthermore, the learning rate is set to 0.0035 along with a negotiable batch size of 132. In Figure 5, the confusion matrices are depicted along-with the different efficiency graphs showing impact when the different numbers of hidden neurons are applied in our experiment purpose, thus achieving an average accuracy rate of 96.47%. The accuracy values at the diagonal are depicting the classification accuracy of individual grape leaf classes. On the other hand, the box (at bottom right) is depicting the average of all the accuracy results achieved in every set of experiment. The values other than the diagonal values represent the observations that were not classified with the higher accuracy. The number of observations and its percentage values are depicted in each representative cell.

It can be concluded from the results that the higher the number of hidden neurons in the model, the higher will be the number of epochs needed to achieve the results. It is pertinent to mention that the number of the hidden neurons is note reflecting the number of classes for which we are performing the classification process. In the real picture, the number of computable and tunable neurons in each hidden layer corresponds to the number of interconnections to be made with the subsequent layers in the network as mentioned in the structure of CNN (Figure 3). These hidden neurons play key role in selecting appropriate set of characteristic features by analyzing the image at a very deep level. It is worthy to note that by adding the number of hidden neurons result in having the higher computation complexity, but it helps in achieving a higher accuracy rate.

For validation purposes, we make use of the different variants of the n-fold cross-validation approach for avoiding any biased results obtained through the conventional ratio of training and testing data. The confusion matrices for the grape leaves classification using 8- and 10-fold cross-validations along with the performance graphs are shown in Figure 6 and showing an average accuracy of 92.7% and 95.6%, respectively. The results showed that our proposed model of CNN outperformed the n-fold cross-validation approach.

Similar set of experiments are also performed on the augmented dataset (see Table 2). The results showed that the augmented dataset provides significant improvement in accuracy achieved through the experiments perform on the original dataset using the same set of parameters. The results are shown in Figure 7 achieved through applying the conventional 70-30 training-testing ratio and, the results generated through applying the variations of n-fold cross validations are shown in Figure 8. Furthermore, the results are compared with the state-of-the art recent approaches for validation process, showing that our proposed model outperformed the approaches used for this purpose.

According to Table 3, it is evident that our proposed model outperformed the approaches used in the literature while using optimal number of features and computation time. Besides, our proposed approach proved itself quite efficient in providing accurate results while classifying the affected grape leaves.

One of the many limitations of our approach is that while performing the classification process on a rather higher number of images affected with grape leaf disease, the model failed to overcome the issue of over fitting during the training process. In order to overcome this critical issue, we recommend the reduction of the number of parameters that are redundant and not contributing in generating the results of classification thus improving he generalization performance of the model. Furthermore, a model with fewer parameters has a higher training speed and consumes fewer computing resources. It is pertinent to mention that the deep separable convolution consists of a depth-wise convolution in our proposed model of CNN, and a point-wise convolution that have fewer parameters than a standard convolution (Howard, 2017). It is worthy to note that in the deep separable convolution only a single kernel filters is used to each input channel thus resulting in a 1 × 1 convolutional operation. This single operation is then used by the point wise convolution at the output layer to constitute the outputs. This framework is quite handy in reducing the overall model size.

6. Conclusion

In this paper, the CNN (convolutional neural network) model is customized to make in use for recognizing and classifying the grape leaf is affected by deficiency of nutrients namely, Potassium, Magnesium, Nitrogen, and Phosphorous. Since the deficiency of each nutrient has somehow similar effects on the health and shape of the leaf, therefore there is a need for an automated system to detect these types of diseases with maximum accuracy. We also generated a novel dataset of grape leaves affected by these deficiencies in a controlled environment under the guidance of experts and grape crop specialists. On the other hand, we also compared the results with other state-of-the-art approaches in order to suggest the recommendations for the parameter tuning while performing parametric-based tasks. For this purpose, we generated two datasets: one containing the original images of the affected grape leaves, the second one contains the increased number of instances that are augmented through the classical approaches to avoid over-fitting and achieving higher accuracy. The usage of CNN model in grape leaf classification furnishes an opportunity to developers for designing an application for handheld devices for both the specialists and the farmers to analyze the grape fruit condition at initial stage to make them alert if any disease is going to affect the plant. The limitation of our work lies in the fact that there is a lack of standard data resources in the grape leaf affected with the diseases aforementioned to generate benchmark results.

It is concluded from the literature that most of the authors recommend the use of deep CNNs while performing image related classification and recognition tasks. There exist some inherent issues that are still remained unanswered like how to determine the number of levels and the hidden neurons that capable enough to extract the features in each layer. Similarly, a dataset is required that has multiple complex feature to check the reliability and efficacy of the deep network models. In addition, extracting the true set of the optimal features and parameters to achieve the error-free results is one of the major critical research issues. The reliability of the proposed classifier lie in the fact that it should be assessed through some other variants of the deep models like two-dimensional Long Short-Term memory (BLSTM) or bidirectional LSTM. The Reinforcement Learning (RL) is one of the new areas of machine learning that makes the system to perform the required actions by learning from the environment to maximize the efficiency of the system where human approach is quite impossible. [4, 68, 10, 12, 13]

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

No funding was available for this study.