Abstract

Agriculture is critical to human life. Agriculture provides a means of subsistence for a sizable portion of the world’s population. Additionally, it provides a large number of work opportunities for inhabitants. Many farmers prefer traditional farming approaches, which result in low yields. Agriculture and related industries are vital to the economy’s long-term growth and development. The primary issues in agricultural production include decision-making, crop selection, and supporting systems for crop yield enhancement. Agriculture forecasting is influenced by natural variables such as temperature, soil fertility, water volume, water quality, season, and crop prices. Growing advancements in agricultural automation have resulted in a flood of tools and apps for rapid knowledge acquisition. Mobile devices are rapidly being used by everyone, including farmers. This paper presents a framework for smart crop tracking and monitoring. Sensors, Internet of Things cameras, mobile applications, and big data analytics are all covered. The hardware consists of an Arduino Uno, a variety of sensors, and a Wi-Fi module. This strategy would result in the most effective use of energy and the smallest amount of agricultural waste possible.

1. Introduction

Agriculture is essential to the nation’s economy because it feeds the whole population. It links and interacts with all of the country’s relevant enterprises in this regard. A country is considered socially and economically prosperous if it has a relatively large agricultural base. In most countries, agriculture is the primary source of employment. Wide farms usually require the hiring of additional laborers to assist with planting and the treatment of farm animals. Most of these major farms have nearby processing plants where their agricultural products are finalized and developed [1]. During humanity’s civilization, significant innovations have been made to boost agricultural productivity with fewer resources and human effort.

Smart farming is a term that refers to the well-known and better approach to farm management that has become popular in modern agriculture. Crop health and production are monitored via the use of agricultural and information technologies, which include monitoring field crop condition and associated indicators. Ultimately, the objective of smart farming is to reduce the cost of agricultural inputs while still maintaining the quality of the end product. Fertilizers and insecticides have typically been applied in bulk and at a fixed pace, with the whole field being treated as a single unit of treatment.

Despite this, the high population rate has never enabled demand and supply to meet during these periods. According to estimates, the global population will reach 9.8 billion in 2050, up around 25% of the current total [1].

Developing countries are expected to account for nearly all of the population growth stated [2]. On the other hand, urbanization is expected to intensify, with 70 percent of the world’s population expected to be urban by 2050 (up from 49 percent today) [3]. Furthermore, income levels will be many times higher than they are now, driving up food demand, particularly in developing countries.

As a result, these countries will be more conscious of their diet and food quality. As a result, consumer preferences may shift away from wheat and cereals and toward legumes and, eventually, meat.

Food production should quadruple by 2050 to feed this larger, more urban, and wealthier population [4, 5]. In particular, the current annual wheat production of 2.1 billion tones should increase to almost 3 billion tones, and yearly meat production should increase by more than 200 million tones to meet the demand of 470 million tones [6, 7].

Currently, crop planting is low, so many farmers prefer to employ conventional methods. Consequently, producers, governments, agricultural scientists, and academia are exploring innovative ways to increase agricultural land production. Environmental causes such as water, plants, and climate change impact them. The production of crops is mostly based on soil fertility. As a result, classifying and increasing the availability of low-nutrient crops for quality crops is critical [2].

It is tough to produce quality crops since soil fertility determines most of the crop productivity. In addition, the detection and optimization of inadequate nutrition content are required to achieve the aim.

Disease detection in crop leaves is difficult. If it is diagnosed at the right time, associated pesticides may be used to suppress the disease. Crops are often harmed by a shortage of primary nutrients. As a result, it is essential to use fertilizers of the proper nature. Farmers have a rough time gathering soil nutrient statistics, water nutrient information, groundwater level, environmental conditions, and seasonal crop data concerning their farmland. Furthermore, they are having difficulty making better decisions based on the facts available to them.

Many modern farms and farming-related industries make good use of modern machinery as well as scientific and technological principles. Many farmers are unaware that soil research has little useful information about their soil. Soil test reports allow farmers to choose the appropriate fertility and learn how to apply fertilizer based on soil requirements. Excessive fertilizer application is occasionally one of the most serious issues in the agricultural domain [3].

This necessitates correction by soil fertility analysis. Furthermore, for safe crop development, it is critical to identify the time of delivery, form, and quantity of fertilizers. Farmers also should not understand environmental conditions when selecting crops. It is also difficult to diagnose diseases in leaves. This results in a bad yield. What are the current concerns of farmers and agriculturists?

The use of technology in agriculture can overcome most of the problems related to modern agriculture. In particular, the use of the Internet of Things concept, machine learning, and cloud storage results in providing solutions to most of the problems.

2. Literature Survey

2.1. Decision Support Systems in Agriculture

Data generation methods for farm management systems, such as pesticide control, field management, and crop management, are employed by decision support systems (DSS) [4]. The effectiveness of these devices is weak. This might improve farmers’ decision-making processes on crop fertilization by using modern IoT-based solutions. Many researchers have analyzed data mining approaches with restricted cases. The Soil Water Balance decision system model takes into consideration several aspects, including the soil condition, temperature, channel network, and cultivation quality.

Synthesis of Crop Environment (CERES) Wheat, which forms part of DSSAT, was deployed effectively in the semiarid and subtropical areas of Punjab for five crop seasons from 2000-2001 to 2003–2005 in order to cater for wheat crop development and advancement under nitrogen, varying climatic conditions, and water.

In Indian agriculture, the success of the DSS consultative is highly crucial. e-Sagu is a DSS service given under Media Labor, Asia, by IIIT, Hyderabad. It promotes improvement in farmhouse productivity by disseminating first-rate farmhouse informative agroexpert decisions in a sensible behavior of all farmhouse operations at the farmers’ doorsteps. This advice was given during all stages of seed cultivation, starting with the crop sowing stage and ending with the harvesting stage, and it reduced crop growing costs while increasing farmhouse yield with the excellence of agricultural products [6].

2.2. Big Data Analytics in Agriculture

Big data analysis is the process of processing massive volumes of data in order to uncover hidden patterns, unpredictable correlations, industry dynamics, customer preferences, and other types of useful business intelligence. Theoretical findings may lead to successful promotion, new income opportunities, better farming preparation, improved grower efficiency, competitive advantages over competitors, and other economic benefits. Throughout this generation, the agricultural sector must develop decision-making processes that can take advantage of large increases in data and knowledge from a variety of different sources, including soil, crops, weather, and farm management systems [7].

Thombare et al. [8] sought to gain insight into crop yield forecasting through big data research while also recognizing the socioeconomic problems involved. The analysis of this massive amount of data is focused on K-means clustering methods to determine which farming methods are better for the particular area and estimate yield using Apriori algorithms. This useful knowledge was once again given to farmers in order to improve crop yields and promote organic farming.

In huge data sets with a Hadoop-based neural network, Victoria et al. [9] offered a comparable and decentralized computing paradigm for function discovery. They used an artificial neural network frame in Hadoop YARN for the implementation of five selection algorithms for attributes. The Hadoop binary relational memory network is combined with stability and versatility to identify the optimum attribute selector for quick identification of appropriate qualities from different and high-dimensional dimension data sets.

Lin [10] implemented the MapReduce representation in Hadoop, thus using the Apriori data mining algorithm. In general, Association Rules algorithms compute rule scarcity when working with a large number of attributes. This problem is solved by the proposed MR-Apriori algorithm.

A good crop recommendation system has been developed by Priya et al. [11]. A Naive Bayes MapReduce classifier was used to offer recommendations to farmers for the crop, especially for agriculture in the Telangana area. The equipment is adaptable so that different crops may be evaluated. Using a yield chart may establish the ideal period for planting, plant growth, and harvesting. Farmers are given an accurate agricultural model to inform them of which crop they can grow on the ground.

Suryanarayana et al. [12] created a framework that uses historical weather data from an area and analyzes it using the MapReduce and Hadoop methods. Many important sectors that are influenced by the climate, such as agriculture, air travel, water supply, and tourism, will benefit from weather forecasting. Weather forecasting is a branch of meteorology that involves gathering data from different sources about the current state of the weather, such as rainfall, temperature, wind, and fog.

In the Internet of Things age, the meteorological department uses various types of sensors to determine humidity and temperature, among other things. MapReduce technology is used to efficiently analyze weather data using distributed algorithms. The advantage of using Map Reduce with Hadoop is that it can speed up data collection in an environment where the amount of data is growing by the day.

The only way to determine the important features in a data set is by feature selection. It has a high accuracy rating and takes less time to complete. Several scholars have developed feature selection algorithms. Recent research studies have established numerous data mining methods that have been used for the study of agricultural and biological datasets, resulting in valuable classification patterns [13].

Chouhan et al. [14] proposed a new way for the extraction of features from a dataset utilizing the PSO-SVM methodology and the fuzzy categorization of decisions trees. This proposed strategy was used for the datasets of Mushroom and Soja. The experimental results show that the methodology provided exceeds existing accurate techniques.

The key characteristics contribute to wheat grain production as determined by the controlled feature selection algorithm [15]. 472 Iranian fields with a unique set of 21 features have been selected for the feature selection phase. As a result of the vast variety of characteristics chosen, the results demonstrated that the systems had greater stability.

Villacampa [16] compared feature selection approaches such as knowledge gain, correlation-based feature collection, relief-F, wrapping, and hybrid methods for reducing the number of features in data sets. Three general classification algorithms (Decision Trees, K-Nearest Neighbor, and Support Vector Machines) were used as classifiers to evaluate the efficiency of the aforementioned methods. The relief-F system outperformed all other methods of choosing functions, according to the data.

Ru and Kruse [17] suggested a novel application of a feature collection approach based on agricultural data. The approach included a comprehensive set of features as well as a complete selection technique. To associate the outcomes of the various data sets with the yield projection, two regression models (SVR and RegTree) were used. Both the regression models, SVR and RegTree, generated experiments that were significantly different but also comparable. Nonetheless, on the one hand, both models returned understandable and explainable function scores while, on the other hand, offering a new understanding of the data sets and their functions.

2.3. Soil Classification in Agriculture

Gholap et al. [18] predicted soil fertility using a decision tree algorithm. They used attribute collection and boosting techniques to improve the efficiency of the J48 decision tree algorithm after collecting a dataset from a private soil testing laboratory in Pune. J48 is a Java version of the C4.5 algorithm that is open source. It is a mathematical classifier built on the Id3 algorithm, which is commonly used in machine learning. It is based on the idea of information entropy.

C4.5 produces a decision tree in which each node divides the groups according to the knowledge gained. As the separating criterion, the attribute with the highest normalized knowledge gain is used. They forecasted soil fertility and graded it as extremely poor, very high, low, high, moderate, or relatively high. They also increased the accuracy ratio to 96.73 percent by using a selection and boosting algorithm.

Ghosh and Koley [19] proposed a new approach to soil property analysis that is centered on neural networks for supervised learning and backpropagation. This research was mainly due to the effects of soil qualities such as organic matter, key plant nutrients, and micronutrients on crop development and to the determination of their percentage using the method above. These parameters are extremely expensive to calculate directly. The findings showed that the proposed approach successfully forecasted the soil parameters.

The links between large-scale meteorological conditions and agricultural yield were examined by Dahikar and Rode [20]. The main simulation and prediction approaches for improving their accuracy have been created for artificial neural networks. The crop forecasting methodology employs the artificial neural network to predict the correct harvested process by detecting soil and environmental characteristics such as soil shapes, PHs, phosphates, phosphates, potassium, organic coal, calcium, magnesium, sulphur, manganese, copper, iron, depth, temperature, precipitation, and humidity (ANN).

Kaur et al. [21] addressed agricultural data mining technology and practices. Data mining techniques such as K-means, K-Nearest Neighbor, Artificial Neural Networks, and Support Vector Machines are used in recent data mining approaches. They investigate the crop price forecasting problem. This has been a major agricultural concern in recent years, and it can only be resolved with the knowledge that is currently available. They discovered suitable knowledge models that aided in achieving high precision and generality in price prediction.

A robot might be used to keep food and warehouses secure, which could be beneficial. Managing a huge warehouse may be exhausting, and food is often left unattended, which can result in contamination of the product. Robots may also be employed to guard warehouses against intruders, even in the most difficult of circumstances, according to the manufacturer. Utilizing this kind of robot would not only save money but also help to maintain the integrity and quality of a meal, which would help to assure food safety [22].

With computer vision, it is possible to accomplish food safety and security that is automated, nondestructive, and cost-effective. Extensive research has been done to determine how effective it is in the assessment and classification of fruits and vegetables. An overview of the most current innovations in the food business is provided as well as information on the critical components of image processing technology. In this paper, the various components of a computer vision system are discussed in further depth. In order to prevent food-borne disease and ensure food security, rapid and precise detection of hazardous bacteria is essential for public safety biomonitoring. Various ways of detecting microorganisms have been developed throughout time [23].

3. Methodology

A framework for smart crop tracking and monitoring is presented in this section. Sensors, IoT devices, cameras, mobile apps, machine learning, and big data analytics are major parts of this framework. Hardware includes Arduino Uno, multiple sensors, and Wi-Fi devices.

Frameworks in Figure 1 consist of the following components:Arduino Uno: this board [24] features Wi-Fi, Ethernet, a USB port, micro-SD card space, and three reset catches in addition to an MCU ATMega 32u4 with Arduino compatibility. The board may also connect to an Atheros AR9331 to run Linux.DHT11/DHT22 humidity sensor: this is a humidity and moisture sensor. It is used to continuously assess the humidity and moisture levels on land. This sensed data is stored in the cloud via Arduino Uno [25].YL-69 soil moisture sensor: this sensor is used to determine the water content of the soil. It is widely used in farming, water systems, greenhouses, and other research center applications that demand precise estimates of water levels in the soil. It is divided into two sections: an electrical board that holds the hardware and a test that measures dirt mugginess. The sensor operates by establishing a potential distinction that corresponds directly to the dielectric permittivity of water. Voltage variations can be translated as changes in dielectric permittivity and hence as changes in water levels [26].Camera: it is used to capture cropped images and store those images into cloud storage via an IoT Arduino Uno board.Cloud storage: all crop-related images are stored in the cloud for further analysis using an SVM classifier. Soil-related details are also stored in the cloud for further analysis using a K-means classifier.For disease detection in crops, the SVM classification algorithm is used. Images related to crops are collected into the cloud using cameras and Arduino Uno. Then image preprocessing and feature extraction are executed. Then these images are classified using the SVM algorithm. Then the disease is predicted on the basis of a previously available training data set as soil details of a particular form are also available in the system. Then, on the basis of detected and soil quality, the system suggests the type and quantity of pesticides that can be used to avoid damage to crops.Mobile application: results are made available to farmers using a mobile application. Farmers will register on a mobile application, and they can see all the details related to their land and crops.

4. Machine Learning Algorithm

4.1. Support Vector Machine Approach

A support vector machine (SVM) is a comprehensive supervised learning approach, which is generally deployed for mostly solving two-class categorization problems. Besides, the SVM can also be utilized for analyzing the data for classification and regression scenarios. Further, SVM employs the kernel phenomenon for transforming the data, and then, depending upon these transformations, it determines an optimal borderline among the likely outcomes. Moreover, the decision boundary between the two classes on a graph needs to be widespread. SVM builds an optimal borderline that splits the new data point and assigns it to the correct category. Therefore, this optimal borderline is also known as the hyperplane [26].

The complexity of logistic SVM is as follows: n = number of training examples, K = number of support vectors, and d = dimensionality of the data.

4.2. Logistic Regression

The method used to relate a dependent variable to one or more independent variable is logistic regression. The dependent variable is sometimes called predictors, and predictors are called the independent variable. Plant type prediction (c) is variable based on temperature and humidity disparity. Thus, soil moisture and pH rate are variable independent. The formula that has been established is

The complexity of logistic regression is as follows:where n is the number of training examples and d is the dimensionality of the data.

4.3. Random Forest

Random forests are a collection of tree predictors in which each tree is based on the values of a self-sampled random variable with the same distribution across all forest trees. The generalization error in the forest converges as the number of trees in the woods is at an all-time high [27].

A tree classifier forest’s generalization error depends on the strength of the specific forest trees and their comparison. Using a random selection of features to separate each node produces error rates that are more stable in terms of noise. Internal measurements calculate variance, frequency, and consistency, and these are used to show the response to increase the number of characteristics used in the splitting. External measurements are also used for parameter importance estimation. Specific ideas apply to regression as well [28].

The complexity of random forest is as follows:where k is the number of decision trees, n is the number of training examples, and d is the dimensionality of the data.

5. Result Analysis

A data set of 500 images was created for the experimental study. This data set contains images related to mango leaf. 135 images had disease, and 365 images were normal images. In experimental analysis, three classification algorithms, namely, SVM-Support Vector Machine, logistic regression, and random forest classifiers, are used. To calculate accuracy, the following formula was used:where TP denotes True Positive, TN denotes True Negative, FP denotes False Positive, and FN denotes False Negative.

The results proved that the accuracy of the SVM classifier is better than the random forest and logic regression algorithm. The accuracy values for these classifiers are shown in Table 1 and are represented through graphs in Figure 2. The graphical representation of the accuracy is shown in Figure 3. Also, other efficiency parameters like precision and recall values for these classifiers are shown in Figures 4 and 5, respectively.

In this graph, SVM exceeds random forest and logistic regression in terms of Machine Learning Algorithm accuracy. The accuracy of SVM is higher than 95%, but random forest and logistic regression accuracy are less than 75%.

6. Conclusion

The main challenges for agriculture production are decision-making, crop selection, and supporting systems to increase crop yield. Agriculture forecasting is dependent on natural factors such as temperature, soil fertility, water volume, water quality, seasons, and crop prices. Growing developments in agricultural automation have resulted in the massive production of tools and applications for obtaining fast knowledge. Everyone, including growers, is increasingly using mobile devices. A framework for smart crop tracking and monitoring is presented in this paper. Sensors, IoT cameras, mobile apps, and big data analytics are all included. A framework for disease detection in crops is proposed. This is based on SVM classification. It detects disease in crops and suggests suitable pesticides on the basis of previously available soil data of a particular land.

6.1. Future of Work

This product is meant to tell farmers to take action right away. However, there is still a lot of work that needs to be done in the near future. The ESP32’s node MCU has both wireless Wi-Fi and Bluetooth capabilities built in. We could not make more prototypes because we did not have enough money. In big farmlands with a lot of different crops, farmers can install many prototypes like this that will be on a local network, connected by Bluetooth, and have a single main node that will gather data and send it to the cloud. Drone technology is also being looked into. By attaching this system to drones, it will be possible to map farmland in 3D, as well as keep an eye on crop production and life conditions. With the help of the GSM module and IoT SIM card on our laptops, we can connect this whole system to the Soracom Lagoon dashboard for even more in-depth analysis. As a result, smart farming has a bright future ahead of it. This industry has the power to change the world with the help of the right technology and government incentives.

Data Availability

The data shall be made available upon request to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.