Optical sensors have grown in popularity for estimating plant health, and they form the basis of midseason yield estimations and nitrogen (N) fertilizer recommendations, such as the Oklahoma State University (OSU) nitrogen fertilization optimization algorithm (NFOA). That algorithm uses measurements of normalized difference vegetative index (NDVI), yet not all producers have access to the sensors required to make these measurements. In contrast, most producers have access to smartphones, which can measure fractional green canopy cover (FGCC) using the Canopeo app, but the usefulness of these measurements for midseason yield estimations remains untested. Our objectives were to (1) quantify the relationship between NDVI and FGCC, (2) assess the potential for using FGCC values in place of NDVI values in the current OSU Yield Prediction Model, and (3) compare the performance of NDVI and FGCC-based yield prediction models from the collected dataset. This project, implemented on 13 winter wheat sites over the 2019-2020 growing season, used a range of nitrogen (N) rates (0, 34, 67, 101, and 134 kg N ha−1) to provide different levels of yield. Our results indicated that while NDVI and FGCC are highly correlated (r2 = 0.76), FGCC is not suitable for direct insertion into the current yield prediction model. However, a yield prediction model derived from FGCC provided similar estimates of yield compared to NDVI (Nash Sutcliffe Efficiency = −3.3). This new FGCC-based model will give more producers access to sensor-based yield prediction and N rate recommendations.

1. Introduction

Sensor-based yield prediction technology can be an important decision support tool for producers across the United States, with the resulting near real time yield estimates allowing farmers to optimize fertilizer application rates. Yield predictions, and associated fertilizer recommendations, can be made using normalized difference vegetation index (NDVI) measurements collected using instruments mounted to farm equipment or handheld sensors. While such sensors provide valuable information, the costs and availability of NDVI sensors can deter producers from adopting them [1]. On the other hand, fertilizer recommendations based on fractional canopy cover (FGCC) rather than NDVI may be a more cost-effective option, and the Canopeo smartphone application [2] enables FGCC measurements and requires only a smartphone camera.

Increasing eutrophication in coastal waters and increasing nitrate levels in drinking water motivate steps to reduce nutrient losses from agriculture. One avenue explored is using optical sensors to assess plant health and for nutrient recommendations [3]. Lukina et al. [4] noted that NDVI measurements could be useful for in-season yield estimates, which could in turn be used to develop N rate recommendations through the nitrogen fertilization optimization algorithm (NFOA). Since the inception of the NFOA, NDVI has become one of the most often used measures of plant health [5]. NDVI is based on reflectance of near infrared light (NIR) and red light, ranging from 0 to 1, with higher values coming from healthier (i.e., greener) plants [6]:

NDVI can be measured in a variety of ways, such as handheld sensors (Greenseeker, Trimble Agriculture; Crop Circle, Holland Scientific), sensors mounted to farm equipment, drones with cameras (Sentera, DJI, Micasense), and satellites [7]. However, NDVI is not the only option for measuring plant health. Canopeo, a free smartphone application developed by Oklahoma State University, is a tool that measures fractional green canopy cover (FGCC) using downward facing digital images [2]. Rather than quantifying the level of greenness, as is the case with NDVI, Canopeo is designed to measure canopy cover as the presence or absence of green vegetation above the soil surface. In this way, FGCC offers an assessment of crop health in terms of fractional canopy cover, with values ranging from 0 (no cover) to 1 (full cover), while different measurements, NDVI (greenness), and FGCC (canopy cover) are inherently linked [8, 9].

While FGCC has been used to estimate yields of forage crops [10] and above ground biomass of row crops [11], few studies have quantified relationships between FGCC values and winter wheat grain yields. Goodwin et al. [12] predicted winter wheat grain yields using both NDVI and FGCC values, and they found relatively poor yield relationships for NDVI (r2 = 0.28 to 0.49) and FGCC (r2 = 0.14 to 0.45) using direct comparisons between sensor readings and yield. In that study, NDVI and FGCC were positively correlated with correlation coefficients of 0.87 at Feekes 5 and 0.73 at Feekes 6.

However, current yield prediction models used by Oklahoma State University do not directly correlate NDVI and grain yield. Instead, the in-season yield estimation tool (INSEY; Raun et al. [13]) estimates yield based on the ratio of NDVI and the number of growing days (denoted as GDD > 0) since planting (i.e., NDVI per day of growth):

Here, GDD > 0 refers to the number of days with average air temperature above 4.4 °C. This value is a threshold set for small grain crops (including wheat) where growth occurs [13]. The INSEY model is based on over 30 site-years of data, and the yield estimates using INSEY are stronger (r2 = 0.54, [13]) than estimates based directly on NDVI (r2 = 0.28 to 0.49, [12]).

While the correlation of NDVI and FGCC suggests that FGCC may have value for calculating INSEY, this has yet to be determined. Therefore, our objectives were to (1) quantify the relationship between NDVI and FGCC, (2) assess the potential for using FGCC values in place of NDVI values in the current OSU Yield Prediction Model, and (3) compare the performance of NDVI and FGCC-based yield prediction models from the collected dataset. We collected sensor values from one year across 13 locations from winter wheat N rate trials across the state of Oklahoma at recommended sensor timing dates for optimum yield prediction [14, 15]. This work was done to lay a framework by which FGCC can become a viable tool for grain producers to predict yield and make subsequent N management decisions.

2. Materials and Methods

2.1. Study Area

This trial was conducted over 13 sites (6 at research stations, 7 on-farm locations) during the 2019-2020 growing season that spanned the wheat producing regions of Oklahoma (Figure 1, Table 1). The climate of Oklahoma is diverse, ranging from humid subtropical climate in the southeast to semiarid climate classification in the north-west and panhandle region. [16]. These sites have annual rainfall totals ranging from 478 to 932 mm and mean annual temperature ranging from 13.7 to 16.4 °C [17].

This trial was a randomized complete block design with a 5 N rate treatment structure (0 kg N ha−1, 34 kg N ha−1, 67 kg N ha−1, 101 kg N ha−1, and 134 kg N ha−1), applied by surface broadcast preplant as ammonium nitrate (34-0-0), replicated 4 times.

2.2. Vegetation Image Data Collection and Analysis

Vegetative sensing measurements of each plot were collected within 80–110 accumulated growing degree days of wheat growth (GDD > 0), which ranged from February 27 to March 29, 2020. This time, frame was chosen as it is where yield prediction has been found to be the most accurate for NDsVI [14, 15]. The NDVI measurements were collected using a GreenSeeker (Trimble Agriculture, Westminster, CO, USA) approximately 0.6 m above the crop canopy surface. Digital images capturing an area of approximately 1.2 × 1.5 m were collected using a Samsung Galaxy S9 smartphone (Samsung Group, Seoul, South Korea). Nadir images were collected by holding the phone out at a 90° angle directly in front of the researcher at arm’s length approximately 1.4 m off the ground for each plot. The image was then analyzed using the Canopeo tool in MatLab 2020b [18]. This tool estimates canopy coverage by classifying each pixel from an RGB image based on its color values. NDVI values in this experiment ranged from 0.24 to 0.77, and FGCC values ranged from 0.04 to 0.80, which spans nearly the range of possible ground cover values (0 to 1).

2.3. Grain Yield Sampling

At physiological maturity, whole plant samples were collected from a 0.9 m × 0.9 m area in each plot via sickles. Samples were placed in forced air oven at 43 °C for at least 24 hours, threshed using a small plot thresher to remove the wheat berry from chaff, and then weighed.

2.4. Yield Prediction Model

The yield prediction model portion of the nitrogen fertilizer optimization algorithm, developed by Raun et al. [13], is derived from six equations. The first is for INSEY, or in-season estimate of yield (see (2)). The INSEY provides a value of growth as a rate, as NDVI value per GDD > 0, signifying increase in NDVI per day of growth under current growing conditions. INSEY is to be taken from the farmer practice strip (FPS), or area to which N will be applied to reach yield potential. This value is then input into the yield prediction model:

The YP0 reflects the yield potential of the crop assuming no factors are changed (i.e., no other nutrients added, no drought stresses). This model also includes a standard deviation shift in the positive direction, to reflect yield potential. It is important to note that grain yield limiting factors occurring after sensing can cause differences between predicted and measured grain yields at harvest. Yield prediction provides a snapshot in time of that crop and does not take into account any postsensing stressors.

To predict yield, assuming an application of N occurs after sensing, at least two NDVI readings are required. These come from areas where high rates of N are applied (N-Rich Strip) and another area outside the N-Rich Strip, where N is to be applied based off yield prediction model from the FPS. The ratio of the response index (RI) values (RIN-Rich: RIFPS) reflects the relationship expected of both sensor values and yield. However, over time, data collection has shown the need to adjust that ratio [19] to reflect the difference in RISensor and RIYield. The calculation for RINDVI and RIAdjust is shown as follows:

Yield (YPN) is then predicted usingwhere YPN is the yield prediction after N application. Fertilizer recommendations can then be calculated based upon the difference of YPN and YP0, seen in the following. While the N rate is a product of the NFOA, evaluating the accuracy of N rates derived from the NFOA is beyond the scope of this manuscript:

2.5. Statistical Analysis

Statistical modeling was conducted using trend analysis software in Microsoft Excel. Linear regression modeling was used to explore relationships of FGCC and NDVI, RISensor and RIYield, and predicted and measured yield. Exponential regression was used to build the yield prediction models from both FGCC and NDVI. Nash–Sutcliffe Efficiency (NSE) was used to assess yield prediction models compared to the achieved yield of the study [20]. The NSE has been used to compare field observed data to predicted values in other studies evaluating hydrologic [21, 22] and forage yield models [23], but we will use this value to assess grain yield prediction models. The NSE values range from −∞ to 1, with a value of 1 indicating a perfectly fitted model, a value of 0 indicating that the model performs only as well as a model that uses mean observed values as the predicted model output, and a negative value indicating the model performs worse than the observed mean. Figures were produced using package ggplot2 in R [24, 25].

3. Results

3.1. Relating Canopeo to NDVI

Wheat producers often rely on optical sensors to gauge the N needs of their crop, but sensors that measure NDVI require special equipment that could be less accessible to many producers. On the other hand, smartphone-based technologies such as Canopeo that measure fractional green canopy cover (FGCC) are accessible to nearly all producers, but the effectiveness of using this technology for yield prediction is largely unknown. While NDVI and FGCC are grounded on inherently different technologies, we found that these measurements are strongly correlated (r2 = 0.76, Figure 2).

3.2. RI Comparison and Adjustment

A primary component of the NFOA is the response index (RI) of both yield and sensor values and the subsequent adjustment. The RISensor is the ratio of the sensor value from the N-Rich Strip and the sensor value from the farmer practice strip (FPS). The RIYield refers to the ratio of the yield from the N-Rich Strip and the yield from the FPS. Due to the discrepancy between RIYield and RISensor, an adjustment is necessary to accurately predict yield from sensor values. This adjustment, RIAdj, is described by the linear models portrayed in Figure 3 for both sensor types, as well as the current RIAdj used. The linear regression model derived from both the FGCC and NDVI data had coefficients of determination of 0.76 and 0.27, respectively. The slope of the FGCC derived model was less than 1 (0.55). The opposite was true for NDVI derived model, which had a slope of 2.49.

3.3. Yield Predictions with NDVI and FGCC

For both sensor types, yield was predicted using the current NFOA and displayed in Figures 4(a)4(d). Figure 4 displays the predicted yields from the current yield prediction model using FGCC data (a), FGCC data without the RI adjustment (b), FGCC data with new RI adjustment found with RIFGCC comparison (c), and the NDVI from the same dataset (d). It is important to keep in mind that predicted yield is expected to be higher than the actual yield (represented by being below the 1 : 1 line), as yield prediction represents a snapshot in time and assumes no yield limiting factors occur after sensing. That is, yield predictions represent the upper limit of potential yields. The strength of the predicted-achieved yield relationship was low (r2 = 0.34, NSE = −35.5) when yield prediction was made by inputting FGCC into the current NFOA (Figure 4(a)). This caused increases in yield prediction that were not only not reflected by achieved yield, but also in some cases not possible to reach in Oklahoma environment (e.g., predicted: 21578 kg ha−1, achieved: 2737 kg ha−1). Removing the RIAdjust led to increased accuracy (Figure 4(b), r2 = 0.33, NSE = −11.4), but it was still inaccurate. Adjusting the RISensor using the RI adjustment for FGCC from Figure 3 increased accuracy even more, providing predictions closer to the achieved yield (Figure 4(c), r2 = 0.26, NSE = −3.3). Comparing the two sensor types, FGCC with new RIAdjust was more accurate at predicting yield than the NDVI data (Figure 4(d), r2 = 0.12, NSE = −4.3).

A yield prediction model was built from FGCC data using the same methods used to build the original NFOA model (Figure 5). For this dataset, the FGCC built yield prediction model provided similar correlation (r2 = 0.47) as the NDVI model (r2 = 0.53).

4. Discussion

While FGCC derived from Canopeo is useful for estimating biomass yield [10, 11], very few have used FGCC to develop yield prediction for grain crops, rather than just measuring biomass. Goodwin et al. [12] investigated using both NDVI and FGCC to estimate grain yield in winter wheat in Ohio at different growth stages and found that these values could account for the most variability in yield, as long as sensor readings were taken at or prior to Feekes 5 growth stage. While our trial used different methods of developing yield prediction models than Goodwin et al. [12], our results support their findings.

Our results found that there was a significant relationship between FGCC and NDVI. As these sensors measure two distinctly different variables (NDVI-greenness, FGCC-canopy cover), it is not expected to be a 1 : 1 relationship. The deviation from the 1 : 1 line indicates that FGCC should not be directly inserted in the NFOA in place of NDVI and doing so could skew yield predictions. Yet the relationship is strong, supporting the opportunity that FGCC could be utilized similarly to NDVI.

The deviation between FGCC and NDVI values becomes apparent when investigating the RIAdj portion of NFOA. In Figure 3, we can see that the slopes of both NDVI and FGCC RI lines are much different. The NDVI RI adjustment line shows that the RINDVI must be increased to reach the RIYield, whereas RIFGCC would need to be reduced to reach RIYield (Figure 3). The RINDVI values from this trial produce an adjustment that is closer to what OSU currently utilizes in its NFOA than RIFGCC, as to be expected, as the OSU RIAdj is derived from NDVI values. The RINDVI has outliers that veer from the regression line. These points come from locations in which there was marginal difference in sensor values at sensing, yet provided very high response to the addition of N. The RIFGCC had a much stronger coefficient of determination (r2 = 0.76) with RIYield, which suggests that FGCC was more sensitive to differences between plots receiving N and those not. While the high r2 value supports the opportunity that FGCC could be utilized in yield prediction models, due to the differences between FGCC and NDVI, directly using FGCC in the current NFOA would not produce accurate prediction values. As the YPN (see (5)) is calculated by multiplying YP0 by RIAdjust, the drastic differences between the RINDVI and RIFGCC would impact the yield predictions and most likely overestimate the yield response if using FGCC.

This can be seen as when using FGCC to predict yield, which was highly inaccurate. Using FGCC and the new RIAdjust in the current NFOA provided the most accurate yield prediction compared to the achieved yield (Figure 4(c), NSE = −3.3, r2 = 0.26). Using NDVI provided less accuracy to predict yield (Figure 4(d), NSE = −4.3, r2 = 0.12). Yet, there were some outliers present in the data. This can occur when there are great differences between the sensor readings coming from the N Rich Strip and the FPS. Each of these locations had low NO3-N levels from the soil test analysis, which allows for greater response to preplant N.

Yet, there are still opportunities in which yield predictions could provide higher accuracy. This is by developing the yield prediction model from the dataset (Figure 5), using the same methods used to build the original NFOA [13]. This was attained by plotting INSEY (sensor value divided by GDD > 0, growth rate) against actual yield. Figure 5 displays a very similar model as the original NFOA models were built. The model has a good coefficient of determination (r2 = 0.47), but not as high as the model derived from the NDVI data (r2 = 0.53).

While the coefficient of determination values across the two models does not depict a very accurate model, this is to be expected, considering the wide range of locales and environments this study was executed in. Yield prediction models offer a snapshot in time of the highest level of yield attainable with current conditions. Any extraneous circumstances that occur after sensing can occur and decrease the yield ceiling. Dhillon et al. [14] reported high coefficient of determination values when investigating optimum sensing timings for yield prediction in Oklahoma but were on a subset of data that spanned 4 site-years, including the same trials that were used to develop the currently used NFOA. The dataset from our study spans 13 sites, 300 km, and 478–932 mm rainfall over one growing season [17]. Drought stress, freeze damage, weed pressure, and other circumstances could have decreased yield potential after sensing, but due to the amount of locations and their relative distance to each other, researchers were not capable of recording each event. This must be taken into account when creating a NFOA to serve large/variable areas. Previous work has shown that NDVI can be used to estimate winter wheat yield but requires regional data/equations to provide most accurate estimations [5, 26]. Future works may consider creating multiple NFOA to better serve the area, which can increase region specific accuracy.

It is important to note the magnitude of response reported in this study, where RIYield reached 10.1, or a 10-fold increase in yield from the 0 N check to full fertilized plots. Many nitrogen response studies have been conducted in Oklahoma over the past two decades and report an average “high” response of 2.0 [2730]. While the exceptional high achieved yield response could be an artifact of varying locale or environment in one growing season, it is certainly uncommon and creates challenges when comparing the results to existing literature.

All FGCC readings were conducted using the same camera, by the same researcher, at the same height and orientation for every plot. In using current NFOA with NDVI, producers use handheld sensors that naturally, when held, are level with the ground, and the producer adjusts the height with the height of the crop. Implement mounted sensors are also mounted in level orientation, and also are adjustable to the height of the crop. If collecting FGCC using a smartphone, further work would need to quantify errors associated with handheld use, as well as across multiple setups (smartphone, implement mounted camera, drone imagery, video implementation, etc.).

While this model is derived from only one growing season, it is compiled from 13 sites and multiple blocks per location. Across the dataset, the yields ranged from 324 to 5884 kg ha−1, across many different soil types, wheat varieties, and ranging environmental pressures. This shows that FGCC values have the opportunity to be just as effective as NDVI in predicting yield. While this dataset is limited in time, it is not limited by space across the state. We acknowledge while this model is not robust enough to fit growing seasons drastically different from the 2019-2020 growing season, it does provide an avenue for future refinement.

5. Conclusion

Sensor-based nutrient recommendations have become popular in the past couple of decades, but instruments capable of capturing normalized difference vegetation index (NDVI) can be costly. Canopeo, a tool developed in Matlab and available for free as an app for most smartphones, provides estimates of fractional green canopy cover (FGCC). We found that FGCC was correlated with NDVI, suggesting that it could provide an alternative to NDVI for in-season yield estimates in winter wheat. When using in the Oklahoma nitrogen fertilization optimization algorithm (NFOA) in place of NDVI [4], we found that yield predictions based on FGCC (NSE = −3.3 , r2 = 0.26) were nearly as accurate as those based on NDVI (NSE = −4.3, r2 = 0.12). A model developed using the same methods used to develop the first NFOA was also found to be similar to a model built using the NDVI from this project. This model sets the framework for utilizing FGCC to build N rate recommendations for the future, not just for Oklahoma, but for other areas as well. With cost being a significant barrier of current yield prediction method adoptions (NDVI sensors), a NFOA model built to utilize FGCC allows more producers in the state to have affordable access to precision N management technologies and use them in their production practices.

Data Availability

The data are available in the dissertation of Dr. Vaughn Reed and can be found at https://shareok.org/discover.

Conflicts of Interest

The authors know no known potential conflicts of interest.