Evapotranspiration (ET) is normally considered as the sum of all water that evaporates from the soil and transpires from plants. However, accurately estimating ET from complex landscapes can be difficult because of its high spatial heterogeneity and diversity of driver factors, which make extrapolation of data from a point to a larger area quite inaccurate. In this paper, we hypothesize that MODIS products can be of use to estimate ET in areas of Caatinga vegetation, the hydrology of which has not been adequately studied. The experiment was conducted in a preserved level area of Caatinga in which meteorological and water flux measures were taken throughout 2012 by eddy covariance. Evapotranspiration estimates from eddy covariance were compared with remotely sensed evapotranspiration estimates from MOD16A2 and SAFER products. Correlations were performed at monthly, 8-day, and daily scales; and produced values of monthly scale = 0.92, 8-day scale = 0.88, and daily scale = 0.85 for the SAFER algorithm. Monthly MOD16A2 data produced a value of , and they may be useful because they are free, downloadable, and easy to use by researchers and governments.

1. Introduction

The high equipment and maintenance costs involved in measuring water fluxes in agrosystems and natural ecosystems through field experiments make remote sensing an attractive alternative [1]. Remote sensing has been used worldwide as a low cost, fast, and practical methodology to measure physical and biological parameters at multiple scales from land surfaces [2]. It has been applied in different climatic regions to determine and to map the spatial and temporal variation of components of water balance [3, 4]. It has been coupled with hydrological models, such as the Soil and Water Assessment Tool (SWAT), providing input data normally obtained from agrometeorological stations [5].

Evapotranspiration is normally considered as the sum of all water that evaporates from the soil and transpires from plants [6, 7]. It can be used to estimate the amount of water used by crops and the amount of irrigation required for optimum crop production [8]. It can also be used as a component of models used to estimate other components of the water balance, including surface runoff, lateral flow, base flow, and percolation to aquifers [5, 9]. Such estimates are useful for management of water in soils, reservoirs, and even hydroelectric plants [2]. As human population increases and climate changes, water management is becoming a greater concern for researchers [7], farmers, and other decision makers. The state of Pernambuco has developed policies for climate change (state law number 14,090 of June 17, 2010) and coastal management (state law number 14,258 of December 23, 2010) and to combat desertification and mitigate the effects of drought (state law number 14,091 of June 17, 2010). These policies are designed to help manage the water in the state, and accurate information about temporal and spatial variations in ET could assist in their implementation.

Spatial heterogeneity of ET can be great due to spatial variations in vegetation, weather, soils, and topography. As a result, extrapolation of ET estimates from flux towers to larger areas can be misleading [1, 10]. Also, ET is spatially autocorrelated, so changes at a point can affect its surroundings [11]. Direct measurement of ET in the field is expensive and difficult; therefore, satellite-based products that can be used to remotely estimate ET have become very popular among researchers and government agencies. Several models of landscape energy balance based on remotely sensed data have been developed, for example, SEBAL (Surface Energy Balance Algorithm for Land, [12, 13]), MOD16 [7, 14], and SAFER (Simple Algorithm For Evapotranspiration Retrieving, [8]). Each of these models has advantages and disadvantages depending on where they are being applied [15]. Because models are often based on a global scale [7] or on a model-species approach [16, 17], calibration and validation may be required to parameterize a model to a given set of local conditions [18].

Validation is used to analyze the uncertainties of model outputs, ensuring the accuracy of remotely sensed ET and enhancing its applications. A wide range of methods has been used, but in most cases, deviance measures and statistical tests are recommended ones. The first one evaluates the differences between the simulated and observed values [19]. With squared deviations, Root Mean Square Errors (RMSE) can be useful to derive statistical properties. The second one is better for bias analysis. It is important to use both methods because a dataset could be extremely close in absolute values but spatially removed from the perfect covariance fit (i.e., regression curve) and vice versa, especially when it is a nonlinear relationship like an exponential fit. For that, the coefficient of determination () is usually the measurement to evaluate the proportion of output variation that can be explained by the fit curve.

In this paper, we hypothesize that MODIS products can be used to accurately estimate ET in areas of Caatinga vegetation in northeastern Brazil, which is a neglected Brazilian vegetation type in terms of research and investments [20]. In our laboratory, Machado et al. [21] have previously estimated ET using Landsat imagery, but we believe the MODIS sensor may be better suited for monitoring ET in Caatinga because of its high temporal resolution. In this study, we used eddy covariance data to calibrate MODIS estimates of ET for Caatinga vegetation on monthly, 8-day, and daily time scales.

2. Methodology

2.1. Study Area

The experiment was conducted in a preserved flat area of Caatinga (≈600 ha) located at Embrapa Semiárido research station of the state of Pernambuco, Brazil (40°19′16′′W, 9°02′33′′S; 350 m) (Figure 1). Caatinga is a hyperxerophilic vegetation type that consists of deciduous xerophytic shrubs and trees with an average height of 5 m and over 1,000 vascular plant species [20]. In the study area, dominant plant species include Poincianella microphylla (Mart. ex G. Don), Croton conduplicatus (Kunth), Bauhinia cheilantha (Bong.), Manihot pseudoglaziovii (Pax & Hoffman), and Commiphora leptophloeos (Mart.). The climate type is BSwh (Köopen) or semiarid, with rainy season from January to April. Average annual rainfall is 510 mm, and it is temporally and spatially very heterogeneous. Mean air temperature is 26.2°C [22].

2.2. Eddy Covariance Flux Tower

Meteorological and water flux measurements were taken throughout 2012 with sensors installed in a 16 m tower located in the study area. A net radiometer (model CNR-1, Kipp & Zonen B; V; Delft, Netherlands) installed at 13.3 m above the soil surface was used to determine incoming solar radiation (). Air temperature and relative humidity (HMP45C, Vaisala, Helsinki, Finland) and rainfall (CS700-L Hydrological Services Rain Gauge, Liverpool, Australia) were measured at 15.7 m and 16.3 m height, respectively. Wind speed values were obtained with an ultrasonic anemometer 3D (WindMaster Pro, Gill Instruments Ltd., Lymington, UK) at 16.9 m height, and ET was directly measured by an Open Path H2O analyzer (IRGA; model LI-7500, LI-COR Inc., Lincoln, NE, USA). All sensors were connected to a data logger (model CR1000, Campbell Scientific Inc., Logan, Utah, USA) set up to take measurements every 10 seconds. More information on the study site and monitoring system can be found in a previous paper [22].

2.3. MOD16A2 Products

MODIS MOD16A2 products were downloaded for all 8-day and month periods of the year of 2012 from http://www.ntsg.umt.edu/project/mod16. The 58 images were rescaled from 0.1 mm 8-day−1 or 0.1 mm month−1 to correct units (mm 8-day−1 or mm month−1) by multiplication of all pixels by the 0.1, using the GDAL library (Geospatial Data Abstraction Library). The ET MOD16A2 dataset is composed of two components, (i) meteorological and (ii) remote sensing based data, and is computed using an algorithm, (1); [7, 14], a modification of the equation described by Cleugh et al. [23], which in turn is a Penman-Monteith approach to estimate ET [24].where is radiation budget (J m−2 day−1), is soil heat flux (≈0 J m−2 day−1), ρ is the air density (kg m−3), is the specific heat of air at constant pressure (1013 J kg−1 K−1), is saturation vapor pressure (Pa), is current vapor pressure (Pa), Δ is the slope of the curve pressure versus air temperature (Pa K−1), and γ is the psychometric constant (kPa K−1). The meteorological input data for that equation is always provided by the Global Modelling and Assimilation Office (GMAO) and includes daily total downward radiation (; MJ m−2 day−1), daily average air temperature (T; °C), daytime and nighttime air temperatures (, ; °C), daily minimum air temperature (; °C), and vapor pressure (, ; kPa), all at 1.0°× 1.25° spatial resolution. The land surface inputs are acquired from three MODIS products, with a spatial resolution from 500 to 1,000 m2. These products are MOD12Q1 [25], MOD15A2 [26], and MCD43B2/B3 Collection 5 (albedo) [27]. Further details of MOD16 algorithm are given in Mu et al. [14].

2.4. SAFER Products

To create SAFER products, we used images of level 1B MOD021KM products from sensor Terra/MODIS for the year of 2012. Firstly, all 366 images were downloaded through the Level 1 and Atmosphere Archive and Distribution System (LAADS; https://ladsweb.nascom.nasa.gov/data/search.html), and secondly, 71 images were selected as having clear and high quality data for the pixel regarding Embrapa Semiárido (Table 1). The MOD021KM product has 36 spectral bands, from 0.405 μm to 14,385 μm, with different spatial resolutions, from 250 to 1,000 m2. The SAFER algorithm is basically a simplified and calibrated version of SEBAL [12, 13] for Brazilian semiarid conditions [15]. It was first proposed by Teixeira [8] using Landsat imagery, but MODIS application details can be found in Teixeira et al. [3]. It utilizes only four bands in a five-step process to acquire ET. The first step was to convert the digital numbers (DN) of two bands from the visible spectrum (bands 1 and 2) into reflectance () and DN of two thermal bands (bands 31 and 32) into radiance (; W m−2 sr−1 μm−1). For that, we used following equation:where DN is the digital number and and are calibrated constants retrieved from the metadata of each product. The second step is used to calculate values of albedo, as recommended by Valiente et al. [28]:where , , and are 0.08, 0.41, and 0.14, which are values calibrated to Brazilian semiarid by Teixeira et al. [29]. The third step is to process the Normalized Difference Vegetation Index (NDVI), which can be acquired as follows [30]:where and are the reflectance of band 1 and 2, respectively. The fourth step is used to calculate surface and brightness temperatures:where and are brightness temperatures from band 31 and 32, respectively, and are both coefficients with value of 0.5 [29], is brightness temperature of each band , is spectral radiance, and and are conversion coefficients in W m−2 sr−1 μm−1. and can be calculated with the following equations [31]:where is the Planck constant ( J s−1), is the speed of light ( m s−1), is the Boltzmann constant ( J K−1), and λ is the mean wavelength for each thermal band in meters ( and for bands 31 and 32 resp.). Finally for the ET ratio (ET : ET0), inputs are NDVI, , and (7). The ET ratio is related with soil moisture and vegetation chlorophyll content. It can be used to estimate components of the water balance at the scale of the remote sensing pixels, which may include different agrosystems and natural plant communities.where is the reference evapotranspiration and and are regression coefficients of values , respectively for Brazilian semiarid conditions [8]. All calculations were performed with the GDAL library.

2.5. Reference Evapotranspiration

To estimate the reference evapotranspiration () required by the last step of SAFER algorithm, we used the following equations with daily resampled data from the eddy covariance Flux tower:where is radiation budget (MJ m−2 day−1), is soil heat flux (≈0 MJ m−2 day−1), is air temperature at 2 m (°C), is wind speed at 2 m (m s−1), is saturation vapor pressure (kPa), is current vapor pressure (kPa), Δ is the slope of the curve pressure versus air temperature (kPa °C−1), and is the psychometric constant (kPa °C−1). was calculated following this series of equations: where is net solar or net shortwave radiation (MJ m−2 day−1); α is albedo or canopy reflection coefficient, which was considered 0.175 as indicated by de Souza et al. [22] for the study area; is measured incoming solar radiation from meteorological tower (MJ m−2 day−1); is latitude in degrees; is latitude in radians; is the inverse relative Earth-Sun distance; is the number of the day in the year; δ is the solar declination; ωs is the sunset hour angle (radians); is daily extraterrestrial radiation (MJ m−2 day−1); is the solar constant (0.0820 MJ m−2 min−1); is clear-sky solar radiation (MJ m−2 day−1); is elevation above sea level in meters; is net longwave radiation (MJ m−2 day−1); and σ is the Boltzmann constant ( MJ K−4 m−2 day−1). The parameters , , , Δ, and were acquired with the equations below, respectively, where is wind speed at elevation (m s−1) and RH is relative air humidity (%).

2.6. Statistical Analysis

Covariance between SAFER and MOD16A2 and tower estimates of ET were analyzed using linear and nonlinear regressions. Normality and homoscedasticity of all ET data were tested with Shapiro Wilk test and Brown Forsyth test, respectively [32]. The differences between the simulated and observed absolute values were evaluated using RMSE [19, 32]. All statistical analyses were performed with the package R (v3.2.3; [33]), and results were considered to be significant when . Graphs were plotted with the software Veusz (v1.24; [34]).

3. Results and Discussion

3.1. Climatology and Surface Properties

A first analysis of the weather data showed that, in general, global radiation (), air temperature (), and relative humidity (RH) exhibited normal behavior in 2012, with low values of and and high values of RH in July and August (Figure 2). Exceptions in the patterns were observed only for February, April, and November, when rainfall had a strong direct or indirect influence. Annual precipitation was 90.42 mm, which was only 17% of historical averages for the area (510 mm). Temporal variation in rainfall was large, with peaks in February, April, and November. February was a cloudy month, with reaching values similar to July and decreasing as RH increased. The effects of the February rainfall peak can also be observed in ET0, which decreased to 76.6% of January’s (Figure 3). Global radiation measured in meteorological stations can be drastically decreased during cloudy days, and that might significantly affect the amount of energy available to the process of evapotranspiration. Also, ET0 is a modelled variable and varies positively in function of . For these ground measurements, cloudy days still produce reliable data, since surface data is acquired beneath the clouds, which is not true for satellite-based data that attempts to acquire surface data from above it. The values of NDVI, , and could not be processed to February due to these interferences. April presented a pattern inverse to February, with high and and low values of RH. Although it was a dry month, NDVI were still high because of the existing lag between rainfall and vegetation cover dynamics, and that kept , , and ET0 stable. November was again a rainy month that followed the same pattern as February, but since vegetation cover is lower in November than in February, was proportionally higher than in February.

3.2. Monthly Evaluation

The ET estimates for the MODIS pixel of Caatinga in which the tower was located were compared with the observations in loco performed by equipment installed on that tower. The relationship between observed data and MOD16A2 products presented coefficients of determination () of 0.77 for the linear fit and 0.82 for the exponential one (Figure 4). These values are comparable to that found by Ruhoff et al. [9] () when comparing MOD16A2 ET against eddy covariance measurements in an area of Cerrado, which is a Brazilian ecosystem that consists of a dense vegetation dominated by shrubs and trees with 5–10 m height. When using SAFER data instead of MOD16A2, we obtained of 0.92 for a linear relation and 0.79 for an exponential relation.

The lower obtained with MOD16A2 compared with SAFER suggest that the limitations cited by Mu et al. [7] may be operative. Those limitations include (i) inaccuracy of algorithm inputs, such as MODIS LAI (Leaf Area Index) that tends to be overestimated [35] and may result in overestimates of ET; inaccuracy of MODIS EVI that may lead to miscalculation of vegetation cover fraction; and misclassification of land cover (MOD12Q1) that may result in incorrect VPD and minimum air temperature values in equations [7, 25] and (ii) average parameter values for stomatal dynamics that may not represent well all species within that biome [36, 37]. On the other hand, the SAFER algorithm was created and validated for the Brazilian semiarid. Its greatest advantage is that both the algorithm itself [8] and its inputs (αs and in Teixeira et al. [15]) have been calibrated for the Caatinga conditions.

The exponential model performed better than the linear one for MOD16A2, indicating that the product’s ET estimates saturate and lose sensitivity at high ET rates. However, that is not a disadvantage of the SAFER algorithm. Since SAFER uses an exponential regression. This saturation pattern is based on an intrinsic behavior of the vegetation indexes used in both MOD16A2 and SAFER equations that are often over- or underestimated at low and high values for the Caatinga, respectively, and also on their relationships with LAI that has proven to be always exponential. For example, Costa et al. [38] (15) and Domiciano Galvíncio et al. [39] (16) have found exponential relationships between LAI and NDVI for Caatinga vegetation. In addition, vegetation indexes produce nonlinear relationships with LAI due to their mathematical structure [40].

Linear estimation of ET using MOD16A2 produced a greater average residual (3.98 mm month−1) than exponential one (2.82 mm month−1). These results corroborate the NDVI-LAI relation that is better for nonlinear than linear equations. Also, the linear model violates assumptions of linearity and homoscedasticity. We can observe curvilinearity in the data, and residuals are greater for greater values of predicted ET (Figure 5(a)). Note that the model should only be used within the range of data used to develop the model, which in our case is from 4.31 to 40.15 mm of ET. The exponential model for MOD16A2 was similar to the linear model, but it violates only the assumption of homoscedasticity, presenting the same increasing relation of residual’s variances and regression predicted values (Figure 5(b)). That is, estimates of small values of ET are more precise than large values. For SAFER, neither linear nor exponential models present any specific pattern of errors, suggesting that the assumptions were met (Figures 5(c) and 5(d)). However, the linear model is preferred because its is greater and its residuals are smaller (1.6 mm month−1) when compared with exponential estimates (2.43 mm month−1).

3.3. Eight-Day Evaluation

The relationship between observed ET and MOD16A2 estimates presented values of 0.62 and 0.69 for a linear and exponential fit, respectively (Figure 6). These values are again close to that found by Ruhoff et al. [9] in the Cerrado (), when comparing 8-day MOD16A2 ET with tower flux measurements of ET. Limitations regarding MOD16A2 products and linear models are the same as described for monthly data. Residuals from the linear model using MOD16A2 presented a greater average residual (1.56 mm 8-day−1; Figure 7(a)) than the exponential model (1.2 mm 8-day−1; Figure 7(b)), and we can observe the same pattern as in the monthly analysis, where both show signs of heteroscedasticity, and only the linear model violates the linearity assumption. However, the 8-day exponential model overestimates ET more than the monthly model. The mean positive residuals were 90.7% greater than the mean negative residuals in the 8-day model, while in the monthly model, mean positive residuals were 12.4% larger. That corroborates results of Ruhoff et al. [9] in which MOD16A2 overestimated ET for another biome in Brazil. We obtained values of 0.88 and 0.65 for linear and exponential SAFER models, respectively (Figure 6), with average residuals of 0.81 mm 8-day−1 and 1.15 mm 8-day−1 for linear and exponential models, respectively (Figures 7(c) and 7(d)). For the linear model, symmetry of residuals was quite large with small differences (6.2% for 8-day), but again the monthly model was better with only 0.01% of difference between average negative and positive residuals.

3.4. Daily Evaluation

As monthly and 8-day SAFER linear models gave better results than the exponential ones, we decided to develop a linear model for daily ET. This model gave of 0.85 with residuals varying from −0.4 to 0.6 mm (Figure 8). In comparison, Teixeira [1] found of 0.89 in a linear relation between SAFER data and field measurements of irrigated crops and Caatinga. In our study, the patterns of residuals suggested heteroscedastic behavior. This result was inconsistent with residuals of the linear monthly and 8-day SAFER models, possibly the result of temporal downscaling of weather data. However, daily ET data is essential for a precise environment monitoring, especially in the case of the Brazilian semiarid where there is a high space-time variation of rainfall [41], for example, 20% of the yearly rainfall may occur in a single day or 60% in a month [42]. Because of limitations in quantity and quality of weather station data in Brazil, use of remotely sensed data by researchers and government agencies has increased. Among the various sensors with freely available data that have been used to estimate ET, MODIS may be the most widely used. It provides data on a daily basis (250 to 1,000 m), which not only allows precise monitoring of ET at daily, monthly, and yearly scales, but also can provide daily inputs to hydrological models like SWAT [5].

3.5. Temporal Evaluation

ET estimates of the SAFER algorithm were closer to Flux tower estimates than MOD16A2 estimates (Figure 9), especially from April to December, where the mean monthly difference between the tower and remotely sensed estimates was −0.26 mm month−1 for SAFER in comparison to 11.08 mm month−1 for MOD16A2, and the mean 8-day differences were 0.019 and 2.89 mm 8-day−1 for SAFER and MOD16A2, respectively. However for the period between January and March, the differences were not as great: −0.012 mm month−1 and 0.17 mm 8-day−1 for SAFER compared to 16.36 mm month−1 and 4.31 mm 8-day−1 for MOD16A2. For the daily scale, the differences between SAFER and Flux tower ET estimates were 0.017 mm day−1 and 0.21 mm day−1 for the first and second periods, respectively. The SAFER model gave better results than the MOD16A2 model at all evaluated temporal scales. MOD16A2 tended to overestimate ET, due, we suggest, to the meteorological input data used in MOD16A2 algorithm. It is derived from a 1.00°× 1.25° grid, while meteorological input data for the SAFER algorithm was acquired from the Flux tower itself. Mu et al. [14] have reported that GMAO data produces biases if compared to measures from meteorological stations. The period when all products and the Flux tower were closest was from September to October. As we showed in the residual analysis, low values of ET tend to have smaller variances than high values. In those months, both SAFER and MOD16A2 presented the smallest values of ET because this was the only period that a dry month (≈0 mm of rainfall) was followed by another dry month and no soil water recharge occurred. However, even during these months when no rainfall occurred, we observed levels of ET varying from ≈3 to 10 mm. This ET may result from plant transpiration of water from deep soil layers and soil evapotranspiration of dew. In northern Israel during the dry season, Agam and Berliner [43] found that even when no dew is deposited on the soil surface, soil evaporation can be observed, suggesting that humidity is absorbed by the soil during the night and evaporates during the day.

SAFER linear ET models more closely matched Flux tower estimates of ET than all other models (Table 2), with Root Mean Square Errors (RMSE) of 1.97 and 1.13 mm for monthly and 8-day scales, respectively, while MOD16A2’s errors ranges were 1.99 and 4.91 mm, respectively. In the Cerrado, Ruhoff et al. [9] found RMSEs of 19 and 0.78 mm when comparing monthly and 8-day ET estimates from MOD16A2 to tower flux measurements, respectively, with monthly differences between predicted and measured data ranging from 0 to 40 mm. We also compared the minimum and maximum predicted values from the SAFER and MOD16A2 estimates to the Flux tower dataset. The MOD16A2 linear model underestimated the minimum Flux tower value by 81% for the monthly scale and by 60% smaller for the 8-day scale. In contrast, SAFER linear and MOD16A2 exponential models showed minimum values of only 4% and 7% greater than the observed minimum values from Flux tower dataset for monthly scale, and both were 28% greater for the 8-day scale. For the maximum values, the 8-day SAFER exponential model overestimated Flux tower estimates by 23%. Daily SAFER linear model showed 0.15 mm RMSE, over- and underestimating the minimum and maximum values, respectively, compared to the RMSE of 0.34 mm found by Teixeira [1], when analyzing ET field measurements and SAFER data. The SAFER algorithm was compared with another algorithm by its developers in their report [44], and they argued that SAFER outperformed because it estimates the ratio ET : ET0 instead of ET, which allows spatial extrapolation of ET [44]. However, since the MOD16A2 model estimates the same ratio, we suggest that the regionally calibrated inputs [8] are the reason that SAFER estimates may be more accurate over other methods.

4. Conclusion

In this study, we compared ground-based ET with remotely sensed ET from MOD16A2 and SAFER products, and it produced values of monthly scale = 0.92, 8-day scale = 0.88, and daily scale = 0.85 for the SAFER algorithm. Monthly MOD16A2 data produced a value of , and 8-day value = 0.69. Although, dataset variance increased in temporal downscaling ET data, we showed that MODIS derived products can be of use to model ET for the Caatinga ecosystem, with acceptable values for the SAFER algorithm at all temporal scales. Moreover, we also recommend the use of MOD16A2 monthly products to monitor the Caatinga when if locally observed meteorological data are not available to produce SAFER estimates. MODIS data produced satisfactory estimates of Flux tower observed data and are freely downloadable at http://www.ntsg.umt.edu/project/mod16. We believe this study contributes to the assessment evapotranspiration data via remote sensing techniques, which may provide better understanding of the evapotranspiration dynamics of the Caatinga ecosystem, which might be a world reference in the future for ecological disturbances due to climate changes or simply key information to help federal and municipal governments plan land use changes with rational criteria.

Competing Interests

The authors declare that they have no competing interests.


The authors thank CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazilian Coordination for the Improvement of Higher Level Personnel) for funding this study through the international project PVE A103/2013.