The quantification of climate change impacts on several human activities depends on reliable weather data series, without gaps and long enough to build up future climate. Based on that, this study aimed to evaluate the performance of temperature-based models for estimating global solar radiation and gridded databases (AgCFSR, AgMERRA, NASA/POWER, and XAVIER) as alternative ways for filling gaps in historical weather series (1980–2009) in Brazil and to project climate change scenarios based on measured and gridded weather data. Projections for mid- and end-of-century periods (2040–2069 and 2070–2099), using seven global climate models from CMIP5 under intermediate (RCP4.5) and high (RCP8.5) emission scenarios, were performed. The Bristow–Campbell model was the one that best estimated solar radiation, whereas the XAVIER gridded database was the closest to observed weather data. Future climate projections, under RCP4.5 and RCP8.5 scenarios, as expected, showed warmer conditions for all scenarios over Brazil. On the contrary, rainfall projections are more uncertain. Despite that, the rainfall amounts will be reduced in the North-Northeast region and increased in Southern Brazil. No significant differences between projections using the observed and XAVIER gridded database were observed; therefore, such a database showed to be reliable for both to fill gaps and to generate climate change scenarios.

1. Introduction

Given the projections of global climate changes, simulation models can be used to estimate the impact of historical and future climates on human activities, mainly in crop growth and yield and food availability [1]. For proper simulations, these models require high-quality and long-term historical daily weather data [2]. However, the major difficulty regarding historical weather data in Brazil is the low density of weather stations, associated with the reduced number of measured variables and the large amount of missing data [35].

To overcome the lack of reliable weather data series, missing data can be filled in with estimated or interpolated data. Among the different approaches used to fill weather data gaps in, the main methods are climatic generators, which generate stochastic sequences of daily data, such as WGEN [6] and SIMMETEO [7] generators; empirical correlations using commonly measured meteorological variables present in the observed data [810]; and the use of the gridded weather database, based on satellite and/or surface data [2, 4, 11].

Once the historical data series have been filled, these can be used for generating future climate scenarios, derived from projections of climate models, which can be global (GCMs) or regional (RCMs). Despite the finer resolution of RCMs, considering the continental dimension of Brazil, GCMs (which would provide the RCM boundary conditions) offer insight into the general characteristics of future climate [12, 13].

Due to the uncertainties associated with the GCM projections, different models can indicate different climate responses, and one way to reduce such an uncertainty is by considering an ensemble modeling approach [14], with the projections being obtained from multiple models, resulting in more reliable scenarios than if the models are considered individually [15].

These future changes can be projected based on GCMs generated by the Coupled Model Intercomparison Project Phase 5 (CMIP5 [16]), under different greenhouse gases emissions that follow distinct representative concentration pathways (RCPs) [1719], assessed in the Fifth Assessment Report (AR5) of the Intergovernmental Panel on Climate Change (IPCC) [20]. For South America and specifically for Brazil, the first projections have indicated an increase in temperatures and an uncertain pattern in the rainfall distribution [12, 13]. Such patterns have been confirmed in the more recent studies of Chou et al., Sánchez et al., and Salviano et al. [2123].

Given the great importance of historical weather data for assessing the impacts of climate change on human activities, mainly agriculture, in addition to the fact that Brazil has a low weather station density, with a large amount of missing data [35], the general objective of this study was to evaluate the performance of different alternatives to fill in weather data gaps and, based on that, to create climate change scenarios for Brazil. More specifically, this study aimed (i) to evaluate the performance of temperature-based models for estimating solar radiation and gridded databases, such as AgCFSR, AgMERRA, NASA/POWER, and XAVIER, as procedures to fill in gaps of weather data (maximum and minimum air temperature, solar radiation, rainfall, wind speed, and relative humidity) for the period of 1980–2009; (ii) to generate, from the complete historical weather data, climate change scenarios, over the medium-term (2040–2069) and long-term (2070–2099) periods, based on seven GCMs of CMIP5, under intermediate (RCP4.5) and high (RCP8.5) emission scenarios; and (iii) to identify patterns of climate change in air temperature and rainfall in different Brazilian regions to define the expected trends in relation to the historical climate.

2. Materials and Methods

The present study was developed according to different steps and in a logical sequence presented in the flow chart of Figure 1 and in the following sections.

2.1. Sites and Weather Data

Historical daily measured weather data of maximum and minimum air temperature, sunshine hours, rainfall, wind speed, and relative humidity, from 1980 to 2009, were obtained from the Brazilian National Institute of Meteorology (INMET). Thirty-one sites well distributed in the country were considered, as presented in Figure 2. More detailed description about the percentage of missing values for each weather variable is in Table S1 of Supplementary Materials.

2.2. Filling Gaps in the Meteorological Database

Due to the large percentage of missing data in the historical weather databases, ranging from 1 to 46% (Figure 2 and Table S1 of Supplementary Materials), weather variables were generated by temperature-based models (solar radiation) and gridded databases (all variables), as alternatives to fill these gaps in.

2.2.1. Temperature-Based Solar Radiation Models

As solar radiation is not commonly recorded by conventional weather stations, its values were calculated from sunshine hours (n) data, following the model proposed by Ängström [8] and Prescott [24], with coefficients as suggested by Glover and McCulloch [25], and then admitted as the reference values (Table 1). The temperature-based models for estimating solar radiation use maximum and minimum air temperatures as inputs to estimate atmospheric transmissivity [10], which is affected by cloudiness. Five solar radiation models (Hargreaves (Ha), Hunt (Hu), Annandale (An), Bristow–Campbell (BC), and Donatelli–Campbell (DC)) were assessed as presented in Table 1.

2.2.2. Daily Gridded Database

Gaps in measured weather data (maximum and minimum air temperature, solar radiation, rainfall, wind speed, and relative humidity) were also filled in with data from the following four gridded databases:(a)AgCFSR and AgMERRA datasets [11], developed as a part of the Agricultural Model Intercomparison and Improvement Project (AgMIP) [31], to provide consistent, daily time series with global coverage of climate variables. They are result of a combination of NCEP’s reanalysis of the Climate Forecast System Reanalysis (CFSR) [32] and NASA’s Modern-Era Retrospective Analysis for Research and Applications (MERRA) [33] with observed datasets from weather stations’ networks and satellites, available on a daily temporal scale, for the period between 1980 and 2010, at 0.25° × 0.25° horizontal resolution.(b)National Aeronautics and Space Administration database developed by the Prediction of Worldwide Energy Resource (NASA/POWER) [34], composed by satellite data, radiosondes, surface observations, and numerical modeling from data assimilation. The meteorological variables are available on a daily world scale, but in a grid of lower resolution, that is, of greater horizontal spacing, with 1° × 1° horizontal resolution, for the period from 1983 to the near present. Just for rainfall, the historical series started in 1997.(c)Gridded dataset developed by Xavier et al. [4], referred to as XAVIER that includes only daily observed data from rain gauges and conventional and automatic weather stations for the period of 1980–2013, available on a spatial resolution of 0.25° × 0.25° only for Brazil.

2.2.3. Evaluation of Solar Radiation Models and Daily Gridded Database for Filling in Weather Data Gaps

Concerning the solar radiation models, two independent datasets were considered with two years each, for the calibration and evaluation of the adjusted coefficients. To avoid inconsistencies in the analysis, two consecutive years with less than 2% of missing data (temperature and sunshine hours) were chosen. For the evaluation of the gridded weather data, the entire database was employed for the period between 1980 and 2009.

The performance of temperature-based solar radiation models and gridded databases for filling in daily data gaps was assessed by comparing estimated and measured data on a daily basis, using the common model performance evaluation indices, such as the coefficient of determination (r2) as a measure of precision; agreement index (d) [35] as a measure of accuracy; confidence index (c) [36] (being classified as great for values higher than 0.85, very good for values between 0.76 and 0.85, good between 0.66 and 0.75, median between 0.61 and 0.65, suffering between 0.51 and 0.60, bad between 0.41 and 0.50, and terrible for values lower than 0.41); mean error or bias (Bias) that indicates the tendency of error; and mean absolute error (MAE), which gives the magnitude of the errors [37].

2.3. Climate Change Projections

Climate change scenarios, based on measured weather data fulfilled with the best alternative, were projected by models that are publicly available through the CMIP5 [16], based on two RCPs [18]: intermediate emission scenario (RCP4.5) and high emission scenario (RCP8.5). As suggested by Ward et al. [38], the intermediate scenario appears as the most likely future for planning purposes, in which observed fossil fuel trajectories show up to be consistent, whereas the high emission scenario represents the extreme conditions.

The future scenarios were generated based on the delta method [39], in which simulated mean monthly changes are imposed for the baseline for all sites by adding temperature changes and multiplying precipitation changes, without changing the variability within a month (e.g., the number of rainy days), following the procedure as described by Hudson and Ruane [40]. All other variables were kept unchanged.

Projections were performed for mid-of-century (2040–2069) and end-of-century (2070–2099) periods, for the following CMIP5 GCMs: CNRM-CM5 [41], CSIRO-Mk3-6-0 [42], GISS-E2-R [43], HadGEM2-ES [44, 45], INMCM4 [46], MIROC-ESM [47], and MPI-ESM_LR [48]. The use of seven different GCMs was adopted since the uncertainties are inherent to the climate system, as a result of nonlinear interactions and the intrinsic complexity of the natural atmospheric phenomena [49]. Therefore, for the same emission scenario, different models produce diverse projections of climate change, and one way to minimize these uncertainties is through a set of global and/or regional models, known as an ensemble approach [15]. In this sense, the climate projection presented here for each variable is an average of the outputs of seven GCMs.

As an alternative to the use of gridded historical climate data for future climate projections, we analyzed climate projections based on measured weather data compared to the climatology provided by the best alternative method, considering only the nine sites which had a percentage of missing data on air temperature and rainfall lower than 10%, as presented in Table S1 of Supplementary Materials.

3. Results

3.1. Filling Gaps in Measured Weather Data
3.1.1. Solar Radiation Models

Table 2 presents the average daily annual coefficients of the temperature-based solar radiation models for all Brazilian locations assessed. The Ha model displayed adjusted coefficients varying from 0.10°C−0.5 to 0.18°C−0.5, differing from the original values of 0.16°C−0.5 and 0.19°C−0.5 obtained by Hargreaves and Samani [50] for continental and coastal regions, respectively. The adjusted b coefficient for the Hu model ranged from 0.04 to 0.22. However, the c coefficient of this model showed quite distinct values, ranging from −7.70 and 9.98. The coefficients e of the BC model and h of the DC model were similar, ranging from 0.75 to 0.77 in both models; however, the coefficients f and of the BC model were smaller than the coefficients i and j of the DC model, whereas f and of the BC model were, in average, 0.03 and 1.63 and i and j of the DC model were 0.07 and 2.24.

Statistical indices for each temperature-based model assessed are presented in Figure 3. For more detailed results, see Tables S2 and S3 of Supplementary Materials. As presented in Figure 3, r2 for the BC model ranges between 0.32 and 0.79, with a mean value of 0.62. For the DC model, r2 values range from 0.26 to 0.76, with an average value of 0.59.

The estimated solar radiation values presented d between 0.44 and 0.93 for the Ha and Hu models and from 0.55 to 0.92 for the An model, with a mean value of 0.79, for all of them. For the BC and DC models, this index ranged from 0.62 to 0.93 and from 0.60 to 0.93, respectively, with average values of 0.86 and 0.85 (Figure 3; Tables S2 and S3).

The confidence index (c) ranged from 0.31 to 0.81, with an average of 0.61 for the Ha and An models, and from 0.25 to 0.82 for the Hu model, with an average of 0.62 (Figure 3). For the BC model, c ranged from 0.35 to 0.82, while for the DC model, c ranged from 0.32 to 0.80, with an average of 0.68 and 0.66, respectively. Considering the average values for all sites, the models of Ha, Hu, and An presented performances classified as “median,” whereas the performances of BC and DC models were classified as “good,” according to the Camargo and Sentelhas [36] classification.

3.1.2. Gridded Database

Table 3 presents the performance of the different daily gridded databases used to fill the gaps in the historical weather series. All databases showed high accuracy (d ≥ 0.89) for maximum air temperature (Tmax), with XAVIER also showing very high precision (r2 = 1). Except for AgCFSR, all models underestimated Tmax. Among all databases, XAVIER was the best one for estimating Tmax, with MAE = 0.17°C, whereas NASA/POWER presented the highest MAE of 2.46°C.

All databases showed high accuracy (d ≥ 0.93) and good precision (r2 ≥ 0.77) for minimum air temperature (Tmin). As to Tmax, XAVIER showed the best performance, with the lowest Bias (0.06°C) and MAE (0.30°C). On the contrary, NASA/POWER presented the worst performance, with Bias = 0.76°C and MAE = 1.74°C. Both AgCFSR and AgMERRA presented similar Bias and MAE, as well as similar c index, respectively, of 0.84 and 0.86 (Table 3).

For global solar radiation (Qg), NASA/POWER and XAVIER presented the best performance, with the latter presenting the highest accuracy (d = 0.97) and precision (r2 = 0.94), resulting in a c index of 0.94, classified as great [36]. NASA/POWER showed r2 = 0.76 and d = 0.93. All databases underestimated Qg, with Bias ranging from −0.58 to −1.32 MJ·m−2·d−1. In terms of MAE, XAVIER was the database with the best performance, with MAE = 1.57 MJ·m−2·d−1.

For the rainfall (Rain), AgCFSR, AgMERRA, and NASA/POWER showed poor performance with r2 ≤ 0.25, d ≤ 0.67, c ≤ 0.33, and MAE ≥ 4.48 mm·d−1. On the contrary, XAVIER presented good precision (r2 = 0.88) and high accuracy (d = 0.96), resulting in an optimum performance (c = 0.90), with a slight underestimation tendency (Bias = −0.10 mm·d−1) and the lowest error magnitude (MAE = 1.51 mm·d−1).

XAVIER also presented the best performance for estimating relative humidity (RH), with high precision (r2 = 0.90) and accuracy (d = 0.97) and small errors (Bias = 0.18% and MAE = 3.76%), whereas the other systems underestimated RH, with MAE higher than 11%.

Despite the poor performance of all databases for estimating wind speed (WS2m), XAVIER displayed the best statistical indices, with r2 = 0.47, d = 0.79, and c = 0.54, and the smallest error, with MAE = 0.49 m·s−1, which, however, is still classified as suffering according to Camargo and Sentelhas [36].

3.2. Climate Change Projections

Based on the historical measured weather data fulfilled with the XAVIER gridded database, the ensemble of climate change projections was performed for RCP4.5 and RCP8.5 emission scenarios on 31 sites from 1980 to 2009, from mid- to end-of-century periods. Annual maximum and minimum temperatures showed an increase in tendency, while for rainfall, the South region will mostly experience increases (annually), and the North and Northeast regions will experience decreases, as presented in Figures 46. More details can be found in Tables S4 and S5 of Supplementary Materials.

Annual average changes, for all 31 sites, of maximum temperature showed increases in medium- and long-term projections of 2.01 and 2.52°C for RCP4.5 and 2.70 and 4.61°C for RCP8.5, while for minimum temperature, the increases will be of 1.79 and 2.25°C for RCP4.5 and 2.56 and 4.45°C for RCP8.5 (Table 4). Under the same emission scenarios and future projected periods, higher increases will occur for maximum than for minimum temperatures. As expected, increases under the RCP8.5 scenario will be higher than those under RCP4.5. However, such increases are much more pronounced in the long-term projections, with the mean increase achieved between 2.39 and 4.48°C, under intermediate and high emission scenarios.

Rainfall projections for the 31 sites showed a decrease of −6.18 and −6.68% for RCP4.5 and −4.34 and −8.62% for RCP8.5 for the medium- and long-term projections (Table 4); however, these changes must be analyzed carefully, since rainfall is a variable of high spatial variability and with distinct distribution patterns over the country.

The monthly climate changes projected for all 31 sites for the RCP8.5 scenario in a long term (2070–2099) are presented in Figure 7. Temperature changes will vary between 2 and 7°C for Tmax (Figure 7(a)) and between 2 and 5.5°C for Tmin (Figure 7(b)). The highest temperature increases will occur in the second semester of the year, mainly in October, for both. Therefore, as shown before, higher temperatures are expected on future climate projections, with increases that will persist every month [13, 22]. Rainfall reduction especially in North and Northeast regions will occur mainly from August to October, which coincides with the dry season and the period of higher temperatures.

Analyzing the future climate projections, by comparing the observed and XAVIER gridded database as a reference for climatology, the projected annual average of maximum and minimum temperature and rainfall was similar, with about the same variability for both databases (Figure 8). For air temperature projections, based on the observed and gridded climatology, the differences were not greater than 0.06 and 0.08°C, respectively, for maximum and minimum temperatures, in both emission scenarios and future periods considered. Similarly, for rainfall, the differences between the two databases did not exceed 1%, considering all scenarios and periods.

4. Discussion

4.1. Filling Gaps in Measured Weather Data
4.1.1. Solar Radiation Models

In general, the temperature-based models for estimating Qg presented very similar performance after their calibration for 31 sites in Brazil (Figure 3). However, the models which were based on three coefficients, BC and DC, had a subtle better performance, improving the general confidence index c above 0.6 for most simulations. As this is the first attempt to calibrate these models considering several locations around the country, the calibrated coefficients (a for Ha; b and c for Hu; d for An; e, f, and for BC; and h, i, and j for DC) were quite different from those obtained by other authors for specific locations or locations within the same state, such as those presented by Barbosa et al. [51] for the state of Minas Gerais (MG), by Conceição and Marin [52] in the northwest of the state of São Paulo, and by Massignam [53] in the state of Santa Catarina. Also, the performances of these models when considering several locations spread in the country were a bit worse than those reported by specific locations [5153], which is mainly caused by the greater Qg variability observed around the country with the different atmospheric transmissivity caused by diverse cloud types.

Despite the differences in performance reported above, the present study confirmed that BC and DC are the best temperature-based methods for estimating Qg. The performance of these methods, however, can vary according to the region and the season of the year, as reported by Rivington et al. [54]. In this study, it was found that the best Qg estimates were found in Southern and Southeastern Brazil, where it seems to be a better correlation between nebulosity and daily thermal amplitude. In these regions, the confidence index was classified between good and very good, as can be seen in Tables S2 and S3 of Supplementary Materials.

4.1.2. Gridded Database

The gridded data provided by difference sources presented distinct performances for simulating weather conditions and variability in different parts of Brazil. For Tmax and Tmin, as well as for Qg, the four systems assessed presented good to great performance, according to the classification of Camargo and Sentelhas [36], with r2 ≥ 0.64, d ≥ 0.88, and c index always above 0.71. In general, XAVIER was the system that presented the best performance for these three variables, with c always above 0.90. On the contrary, for Rain, RH, and WS2m, the performances were quite variable, with AgCFSR, AgMERRA, and NASA/POWER presenting the worst estimates, with c equal to or below 0.33, 0.61, and 0.21, respectively, whereas XAVIER presented great performance for Rain (c = 0.90) and RH (c = 0.92). For WS2m, XAVIER also had a better performance than the other sources, however, with lower indices when compared to the other weather variables (r2 = 0.47, d = 0.79, and c = 0.54).

Similar results were found by Monteiro et al. [55] and by Battisti et al. [5] when using NASA/POWER, XAVIER, and AgMERRA gridded databases in several Brazilian locations. Despite the similar performances observed by these authors regarding the gridded data they used, both of them concluded that the differences between observed and gridded data were not enough to lead to significant differences for estimating the potential yield of sugarcane [55] and soybean [5]. However, when simulating the attainable yield, which depends on the rainfall, Monteiro et al. [55] realized that the use of observed data improved the estimates substantially, once NASA/POWER did not represent rainfall spatial and temporal variability very well, as also observed in the present study (Table 3). Following the same strategy, Battisti et al. [5] also observed that the use of rainfall data from AgMERRA did not provide reliable results of the soybean attainable yield, whereas XAVIER data did.

Regarding rainfall data, the major limitation for their spatial interpolation based on satellite data, as done by AgCFSR, AgMERRA, and NASA/POWER, is the low or inadequate resolution of the images which is not good enough to capture extreme events [56, 57] and local spatial variability associated with the topography [58, 59]. Similarly, the poor performance of all databases to estimate WS2m is related to two main aspects: the small magnitude of this variable, which leads to large errors even with small deviations, and its high spatial variability associated with the topography and land cover [60]. Finally, the median to bad AgCFSR, AgMERRA, and NASA/POWER performance to estimate RH is related to the fact that the former two provide RH at the time of maximum daily temperature, which is not the daily average, which resulted in MAE between 14 and 17% in the assessed regions. NASA/POWER estimates RH based on similar procedures employed by AgCFSR and AgMERRA, which resulted in errors of similar magnitude, about 11%, very close to those reported by Stackhouse et al. [34] for several locations in the United States for a historical weather series of 31 years.

From the results presented in Table 3, the XAVIER gridded database was the best one to represent spatial and temporal weather data variability in Brazil, once it is based on data from ground stations from several sources. In addition, its high spatial resolution (0.25°) allows a reasonable characterization of the topography and land cover effects on surface weather variables, which are difficult to be captured by satellite estimates, as done by AgCFSR, AgMERRA, and NASA/POWER.

4.2. Climate Change Projections

The temperature increases presented in this study are in line with the projections performed by Chou et al., Sánchez et al., Torres and Marengo, and Reboita et al. [21, 22, 61, 62]. For air temperature, Torres and Marengo [61] projected increases exceeding 2°C by the end of the present century in South America with more than 90% of probability, which was confirmed by our results (Figures 4 and 5; Table 4). For rainfall, decreases will be expected in the northern part of the country, whereas in the center-southern part, rainfall increase will prevail; these results are comparable to those obtained by Sánchez et al. and PMBC [22, 49]. The rainfall reduction in Northern Brazil will occur mainly from August to October, which coincides with the dry season, and when high temperatures predominate, it leads to higher water deficits, increasing the risks for rainfed perennial crops as well as for annual and perennial irrigated crops by increasing the crop water demand and irrigation requirements [63, 64].

Comparing the future climate projections generated from observed and XAVIER gridded databases, considered as the historical basis for future climate projections, the results did not show any substantial difference in the projected scenarios of temperature and rainfall, which makes possible to use the XAVIER database for studying the impacts of climate change on agriculture or any other human activity.

5. Conclusions

This study assessed the potential use of temperature-based solar radiation models and gridded databases as options to fill gaps in weather series and to project climate change scenarios in Brazil. Among the temperature-based solar radiation models, the one with the best performance was the BC model, which presented the lowest errors and highest precision and accuracy. In relation to the gridded data, the XAVIER database was the best one to represent observed weather series in Brazil, showing up to be reliable for both to fill gaps in and to be used as a reference to agricultural planning and agroclimatic risk studies for the present and future climates. Due to its outstanding performance, the XAVIER database can also be used for studies related to the impact of climate variability and climate change on other human activities in Brazil.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.


The authors would like to thank the Brazilian National Council for Scientific and Technological Development (CNPq) for the funds to support this project (Ph.D. scholarship for Fabiani Denise Bender and research fellowship for Paulo Cesar Sentelhas).

Supplementary Materials

The supplementary material gives additional information for this paper, with a detailed description of weather stations considered in this study, and the percentage of missing data for each weather variable (Table S1); a detailed statistical performance for Hargreaves (Ha), Hunt (Hu), Annandale (An), Bristow–Campbell (BC), and Donatelli–Campbell (DC) temperature-based models employed for estimating solar radiation (Tables S2 and S3); and the annual changes of maximum, minimum, and mean air temperature and rainfall projected by seven global climate models (GCMs), for mid-of-century (2040–2069) and end-of-century (2079–2099) periods, under intermediate (RCP4.5) and high (RCP8.5) emission scenarios, when compared to the historical climate (1980–2009), for 31 Brazilian sites (Tables S4 and S5). (Supplementary Materials)