Abstract

As a fundamental component in material and energy circulation, precipitation with high resolution and accuracy is of great significance for hydrological, meteorological, and ecological studies. Since satellite measured precipitation is often too coarse for practical applications, it is essential to develop spatial downscaling algorithms. In this study, we investigated two downscaling algorithms based on the Multiple Linear Regression (MLR) and the Geographically Weighted Regression (GWR), respectively. They were employed to downscale annual and monthly precipitation obtained from the Global Precipitation Measurement (GPM) Mission in Hengduan Mountains, Southwestern China, from 10 km × 10 km to 1 km × 1 km. Ground observations were then used to validate the accuracy of downscaled precipitation. The results showed that GWR performed much better than MLR to regress precipitation on Normalized Difference Vegetation Index (NDVI) and Digital Elevation Model (DEM); coefficients of GWR models showed strong spatial nonstationarity, but the spatial mean standardized coefficients were very similar to standardized coefficients of MLR in terms of intra-annual patterns: generally NDVI was positively related to precipitation when monthly precipitation was under 166 mm; DEM was negatively related to precipitation, especially in wet months like July and August; contribution of DEM to precipitation was greater than that of NDVI; residuals’ correction was indispensable for the MLR-based algorithm but should be removed from the GWR-based algorithm; the GWR-based algorithm rather than the MLR-based algorithm produced more accurate precipitation than original GPM precipitation. These results indicated that GWR is a promising method in satellite precipitation downscaling researches and needed to be further studied.

1. Introduction

As a fundamental component in material and energy circulation, precipitation is of great significance for hydrological, meteorological, and ecological studies (e.g., [13]). Traditionally, spatial distribution of precipitation is obtained through rain gauge data interpolation [4]. However, interpolation methods are greatly limited over mountainous regions due to the sparse rain gauge network [5, 6]. Satellite precipitation datasets based on remote sensing technology have developed rapidly since the 1980s, such as the Global Precipitation Climatology Project (GPCP, [7]), the Tropical rainfall Measuring Mission (TRMM, [8, 9]), the Global Satellite Mapping of Precipitation (GSMaP, [10]), and the Global Precipitation Measurement (GPM) mission [1113]. Gridded satellite precipitation datasets provide reliable estimations of precipitation reflecting more spatial distribution than rain gauge data. However, for regional scale applications, they are often too coarse to be used in hydrological, meteorological, or ecological studies [14, 15]. Taking TRMM as an example, its spatial resolution is only 0.25° × 0.25°. Therefore, it is essential to develop downscaling algorithms for satellite datasets to improve their resolutions as well as accuracy.

Relationships between precipitation and other environmental factors such as vegetation and topography have been widely investigated in the literature [1621]. Meanwhile, available resolutions of gridded data of these factors are often higher than those of satellite precipitation products. For instance, the SPOT Normalized Difference Vegetation Index (NDVI) is available at 1 km × 1 km resolution. Therefore, it is a feasible approach to downscale precipitation through establishing statistical models of precipitation and these factors, and this is termed as statistical downscaling algorithms. Their fundamental statistical theories could be classified into regression analysis [14, 15, 2224] and machine learning methods such as Artificial Neural Network [25, 26] and random forests [2729].

In this article, regression-based downscaling algorithms were mainly concerned and investigated. Immerzeel et al. [23] established exponential regression (ER) models between annual TRMM precipitation and NDVI at different spatial resolution from 0.25° to 1.50°, among which who with the optimal coefficient of determination () was eventually selected to downscale the TRMM precipitation from 0.25° to 1 km. Jia et al. [14] introduced both NDVI and Digital Elevation Model (DEM) as the explanatory variables in a Multiple Linear Regression (MLR) model. Independent station validation in the two studies and later studies [15, 22, 24, 30] showed that the downscaled precipitation based on ER or MLR after residual correction was comparable with the original TRMM product but at much improved resolution.

Nevertheless, there are several limitations of the above downscaling algorithms. Firstly, downscaled precipitation would inevitably inherent errors in the original satellite precipitation [14, 23]. This indicates that it is not likely to obtain accurate downscaled precipitation from inaccurate satellite datasets. Therefore, the next generation GPM product with higher quality and resolution was used as the original satellite precipitation to partly overcome this problem. As far as our information goes, most previous studies were based on TRMM, and downscaling algorithms based on GPM product were short of research. Secondly, performance of downscaling these algorithms relies on goodness-of-fit of regression models. The above two algorithms are both globally regression models which assume that the relationships among precipitation, NDVI, and DEM are spatially stationary. Against their assumption, however, precipitation-NDVI and precipitation-DEM relationships were both reported to be spatially nonstationary [17, 18, 21]. Brunsdont et al. [31] put forward a local regression model called Geographically Weighted Regression (GWR) to handle the problem of spatial nonstationary. GWR-based satellite precipitation downscaling algorithm was found to perform better than the two global-regression-based algorithms over various regions on TRMM product, at both annual scale and monthly scale [15, 30, 32].

This study aims to investigate the effectiveness of the MLR-based and GWR-based downscaling algorithms on GPM annual and monthly precipitation. To this end, NDVI and DEM were employed as the explanatory variables in the MLR-based and GWR-based downscaling algorithms to downscale GPM monthly and annual precipitation from 10 km × 10 km to 1 km × 1 km over Hengduan Mountains, Southwestern China. Their performances were validated with observed precipitation from 71 rain gauge stations.

2. Study Area and Datasets

2.1. Hengduan Mountains

Hengduan Mountains (24°40′–34°00′N; 96°20′–104°30′E) locates in the southwestern of China and the southeastern of Tibetan Plateau, with an area of ~500,000 km2. The study region possesses a complex topography with plenty of high mountains and deep-cutting gorges which run north to south. The climate of Hengduan Mountains is mainly controlled by southwest Asian monsoon, the southeast Asian monsoon, and the winter monsoon as well as local circulation of Tibetan Plateau. Meanwhile, due to its wide range of altitude from 300 m to over 7000 m, the climate also varies vertically. Mean annual precipitation of Hengduan Mountains is about 800 mm and more than 75% comes from the summer monsoon (May to October). To weaken the influence of edge effect of the downscaling algorithms, a 100 kilometers wide buffer area was added to the study area during calculation and validation.

2.2. Precipitation Datasets

The Global Precipitation Measurement (GPM) Mission, as a successor of the Tropical rainfall Measuring Mission (TRMM), aims to provide a new generation of global rainfall and snowfall observations. The Core Observatory satellite was launched in February 2014 by the National Aeronautics and Space Administration (NASA) and the Japan Aerospace and Exploration Agency (JAXA) carrying the GPM Microwave Imager (GMI) and the Dual-frequency Precipitation Radar (DPR). This study selected the GPM-3IMERGM product (GPM for short, downloaded from https://pmm.nasa.gov/data-access/downloads/gpm) which estimated global monthly precipitation at resolution of 0.1° × 0.1° by combining GPM with several other satellite precipitation datasets. Since GPM dataset starts from March 2014, the first year with complete months (i.e., 2015) was therefore selected to investigate GPM annual precipitation downscaling algorithms. Annual precipitation (Figure 2(a)) was obtained by summing up monthly GPM precipitation in the year.

The ground observed precipitation of 71 rain gauge stations (Figure 1) located in Hengduan Mountains and the buffer area was provided by the National Meteorological Information Center of China Meteorological Administration. Since the function of ground data was to validate the results of downscaling algorithms independently, it was not used to correct GPM precipitation.

2.3. NDVI and DEM Datasets

The MOD13A3 monthly NDVI product with a spatial resolution of 1 km × 1 km derived from atmospherically corrected reflectance in the red and near-infrared wavebands of the Moderate Resolution Imaging Spectroradiometer (MODIS) on the Terra satellite was used in this study. Annual NDVI (Figure 2(b)) was obtained by averaging monthly NDVI in the year. The Shuttle Radar Topography Mission (STRM) DEM product was provided by the National Geospatial-Intelligence Agency (NGA) and the National Aeronautics and Space Administration (NASA). The original DEM data was at resolution of 90 m × 90 m and resampled to resolution of 1 km × 1 km (Figure 1) and 10 km × 10 km.

3. Method

3.1. Multiple Linear Regression

Jia et al. [14] proposed a downscaling algorithm based on Multiple Linear Regression (MLR), which is expressed aswhere , , and are the GPM annual precipitation, NDVI, and elevation at low resolution (LR). and are slope of precipitation to NDVI and precipitation to DEM, respectively. is the intercept, and is residual of the regression model. The three coefficients are calibrated with the Ordinary Least Square (OLS) method. Standardized coefficients could be calibrated if the variables are standardized with (2) in advance.

3.2. Geographically Weighted Regression

Geographically Weighted Regression (GWR) proposed by Brunsdont et al. [31] is a local regression method that can be used to investigate spatially nonstationary correlational relationships. Contrast to the MLR model, the GWR model assumes the coefficients to vary with geographical locations, and it is expressed aswhere is the geographical location. The coefficients are as same as those in the MLR method, except that they are local rather than global coefficients and can be estimated by where is matrix of the explanatory variable at low resolution composed of , , and vector of constant one and is transposition of . is the weight matrix that put more weights on observations closer to the center point and can be calculated with kernel functions like Gaussian kernel and bisquare kernel. Gaussian kernel (Figure 3) was adopted in this study, and is the diagonal matrix of where is the Euclidean distance between the th observation and the center point and is the kernel bandwidth that can be estimated with the cross validation (CV) technique.

3.3. Procedure of Downscaling Algorithms

Step 1 (data preparation). Resample annual or monthly NDVI and DEM fields to the low (10 km × 10 km) and high (1 km × 1 km) resolution. And resample the GPM annual precipitation field to 10 km × 10 km resolution. It should be noted that all the fields must be projected to the same projection coordinate system (China Albers Equal Area Conic), and fields at the same resolution should share the same raster.

Step 2 (model establishment). Using sample data at low resolution to calibrate the coefficients of regression functions in the downscaling algorithms. For GWR-based algorithm, the bandwidth should be estimated with cross validation (CV) technique before calibration of the coefficients.

Step 3. Prediction of high resolution precipitation () before residuals correction: put high resolution NDVI () and DEM () into the MLR models and GWR models, respectively, to predict . The downscaled results before residuals correction were termed as MLRBRC and GWRBRC, respectively, in this study.

Step 4 (residuals correction). Interpolate low resolution residuals () of the regression methods to high resolution residuals () with simple spline tension interpolator, respectively. The final downscaled precipitation () is calculated asAnd the downscaled results after residuals corrections were termed as MLRARC and GWRARC, respectively.

3.4. Validation Criteria

Observed precipitation from rain gauges was used to validate the precipitation downscaled by the two different algorithms. In the validation, we employed three widely used criteria, namely, the Mean absolute error (MAE), the Root Mean Square Error (RMSE), and the coefficient of determination (), which are calculated aswhere is precipitation predicted by regression models or downscaling algorithms and is observed precipitation from rain gauge stations.

4. Results and Discussion

4.1. Results of Regression Models
4.1.1. Results of MLR Models

Multiple linear regression (MLR) models were established to explore the global relationships of annual and monthly GPM precipitation against NDVI and DEM in Hengduan Mountains at lower resolution of 10 km. The calibrated coefficients including the constant coefficients, slopes of NDVI and DEM, and the corresponding coefficients of determination and values were shown in Table 1. First of all, values of all the MLR models were smaller than 0.001, which meant all the MLR models were extremely significant and therefore to be valid. However, values of February to June were relatively larger, and the corresponding was smaller than 0.06, indicating that no more than 6% of variances of GPM precipitation could be interpreted by MLR models from February to June. Among the other months, July had the highest of 0.39, and annual was 0.35. For annual GPM precipitation, slope of NDVI was 390.93 mm/1, which meant as NDVI increased per unit, annual precipitation would increase 390.93 mm averagely. Similarly, slope of DEM was −124.06 mm/km, which meant as DEM increased per kilometer, annual precipitation would decrease 124.06 mm averagely. For monthly precipitation, slopes of NDVI ranged from −64.37 to 75.79 mm/1 and slopes of DEM ranged from −49.16 to 0.01 mm/km.

We further calculated standardized regression coefficients of the MLR models to investigate the intra-annual variation of standardized coefficients and compare the relative contributions of NDVI and DEM. As shown in Figure 4, for monthly precipitation, standardized slopes of NDVI changed periodically for three times in the year. The first cycle was from January to June, the second was from June to October, and the third was from October to January. The peak values were positive and the valley values were slightly negative, except that the valley value reached −0.12 in August, when monthly precipitation reached its peak of 195.22 mm in the year. The reasons for such the complicated periodicity might be the combined effect of the following facts: Hengduan Mountains is controlled by southwest Asian monsoon, the southeast Asian monsoon, the winter monsoon, and local circulation of Tibetan Plateau in turn and/or simultaneously; there are time lags between vegetation growth withering and precipitation increase/decrease; vegetation is restricted by both water and temperature during dry season, but much less restricted by water than temperature during flood season. Standardized slopes of DEM were negative and smaller than −0.2 all the year except for February, May, and June.

The relative contributions of NDVI and DEM could be estimated through comparing absolute values of their standardized coefficients. For annual precipitation, standardized slopes of NDVI were 0.20, which meant as NDVI increased per standardized unit, annual precipitation would increase 0.20 standardized units averagely. Meanwhile, standardized slopes of DEM were −0.44, which meant as DEM increased per standardized unit, annual precipitation would decrease 0.44 standardized units averagely. The relative contribution to annual precipitation of DEM was about twice of that of NDVI. For monthly precipitation, contribution of DEM was also larger than that of NDVI all the year except for January, February, and June.

Finally, we investigated relationships between standardized coefficients of NDVI and DEM in MLR models against regionally averaged GPM monthly precipitation (Figure 5). Both insignificant (p ~= 0.20) decreasing trends were found for NDVI and DEM. Standardized coefficients of NDVI trended from positive to zero, when monthly precipitation increased below 166 mm, and trended from zero to negative when monthly precipitation increased over 166 mm. To a certain degree, this feature precipitation of 166 mm could be regarded as the water saturation point of Hengduan Mountains, if neglecting time lags between NDVI and precipitation. Standardized coefficients of DEM trended more negative as monthly precipitation increased, indicating that the stronger southwest and southeast monsoons [33] were, the stronger relationships between precipitation and DEM were.

4.1.2. Results of GWR Models

Geographically Weighted Regression (MLR) models were established to explore the local relationships of annual and monthly GPM precipitation against NDVI and DEM in Hengduan Mountains at lower resolution of 10 km. As Figure 6 showed, of GWR models ranged from 0.82 to 0.98, except that of November was 0.65, which were much greater than those of MLR models. Obviously, GWR models were much more effective on explaining the original GPM annual and monthly precipitation than MLR models. Meantime, the intra-annual variation pattern of of GWR was very similar to that of MLR, indicating that the local relationships of precipitation against NDVI and DEM were positively related to the global ones within a year.

Unlike the MLR model, calibrated coefficients in GWR models were not spatially stationary. As an example, Figure 7 showed the spatial distribution of coefficients of the GWR model for annual GPM precipitation in Hengduan Mountains. The intercepts varied from −1666 to 4106 mm, and the spatial pattern was similar to that of annual GPM precipitation with a of 0.56. The coefficients of NDVI varied from −1674 to 1581 mm/1, with no obvious spatial pattern. The coefficients of DEM varied from −657 to 435 mm/km and were slightly positively related to elevation ( = 0.097).

We also calculated standardized coefficients of NDVI and DEM when establishing the GWR model of annual GPM precipitation. Like coefficients of determination, mean standardized coefficients of NDVI and DEM of the GWR models (Figure 8) showed very similar intra-annual pattern with those of the MLR models. It revealed that although coefficients in the GWR models were estimated locally, but their spatial distribution could reflect the global relationships of precipitation against NDVI and DEM. However, it should be noted that the standard deviations of standardized coefficients were quite large, emphasizing the great difference between local coefficients in the GWR model and global coefficients in the MLR model.

4.2. Downscaled Precipitation and Ground Validation
4.2.1. Downscaled Precipitation before/after Residuals Correction

Monthly and annual precipitation at high resolution before residual correction (MLRBRC and GWRBRC) were predicted by putting the calibrated coefficients, high resolution NDVI, and DEM into the MLR and GWR models, respectively. After adding the high resolution residuals interpolated with spline tension method to MLRBRC and GWRBRC, MLRARC and GWRARC were eventually obtained. As an example, Figure 9 showed the downscaled annual precipitation before residuals correction, interpolated residuals, and the downscaled annual precipitation after residuals correction, from which a qualitative comparison of downscaled annual precipitation of the MLR-based and GWR-based algorithms was made as below.

Basically, MLRBRC annual precipitation was only able to capture the global trend that “south wetter than north” rather than the spatial distribution of original GPM, and its absolute residuals were very large, especially in extremely wet regions like the Gaoligong Mountains and middle and lower reaches of the Nu River. While GWRBRC annual precipitation distributed very similar with original GPM spatially, the former was a little smoother. Meanwhile, the range of GWRBRC (269–1703 mm) was a little wider than that of original GPM (394–1625 mm), which was reasonable for the averaging effecting of low resolution precipitation. Absolute residuals of GWRBRC were also much smaller than those of MLRBRC in most regions. As for MLRARC and GWRARC annual precipitation, they were also very similar with the original GPM like GWRBRC, so the differences among them were analyzed based on ground validation in the next two sections.

4.2.2. Is Residuals Correction Helpful?

Observations from 71 rain gauge stations within Hengduan Mountains (including buffer area) were used to validate the accuracy of the MLR-based and GWR-based downscaling algorithms using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Coefficient of determination ().

To investigate whether residuals correction was helpful to MLR-based and GWR-based downscaling algorithms, we compared the accuracy of monthly (Figure 10) and annual (Table 2) precipitation downscaled by MLR-based and GWR-based algorithms between before residuals correction (termed as MLRBRC or GWRBRC) and after residuals correction (termed as MLRARC or GWRARC). Average accuracy of monthly downscaled precipitation was also summarized in Table 2.

Compared to MLRBRC, MLRARC had smaller RMSE and MAE as well as higher in each month of the year. For monthly precipitation, averagely speaking, , , and . And for annual precipitation in 2015, the changes were , , and . Obviously, the step of residuals correction improved the accuracy of precipitation predicted by MLR significantly and hence was indispensable for MLR-based algorithm.

However, GWRARC had slightly greater RMSE and MAE as well as lower compared to GWRBRC in each month except March of the year. For monthly precipitation, averagely speaking, , , and . And for annual precipitation in 2015, the changes were, , , and . This indicated that the step of residuals correction degraded the accuracy of precipitation predicted by the GWR models. Therefore, the step of residuals correction was recommended to be removed from the procedure of the GWR-based downscaling algorithm and as a result, GWRBRC rather than GWRARC was adopted as the final downscaled precipitation.

4.2.3. Accuracy of the MLR-Based and GWR-Based Downscaling Algorithms

In Section 4.2.2, we concluded that MLRARC and GWRBRC should be adopted as the final results of MLR-based and GWR-based algorithms. In this section we compared the accuracy of MLRARC, GWRBRC, and the original GPM monthly and annual precipitation. The accuracy of monthly precipitation was presented in Figure 11 and those on annual precipitation and summary of average accuracy of monthly precipitation were shown in Table 2.

MLRARC performed better than original GPM monthly precipitation in 7 out of 12 months with RMSE as evaluation criterion and 6 out of 12 months with MAE or as evaluation criteria. Averagely speaking, MLRARC had almost the same accuracy as GPM monthly precipitation, with , , and , indicating that it was no better than the Nearest Neighborhood (NN) method, which is the simplest interpolation method. Therefore, the effectiveness of MLR-based downscaling algorithm on monthly precipitation over Hengduan Mountains was unsatisfactory. For annual precipitation, performance of MLRARC was better than that of GPM dataset, with , , and .

GWRBRC performed better than original GPM monthly precipitation in 11 out of 12 months with RMSE or as evaluation criteria and 10 out of 12 months with MAE as evaluation criterion. Averagely speaking, GWRBRC had much better accuracy than GPM monthly precipitation, with , , and . For annual precipitation, performance of MLRARC was also much better than that of GPM dataset, with , , and . The significant improvement should be attributed to that the complicated relationships among precipitation, vegetation and topography were better interpreted by local regression methods considering spatially heterogeneity of regression coefficients than global regression methods.

5. Conclusion

Precipitation dataset with high resolution and accuracy is crucial for regional hydrological researches. Satellite precipitation could be downscaled based on statistical relationships among precipitation, vegetation, and topography. In this study, to downscale GPM annual and monthly precipitation over Hengduan Mountains from 10 km × 10 km to 1 km × 1 km, the Multiple Linear Regression (MLR) based algorithm and the Geographically Weighted Regression (GWR) based algorithm were employed. Downscaled precipitation was validated with ground observations from 71 rain gauge stations. After analyzing the results, it could be concluded that one has the following:

Regression model is a fundamental component of statistical downscaling algorithms; hence it is important to establish an appropriate regression model. Results of this study showed that the GWR model was much more suitable than the MLR model to describe the complicate relationships among precipitation, vegetation, and topography in Hengduan Mountains, with much higher and lower RMSE and MAE against the original GPM annual and monthly precipitation.

After analyzing the coefficients and standardized coefficients of the MLR models, we found that generally NDVI was positively related to precipitation when monthly precipitation was under 160 mm; DEM was negatively related to precipitation, especially in wet months like July and August; contribution of DEM to precipitation was greater than that of NDVI. Spatial mean standardized coefficients of GWR showed very similar intra-annual patterns with those of MLR but had strong spatial nonstationarity.

Residuals correction greatly improved accuracy of the MLR model but reduced those of GWR model. Therefore it is an indispensable step for the MLR-based downscaling algorithm but not recommended for the GWR-based algorithm.

Validation results with ground observations showed that, compared with the original GPM data, downscaled precipitation (before residual correction) based on the GWR-based algorithm not only possessed finer resolution but also higher accuracy, indicating it is a promising algorithm for satellite precipitation downscaling. Meanwhile, accuracy of precipitation downscaled by the MLR-based algorithm was only about the same with the original GPM precipitation.

Satellite precipitation downscaling might be improved through introducing more appropriate explanatory variables like the Enhanced Vegetation Index (EVI) and Land Surface Temperature (LST). Similarly, considering temporally correlation and autocorrelation in regression models might also be effective. Moreover, downscaling of daily and hourly precipitation is more significant for practical application and should be paid more effort.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was financially supported by the Key Research and Development Program of China (2015CB452701) and the National Natural Science Foundation of China (41571019).