#### Abstract

The common approach to quantifying the precipitation forecast uncertainty is ensemble simulations where a numerical weather prediction (NWP) model is run for a number of cases with slightly different initial conditions. In practice, the spread of ensemble members in terms of flood discharge is used as a measure of forecast uncertainty due to uncertain precipitation forecasts. This study presents the uncertainty propagation of rainfall forecast into hydrological response with catchment scale through distributed rainfall-runoff modeling based on the forecasted ensemble rainfall of NWP model. At first, forecast rainfall error based on the BIAS is compared with flood forecast error to assess the error propagation. Second, the variability of flood forecast uncertainty according to catchment scale is discussed using ensemble spread. Then we also assess the flood forecast uncertainty with catchment scale using an estimation regression equation between ensemble rainfall BIAS and discharge BIAS. Finally, the flood forecast uncertainty with RMSE using specific discharge in catchment scale is discussed. Our study is carried out and verified using the largest flood event by typhoon “Talas” of 2011 over the 33 subcatchments of Shingu river basin (2,360 km^{2}), which is located in the Kii Peninsula, Japan.

#### 1. Introduction

Recent advances in weather measurement and forecasting have created opportunities to improve streamflow forecasts. It is possible to combine high-resolution numerical weather prediction (NWP) data directly into streamflow forecast systems in order to obtain an extended lead time. The accuracy of weather forecasts has steadily improved over the years, but recent researches represented that direct application of outputs from the NWP model into the hydrological domain can result in considerable bias and uncertainty that are propagated into hydrological domains [1, 2].

One of the biggest sources of uncertainty in the application of streamflow forecasting comes from forecasted rainfall. The grid size in NWP models is often larger than the subcatchment size in hydrological models, which results in the forecast rainfall data not being at the appropriate resolution required for flood forecasting. In addition, even small errors in the location of weather systems by NWP models may result in forecast rainfall for the catchment concerned being significantly wrong [3, 4]. These biases and uncertainties of rainfall forecast may be amplified when cascaded through the hydrological system, and small uncertainties in rainfall forecast may translate into larger errors in flood forecasting. As an example, Komma et al. [5] showed that an uncertainty range of 70% in terms of NWP rainfall translated into an uncertainty range of 200% in terms of runoff for a lead time of 48 hours. They presented this to the nonlinearity of the catchment responses, but uncertainties such as forecast rainfall, parameter, and structure of a hydrologic model may contribute to the amplification of the uncertainty in terms of flood forecasting. Xuan et al. [6] also highlighted that although the QPF from NWP model could generally catch the rainfall pattern, the uncertainties of rainfall at the scale of model grid to the catchment were always significant.

It is difficult to understand the full range and interaction of uncertainties in flood forecasting. And the different types of uncertainty will vary with lead time of the forecasts, and with the magnitude of the event and catchment characteristics. Vivoni et al. [7] addressed the propagation of radar rainfall nowcasting errors to flood forecasts in the context of distributed hydrological simulations. However, they used the radar rainfall measurements to quantify how increases in nowcasting errors to flood forecast with lead time, whereas our approach applies the ensemble NWP rainfall into the flood forecasts to assess the error and uncertainty propagation with the catchment scale. And the variability of runoff predictions by rainfall uncertainty differs for different case studies and thus no general trend is apparent [8–10]. This study is carried out under the assumption that model parameters and structure errors do not contribute to uncertainty of flood forecasting to remove the focus from forecast rainfall error. As a result, a distributed hydrologic model is considered to be the appropriate tool to assess rainfall forecast quality and to understand how uncertainty in the rainfall forecasts field may propagate throughout the watershed. Further, the integration of the rainfall forecast into runoff simulation at multiple locations in a catchment allows the investigation of the effects of catchment scale on the propagation of rainfall forecast uncertainties in the streamflow forecasting.

The main objective of this study is to assess the error and uncertainty propagation due to NWP rainfall uncertainty on hydrological response through a distributed hydrologic model depending on catchment scale. In the context of flood forecasts, it is important to assess the forecast rainfall uncertainty in terms of the effect on runoff. And uncertainties based on spatial scale are also important by means of the information for real-time flood forecast and the possible amount of flow to the reservoir and exceeding its capacity to optimize the water volume to be released. Therefore, the coupled use of NWP rainfall output and hydrologic flood forecasting requires an assessment of uncertainty through hydrological response.

The research question is as follows: How does ensemble NWP rainfall error translate into flood forecasting, and how does flood forecast uncertainty propagate as a function of catchment scale dependency? To our knowledge, there exists research about rainfall uncertainty’s direct propagation into the hydrological domain, but the spatial scale dependency of uncertainty propagation of ensemble NWP rainfall into hydrological predictions has not been addressed. First, we compared forecast rainfall error based on the BIAS, which is used to measure error amplification, to flood forecast error driven by ensemble NWP forecast outputs to assess error propagation. Second, we discussed the variability of flood forecast uncertainty according to catchment scale using ensemble spread, which is driven by ensemble NWP rainfall through a distributed hydrologic model. We also assessed flood forecast uncertainty, which is under the condition that ensemble NWP rainfall has not BIAS compared with observed radar rainfall and catchment scale using an estimation regression equation between ensemble NWP rainfall and discharge based on the BIAS. Finally, we assessed flood forecast uncertainty with RMSE using specific discharge in catchment scale. Note that we focused not only on the quantitative error propagation of rainfall forecast into flood forecast but also the variability of flood forecast uncertainty with catchment scale.

This paper has been organized in the following way. After the Introduction, Section 2 introduces the design of meteorological experiment for the Typhoon Talas event and describes the target area and a hydrologic model, and Section 3 addresses the results of uncertainty propagation of NWP Rainfall Forecast to Flood Forecast with catchment scale. Finally, we summarize our major conclusions in Section 4.

#### 2. Data and Methodology

##### 2.1. Meteorological Data

In Japan, an operational one-week ensemble prediction model from JMA was developed to provide probabilistic information of 51 ensemble members with a horizontal resolution of 60 km, and it used to be applied for hydrological applications (e.g., prior and optimized release discharge for dam operation) [11]. However, operational short-term (1-2 days) ensemble prediction with much finer resolution has not yet been developed. For that reason, studies on ensemble forecast systems that are composed of 11 members (1 unperturbed and 10 perturbed member) with a horizontal resolution of 10 km and 2 km, the latter nested inside the former with a 6-hour lag, have been conducted by the Meteorological Research Institute (MRI) of JMA for the 2011 Typhoon Talas event.

Both 10 km and 2 km resolution systems used the JMA Nonhydrostatic Model (NHM) as the forecast model [12, 13]. Whereas the 10 km resolution forecast adopted the cloud microphysical process and Kain-Fritsch convective scheme, the 2 km resolution forecast did not use a convective scheme because of its cloud resolving resolutions. The coarse resolution system of 10 km had a domain of grid points with 50 vertical levels and forecasted up to 36 hours in advance. For initial and lateral boundary conditions, 10 km used the analysis from the JMA nonhydrostatic 4DVAR (JNoVA) data assimilation system [14] and the forecasts of JMA’s high-resolution (TL959L60) global spectral model (GSM). The control run (cntl) is the forecast with a nonperturbed analysis, and the 10 perturbed forecasts were generated from JMA’s 1-week global EPS (WEP) for the initial and boundary perturbations. The fine-resolution 2 km system was conducted from the downscale forecast of 10 km resolution systems. This system had a domain of 350 × 350 grid points with 60 vertical levels and forecasted up to 30 hours in advance. The domain of the two ensemble systems with 10 km and 2 km horizontal resolution are illustrated in Figure 1(a). The initial and boundary conditions for each member at 2 km were interpolated from the forecasts on the corresponding member at 10 km resolution with a 6-hour lag. 10 km started running at 21 JST every day, and 2 km began 6 hours later. Figure 1(b) shows a schematic of forecast runs with 10 km and 2 km resolution.

**(a)**

**(b)**

In this study, we introduced the results of ensemble prediction with a 2 km horizontal resolution due to the viewpoints of high resolution and better predictability of weather phenomena and used 4 sets of ensemble prediction outputs with 30 hours forecast time to assess rainfall forecast uncertainty and to understand how uncertainty in the rainfall forecast may propagate throughout the watershed (Table 1). And the ensemble NWP rainfall forecast in this study is verified spatially against the Ministry of Land, Infrastructure, Transport and Tourism (MLIT) C-band composite radar data (radius of quantitative observation range: 120 km, 1 km mesh and 5 min resolution). Since the first installation of C-band radar in Japan in 1976, the radars have installed all parts of Japan gradually. Now 26 C-band radars cover and monitor rainfall of all Japan. It is important to provide information of river and basin rapidly to relevant authorities and people in order to protect human life and property from disaster. MLIT C-band radar provides wide observation range and is useful for large river flood-management tool in observing the seasonal rain front or typhoons.

##### 2.2. Target Area and a Hydrologic Model

The Shingu river basin was selected as the target area to assess rainfall forecast uncertainty into streamflow forecast with spatial scale. The Shingu river basin is located in the Kii Peninsula of the Kinki area, Japan, and covers an area of 2,360 km^{2}. The average elevation of the study site is 644.6 m, and the slope is steep; this basin is a mountainous area. The five dams, Futatsuno, Kazeya, Komori, Nanairo, and Ikehara, are located upstream. The left and right sides of the Shing river basin exhibit different characteristics. The left side is the Totsukawa basin, and the right side is the Kitayamakawa basin. Their characteristics are completely different. The elevation of Totsukawa is higher than that of Kitayamakawa. And Kitayamakawa has a lower level in the channel. We divided the Shingu river basin into 33 subcatchments from 54.24 to 2245 km^{2} (Figure 2, Table 2), including 6 gauged (5 dams and 1 gauge station) and 27 ungauged locations, for the assessment of uncertainty of ensemble NWP rainfall into flood forecast with catchment scale. At first, we divided the Shingu river basin into 6 subcatchments including the 5 dams and 1 gauge station, which have the observed discharge data. Then we also divided the Shingu river basin into 33 subcatchments from 54.24 to 2245 km^{2} by considering the channel junction of tributaries using the drainage networks of digital elevation model (DEM). Segond [9] specified the catchment into small (<100 km^{2}), medium (100–2000 km^{2}), and large (>2000 km^{2}) catchments. However, the standard of catchment scale differs for different case studies, and the Shingu river basin covers an area of 2,360 km^{2}; thus, we specified 33 subcatchments into 3 types, small catchment (<200 km^{2}), medium catchment (200~1000 km^{2}), and large catchment (>1000 km^{2}) to evaluate the variability with catchment scale. We also divided catchment characteristics into 2 types, mountainous area (>800 m) and flat area (<800 m) considering average elevation (800 m) of the 33 subcatchments.

We used a spatially distributed hydrologic model, based on one-dimensional kinematic wave method for subsurface and surface flow (hereafter, KWMSS) with a conceptual stage-discharge relationship [15]. Figure 3 is a conceptualization of spatial flow movement and flow process in hillslope elements of KWMSS. The rainfall-runoff transformation conducted by KWMSS is based on the assumption that each hillslope element is covered with a permeable soil layer. This soil layer consists of a capillary layer and a noncapillary layer. In these conceptual soil layers, slow and quick flow are simulated as unsaturated Darcy flow and saturated Darcy flow, respectively, and overland flow occurs if water depth, [m], exceeds soil water capacity: where [m/s], [m/s], [m/s], [m^{1/3}s^{−1}], , is the slope gradient, [m/s] is the hydraulic conductivity of the capillary soil layer, [m/s] is the hydraulic conductivity of the noncapillary soil layer, [m^{−1/3}s] is the roughness coefficient, [m] is the water depth corresponding to the water content, and [m] is the water depth corresponding to maximum water content in the capillary pore. The flow rate of each hillslope element [m^{2}/s] is calculated by (1) and combined with the continuity equation for channel routing by (2). Many studies have applied this hydrologic model in a variety of hydrologic applications and have shown that this rainfall-runoff model was effective, robust, and flexible [16–18].

There was no observed discharge data in subcatchments, except in 5 dams and 1 gauge station. For that reason, the parameter optimization of the hydrologic model was conducted using the Ministry of Land, Infrastructure, Transport and Tourism (MLIT) C-band composite radar data, which has high spatial-temporal resolution to capture the spatial variability of rainfall. However, in spite of the high-resolution accuracy of radar data, parameterization associated with soil parameters of hydrological model remains uncertain due to impossibility of direct observation and use of the soil parameters (i.e., discordance between soil properties and model parameters). Therefore, we assumed that parameters of hydrologic model in Table 3 are spatially homogenous over the 33 subcatchments. The Shuffled Complex Evolution (SCE) global optimization method [19] was used for the parameter optimization of the hydrologic model using MLIT composite radar rainfall to acquire the reference data of the 33 subcatchments. The SCE-UA, one of the computer-based automatic optimization algorithms, is a single-objective optimization method designed to handle high parameter dimensionality encountered in calibration of a nonlinear hydrologic simulation model. Basically, this scheme is based on the following three concepts: combination of simplex procedure using the concepts of a controlled random search approach; competitive evolution; and complex shuffling. The integration of these steps makes the SCE-UA effective, robust, and flexible. In this study, the SCE-UA optimization method was modified to minimize the objective function between observed inflows and simulated results for all 5 dams and 1 gauge station at the same time (Equation (3)). The hydrologic model used here provides output variable of the discharge at the outlet of interest that our target is to find the near-optimal parameter values. We selected objective function using the root mean square error (RMSE). Table 3 summarizes the optimized parameter values from multicalibration using SCE-UA optimization method, and Figure 4 shows the results of multicalibration using the SCE-UA optimization method and minimizing the objective function of 6 observation points: Observed radar data and its simulated discharge were used as reference data to compare the ensemble NWP rainfall forecast and flood forecast for the assessment of uncertainty propagation in 33 subcatchments. Although the simulated discharge from observed radar rainfall does not specifically represent the true discharge, the simulated discharge from the observed radar data is nevertheless set as reference data for comparison with the discharge from ensemble prediction data.

##### 2.3. Skill Score Descriptions

To evaluate the accuracy of the ensemble forecast in terms of areal rainfall intensity, we calculated two error indexes. The first is the normalized root mean square error (RMSE), which is normalized by the mean value of the observations during the each forecast period (30 hours). The second is the log ratio bias, which a relative error and provides information about the total amount of rainfall. A log ratio bias value of zero indicates a perfect forecast; positive and negative values indicate underestimated and overestimated forecasts, respectively: where is forecast time (30 hours) in each period and and are the observed and forecasted rainfall at time .

For the spatial verification of ensemble NWP rainfall, the rainfall forecasts have been verified spatially against the MLIT C-band composite radar data. The ensemble forecast was expressed as probabilities of exceeding selected rainfall thresholds (1.0 and 5.0 mm/h). A contingency table can be constructed with a spatial comparison, in which each area with more than selected rainfall threshold is defined as “yes,” and other areas are defined as “no” for both forecasted and observed rainfall fields. In this study, two indexes are considered for spatial verification of ensemble forecast in the Kinki region (Figure 1). First index is critical success index (CSI), which is also called the “threat score” and its range is 0 to 1, with a value of 1 indicating a perfect forecast. It takes into account both false alarms and missed events. And second one is BIAS, which has range with 0 to . CSI and BIAS are given bywhere hits are the number of correct forecasts over the threshold (i.e., rainfall is forecast and also observed), and misses are the number of times rainfall is not forecast but is observed. False alarms are the number of times rainfall is forecast but is not observed.

Rainfall forecast error of ensemble outputs from the NWP model is compared with the flood forecast error driven by those rainfall forecasts to assess the uncertainty propagation. It is important, however, to quantify uncertainty propagation from rainfall forecast to flood forecast using statistical measures that appropriately capture forecast deviations. For this reason, the BIAS was used to compare the mean conditions in the forecast and observation in terms of rainfall and flood forecast and to measure error amplification. Note that the BIAS of the basin-mean rainfall is directly compared with the discharge BIAS, and the BIAS is used for an average value of 30 hours of forecast time of rainfall and flood forecast results. Furthermore, the results are classified according to the forecast period of ensemble rainfall from the NWP model:where is the forecast time of each forecast period (30 hours); and are the observed and forecasted rainfall and discharge at time , respectively; and is each ensemble forecast (11 ensemble members).

For the evaluation of the variability of flood forecast uncertainty according to catchment scale, the mean value of the coefficient of variation (CV), which is a normalized measure of dispersion of a probability distribution or frequency distribution, was used (Equation (7)). It is defined as the ratio of the standard deviation to the mean. The absolute value of the CV is sometimes known as relative standard deviation (RSD), which is expressed as a percentage. The coefficient of variation determines the risk:where is the forecast time of each forecast period (30 hours), and and are the standard deviation to the mean value of the flood forecast at each ensemble and time , respectively.

#### 3. Results and Discussion

##### 3.1. Rainfall Verification

For the purpose of temporal verification of QPF with ensemble NWP rainfall during the Talas event, the areal rainfall intensity of ensemble forecasts is compared with the Automated Meteorological Data Acquisition System (AMeDAS) over the Shingu river basin. For comparison, the observed rainfall of AMeDAS (18 stations, 10 min step) is interpolated using the Thiessen polygon spatial distribution method.

Figure 5 shows areal rainfall of ensemble forecast over the Shingu river basin in the form of box plots plotted from 0 to 24 hours forecast time of ensemble forecast excluding overlapped forecast time (from 25 to 30 hours) compared with the areal rainfall of AMeDAS. In the 1st and 2nd forecast periods, the control run (unperturbed member) and ensemble (perturbed members) forecasts produced a suitable areal rainfall compared with the AMeDAS rainfall, but as shown in the 3rd forecast result, the control run forecast was not well matched and did not produce the rainfall intensity because the spatial pattern of rain cells moved to the north-eastern part of Kii peninsula quickly by that the MSM failed to correctly forecast, as mentioned in the Introduction. On the other hand, the upper range of the ensemble forecast was able to produce considerable rainfall intensity, and the amounts of maximum rainfall intensity are also similar to AMeDAS rainfall. In 4th forecast period, the reason why rainfall intensities are overestimated can be explained by the fact that the last spatial rainfall pattern of the 3rd forecast moved to the north-eastern part of the Kii peninsula; however, it started the forecast again from the Kii peninsula in the 4th forecast. For this reason, rainfall intensities were very high in the 4th forecast period compared with AMeDAS.

**(a)**

**(b)**

In the index of normalized RMSE, the control run and ensemble mean have similar values from 1st to 3rd forecast period, but the best index of the ensemble spread could provide good value as compared with the deterministic control run. In the 4th forecast period, as mentioned above, the index of the control run and ensemble spread is relatively large, but the best index of the ensemble is estimated at 0.89 (the control run is 3.85). In the index of the log ratio bias, the best index of ensemble spread could cover zero value (perfect forecast), whereas the control run forecast was underestimated for the 1st, 2nd, and 3rd forecasts and overestimated for the 4th forecast period.

Figure 6 shows the results of Critical Success Index (CSI) and BIAS in a comparison of radar data and ensemble forecasts with selected rainfall thresholds (1.0 and 5.0 mm/h) during the 1st, 2nd, 3rd, and 4th forecast periods. In the 1st forecast period of CSI with 1.0 mm/h threshold value, ensemble spread could provide better results than deterministic control run after 17 hours’ forecast time, whereas the CSI of control run is close to the ensemble mean value. In the 2nd forecast period, although the CSI of control run are better than ensemble mean, the best index of the ensemble spread outperformed the control run. In the 3rd forecast period, as stated above, the spatial pattern of rain cells moved to the north-eastern part of Kii peninsula quickly, so the CSI of control run decreased as lead time increased, whereas the best value of ensemble spread could provide the better result than the control run. In the 4th forecast period, the control run was close to the ensemble mean, and ensemble spread could cover the control run. In the 3rd forecast period with 1.0 and 5.0 mm/h threshold value, the BIAS decreased quickly as lead time increased. However, the best values of the ensemble spread could maintain higher forecast accuracy compared to the control run forecast. It showed that ensemble forecasts have an advantage in terms of spatial accuracy, although lower value of ensemble forecasts exists in each forecast period as lead time increases.

**(a) CSI and BIAS with 1 mm/h threshold value**

**(b) CSI and BIAS with 5 mm/h threshold value**

##### 3.2. Uncertainty Propagation of NWP Rainfall Forecast to Flood Forecast

We conducted the ensemble flood forecasts of 33 subcatchments in the Shingu river basin for an assessment of the ensemble flood forecast driven by ensemble NWP rainfall. Simulated discharges from the observed radar rainfall were used as the initial condition for the ensemble flood forecast in each forecast period. Figure 7 shows the results of the 30 hours’ ensemble flood forecast from first to fourth forecast periods over the 33 subcatchments for Typhoon Talas event. Figure 5 illustrates a complete set of the forecasted discharge for the ensemble range (grey curve), the ensemble mean (red curve), and observed radar discharge data of 33 subcatchment outlet points (bold black curve). Through Figure 5, the ensemble rainfall from NWP model from the first to the fourth forecasts produced a suitable discharge, but average ensemble values were lower than the observed radar discharge of the 2nd forecast period over the 33 subcatchments, caused by the underestimation of the rainfall forecast. In the 3rd forecast period of peak discharge, the average ensemble rainfall was typically lower than the observed discharge, caused by the spatial shift of ensemble NWP rainfall from the correct spatial position. The majority of ensemble members were also lower than the observed discharge, but a few ensemble members exceeded the observed radar discharge and were close to the observed discharge. In the 4th forecast period, the ensemble forecasts were well matched to observed radar discharge and were overestimated because the overestimation in rainfall forecast triggered a runoff overestimation. From the results of ensemble flood forecast over the 33 subcatchments, flood forecasts driven by ensemble outputs produced suitable results but showed that in general it has a large proportion of under- and overpredictions at low lead times and exhibit a negative bias at longer lead times.

Figure 8 presents a comparison of rainfall and flood forecast errors from the first to the fourth forecast periods with linear regression equations based on a statistical measure, the BIAS, for 33 subcatchments of the Shingu river basin represented in Figure 2. Through Figure 8, rainfall forecast errors lead to proportional flood forecast errors with linear regression equations. The discharge BIAS varies based on the same rainfall BIAS, so the discharge BIAS is different based on catchment scale. For small catchments, rainfall errors from forecast location error occur sensitively due to rainfall pixels of NWP model, which does not cover the small catchment exactly. For larger catchments, many rainfall pixels contribute to the rainfall forecast error propagation in the flood forecast. Therefore, the variability of flood forecast uncertainty according to catchment scale should be investigated.

##### 3.3. Flood Forecast Uncertainty with Catchment Scale

As mentioned above, the Shingu river basin is divided into 33 subcatchments from 54.24 to 2245 km^{2}, including 6 gauged and 27 ungauged locations, for the assessment of uncertainty of ensemble NWP rainfall into flood forecast with catchment scale. The Shingu river basin has 3 types (small, medium, and large catchments) and 2 characteristics (mountainous and flat area) for evaluation of the variability with catchment scale.

Figure 9 shows the flood forecast variability expressed by coefficient of variation using ensemble spread of the flood forecasting with catchment scale and characteristic. Each CV value refers to the average value from the first to the fourth forecast periods and shows CV values for 3 types of the small (red point), medium (blue point), and large (grey point) catchments and 2 characteristics of mountainous (large point) and flat (small point) area for evaluation of the variability with catchment scale. It is evident from Figure 9 that the coefficient of variation in medium and large catchments is close to 0.25, and this is maintained as the catchment increases. For small catchments, however, there is a larger variability than for medium and large catchments, and small catchments have a high coefficient of variation (>0.3). This result suggests that uncertainty variability occurs sensitively and diversely at the same time in different catchments, and small catchments have more sensitive variability in uncertainty. Therefore, flood forecasting in small catchment requires care due to the large variability of uncertainty. On the other hand, in medium and large catchments, there is less uncertainty than with small catchments, and the coefficient of variation converges into a uniform value.

Flood forecast uncertainty focuses on the discharge uncertainty with catchment scale and was assessed when rainfall BIAS was 1, using an estimated linear regression equation between each ensemble rainfall BIAS and discharge BIAS of 33 subcatchments. Figure 10 compares the rainfall BIAS of ensemble members and discharge BIAS driven by those rainfall forecasts in each subcatchment and linear regression equation. From Figure 8, the relationship between rainfall forecast errors and flood forecast errors is proportional in ensemble members to the linear regression equation and is different with catchment scale. And as a result of separation of the forecast BIAS by each subcatchment, we obtain 132 linear regression equations for 33 subcatchments and 4 forecast periods. Then we calculate the discharge BIAS when rainfall BIAS is 1 using a linear regression equation for each subcatchment to focus on the discharge BIAS with catchment scale.

Figure 11 represents the discharge BIAS. It is assumed that rainfall forecast has no error compared to observed radar rainfall (rainfall BIAS is 1 using the linear regression equation) with catchment scale and characteristic. Figure 11 shows that there is a discharge BIAS in all of small, medium, and large catchments even though rainfall forecast has no errors compared to observed radar rainfall. This is due to the spatial variability of rainfall, even though basin-mean rainfall is similar to the observed radar rainfall. As an example, Lee et al. [20] showed that input uncertainty is due to spatial variability of rainfall on catchment responses in rainfall-runoff modeling. As stated above, however, we focused not only on the quantitative error propagation of rainfall forecast into flood forecast but also on the variability of flood forecast uncertainty with catchment scale. The discharge BIAS in medium and large catchments has properties similar to those of the coefficient of variation in Figure 9. The small catchments indicate large variability of discharge BIAS.

Figure 12 represents the flood forecast uncertainty with root mean square error (RMSE) using specific discharge (discharge/catchment scale) of outlets with catchment scale. Figure 12 demonstrates properties similar to those resulting from the coefficient of variation and BIAS in Figures 9 and 11, respectively. In medium catchments, however, there are two types of characteristics in forecast uncertainty variability. In mountainous areas, discharge RMSE is less than that in flat areas, and this characteristic is also seen in Totsukawa and Kitayamaka, the left and right sides of the Shingu river basin, respectively.

#### 4. Concluding Remarks

Forecast uncertainty of NWP models is usually assumed to represent the largest source of uncertainty on flood forecasts. However, there are in fact many sources of uncertainties in the flood forecasts which could also be significant, for example, the corrections and downscaling mentioned above and spatial and temporal uncertainties as input into the hydrological simulations including data assimilation. And the different types of uncertainty will vary with lead time of the forecasts and with the magnitude of the event and catchment characteristics. Ensemble flood forecasting by ensemble NWP rainfall is specifically designed to capture the uncertainty, by representing a set of possible future states of the atmosphere. This uncertainty can then be cascaded through flood forecasting systems to produce an uncertain or probabilistic prediction of flooding. In many cases, the potential of flood forecasting is described alongside cautious notes regarding variability, uncertainty, communication of ensemble information, need for decision support, and problems of using short time series [9]. Therefore, it is important to assess the forecast rainfall uncertainty in terms of the effect on runoff, and uncertainties based on spatial scale are also important for the information of real-time flood forecast.

The main objective of this study is to investigate the error and uncertainty propagation due to NWP rainfall uncertainty on hydrological response through a distributed hydrologic model depending on catchment scale. First, we conducted the ensemble flood forecasts of 33 subcatchments in the Shingu river basin for an assessment of the ensemble flood forecast driven by ensemble NWP rainfall and compared forecast rainfall error based on the BIAS, which is used to measure error amplification, to flood forecast error driven by ensemble NWP forecast outputs to assess error propagation. Second, we discussed the variability of flood forecast uncertainty according to catchment scale using ensemble spread by ensemble NWP rainfall through a distributed hydrologic model. Finally, we assessed the flood forecast uncertainty using an estimation regression equation between ensemble NWP rainfall and discharge based on the BIAS and also assessed the flood forecast uncertainty with RMSE using specific discharge in catchment scale.

From the results, the ensemble flood forecast over the 33 subcatchments and flood forecasts driven by ensemble outputs produced suitable results but showed that in general it has a large proportion of under- and overpredictions at low lead times and exhibit a negative bias at longer lead times. And this study demonstrates that uncertainty variability occurs sensitively and diversely at the same time in different catchments, and small catchments have sensitive variability of uncertainty. General findings from this study are the fact that smaller catchments demonstrate a larger uncertainty in the flood forecast. Therefore, flood forecasting in small catchment should be careful due to the large variability of uncertainty. On the other hand, in medium and large catchments, there is less uncertainty than in small catchments as would be expected due to the smoothing effects of modeling a larger catchment. The ensemble forecasts are specifically designed to capture the uncertainty in NWPs, by representing a set of possible future states of the atmosphere. This uncertainty can then be cascaded through flood forecasting systems to produce an uncertain or probabilistic prediction of flooding. In order to use ensemble forecasts of NWP model for flood forecasts effectively, it is important to establish methodologies to analyze ensemble flood forecasts. To reduce the uncertainty of rainfall and flood forecasts, the bias correction and/or hybrid products with radar-based prediction are required to achieve more reliable hydrologic predictions; bias correction and blending method for accuracy improvement was addressed in Yu et al. [21]. In further research, we need to verify the applicability through a number of case studies, and we expect it to be used in hydrological applications such as real-time flood forecasting for warning system and optimized release discharge for dam operation.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This study was partly supported by the subproject of the field 3 on Next Generation Supercomputer Project, “Prediction of Heavy Rainfalls by a Cloud-Resolving NWP System,” and was based on data from the Ministry of Land, Infrastructure, Transport and Tourism (MLIT) and J-POWER Co. Ltd., Japan. The authors are grateful for their support.