Abstract

For the first time ever, convection-resolving forecasts at 1 km grid spacing were produced in realtime in spring 2009 by the Center for Analysis and Prediction of Storms (CAPS) at the University of Oklahoma. The forecasts assimilated both radial velocity and reflectivity data from all operational WSR-88D radars within a domain covering most of the continental United States. In preparation for the realtime forecasts, 1 km forecast tests were carried out using a case from spring 2008 and the forecasts with and without assimilating radar data are compared with corresponding 4 km forecasts produced in realtime. Significant positive impact of radar data assimilation is found to last at least 24 hours. The 1 km grid produced a more accurate forecast of organized convection, especially in structure and intensity details. It successfully predicted an isolated severe-weather-producing storm nearly 24 hours into the forecast, which all ten members of the 4 km real time ensemble forecasts failed to predict. This case, together with all available forecasts from 2009 CAPS realtime forecasts, provides evidence of the value of both convection-resolving 1 km grid and radar data assimilation for severe weather prediction for up to 24 hours.

1. Introduction

Accurate prediction of convective-scale hazardous weather continues to be a major challenge. Efforts to explicitly predict convective storms using numerical models dated back to Lilly [1] and began with the establishment in 1989 of an NSF Science and Technology Center, the Center for Analysis and Prediction of Storms at the University of Oklahoma (CAPS). Over the past two decades, steady progress has been made, aided by steady increases in available computing power. Still, the resolutions of the current-generation operational numerical weather prediction (NWP) models remain too low to explicitly resolve convection, limiting the accuracy of quantitative precipitation forecasts.

For over a decade, the research community has been producing experimental real time forecasts at 3-4 km convection-allowing resolutions (e.g., [24]). Roberts and Lean [5] documented that convection forecasts of up to 6 hours are more skillful when run on a 1 km grid than on a 12 km grid, and more so than on a 4 km grid. On the other hand, Kain et al. [2] found no appreciable improvement with 2 km forecasts compared to 4 km forecasts beyond 12 hours.

In the spring seasons of 2007 and 2008, CAPS conducted more systematic real-time experiments. Daily forecasts of 30 h or more were produced for 10-member 4 km ensembles and 2 km deterministic forecasts ([6, 7], X07 and X08 hereafter). In 2008, radial velocity and reflectivity data from all operational radars in a domain covering most of the CONUS (continental US) were assimilated [7] using a combined 3DVAR-cloud analysis method [8, 9]. Standard precipitation verification scores show that significant positive impact of radar data lasts up to 9 hours but the difference in scores between the 4 and 2 km forecasts is relatively small [7, 10].

Recognizing that producing better convective forecasts requires accurately resolving the internal structures of convective storms, the CAPS team carried out real-time 1 km resolution forecasts assimilating radar data from mid-April through early June, 2009 [11]. Daily 30-hour forecasts used 9600 processor cores of a Cray XT5 supercomputer at the National Institute of Computational Science, University of Tennessee. Each forecast took about 5.5 hours to complete. In preparation for such forecasts, tests were made using cases from the spring of 2008 and they represented the first time ever that forecasts at a 1 km resolution were produced for a large domain covering the entire CONUS, assimilating all available data from the operational weather radars in the domain (see Figure 1). In this paper, we document the results of one of the 1 km tests as they were produced in early 2009 in preparation of the 2009 CAPS spring forecast experiment and compare the forecasts produced at 4 km grid spacing with and without radar data assimilation that were produced in realtime in 2008. We also present briefly the mean precipitation skill scores from the spring 2009 forecasts, produced at 1 km and 4 km grid spacing with radar data assimilation and 4 km forecasts without radar data, together with their comparisons with the reference NAM forecasts.

The rest of this paper is organized as follows. Section 2 describes the forecast model configurations, and Sections 3 and 4 present and discuss the results. A summary is given in Section 5.

2. Forecast Configurations

The 26 May 2008 test case is a more weakly forced case highlighted in X08. The 4 km realtime forecasts correspond to the control members of the 4 km storm-scale ensemble forecasts (SSEF, X08, [12]), with and without radar data (named CN4 and C04, resp.). In 2008, the CAPS forecasts used version 2.2 of the Advanced Research Weather Research and Forecast (WRF-ARW) [13] model while in 2009 version 3.0 of WRF-ARW was used. For this reason, the 4 km and 1 km forecasts presented in this paper used versions 2.2 and 3.0 of WRF, respectively, but with the same set of physics parameterization options that correspond to the control member of the CAPS SSEFs of the two years [12, 14]. (The physic options used by the control forecasts of the two years were the same. Furthermore, version 3.0 differs from 2.2 mainly in the addition of new physics parameterization schemes while the dynamic core remains the same. For the configurations used, version 3.0 produced essentially the same forecasts results as version 2.2 for the 4 km forecasts based on later tests.) They are, specifically, the RRTM short-wave and NASA GSFC long-wave radiation; the NOAH land surface model, the Thompson microphysics, and the Mellor-Yamada-Jancic (MYJ) PBL schemes were used (see X08 for references), together with monotonic advection for water variables [15]. Cumulus parameterization scheme was not used, since 4 km and 1 km grid spacings are generally considered convection permitting and convection resolving, respectively, while cumulus parameterization schemes are usually designed for grid spacings larger than 10 km [16].

All forecasts were initialized at 0000 UTC of 26 May 2008 for the test case. Forecasts C04 and C01 are, respectively, 4 and 1 km forecasts without radar data assimilation and were initialized by interpolation from the operational National Centers for Environmental Prediction (NCEP) North America Mesoscale (NAM) model 0000 UTC analysis on a 12 km grid. The 4 and 1 km forecasts with radar data assimilation, that is, CN4 and CN1, started from the analyses produced on the native model grid by the Advanced Regional Prediction System (ARPS) [8] three-dimensional variational (3DVAR) system [17] and its complex cloud analysis package [9, 18], using the same NAM analysis as the background. Full-volume level-2 data from 57 WSR-88D radars running in precipitation mode (63 additional radars ran in clear mode) were analyzed by the 3DVAR. The data entered the system through the ARPS complex cloud analysis package, which analyzes cloud and hydrometeor fields and then adjusts in-cloud temperature and moisture based on a 1D parcel model with entrainment in areas of diagnosed cloud and rising motion [18]. The radar data were first automatically quality controlled, including velocity dealiasing, then “remapped” to the model grid through a least-squares fitting procedure [19] before being analyzed. Hence, the data were essentially super-obbed to the model grid first. Additionally, wind profiler and standard surface observations including the Oklahoma (OK) Mesonet data were also analyzed. The lateral boundary conditions came from the NAM forecasts. Both grids had 50 vertical layers with a near-surface vertical resolution of 20 m.

3. Forecast Results and Subjective Evaluation

3.1. The 26 May 2008 Case

At 0000 UTC, 26 May 2008 (not shown), a low was centered over Minnesota (MN), and a weak, quasistationary cold front extended from the low center southwestward to the western Kansas (KS) border, where it intersected a dryline that extended southward along eastern New Mexico (NM) border into northern Mexico (the point where a dryline intercept a front is often referred to as the front-dryline triple point, e.g., [20]). Fully developed quasi-linear convection existed through central KS about 100 km ahead of the cold front. Another SSW-NNE-oriented quasi-linear convective line existed in the Texas (TX) panhandle area, about 150 km east of the dryline at the TX-NM border. Over the next three hours, these lines evolved into a long connected line that was further linked with the convection in the Great Lakes (GL) region (Figure 1(a)). This squall line propagated eastward and maintained its identity until 0000 UTC, May 27 (not shown), when it was found over eastern Mississippi (MS), northern Alabama (AL), and eastern Tennessee (TN). During the entire period, the cold front was nearly stationary; the squall line was therefore mostly self-propagating, driven by the progression of its own cold pool. The initial convection-initiating forcing along the front and dryline was lost during this stage. This line quickly dissipated after 0000 UTC, May 27.

During this 24 hour period, there were other regions of convection that interacted with each other. As documented by X08, the evolution of convection during this period was rather complex and the morphology of many of the convective storms was modulated by their own cold pools and gust fronts and interactions with those of other storms. Such a situation is more difficult to predict than cases where strong propagating synoptic-scale features, such as a strong cold front, play more controlling roles. We demonstrate here that in the absence of strong large-scale control, the impact of radar data can be long-lasting.

3.2. Prediction Results

At the initial time (not shown), the composite (vertical column maximum) fields in CN4 and CN1 look very similar to the observed, which is due to the direct assimilation of data. C04 and C01, however, had no reflectivity in the initial condition (not shown). In addition to the quasi-linear convection ahead of the dryline and cold front, there was a large bow-shaped echo extending from central Missouri (MO) to central Arkansas (AR) at this time. There was also a line of cells in far southwestern TX, also east of the dryline.

Being properly initialized in CN1 and CN4, these groups of convection were accurately predicted over the first three hours (Figures 1(b) and 1(c)). The characteristics and pattern of convection predicted by CN1 (Figure 1(b)) in the TX panhandle, northwest OK, and KS regions at 0300 UTC compare very well with those of observation (Figure 1(a)). The associated narrow-line structures in CN1 agree particularly well with the observations. The forecast did miss the development of a new line segment in eastern Iowa (IA) at this time, which developed in the model later, at 0500 UTC. The model also predicted the bow-echo in the MO-AR region well, moving it from the initial central-MO-central-AR location to the Mississippi River at 0300 UTC (Figure 1(b)). The broad pattern of CN4-predicted convection is similar, but many fine-scale details are missing. The line segments in the TX, OK, and KS regions are not as well organized. This shows the noticeable advantage of the 1 km grid in resolving storm-scale structures.

The 4 km forecast without assimilating radar or additional surface Mesonet data (C04) is clearly inferior at 3 hours (Figure 1(d)). Essentially all of the line segments in TX, OK, and KS are missing. Instead, the model was trying to initiate new convection along the dryline at the TX-NM and KS-Colorado (CO) borders and along the cold front now located at the KS-Nebraska (NE) border and intersecting the dryline at the northwest corner of KS. In C04, the bow in MO-AR region is mostly missing, and the convection in the GL region is too weak. In this case, the convection that developed in the first few hours of forecast near the cold front and dryline was at wrong locations; as we will see later, this has long-term consequences.

At 9 hours, a time when the direct impact of radar data measured by standard skill scores for the season average starts to diminish (X08), the positive impact of radar data is still very clear in this case in both CN1 and CN4 (Figure 2). Figure 2(b) shows that CN1 predicted the strong, narrow squall line extending from central OK through eastern-central MO very well, including the structure of embedded intense convection. Its southern end advanced too fast though, placing it about 150 km ahead of the one observed in southeast Oklahoma. One possible reason for the too fast advancement of the line is the cold pool which may be too strong. Cold pool intensity has been found to be rather sensitive to the microphysics, especially the drop/particle size distributions of rain and graupel, which through evaporative and melting affects cold pool intensity [21, 22].

Along the Mississippi River is another narrow line of cells that was observed and also predicted accurately in CN1. An examination of radar data and satellite imagery indicates that these cells developed along the back edge of the cold pool left behind by the northeastward propagating bow-shaped convection, which is at this time barely identifiable in northwestern Kentucky (KY, Figure 2(a)). This line intersected with the main squall line northwest of St. Louis, MO, forming a -shaped echo. The CN1 forecast of this pattern matches the observation very well. In addition, there is indication that the 1 km forecast is producing stratiform precipitation trailing the leading convective line at the northern portion (Figure 2(b)), while the observation shows a clear secondary precipitation maximum behind the convective line somewhat near the southern end of the line. Such secondary precipitation maxima are prominent features within mature squall lines (e.g., [23]) but are notoriously difficult to predict in numerical models, and lack of model resolution and deficiency in the microphysics had been suspected to be the cause [2426]. The fact that the 1 km forecast shows a somewhat better ability in producing the trailing stratiform precipitation is encouraging. The evolution of convection in other parts of the domain not shown, including those in southwest TX, the northern US Rockies, and near the GL, generally agrees with observations also.

The general pattern of predicted convection in CN4 (Figure 2(c)) is similar to that in CN1 (Figure 2(b)), although significant differences exist in detail. CN4 also captured the general -shaped echo, but the embedded cells are clearly weaker. The southern portion of the main line also propagated too fast. In general, the 1 km forecast is noticeably superior to the 4 km forecast; it provides a much clearer indication of the intensity of the strongest embedded convective cells.

The forecast of C04 at this time is much poorer (Figure 2(d)). This forecast never managed to “spin up” the prefront and predryline convection. It simply evolved the convection that was incorrectly initiated along the front and dryline during the first few hours of the forecast, missing the most significant areas of convection. As discussed in X08, this failure continued to affect the subsequent evolution of a complex sequence of convective activities, for the reminder of the forecast.

By noon of 26 May (1800 UTC), all of the convective systems from the previous evening and night have moved out of the central Plains. The quasistationary front remained running across central KS, intersecting the dryline that extended north from the TX panhandle near the CO border (not shown). In the afternoon, convection was initiated along the dryline and, to a lesser extent, along the front. These processes were captured well in both CN1 and CN4 (Figure 3).

In the late afternoon hours, many hail events associated with the above convective storms were reported. Two brief tornadoes were reported near Dodge City, KS, between 2300 UTC, 26 May, and 0000 UTC, 27 May, emerging from storms that developed near the dryline-cold front triple point. At 2300 UTC, the observed composite reflectivity map of the OK-KS region shows three groups of convective cells (labeled A, B, and C in Figure 3(a)), one near the western OK border (A), one in southwestern KS near Dodge City (B), and one in the form of more isolated cells at the central OK-KS border (C). Groups A and B were initiated along the dryline and B near the front-dryline triple point (the east-west frontal location can be inferred from the surface wind field in, e.g., Figure 3(b), while the north-south dryline is located near the east edge of the plotting domain), and they were captured in both CN4 and CN1 (Figures 3(b) and 3(c)) but not in C04 (Figure 3(d)). In C04, the convection that was incorrectly initiated along the front over 20 hours earlier was organized into an east-west oriented line and moved to northern OK by this time (Figure 3(d)); it dissipated over the next couple of hours. This line obviously interfered with the conditions producing the actual dryline convective initiation in the afternoon of the second day. In fact, in C04 no initiation occurred at all along the dryline, except for an isolated cell near the triple point (Figure 3(d)).

Group C, consisting of more isolated cells, formed in the warm sector south of the front and east of the dryline near KS-OK border (Figure 3(a)). It is interesting that the main cell with this group is successfully predicted in CN1 (Figure 3(b)), but not in CN4, C04, nor in any other member of the 4 km ensemble produced in real time (X08). The observed cell became fully developed at 1900 UTC, while in CN1 it reached maturity at 2100 UTC. The observed storm propagated slowly south-southeastward and maintained its identity until 0300 UTC, 27 May. It generated many hail reports and a high-wind report of over 40 m s−1 at 2340 UTC.

The corresponding storm in the CN1 prediction maintained its full intensity until after 0100 UTC. It gained some supercell characteristics in terms of the shape of the reflectivity by 2300 UTC (Figure 3(b)), consistent with severe weather reports. Despite some difference in the exact timing and longevity between the observed and prediction storms, the ability of a 1 km model to predict, about 20 hours into the forecast, an isolated severe storm that developed in the absence of obvious mesoscale forcing is very remarkable. None of the ten 4 km ensemble forecasts that included initial and boundary condition perturbations as well as variations in physics schemes captured this storm. In fact the 4 km member without radar data assimilation completely missed the initiation along the dryline on the second day. Finally, the 1 km forecast without radar data assimilation, C01, is similarly poor as C04, and this can be seen from the precipitation forecast scores presented in the next section.

4. Precipitation Verifications

To complement the earlier subjective evaluation of the forecasts for May 26, 2008 test case, we calculate the equitable threat scores (ETSs) verified against hourly radar-estimated precipitation produced on a 1 km grid by the National Severe Storms Laboratory in real time [27]. Such data were first interpolated to the forecast model grid before the ETS scores are calculated. Figure 4 shows the ETSs for hourly accumulated precipitation, at the 0.1 and 0.5 inch per hour thresholds, for the entire model domain. Clear evident is that the radar-assimilating CN1 and CN4 start with much higher ETSs initially, while the scores of C01 and C04 are around zero before 12 hours. For the 0.1 inch per hour threshold (Figure 4(a)), the ETS score for the first hour is about 0.45 for CN1 and 0.3 for CN4, indicating large difference in the short-range precipitation forecasts of 1 and 4 km grids. For the higher 0.5 inch per hour threshold (Figure 4(b)), the scores for the first hour are 0.29 versus 0.14, respectively. In general, the ETS scores decrease quickly during the first 5 hours, and the decrease is the fastest during the first two hours, especially for the higher thresholds. Such behaviors are actually expected and are consistent with the shorter range of predictability for more intense, smaller-scale convection, since errors associated with smaller scale, unstable motion grow the fastest (e.g., [28]). As errors associated with very short spatial scales present in the radar-assimilated initial condition grow quickly, predictability associated with such scales is quickly lost, causing initially rapid decrease of the precipitation forecast skill scores. Another possible cause for the initially rapid decrease in the skill score is insufficient dynamic and thermodynamic consistency among the model state variables within clouds when initialized by the single-time 3DVAR/cloud analysis. More advanced, four-dimensional, data assimilation methods that are closely coupled with the prediction model are expected to slow down the initial error growth to some degree. The forecast model error is another source although such an error tends to have larger impacts on longer forecasts.

The scores of C04 and C01 remain very low throughout the 30-hour-long forecasts and never exceed 0.03 (0.02 for the higher threshold). Between 2 and 19 hours, the scores of CN1 are up to 0.05 higher than those of CN4 for the lower threshold (Figure 4(a)). After 19 hours, the scores are comparable. For the higher threshold (Figure 4(b)), the differences between CN1 and CN4 become small after three hours. For grid point-based skill scores such as the ETS, position errors in small scale features can significantly impact the skill scores. In general, beyond the life cycle of the initial convective storms present in the initial condition, it is difficult for an NWP model to predict accurately the timing and location of new storm cells, especially when they are not forced by fixed features such as local terrain. Therefore skill scores that would allow for a certain degree of position error are often more useful (e.g., [5]).

To examine the precipitation forecast skill scores for the 4 and 1 km grids and the impact of radar data on the 4 km grid beyond the single test case present above, we discuss briefly here ETS scores for forecasts from 23 days of the 2009 CAPS spring forecast experiment on which all three forecasts are available; they are presented in Figure 5 for three-hour accumulated precipitation and for the 0.1 and 0.5 inch thresholds. For the ETS calculations, the 1 km precipitation fields were averaged to the 4 km grid.

Figure 5(a) shows that for the lower threshold, the mean ETS scores for CN1 are slightly higher than those of CN4 before 21 hours except for hour 12 when the score of CN1 dips slightly below that of CN4. For later hours, the scores are similar. The same comparison holds for the higher threshold (Figure 5(b)) although the relative difference is larger. This suggests that more intensive convection typically associated with smaller, more localized storms benefits more from the increased spatial resolution on average. For the May 26, 2008 test case, the difference between CN1 and CN4 is larger for the lower threshold, but it should be pointed out that the threshold for Figure 4(a) is 0.1 inch per hour rather than the 0.1 inch per three hours; therefore it actually corresponds to a higher precipitation intensity. In general, the ETS scores for all forecasts of spring 2009 are consistent with those of May 26, 2008 test case.

The ETS scores for the operational 12 km NAM forecasts are consistently lower than all high resolution forecasts for the lower threshold shown (Figure 5(a)), except for the first three hours when compared to the no-radar 4 km run (C04). For the initial hours, the NAM might have benefited from the consistency of its own analysis with its prediction model. Still, with the assimilation of radar data on either 4 or 1 km grid, the precipitation scores are much higher even during the initial hours (Figure 5(a)).

5. Summary

In this paper, we report on the results of the first ever test forecasts performed for a case from May 2008, at 1 km grid spacing in a domain covering almost the entire continental U.S., and the comparison of such forecasts with similarly configured forecasts produced at 4 km grid spacing in real time. These forecasts were 30 hours long, and a pair of forecasts assimilated both radial velocity and reflectivity data from all operational U.S. WSR-88D radars within the model domain, while another pair did not assimilate radar data. These 1 and 4 km forecasts with and without radar data assimilation are compared. Based on subjective evaluations, significant positive impact of radar data assimilation is found to last at least 24 hours for the test case. The 1 km forecast with radar data assimilation more accurately reproduced the observed convection than the corresponding 4 km forecast, especially in structure and intensity. It successfully predicted an isolated severe storm nearly 24 hours into the forecast, while the corresponding 4 km forecast, as well as all other 4 km members from the CAPS realtime storm-scale ensemble forecasts, failed to do so. The positive impact of radar assimilation on the precipitation forecast is even larger on both 4 and 1 km grids. Similar conclusions hold for precipitation forecasts based on mean equitable threat scores for 23 forecast days from spring 2009. This study provides evidence of the value of both convection-resolving resolution and radar data assimilation for severe weather prediction for up to 24 hours. We do want to point out that the equitable threat score examined in this paper has many limitations when applied to high-resolution precipitation forecasts due to large penalty associated with position errors. Object-based verification methods (e.g., [29]) and methods that account for position errors (e.g., [5]) will be explored in the future. In fact, an initial effort has been made to compare the number and size characteristics of storm cells predicted on the 4 and 1 km grids during the CAPS realtime forecasts [30].

Acknowledgments

This research was supported by a NOAA Collaborative Science and Technology Applied Research (CSTAR) Grant NA17RJ1227 and by the National Science Foundation Grants AGS-0738370, AGS-0802888, and EEC-0313747. The forecasts were produced at the National Institute of Computational Sciences, University of Tennessee, as part of the national Teragrid (currently Xsede) supercomputing allocation.