Abstract

It is well documented that soil moisture has a strong impact on precipitation forecasts of numerical weather prediction models. Several microwave satellite soil moisture retrieval data products have also been available for applications. However, these observational data products have not been employed in any operational numerical weather or climate prediction models. In this study, a preliminary test of assimilating satellite soil moisture data products from the NOAA-NESDIS Soil Moisture Operational Product System (SMOPS) into the NOAA-NCEP Global Forecast System (GFS) is conducted. Using the ensemble Kalman filter (EnKF) introduced in recent year publications and implemented in the GFS, the multiple satellite blended daily global soil moisture data from SMOPS for the month of April 2012 are assimilated into the GFS. The forecasts of surface variables, anomaly correlations of isobar heights, and precipitation forecast skills of the GFS with and without the soil moisture data assimilation are assessed. The surface and deep layer soil moisture estimates of the GFS after the satellite soil moisture assimilation are found to have slightly better agreement with the ground soil moisture measurements at dozens of sites across the continental United States (CONUS). Forecasts of surface humidity and air temperature, 500 hPa height anomaly correlations, and the precipitation forecast skill demonstrated certain level of improvements after the soil moisture assimilation against those without the soil moisture assimilation. However, the methodology for the soil moisture data assimilation into operational GFS runs still requires further development efforts and tests.

1. Introduction

Soil moisture is a critical hydrospheric state variable that often limits the exchanges of water and energy between the atmosphere and land surface, controls the partitioning of rainfall among evaporation, infiltration, and runoff, and thus may have significant impacts on numerical weather, climate, and hydrologic predictions. The Global Forecast System (GFS) of National Centers of Environmental Prediction (NCEP) of NOAA is the primary weather forecast model that provides up to 16-day weather forecasts for users around the world. For each GFS run, a set of initial values of system state variables including soil moisture is required. For the state variable of soil moisture, current GFS operational runs basically use soil moisture estimates of the Noah land surface model (LSM) based on precipitation simulations from previous GFS runs. Because of the uncertainties associated with the precipitation simulations and other meteorological forcing data for the Noah LSM, the initial values used for the GFS runs may not represent the real-world soil moisture level, which may contribute a certain level of errors of the GFS forecasts.

In the past decade, several sets of global soil moisture data products have been generated from satellite observations and have become available for various applications [1, 2]. The NASA Soil Moisture Active/Passive (SMAP) mission (previously called HYDROS mission) was launched solely for observing global soil moisture [3] in January 2015. A Soil Moisture Operational Product System (SMOPS) has been developed at the National Environmental Satellite, Data, and Information Service (NESDIS) of NOAA to provide global satellite soil moisture data products primarily for numerical weather prediction applications at NOAA, NCEP, and for other users [4]. With those soil moisture data becoming conveniently available, it is a natural next step to apply them in the numerical weather prediction models and examine their impacts on numerical weather predictions.

In the past years, many studies have explored approaches to assimilating satellite soil moisture observations into land surface models to improve land surface process simulations and in turn to improve numerical weather predictions [513]. The EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) ASCAT surface soil moisture product is assimilated into the ECMWF numerical weather prediction system [6, 14]. They found that the ASCAT soil moisture nudging scheme improves the model soil moisture and screen-level parameters but has a slightly negative impact on the atmospheric forecasts. They also demonstrated a neutral impact on both soil moisture and screen-level parameter forecasts using ASCAT soil moisture data via an extended Kalman filter data assimilation approach. At the United Kingdom Meteorological Office (UKMO), a simple nudging scheme of ASCAT soil moisture data was implemented in operations in July 2010 and it shows positive evaluation results of soil moisture analysis and forecasts scores [15]. Using an ensemble Kalman filter, Draper et al. [16] confirm the potential of satellite-based soil moisture data for NWP applications with the combined data assimilation from the active microwave ASCAT and the passive microwave AMSR-E satellite instruments.

In this study, we attempt to implement the well-documented soil moisture data assimilation approach in the NCEP GFS as we prepare for the future use of the global soil moisture data products from NASA’s SMAP mission. The preparation for applying SMAP data products includes the following steps: (1) develop and make operational a soil moisture product system that meets all requirements of soil moisture data needs for NOAA applications, especially in the GFS; (2) implement the Kalman filter data assimilation algorithm to ingest satellite soil moisture data products into the GFS; (3) examine the impact of assimilating satellite soil moisture data products on GFS forecasts; and (4) make the soil moisture data assimilation utility in GFS operational. The global Soil Moisture Operational Product System (SMOPS) has been developed and operational to provide global soil moisture data products ready for use in the GFS from observations of the advanced scatterometer (ASCAT) of operational MetOp-A and MetOp-B satellites of EUMETSAT, the WindSat of Naval Research Lab (NRL), and the Soil Moisture and Ocean Salinity (SMOS) satellite of European Space Agency (ESA). An ensemble Kalman (EnKF) filter has been implemented in the GFS to assimilate the SMOPS data products. This paper describes the satellite soil moisture data products provided by NESDIS Soil Moisture Operational Product System (SMOPS) in the following section, introduces the implementation of the EnKF soil moisture data assimilation algorithm in the GFS in Section 3, and examines the impact of assimilating SMOPS data product on GFS forecasts by validating the GFS model simulations with ground measurements of major system state variables in Section 4, compares the anomaly correlations, bias, and root-mean-square errors of isobar heights in Section 5, and demonstrates the precipitation forecast skill scores in Section 6. Finally, methodology and findings of this study are summarized and discussed in Section 7.

2. Satellite Soil Moisture Data Products

The NCEP GFS, North American Mesoscale System (NAM), and their associated assimilation systems include a land surface model (LSM) component that requires soil moisture data as an input for accurate weather and seasonal climate predictions. Currently, soil moisture in the NCEP models is estimated via the background simulation of the LSM of the assimilation system. This soil moisture estimates contain considerable biases and uncertainties. In the past decades, several low-frequency microwave satellite sensors have been used to retrieve surface soil moisture with a certain level of success [1]. Satellite-based global soil moisture observational data products are believed to provide a substantial constraint to the model estimate uncertainties and therefore improve the global and mesoscale model accuracies of weather forecasts. However, these satellite soil moisture data products have not been used by the NCEP numerical weather prediction (NWP) models because either their qualities or their availabilities/formats do not meet the NWP model operation requirements, or algorithms for ingesting the soil moisture data products into the NWP models have not been implemented or tested. To meet the NCEP soil moisture data needs, NOAA-NESDIS has developed a Soil Moisture Operational Product System (SMOPS) to either ingest or retrieve near-real-time soil moisture data from available satellite observations and merge them into a single data layer for better spatial and temporal coverage [4]. SMOPS data used in this research are soil moisture retrievals from ASCAT on EUMETSAT’s MetOp-A satellite and the European Space Agency (ESA) Soil Moisture and Ocean Salinity (SMOS) satellite. Processing of these data products includes mainly converting their data files from their original format (bufr or hdf) to SMOPS internal binary format and resampling to 0.25 degree latitude-longitude grids. More details of those data products of either the individual satellite sensors or their blend and their quality assessments are presented by Zhan et al. [4].

3. Soil Moisture Data Assimilation Method in Global Forecast System

The NCEP GFS is the operational NCEP global spectral numerical forecast model (and its associated ensemble Kalman filter (EnKF) hybrid data assimilation system providing the initial states) based on the primitive dynamical equations for fluid dynamics and a suite of parameterizations for atmospheric physics. This model had substantial upgrades in recent years (http://www.emc.ncep.noaa.gov/GFS). In particular, the Noah land surface model (LSM) (Version 2.7.1) replaced the Oregon State University (OSU) LSM to describe the land surface processes [1720]. The Noah LSM has four soil layers (10, 30, 60, and 100 cm thick), including updated treatments of frozen soil physics, infiltration and runoff, snowpack, canopy resistance, ground heat flux, soil thermal conductivity, direct surface evaporation, and green vegetation cover. The land surface skin temperature (LST) is derived from the surface energy budget. Momentum roughness lengths over land are prescribed for each month based on calculations from the vegetation and land use dataset of Dorman and Sellers [21], but a new formula is used for the thermal roughness lengths [22, 23] which can substantially reduce land surface skin temperature daytime cold bias and low-level warm bias over arid land areas during warm seasons. A lookup table used in the land surface scheme to control minimum canopy resistance and root depth number was updated to reduce excessive evaporation to improve the cool and moist bias in the near-surface air temperature and moisture fields during the warm season. In terms of land surface characteristics, 9 soil texture classes [24, 25] and 13 vegetation types [21] are used. Green vegetation fraction (GVF) is obtained with the NESDIS 5-year (from April 1985 to March 1991 with the year 1988 excluded) Normalized Difference Vegetation Index (NDVI) monthly climatology [26]. Monthly variation of snow-free surface albedo is derived in reference to Staylor and Wilbur [27], and for snow cases, the albedo is calculated in the Noah LSM. Longwave emissivity is prescribed to be unity (black-body emission) for all surface categories.

A new hybrid EnKF, three-dimensional variational (3DVAR) data assimilation system, GSI, was implemented into the analysis system of GFS called the Global Data Assimilation System (GDAS). In this system, the background error used to project the information in the observations into the analysis is created by a combination of a static background error and a new background error produced from a lower resolution (a horizontal resolution of T254) ensemble Kalman filter [28]. The atmospheric analysis is generated every 6 hours by the GSI with the GFS previous forecast as the background. This analysis is then used as the initial conditions for GFS subsequent forecasts, and the cycle continues.

To assimilate soil moisture observations into the GFS, the ensemble Kalman filter (EnKF) is selected and implemented. The EnKF is a Monte Carlo variant of the Kalman filter [29] and works sequentially by performing in turn a model forecast and a data assimilation update [30]. The EnKF was demonstrated for land data assimilation in synthetic studies where it compared well to the weak constraint variational “representer” method [8] and favorably to the extended Kalman filter [9]. Overall, the EnKF is flexible in its treatment of errors in model dynamics and parameters and is very suitable for the modestly nonlinear and intermittent character of land surface processes.

In this study, considering that soil moisture variation within nonraining days is small and that the blended soil moisture data from the SMOPS represent only daily soil moisture level, the soil moisture data assimilation is carried out at 00, 06, 12, and 18 UTC cycles in the GFS-GSI system, and only at 00 UTC cycle, the GFS is performed for week one forecast (0–192 hrs) to save computation resources. The SMOPS has used Noah LSM multiple-year grid-wise means and standard deviations to scale surface layer soil moisture retrievals from the individual satellite sensors already before blending [4], and the blended soil moisture data are assumed to have the same climatology as the model simulations of the Noah LSM used in the GFS.

4. Impact of Soil Moisture Assimilation on GFS Surface State Variables

Using the EnKF implemented in the GFS, the global satellite soil moisture data products from the NESDIS SMOPS have been assimilated for the whole month of April and early May 2012. The GFS-GSI system was run starting from 1 April 2012 and continued until 5 May 2012 with (analysis run) and without (control run) the EnKF assimilation of the SMOPS blended soil moisture data products. GFS week one forecast (0–192 hrs) was carried out only at the 00 UTC cycle. Impact of the soil moisture data assimilation on the NWP is then assessed by comparing the surface state variable forecasts, the anomaly correlation of the pressure-level height forecasts, and the precipitation forecast skill scores with and without the assimilation for this more than one month experiment.

For the surface state variable forecasts, we first check the soil moisture field. Figure 1 gives the comparison of soil moisture over the CONUS between the SMOPS data product and the GFS simulations at the first soil layer averaged for the whole April 2012. The time average for the SMOPS used in the assimilation is computed from 1 April to 5 May 2012 based on the daily soil moisture product. The GFS first layer soil moisture at 18:00 UTC from 1 April to 5 May 2012 is computed for GFS simulations. It is evident that the difference between the SMOPS and the GFS control run is quite large. The SMOPS data have been scaled before blending according to the GFS annual climatology [4], as mentioned before. The soil moisture from the SMOPS is around 0.2–0.3 g/kg in the east CONUS and below 0.2 g/kg in the dry west CONUS, except for Washington and Oregon states where soil moisture is around 0.2–0.3 g/kg. The southwest CONUS as well as northern Mexico is particularly drier. The surface soil moisture is below 0.1 g/kg (Figure 1(a)). The GFS control run shows much high moisture over the whole CONUS (Figure 1(b)) in this month. It is about 0.1 g/kg higher in the eastern regions and southern regions and about 0.2 k/kg higher in the northern regions, that is, mountain regions. In the northeast regions as well as their adjacent of east Canada, the simulated soil moisture is close to the SMOPS.

The soil moisture data assimilation can substantially adjust the soil moisture in the GFS model. The difference of the top layer soil moisture between the sensitivity and control runs shows that the large impact occurs around the Mississippi River Basin, and in large part of the western US, particularly over the mountain areas where the soil moisture reduced up to 0.2–0.25 g/kg (Figure 1(c)). As expected, it does not show much difference in northeast US regions as well as its adjacent regions of east Canada.

To examine the improvement of soil moisture simulations, we use the ground measurements from the U.S. Department of Agriculture (USDA) Soil Climate Analysis Network (SCAN) [31], which is the independent observation of soil moisture and to validate the soil moisture estimates. After quality control steps, 26 and 25 sites of the SCAN network are selected over the eastern and the western CONUS, respectively. Their corresponding estimates of GFS with and without the data assimilation as well as SMOPS retrievals are compared with the ground measurements of these sites. The comparison statistics are listed in Table 1. The SMOPS biases are negative both over the east and west CONUS, indicating the SMOPS is drier comparing to the USDA-SCAN observation, particularly over the west CONUS. The GFS control run shows too wet over the whole CONUS. The GFS-EnKF run corrects the wet bias of the GFS control run but shows a little drier comparing to the USDA-SCAN ground measurements. The table also indicates that the GFS-EnKF run reduces the RMSE and increases the correlation coefficient between the model and the ground measurements over both the east and west CONUS, even it has a better performance than the SMOPS data, except the RMSE over the east CONUS where the GFS-EnKF run has slightly higher RMSE than SMOPS data. It should be noted that the comparison statistics did not consider the scale differences between the ground measurements and the model estimates or satellite retrievals.

The improvement of soil moisture can directly affect the near surface forecast of atmospheric humidity and temperature. This can be seen from the one-month mean forecast (from 2 April to 5 May 2012) of the surface relative humidity and temperature at 2 meters, averaged over the west CONUS and east CONUS, respectively (Figure 2). Daytime surface relative humidity in the west CONUS in both GFS runs is lower than the observation during 7-day forecast, and close to the observation in nighttime, though the GFS sensitivity run shows a little drier. The surface temperature over the west CONUS is in good agreement with the observation, and the GFS sensitivity run does not have a big impact (Figure 2(c)). Over the east CONUS, surface humidity in the GFS control run clearly shows positive bias in daytime and nighttime, which is consistent with the horizontal surface moisture distribution as shown in Figure 1. This bias got reduced in the GFS sensitivity run, but forecast surface humidity in nighttime is still higher than the observation. The surface temperature in the GFS run shows a cold bias of daytime after 4 days of forecast, and the cold bias becomes more obvious with forecast time. This cold bias got reduced in the sensitivity run (Figure 2(d)), indicating that the soil moisture data assimilation can have a good improvement in the surface temperature forecast.

5. Impact of Soil Moisture Assimilation on GFS hPa Height Forecasts

With the improvement of surface field simulations from the soil moisture data assimilation, there should be impacts of the planetary boundary layer (PBL) and lower troposphere. In order to examine the impacts, we use the sounding data to validate the specific humidity and temperature simulation. Figure 3 gives a comparison of the vertical profiles of specific humidity bias and RMSE over the CONUS. The bias of specific humidity from GFS-EnKF reduced much in the lower troposphere, from 500 hPa to the surface, where its RMSE also got some reduction. The moisture in the GFS is too high near the surface from 850 hPa to the surface, though the GFS-EnKF run reduced this somewhat. The impacts of this high bias of moisture on other aspects such as the land surface model and the PBL scheme should be investigated.

Figure 4 presents a comparison of the vertical profiles of temperature bias and RMSE over the CONUS with validation of sounding data. The GFS control run shows cold bias near the top of troposphere and warm bias in the middle and lower troposphere but a little cold near the surface. The GFS-EnKF run does not change temperature bias in the high levels, but there are some impacts from 500 hPa to the surface. It increases the warm bias but shows a better improvement near the surface. In the whole troposphere, the GFS-EnKF run got a reduction of RMSE, as shown in Figure 4(b).

After we examined the model performance of surface fields and vertical profiles compared to the observations as shown in the above sections, we further check the model forecasting by calculating global anomaly correlation (AC) of geopotential heights at 500 hPa for day 5 as well as its bias and RMSE against the model analyses (GDAS). Figure 5 gives a comparison of the global AC scores for both runs, and GFS-EnKF shows a positive impact of AC with increase of 0.003 for the global region. This improvement shows a clear impact after three weeks.

The error reduction of global mean bias and RMSE for geopotential heights at 850 hPa and 500 hPa illustrates the improvement of the forecasts (Figure 6). The GFS control run shows a negative bias and becomes larger during a seven-day forecast. The GFS-EnKF reduced its bias, and this improvement shows significance at the 95% confidence level. The RMSE analysis shows some improvement from GFS-EnKF run but not up to the significant level.

Thus, the assimilation of soil moisture can reduce errors of surface temperature and surface humidity and errors of the vertical temperature and humidity profiles, modify the boundary layer structure and atmospheric stability, and finally have significant impacts on the high-level heights as well as the precipitation processes as discussed in Section 6.

6. Quantitative Precipitation Forecasts

More investigation related to the impact of soil moisture data assimilation can be done concerning precipitation forecasting in the model. A quantitative precipitation estimate (QPF) is used for evaluation of the GFS model performance. The precipitation observation estimates come from the Climate Prediction Center’s (CPC) gauge observation over the CONUS, which is usually used in the NCEP global NWP Model Deterministic Forecast Verification Package applying the Monte Carlo significance test rather than Student’s t-test applied in Figures 2 and 6 because the conventional method of significance tests such as Student’s t-test is not applicable for precipitation skill scores (http://www.emc.ncep.noaa.gov/gmb/STATS_vsdb/) [32].

The equitable threat scores (ETS) and bias scores [33] of precipitation over the CONUS for a period from 2 April to 5 May 5 2012 are calculated with the CPC observation data. Figures 7 and 8 illustrate the GFS forecast from 36 h to 60 h (day 2) and from 60 h to 84 h (day 3), separately. The ETS from the GFS-EnKF is slightly higher than that from the GFS-CTL for the light or heavy precipitation amounts, and bias reduction in GFS-EnKF versus GFS-CTL is quite consistent from light to heavy precipitation and significant at the 95% confidence level for light and medium precipitation amounts. These results indicate that the soil moisture data assimilation has a positive impact of precipitation forecasting.

In terms of day 3 precipitation forecasting, Figure 8 indicates that the GFS-EnKF yields higher equitable threat scores than the CTL, and this skill difference attains the 95% confidence level for the majority of the light and medium precipitation amounts but also shows a big drop of ETS for the heavy precipitation range. It is noted that the station number of observed heavy precipitation is smaller, so the scores calculated could be not as accurate as ones for light or medium precipitation. Similar to day 2, the bias score comparison between two runs shows the GFS-EnKF gives a substantial reduction of precipitation bias for all the precipitation amount range with the great confidence.

7. Conclusion and Discussion

It is well documented that soil moisture has a strong impact on precipitation forecasts of numerical weather prediction models [34]. Several microwave satellite soil moisture retrieval data products have also been available for applications [2]. However, it has not been demonstrated how these satellite soil moisture data products could improve numerical weather or seasonal climate predictions. A preliminary test of assimilating NOAA-NESDIS SMOPS soil moisture data products into NOAA-NCEP Global Forecast System is conducted in this study. From the above analysis of the results, it may be concluded that assimilating satellite soil moisture data products may have certain positive impacts on improving the estimates of surface and deeper layer soil moisture, surface humidity, and air temperature and increasing anomaly correlations of isobar heights. The statistically significant impacts on the skill of GFS forecasts for lower precipitation amounts over the CONUS are also notable, especially for the reduction of precipitation bias.

However, several issues need to be addressed before the GFS model could operationally ingest satellite soil moisture observations. Firstly, microwave satellite soil moisture retrievals represent the soil moisture levels of various soil depths while the GFS model surface layer is always 10 cm. C/X-band sensors (such as ASCAT, AMSR-E, WindSat, and the Advanced Microwave Scanning Radiometer on JAXA’s GCOM-W1 satellite, AMSR2) have a typical sensing depth of 1-2 cm. The sensing depth varies with top layer soil moisture level too: the depth is shallower when the top layer soil is wetter. The sensing depth of L-band sensors (e.g., SMOS of ESA and future SMAP of NASA) could be 5 cm. In this study, the satellite observations are assumed to represent the top 10 cm soil moisture layer of GFS. The validity of this assumption is still unknown.

Secondly, the EnKF data assimilation algorithm requires that both the observational soil moisture data and the GFS soil moisture simulations have a Gaussian distribution and no bias from each other. This study used their multiyear means and standard deviations to make the satellite retrieval climatology match the GFS simulations. The impact study was carried out for only about one month (April 2012) because of the GFS computing resource limitation. Results in Figure 1 indicate that satellite soil moisture retrievals are drier than GFS simulations for the month. Whether the result of this impact study is influenced by the seasonality difference between the satellite retrievals and the GFS model simulations needs further investigation.

Thirdly, the optimal EnKF data assimilation result requires the model and observation error covariances to be determined correctly. We managed to make the normalized EnKF innovation statistics meet the optimal requirements by empirically adjusting the model and observation error covariance level according to Crow and Van Loon [35]. But how to routinely or operationally carry out this process in operational GFS runs still requires further developments and tests.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank our many collaborators or partners at NCEP/EMC, NESDIS, and JCSDA for their useful suggestions and beneficial comments. Internal reviews from Youlong Xia and Jiarui Dong at NCEP/EMC are acknowledged.