Abstract

Observing System Simulation Experiments (OSSEs) have been conducted to evaluate the effect of Argo data assimilation on ocean reanalysis in the Pacific region. The “truth” is obtained from a 5-year model integration from 2003 to 2007 based on the MIT general circulation model with the truly varying atmospheric forcing. The “observations” are the projections of the truth onto the observational network including ocean station data, CTD, and various BTs and Argo, by adding white noise to simulate observational errors. The data assimilation method employed is a sequential three-dimensional variational (3D-Var) scheme within a multigrid framework. Results show the interannual variability of temperature, salinity, and current fields can be reconstructed fairly well. The spread of temperature anomalies in the tropical Pacific region is also able to be reflected accurately when Argo data is assimilated, which may provide a reliable initial field for the forecast of temperature and currents for the subsurface in the tropical Pacific region. The adjustment of salinity by using T-S relationship is vital in the tropical Pacific region. However, the adjustment of salinity is almost meaningless in the northwest Pacific if Argo data is included during the reanalysis.

1. Introduction

An ocean reanalysis system of the global ocean has been established recently by National Marine Data and Information Service (NMDIS) of China for the purpose of understanding monthly, annual, and interannual changes of sea surface height (SSH), as well as three-dimensional (3D) temperature, salinity, and currents. MITgcm (MIT general circulation model) serves as the ocean dynamical model in the reanalysis system [1], which is a state-of-the-art ocean model and is also employed in the Estimating the Circulation and Climate of the Ocean (ECCO) reanalysis project. The ocean data assimilation scheme used is a sequential 3D variational (3D-Var) analysis scheme designed to assimilate temperature and salinity using a multigrid framework [2]. This sequential 3D-Var analysis scheme can be performed in 3D spaces and can retrieve resolvable information from longer to shorter wavelengths for a given observation network to yield multiscale analysis. The historic observational data assimilated in the reanalysis system include temperature and salinity profiles from ocean station, conductivity-temperature-depth (CTD), various bathythermograph (BT), and Argo floats, as well as sea surface height anomaly (SSHA) from altimeter and sea surface temperature (SST) from satellite remote sensing. The other purpose for which we develop the global ocean reanalysis is to provide better real-time (daily or hourly resolution) lateral boundary conditions for the ocean dynamic model used for China Ocean ReAnalysis (CORA; [3]) developed by the NMDIS, from which the reanalysis products of SSH, 3D temperature, salinity, and currents from 1986 to 2008 in the China coastal waters and adjacent seas have been produced (http://www.cora.net.cn).

The 21st century Argo (Array for Real-time Geostrophic Oceanography) observing network is very important for global ocean climate studies. In particular, the salinity observation provided by the Argo network gives significantly more information comparing with the 20th century XBT observing network. Cooper [4] pointed out that the single variable assimilation of temperature will deteriorate the density field, which can result in a worse analysis of current field than that even with no data assimilation. The increasing salinity observation from Argo is essential to improve the structure of density field during data assimilation. To evaluate the impact of Argo on ocean data assimilation, many studies have been carried by various institutes (e.g., [58]). However, it is unclear what the concrete effect of Argo data assimilation has on ocean reanalysis in the Pacific region, especially in the subsurface layers of the tropical Pacific region and the northwest Pacific region. In addition, it is also necessary to know the role of T-S relationship in bivariate data assimilation when salinity data has increased dramatically thanks to Argo.

Observing System Simulation Experiment (OSSE) is one of the useful approaches to evaluate impact of the ocean observing system [9]. Within the OSSE framework, simulated rather than real observations serve as the input to a specified data assimilation system [10]. In this study, simulated observational values are drawn from a “truth” model. Besides, at every model grid point, time series of the “truth” values of the state variables, such as the temperature, salinity, and currents, can be obtained from the “truth” model integration. Here we intend to evaluate the effect of Argo data assimilation on the ocean reanalysis in the Pacific region utilizing the above-mentioned reanalysis system, especially the tropical Pacific region and the northwest Pacific region, which can be served as essential step to deeply understand the effect of Argo data assimilation on the World Ocean. This study is organized as follows: Sections 2 and 3 briefly describe the numerical model and ocean data assimilation scheme, respectively. Section 4 gives sensitivity experiment design. The impact of Argo on the ocean reanalysis in the Pacific region and conclusions are in Sections 5 and 6, respectively.

2. Numerical Model

MITgcm is developed by Marshall et al. [11]. The MITgcm manual illustrates that one hydrodynamical kernel is used to drive forward both atmospheric and oceanic models. It has a nonhydrostatic capability and can be used to study both small-scale and large-scale processes. Finite volume techniques are employed, yielding an intuitive discretization and supporting the treatment of irregular geometries using orthogonal curvilinear grids and shaved cells. Besides the above characters, MITgcm is developed to perform efficiently on a wide variety of computational platforms (http://Mitgcm.org).

The model domain in this study is from 74.25°S–84.75°N, 0.25°E–359.75°E. The KPP [12, 13] vertical mixing scheme is adopted. A horizontal C-grid has 1/2°  ×  1/2° resolution telescoping to 1/4° meridional spacing near the equator, and the horizontal grid numbers are 720 × 348. The -level standard vertical grid is used, with a total of 35 vertical levels configured. ETOPO5 bottom topography [14] is used in the model, and the minimum and maximum of water depths are 5 m and 5000 m, respectively. The time step is 600 s. The atmosphere forcing is from the National Centers for Environmental Prediction (NCEP) reanalysis, which includes daily wind speed at 10 m, net heat flux, and net freshwater flux. Wind speed is converted to wind stresses using the formula of Yelland and Taylor (1994). The surface temperature and salinity are relaxed to monthly climatologies, and the relaxation time scale is set to 100 days.

3. Data Assimilation Scheme

The multigrid 3D-Var data assimilation scheme developed by Li et al. [2] is used in the reanalysis system. The scheme is able to retrieve resolvable information in 3D space from longer to shorter wavelengths for a given observation network and yield multiscale analysis. The multigrid technique is introduced into the 3DVAR data assimilation to obtain longwave information of the observations over data-sparse regions and shortwave information over data-dense regions. The cost function can be written as where is the correction of the state variable referred to the background. is the difference between the available observation and the interpolated background field at the observation locations. is the observation error covariance matrix. is the interpolation operator from the model space to the observation space. The superscripts and show the transpose and the th level grid, respectively. shows the final level. It can be seen that the background error covariance matrix does not appear in (1), which has been represented implicitly by the grid levels. Compared to the traditional scheme of 3DVAR, the multigrid 3D-Var scheme has higher forecast accuracy and lower root-mean square errors. More details can be found in Li et al. [15].

Figure 1 shows the flowchart of the temperature and salinity data assimilation scheme. Firstly, using the polynomial fitting, the T-S relationship is calculated from the simulated temperature and salinity fields. Secondly, observed temperature data is assimilated into the numerical model using the multigrid 3D-Var data assimilation scheme. Thirdly, the background field of salinity is adjusted according the assimilated temperature field by the derived T-S relationship. Here we have assumed that the T-S relationship remains unchanged after the temperature is assimilated. Finally, the available observation of salinity is assimilated into the model. Following Troccoli et al. [16], a latitudinal filter has been applied to the salinity and temperature increments so that the whole salinity increment is applied only within 30° of the equator. Outside this region, the weight given to the salinity analysis diminishes linearly to zero at 60°N and 60°S. This is done to avoid implementing the salinity correction scheme in areas where stratification is weak.

4. Experiment Setup within the OSSE Framework

4.1. Construction of the “Truth” and “Observed” Data

Velocity, temperature, and salinity in January 2002, derived from a fully coupled data assimilation system of Geophysical Fluid Dynamics Laboratory (GFDL) developed by Zhang et al. [8], are served as the initial fields for model integration. The model is spun up for 10 years, using looped daily wind stress and net heat flux derived from the NCEP in 2002. Wind stress and net heat flux from 2003 to 2007 are used to drive the model for 5 years. The obtained simulation results are used as the “truth” for comparing the reanalysis results of sensitivity experiments to evaluate the impact of Argo data assimilation.

The “observed” data used in reanalysis sensitivity experiments are constructed by projecting the truth onto a real observational network (limited to the top 1000 m in this study). Data types in the real observational network include XBT, CTD, DRB, OSD, UOR, MRB, and Argo from 2003 to 2007, and positions of observational profiles come from the World Ocean Data (WOD2009) and China Argo Real-time Data Center (http://agro.org.cn), respectively. The projection from the model space onto observational space is a bilinear interpolation in the horizontal direction and the Akima interpolation in the vertical direction. A Gaussian white noise with the mean and standard deviation being 0.0°C (0.0 psu) and 0.2°C (0.05 psu), respectively, is added to temperature (salinity) “observation” as random error simulation. For simplicity, “observations” including the temporal and spatial information of XBT, CTD, DRB, OSD, UOR, and MRB are called “conventional observations,” and those including Argo temporal and spatial information are called “Argo observations.” It should be noted that CTD and Argo profiles have both temperature and salinity observations, while the other profiles may only have temperature observations. Distributions of temperature and salinity of conventional observations and Argo observations from 2006 are shown in Figures 2 and 3, respectively. It can be seen that the distributions of Argo observations are much denser than the conventional observations in the model domain, especially in the south Pacific, where the conventional observations are scarcely distributed. The distribution of conventional salinity is very limited, compared with conventional temperature, especially south of 50°S, where conventional salinity is almost invisible. However, the number of temperature and salinity data from Argo is equivalent in the Pacific region.

4.2. Experiment Setup

Five experiments are presented in Table 1. All these experiments employ the same model setup described in Section 2 and the data assimilation scheme described in Section 3. EXP_1 is the control run with no “observations” assimilated, where climatological temperature/salinity field in January derived from SODA (Simple Ocean Data Assimilation) [17, 18] and climatological monthly wind and net heat flux derived from NCEP serve as its initial condition and driving force, respectively. EXP_1 is spun up for 20 years to provide the initial fields for these five experiments. By inputting the obtained initial fields the model runs for another five years in each experiment using climatological monthly wind and net heat flux derived from the NCEP. In such period of five years, Exp_2 assimilates “conventional observations” and “Argo observations,” while EXP_3 assimilates only “conventional observations.” In EXP_2 and EXP_3, T-S relationship is used to adjust the background fields of salinity after temperature is assimilated into the numerical model. EXP_4 and EXP_5 are the same as EXP_2 and EXP_3, respectively, except that T-S relationship is ignored.

5. Impact of Argo Data on the Ocean Reanalysis in the Pacific Region

Figures 4(a) and 4(b) show temperature and salinity RMS errors of these five experiments in the top 1000 m in the Pacific region, respectively. It can be seen that RMS errors in EXP_2 (black line) and EXP_4 (pink line), for both the temperature and salinity, are much smaller than those in the other experiments. The RMS errors of salinity in EXP_2 and EXP_4 decrease gradually with time and reach a stable state with a value of 0.05 psu after 1000 days, while the RMS errors in EXP_3 (blue line) and EXP_5 (green line) are much bigger than those of EXP_2 and EXP_4 and increase gradually with time and exceed 0.15 psu after 1600 days. In addition, the improvement of temperature in EXP_2 and EXP_4 is also obvious comparing with that of EXP_3 and EXP_5. This means that Argo data plays a very important role in improving the reanalysis fields of temperature and salinity in the Pacific region. Figures 4(c) and 4(d) show temperature and salinity RMS errors of the five experiments in the top 100 m of the whole Pacific region, respectively. We can see from Figure 4(d) that the RMS errors of salinity in EXP_3 are slightly bigger than those in EXP_5, which indicates that the upper ocean may not hold an appropriate T-S relationship. The temperature and salinity in the upper ocean are more turbulent and can be affected easily by many factors, such as waves and rainfall. Therefore, the empirical T-S relationship lacks representative owing to the turbulent upper ocean, and the assimilated results adjusted by the T-S relationship are slightly worse than those which are not be adjusted by the T-S relationship. Figures 4(e) and 4(f) show the RMS errors of temperature and salinity in five experiments between 100 m and 1000 m. We can see that the RMS errors of salinity for subsurface in EXP_3 are obviously lower than those in EXP_5. In particular, the RMS errors of salinity in EXP_5 are bigger than those in EXP_1 all the time, which indicates that the analysis of salinity is somewhat inferior in subsurface ocean if only the conventional data is assimilated into the numerical model when the T-S relationship is ignored.

Figure 5 presents the RMS errors of temperature and salinity in five experiments for the top 1000 m in the tropical Pacific region (5°S–5°N). The RMS errors of salinity in EXP_2 are lower than those in EXP_4. The RMS errors of salinity in EXP_4 present a sharp fluctuation, with maximum reaching 0.16 psu on the 700th day. The great improvement of the analysis of salinity makes a better analysis of density field, which also makes the RMS errors of temperature in EXP_2 slightly lower than those of EXP_4. The RMS errors of salinity in EXP_2 are also much lower than those in the other three experiments besides EXP_4. This gives the fact that the T-S relationship is necessary even if Argo data is assimilated in the tropical Pacific region. However, Argo data is also indispensable in the tropical Pacific region. Without the Argo data assimilated, such as EXP_3 and EXP_5, the RMS errors of salinity become worse than those in EXP_1. The results indicate that the number of conventional salinity observations is too little to improve the analytical result of salinity in the tropical Pacific region if the oceanic initial fields and atmosphere forcing are inaccurate.

Figure 6 is the same as Figure 5, except for the northwest Pacific region (120°E–150°E, 10°S–52°E). The T-S relationship is important when the conventional data is assimilated in this region (comparing the results of EXP_3 and EXP_5). However, the effect of the T-S relationship is not obvious if Argo data is assimilated (comparing the results of EXP_2 and EXP_4), which indicates that the effect of the T-S relationship on the analysis of salinity is not important for the northwest Pacific relative to the tropical region.

The accuracy of analysis of temperature and salinity can be improved greatly if Argo data is assimilated into the ocean model in the whole domain of the Pacific region. Assimilation errors are reduced by 28% for temperature and 37% for salinity. However, in the northwest Pacific region where the temporal and spatial distribution of Argo data is not dominant comparing with the conventional data, the improvement in the analysis of temperature and salinity is not the same as in the tropical Pacific region. Assimilation errors are reduced by 11% for temperature and 16% for salinity in the northwest Pacific region.

Compared the results of Figure 4 and Figures 5 and 6, it can be noted that there are obvious shrinking processes of the RMS errors when Argo observations are assimilated (EXP_2 and EXP_4) in the whole Pacific region. One can see from Figure 3 that the distributions of Argo are sparse in the subpolar and polar regions, especially in the ACC region in the Southern Ocean, where Argo observations almost can not be found. Therefore, the RMS errors in the data-sparse regions are decreased gradually through the model dynamical constraint rather than the direct observational constraint. In contrast, observational numbers in both the tropical Pacific and the northwest Pacific are enough to constrain the dynamic model, where the RMS errors can be reduced rapidly.

The “true” velocity field can be used for verifying the effect of Argo data as an independent element. Figures 7(a) and 7(b) present the RMS errors of U (eastward) and V (northward) component in the five experiments for top 1000 m, respectively. It can be seen that the RMS errors are not reduced obviously after Argo data is assimilated (comparing EXP_2 with EXP_3), both for U and V components. The results of EXP_5 are the worst in all five experiments, which suggests that the density field is deteriorated owing to insufficient observations and the ignorance of T-S relationship. Figures 7(c) and 7(d) show the RMS errors of U and V component in the five experiments between 100 m and 1000 m, respectively. The analysis of U and V components in both EXP_2 and EXP_4 can be improved below 100 m when Argo temperature and salinity are assimilated, where the effect of atmosphere on ocean state is smaller than that near the ocean surface. While the analysis of EXP_3 and EXP_5 is much worse than that of EXP_1. The conventional data is very unevenly distributed in the subsurface and Argo data is able to remedy the disadvantage by adjusting the density field and then to improve the accuracy of current analysis.

Figure 8 shows time series of temperature anomalies of the “truth,” EXP_1, EXP_2, and EXP_3 in the top 500 m at Nino 3.4 region. The results of EXP_1 (Figure 8(b)) display a strong annual change, which is induced by the periodic driving of climatological wind and net heat flux. The shift of phase and intensity of the temperature anomalies in EXP_2 (Figure 8(c)) are coincident with those of the “truth.” The variability below 300 m in EXP_3 (Figure 8(d)) is inconsistent with that of the “truth,” which indicates that the number of conventional temperature is insufficient to improve the accuracy of temperature analysis in the subsurface in the tropics. The insufficient observations are unable to rectify the errors induced by the initial fields or atmosphere forcing. Contrary to that, the assimilation of Argo data can improve the accuracy of temperature analysis in the subsurface.

Figures 913 show the distributions of 5-year averaged RMS errors of temperature and salinity in the five experiments in the Pacific region, respectively. Large RMS errors of temperature in the control run (Figure 9) lie in the northwest Pacific region and the south Pacific region (south of 60°S), and large RMS errors of salinity lie in the northwest Pacific region, the east subtropical Pacific region, and the south Pacific region (south of 60°S). After all data is assimilated (Figures 10 and 12), the analysis of temperature and salinity are both improved greatly in the whole north Pacific region. There is also a visible improvement in the tropical Pacific region and south Pacific region comparing with that of the control run. However, the improvement of temperature is not distinct in the south Pacific region when Argo data is ignored just as in EXP_3 and EXP_5 (Figures 11 and 13). The improvements of salinity in EXP_3 and EXP_5 are both small comparing with that in EXP_1 except in the northwest Pacific region. Further, the analysis of salinity in EXP_5 becomes awful in the tropical Pacific region without considering the T-S relationship, which can ruin the structure of density field and result in improper dynamic fields in this region.

Figures 14 and 15 show time series of temperature anomalies of the “truth,” EXP_1, EXP_2, and EXP_3 at 50 m and 500 m in the tropical Pacific, respectively. It can be seen that the results of both EXP_2 (Figure 14(c)) and EXP_3 (Figure 14(d)) can reflect the spread of temperature anomaly accurately at 50 m (comparing them with Figure 14(a)). However, the results of EXP_2 (Figure 15(c)) are superior to those of EXP_3 (Figure 15(d)) at 500 m. This also confirms that Argo data is quite effective for improving the analysis of temperature in the subsurface in the tropical Pacific region.

6. Conclusions

The results of five experiments within the OSSE framework confirm the key role of Argo data in improving the reanalysis fields of temperature and salinity in the subsurface in the Pacific. Further, the reanalysis fields of currents can be improved via assimilating Argo data. The spread of temperature anomalies in the tropical Pacific region is able to be reflected accurately when Argo data is assimilated, which may provide a reliable initial field for the forecast of temperature and currents for the subsurface in the tropical Pacific region.

In the northwest Pacific region, the utilization of T-S relationship is useful for inhibiting the deterioration of salinity field when only the conventional data is assimilated. In contrast, when Argo data is included in that region, the adjustment of salinity tends to become almost meaningless. In the tropical Pacific region, however, the T-S relationship is also essential for adjusting the background field of salinity even if Argo data is included in addition to the conventional data. At the same time, Argo data plays a key role in further correcting the density field in the tropical subsurface. In addition, results also indicate that the analyses of hydrographic and dynamic fields, before Argo project was fully implemented, are easy to deviate from the truth on the basis of the 20th century XBT observing network, whether the T-S relationship is used or not.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was jointly supported by grants from the National Basic Research Program of China (no. 2016YFC1401701), National Natural Science Foundation of China (nos. 41676088, 41606039), and National Programme on Global Change and Air-Sea Interaction (GASI-01-01-12 and GASI-IPOVAI-04) of China.