As a basic part of the atmosphere, the stratosphere plays an important role in the tropospheric climate and weather systems, especially during the winter, when the stratosphere and troposphere have their strongest interactions. This study assesses the abilities of the Fifth Phase of the Coupled Model Intercomparison Project (CMIP5) and CMIP3 models to simulate the boreal winter stratospheric polar vortex. Analysis indicates that the models with well-resolved stratospheres, that is, with a high model top (HTOP) covering the whole stratosphere, a high vertical resolution (HVer) of the stratosphere, and nonorographic gravity wave drag (NOG), rank higher in both the temporal scoring system and the spatial scoring system. The extreme cold polar vortex bias, which was found in the CMIP3 models, vanishes in the CMIP5 models with HTOP, HVer, and NOG but persists in the other CMIP5 models. A dynamical analysis shows that the heat flux propagating into the stratosphere is stronger in models with HTOP, HVer, and NOG, but these propagations are still weaker than those in the ERA40 reanalysis, indicating the lack of variability in the current CMIP5 models.

1. Introduction

Climate models are very effective tools for improving our mechanistic understanding of climate variability and for performing future climate projections. In recent decades, great progress has been made in the current climate models to have higher resolutions (both horizontal and vertical), more complete physical processes, and more sophisticated parameterization schemes. This has greatly improved model performance and applicability for climate change projections on global to regional scales [13].

Most of the published climate change studies are primarily focused on the surface and lower troposphere. Accordingly, most of the model improvements and tunings that have been made are related to the troposphere. However, in recent decades, studies have revealed that the stratosphere can play an important role in near-surface climate variability and change. Stratospheric forcing, resulting from internal climate variability or from external factors such as massive injections of volcanic aerosols into the stratosphere, can be an important driver of tropospheric climate. Among others, Perlwitz and Graf [4, 5] and Perlwitz and Harnik [6] found that the propagation of planetary waves from the troposphere can be reflected by the stratospheric layers back into the troposphere when the stratospheric background is strong westerly winds [4, 7] or when there is negative wind shear in the westerly wind regime near the stratopause [6]. This can further affect the tropospheric weather systems [813] and circulation patterns [14]. Nath et al. [15] showed this link for the 500 hPa height anomalies over the Northern Pacific, which lead to cold weather over North America. Black [16] and Scaife et al. [17] showed that variations of the winter stratospheric polar vortex could have impacts on the surface climate via the impacts of these variations on the Arctic Oscillation/North Atlantic Oscillation. Baldwin and Dunkerton [8, 18] reported that both the strong and the weak polar vortex anomalies could propagate within the stratosphere down to the tropopause. These anomalies can persist for up to 60 days, favor tropospheric anomalies with the same signs as those in the stratosphere, and influence surface storm tracks, surface pressure [19], and North Atlantic sea surface temperatures [20]. Furthermore, Manzini et al. [21] showed that the sea level pressure, surface temperature, and sea ice coverage anomalies in the Northern Hemispheric mid- and high latitudes can be traced back to the long-lasting stratospheric vortex anomalies on interdecadal time scales. Improved representation of the stratosphere was shown to create a first-order change in climate projections of winter sea level pressure in multiple climate models [22] and single model [23]. For the future climate projection, the spread in climate sensitivity may be reduced by reducing the spread of stratospheric changes [24].

Stratosphere-resolving models are also crucial for the simulation of the Brewer-Dobson circulation (BDC), which is likely to strengthen in the future with a rising greenhouse gas concentration in the atmosphere [2528]. Since the BDC transports ozone and other trace gases from the tropics to the polar regions by means of the stratospheric overturning circulation, model performance in the stratosphere is important for understanding the formation of the ozone hole and its recovery [29], as well as the variations of the radiation balance [1, 30].

However, previous studies on the stratosphere circulation have shown some common model biases. For example, the “cold pole” problem has prevailed in most stratosphere-resolving coupled general climate models (GCMs) for a long time [31]. Stronger and colder stratospheric polar vortices are closely associated with fewer perturbations of the stratosphere and lower frequencies of sudden stratospheric warming events [32]. Cordero and Forster [33] evaluated the performance of 19 CMIP3 models, which were used for the Intergovernmental Panel on Climate Change Fourth Assessment Report (AR4). Their analysis indicated that the 19 GCMs exhibited a colder stratosphere than the NCEP reanalysis data. Although the models with higher altitudes and more stratospheric levels had better skills when simulating the polar temperature, they still had a cold bias of approximately 4–7 K at 10 hPa. Ren et al. [34] showed that this “cold pole” problem prevailed in their climate model, with an overestimated polar night jet speed, less zonal asymmetry, a much lower SSW frequency (only 3 events in the 60 perpetual-January model months), and a much weaker variability across the polar stratospheric region.

Until recently, this cold pole bias remained in most of the models. An analysis of CMIP5 [35] model simulations showed that the polar stratospheric cold bias in the spring temperatures still persists in most models, which leads to a delay in the final stratospheric warming of both hemispheres [36]. Another bias is the persistent annular mode pattern in the troposphere, which is common and found both in the CMIP3 coupled climate models [37] and in the CCMVal (Chemistry-Climate Model Validation Activity) chemistry-climate models [38]. This bias was considered to be related to the bias of the underestimated intraseasonal variability in the stratosphere [38].

In this study, we utilized a ranking system to study the individual model performances in simulating the stratospheric circulation during the boreal winter, and the question of how to simulate a better stratospheric polar vortex in the northern hemisphere is discussed. We compared several crucial processes, which might contribute to the increase in model performance in the stratosphere, including increases in the model top, the introduction of nonorographic gravity wave drag in the model, increases in the horizontal/vertical resolution of the stratosphere of the model, upgrading the model from climate model to earth system model (ESM), and the inclusion of chemical processes in the stratosphere.

In this study, 42 climate models from CMIP5 that were used in AR5 are used to evaluate their performances in simulating the Northern Hemispheric stratospheric polar vortex (SPV). Meanwhile, the results from 22 CMIP3 models, which were used for the AR4 evaluation, are also utilized for comparison purposes in order to examine the possible improvements in the simulations of SPV climatology. The paper is structured as follows. In Section 2, we will discuss the models, observation-based data, and the methodology used in this study. In Section 3, CMIP5 SPV climatology will be presented. A two-sample Kolmogorov-Smirnov test (K-S test) is performed, and the models will be ranked according to their values from the K-S test. The dynamic reasons for the differences between the subensembles of the models will be discussed. In addition, a summary and discussion will be presented in Section 4.

2. Models and Data

2.1. Model Output Availability

Model intercomparison projects (MIPs) provide the best opportunities to evaluate how the current state-of-the-art climate models characterize the variability of the climate system. The most prominent and successful MIP is probably the Coupled Model Intercomparison Project (CMIP), which compares simulations from global coupled climate models and provides a standard set of model simulations aiming to evaluate how well the models simulate and project future climate change and to understand some of the factors responsible for the differences in model projections (http://cmip-pcmdi.llnl.gov/cmip5/). These simulation results were used in the climate change assessment report of the Intergovernmental Panel on Climate Change (IPCC). Since its inception in 1995, CMIP has evolved to its fifth phase (CMIP5), which was widely used in the fifth assessment report of the IPCC.

According to the IPCC AR5 report, there are a total of 50 CMIP5 models from 25 institutions, which is twice as many as those used in AR4 and CMIP3 (24 models). The CMIP5 models distinguish themselves from the CMIP3 models in many ways. About two-thirds of the CMIP5 models are Earth System Models (ESMs) that include the global carbon cycle [39]. However, all the CMIP3 models use a prescribed globally averaged CO2 concentration. Hence, these models were called atmosphere–ocean general circulation models (AOGCMs) or climate system models (CSMs). Hereafter, AOGCMs and CSMs are called GCMs for short. The horizontal resolutions for both the atmospheric and the oceanic components of most of the CMIP5 models are higher than those in the CMIP3 models. Moreover, about half of the CMIP5 models can fully resolve the stratosphere, and the vertical resolutions of some of the models (CESM-WACCM, CMCC-CMS, HadGEM2-CC, MIROC-ESM, MPI-ESM-MR, and MIROC-ESM-CHEM) have the ability to simulate the QBO.

In this study, we adopted 42 couple climate model outputs from the Program for Climate Model Diagnosis and Intercomparison (PCMDI, http://cmip-pcmdi.llnl.gov/). The names of the analyzed models are listed in Table 1, along with other model information. Table 1 is compiled using information that the participating modeling centers provided to the PCMDI (see http://www-pcmdi.llnl.gov/ipcc/about_ipcc.php for more information about the models).

To compare with previous versions of CMIP and confirm model progress, the model outputs from 24 CGCMs, released by CMIP3 and used for the AR4 evaluation, are used in this study. The model names of the CMIP3 models are listed in Table 2. These model results were made available by the IPCC Data Distribution Center (https://esgf.llnl.gov/) and the Hadley Centre for Climate Prediction and Research.

Since our study is mainly focused on the boreal winter, when the stratosphere and troposphere have their strongest interactions [40], we use the monthly mean model outputs from the midwinter months (December–February). To compare these results with the reanalysis data and CMIP3 results, we have chosen a period of 30 years from 1969/70 to 1998/99. For our analysis, we use the zonal wind, meridional wind, and air temperature data from the r1i1p1 (r for the indicator of control run where historical runs initialized, i for initialization method indicator, and p for perturbed physics number) ensemble [35] for the outputs of the historical runs.

2.2. Observation-Based Data

We utilize the European Centre for Medium-Range Weather Forecasts 40-Year Reanalysis (ERA40, [41]) as our reference data, which covers the period from September 1957 to August 2002. The main reason we have chosen the ERA40 data is that it covers the period of 1969–1999 and reaches the level of 1 hPa. In addition to ERA40, there are other reanalysis datasets; for example, the Japanese 55-year reanalysis (JRA55, [42]) satisfies the necessary criteria. We have validated the JRA55 variables against those from ERA40 to demonstrate reanalysis uncertainties, and the results show that the differences between JRA55 and ERA40 are negligible under the circumstances of the present study. Thus, the results of comparing the model results with the JRA55 data are not shown in this paper.

2.3. Methodology
2.3.1. Basic Treatments

CMIP5 models have a variety of horizontal and vertical resolutions (Table 1), with MIROC4h having a resolution of T213 () and HadCM3 having a resolution of N48 (). Hence, we regridded all the model outputs onto a grid in the horizontal domain with 23 vertical levels to match the resolution of ERA40. However, the highest level of some models is at 10 hPa, and thus there are only 18 levels for these models after the regridding.

The 42 CMIP5 models and 22 CMIP3 models used in the present study are listed in Tables 1 and 2, respectively. To compare the model performances with or without one specific process, the CMIP5 models can be classified into the following groups according to the schemes of these models.HTOP: high-top model, that is, with a model top greater than 1 hPa, 20 modelsLTOP: low-top model, that is, with a model top lower than 1 hPa, 22 modelsNOG: model with nonorographic gravity (NOG) wave parameterization, 20 modelsnNOG: model without nonorographic gravity (NOG) wave parameterization, 20 modelsCHEM: models with stratospheric chemistry processes, 14 modelsnCHEM: models without stratospheric chemistry processes, 28 modelsESM: earth system model that includes the global carbon cycle, 23 modelsGCM: model without the global carbon cycle, 19 modelsHHor: horizontal high-resolution (HHor) model, with a resolution finer than (~T63), 22 modelsLHor: horizontal low-resolution (LHor) model, with a resolution sparser than (~T63), 20 modelsHVer: vertical high-resolution (HVer) model of the stratosphere, with more than 19 levels above 200 hPa, 18 modelsLVer: vertical low-resolution (LVer) model of the stratosphere, with fewer than 19 levels above 200 hPa, 24 models

The HTOP models are usually associated with an NOG scheme. Of the 20 high top models, 17 models have nonorographic wave parameterization schemes, while only 5/22 of the low top models have this parameterization. Meanwhile, of the 20 HTOP models, 18 models are classified as HVer models, while the LTOP models are all LVer models. Therefore, with the increase of the model top, the vertical resolution of the stratosphere is also increased, and the nonorographic wave parameterization scheme is implemented.

To demonstrate the differences in the vertical resolutions and vertical coverages of the atmospheric heights between the HTOP and LTOP models, the vertical layer profiles of the individual models are shown in Figure 1, which can be viewed as an updated version of Figure 1 of Cordero and Forster [33]. Though the CMIP5 models show an overall improvement over the CMIP3 models, the CMIP5 intermodel differences remain. The lowest model top is 10 hPa (~32 km, e.g., INM-CM4, CMCC-CM, and CNRM-CM5), while the highest is 5.1 × 10−4 Pa (~140 km, CESM-WACCM) and the rest of the models lie in between. Most of the LTOP models have a top at approximately 3 hPa, and 13 out of the 20 HTOP models have a top at approximately 80 km, which is the altitude of the mesopause. Overall, in both the troposphere and the stratosphere, the number of vertical layers of the CMIP5 models is more than those of the CMIP3 models (Figure 1, [33]). However, it is worth mentioning that these 42 models are not absolutely independent of each other. They primarily originated from CAMs (i.e., BCC-ESM, FGOALS, FIO-ESM, and NorESM1-M), the Hadley Center Models (i.e., the ACCESS models), and the ECHAM models (i.e., MPI-ESMs, CMCC-ESM). More specific information is listed in Table 1.

2.3.2. Ranking System

A temporal ranking system is adopted to study the individual model performances in simulating the stratospheric circulation, based on the work of Brands et al. [43]. This ranking system employs the two-sample Kolmogorov-Smirnov test (K-S test, [44]). The K-S test is a nonparametric test that compares the cumulative PDFs of two samples (e.g., the outputs of a CMIP5 model and ERA40) to determine if two samples have the same distribution. The null hypothesis is that the two samples come from the same theoretical probability distribution [43]. The algorithm is written as follows:where and I denote the empirical cumulative distribution frequencies of a given model and ERA40, respectively, with being the th value of the sample and n being the sample size (here ). The value of is bounded between 0 and 1. As the value approaches 1, the distributions of the two samples are more similar, and vice versa. If is lower than 0.05, the null hypothesis is denied and the two samples are assumed to be from different distributions at a confidence level of >95%.

For a variable, such as the zonal-mean temperature, the ranking is performed at each grid point of the 42 models according to their respective values from the K-S test (without the removal of climatology), with higher values indicating a better rank. If a model is ranked number 1 at some grid point (latitude and height), this means that this model performs best at this grid point. Then, we average the ranks at all the grid points of a model to get the model rank. Finally, the 42 models are ranked according to their model ranks.

We also used the spatial score system, adopted from Douglass et al. [45], Waugh and Eyring [46], and Gettelman et al. [47]. It is based on three statistical metrics:where stands for a variable of a model at the th time step and for ERA40. and Cor stand for the standard deviation and correlation coefficient, respectively. In addition, n represents the total time steps and is set to 1 (DJF), and is a scaling factor set to 3, as in Gettelman et al. [47]. Then, , , and are synthesized into a term:

The method was proved to be robust and to have some advantages by Grewe and Sausen [48] and Gettelman et al. [47].

3. Results

The 30-year zonal-mean climatology of the temperature and zonal winds of the boreal winter (DJF) are shown in Figure 2. The ERA40 temperature is well characterized, with a cold center located in the polar mid stratosphere called the polar vortex and another centered in the tropical tropopause. In the troposphere, the temperature drops with increasing altitudes and latitudes. The isotherm line of 205 K constrains the cold core of the polar vortex between the altitudes of 70 hPa and 20 hPa. Given the zonal-mean zonal wind, two jets can be easily distinguished in ERA40 (Figure 2(a)), CMIP3 (Figure 2(b)), and CMIP5 (Figure 2(c)), with one jet being the polar night jet, centered at approximately 65°N, 10 hPa, and the other being the subtropical jet, centered at approximately 30°N, 200 hPa. Easterlies prevail in both the tropical stratosphere and the tropical lower troposphere. The CMIP3 ensemble means show a vortex of similar strength as that seen in ERA40, but the cold core moves upwards, stretching into the upper stratosphere. The polar vortex cold bias between CMIP3 MME and ERA40 is greater than −8 K (Figure 2(c)) in the mid and upper stratosphere. The displaced polar vortex leaves a slight warm bias in the lower stratosphere. Compared to CMIP3 MME, the isotherm line of 205 K of the CMIP5 ensemble is closer to that of the reanalysis, thus constraining the structure of the polar vortex to the mid stratosphere, though it still stretches into the higher levels. The temperature anomalies of CMIP5 at 10 hPa (Figure 2(e)) are around −4 K, much smaller than those of the CMIP3 models (Figure 2(d)). The polar night jet simulated in the CMIP3 models extends equatorward, causing large westerly biases in the midlatitudes and subtropical regions (Figure 2(d)). The extension of this jet in the CMIP5 MME is smaller (Figure 2(e)) because of the more moderate polar vortex in this model. In the tropical stratosphere, both ensembles show easterly biases. This was probably induced by the lack of the QBO in the model simulations. Only 6 of the 64 employed models (42 CMIP5 and 22 CMIP3) are able to simulate the QBO or QBO-like oscillations. The tropical stratosphere is dominated by easterlies all the year round in the simulations of the other 59 models. Without interannual variability, the easterlies in the models are stronger than those observed.

To derive quantitative information about the performances of the models, the CMIP3 and CMIP5 ensembles are ranked according to spatial and temporal ranking systems, respectively. The results are shown in Figures 3 and 4. The upper and lower values of the box plots indicate the uncertainty range of the mean value, represented by . For the spatial ranking system (Figure 3), both ensembles exhibit similar distributions, but the medians and averages of the CMIP5 models are slightly better than those of the CMIP3 models for both the zonal-mean zonal wind and the zonal-mean temperature. However, the differences in the temporal variabilities of the two ensembles (Figure 4) are significant. For both variables, the rankings of CMIP5 ensemble tend to skew towards small values, yet they leave a long tail towards large values. This phenomenon indicates that some of the CMIP5 models have improved simulations and some did not, as compared to the results of the CMIP3 ensemble.

To further determine the processes that improve model simulations of the stratosphere, the CMIP5 models are sorted into groups, as discussed in Section 2.3.1. The temporal rankings of the zonal-mean temperatures for each of the subensembles are shown in Figure 5 and Table 3. Using -test, the ranking differences between HTOP and LTOP, NOG and nNOG, HVer and LVer, and ESM and GCM are significant at 97.68%. 99.95%, 99.90%, and 86.03%, respectively, while the ranking differences between CHEM and nCHEM and HHor and LHor are only significant at 15.11% and 13.88%, respectively. The median ranking of the HTOP models is 15.1, which is smaller than that of the LTOP models (22.6). In the CMIP3 (AR4) archive, only 5 of the 23 models have tops above 1 hPa [33], but that ratio increases to 22 of the 50 models in the CMIP5 suite [36]. In addition to the stratospheric polar vortex climatology, raising the model top can also increase the stratospheric variability on daily and interannual time scales and increase the frequency of major sudden stratospheric warming events [36].

Gravity waves (GWs) are generated by convection, topography, instability, tropical cyclones, and various adjustment processes in the atmosphere [49, 50]. It is now believed that vertically propagating gravity waves can provide the strong mechanical forcing that keeps the middle atmosphere away from radiative equilibrium and helps maintain the residual circulation between the equator and poles [51, 52]. The incorporation of nonorographic gravity wave drag could alleviate the cold pole problem in the stratosphere [53]. However, even at the CMIP5 stage, almost half of the models do not include nonorographic gravity wave parameterization. The NOG models rank higher than the nNOG models, with a median ranking of 14.4 for the 21 NOG models and 23.9 for the 21 nNOG models. The introduction of nonorographic gravity wave schemes usually requires a high model top [54]. Among the 21 NOG models, 17 models are HTOP models, which should be one critical reason why the HTOP models have more realistic polar vortexes.

Moreover, the vertical resolutions of the stratosphere and the incorporation of carbon cycles can help to improve simulations of the polar vortex. The median ranking of the 18 HVer models is 14.7, which is much better than the median value of 24.2 of the 22 LVer models. Higher vertical resolutions allow subgrid waves to propagate vertically instead of being damped. For example, the successful simulation of the QBO requires an 800 m vertical resolution [55, 56]. For the ESM and GCM groups, the median ranking of the 23 ESMs is 17.3, which is much better than the 23.9 median ranking of the 19 GCMs. The ESMs are reported to have better simulations of the polar vortex than those of GCMs not only in their climatology but also for the intraseasonal variations of the vortex [57]. The inclusion of carbon cycle, especially the ocean biogeochemistry, has an influence on the global oceanic radiation flux, oceanic chlorophyll, and energy balance, which will help to simulate a better ocean mean state and better SST. This may influence the stratospheric circulation via large-scale planetary waves. However, this should be confirmed by further study.

Meanwhile, the treatments of stratospheric chemistry processes are not vital to the rankings of the models, indicating that, in current models, the temperature of the polar vortex does not strongly rely on interactive chemistry processes. The ranking differences between CHEM and nCHEM models are minor (18.3 for the 14 CHEMs versus 20.1 for the 28 nCHEMs). Therefore, using interactive chemistry or semioffline calculated chemistry in CHEM models can only lead to a very small change in the model ranking compared to using the prescribed ozone in the nCHEM models. This is coincident with the results of Eyring et al. [58], who noted that both the CHEM and the nCHEM multimodel means agree well with observations in terms of their absolute values and the trends of both temperature and zonal wind.

It is worth noting that improvements of the horizontal resolutions barely show a better polar vortex simulation. The median ranking for the 22 high-resolution models (those with a model resolution finer than ) is 18.7, while the median ranking for the other models is 19.4, suggesting that improving the model resolution results in a very limited improvement of the polar vortex temperature simulation.

Raising the model top can be the first step in improving the simulations of stratospheric circulation. For the HTOP models, the inclusions of the nonorographic gravity waves (NOGs), interactive chemistry (CHEM), and carbon cycle (ESM), as well as an increased horizontal resolution, can all lead to model ranking improvements (Table 4). The median ranking value rises from 29.8 (nNOG) to 14.4 (NOG), 17.05 (nCHEM) to 13.35 (CHEM), 19.55 (GCM) to 14.25 (ESM), and 16.55 (LHor) to 13.1 (HHor) in the HTOP models. For the LTOP models, however, the introduction of CHEM and finer resolutions leads to setbacks in the model rankings. The median ranking value drops from 22 (nCHEM) to 23.65 (CHEM) and from 20.65 (LHor) to 26.9 (HHor) in the LTOP models. Therefore, the development of models to include CHEM and finer horizontal resolution must also be accompanied by an increase of the model top to achieve a better simulation of the NH stratospheric polar vortex. The inclusion of NOG in LT and HT modes remains robust, with both subensembles showing a model ranking increase compared with the nNOG models.

The combination of the high model tops and high horizontal resolutions leads to better performances, with a median model ranking of 13.1 for the 10 models. We noted that 8 of the 10 models are models with high vertical resolutions in the stratosphere. The median rank of the 8 models with high model tops, high horizontal resolutions, and high vertical resolutions is 11.5, indicating that improvements of the dynamical model configuration can lead to considerable improvements in the simulation of the stratospheric polar vortex. Furthermore, 7 of the above 8 models are models with nonorographic wave parameterization. The median value of the 7 models is 11.2, which is the best overall performance of all the combinations.

We compare these 7 best models (GFDL-CM3, CMCC-CMS, HadGEM2-CC, MPI-ESM-LR, MPI-ESM-MR, MPI-ESM-P, and MRI-CGCM3), which all have high model tops, high horizontal resolutions, high vertical resolutions, and nonorographic wave parameterization, to the other models and ERA40, as shown in Figures 6 and 7. The polar vortex of the 7-model ensemble (7ME) is well structured and confined to the proper altitudes. The cold pole problem was almost absent. The maximum temperature anomalies between the 7ME and the ERA40 are approximately 2 K at 10 hPa in the polar region and are only −0.5 K at approximately 20 hPa, while the temperature differences between the other models’ ensemble (OME) and ERA40 are above −5 K at approximately 10 hPa over the polar cap, indicating a severe cold bias in these models. Accordingly, the zonal-mean zonal wind of the 7ME is very close to that of the reanalysis (Figure 6(d)), with very small departures; meanwhile, OME has stronger meridional temperature gradients around the polar vortex, leading to stronger zonal winds from the polar stratosphere to the subtropical tropopause in the models (Figure 6(e)). Positive temperature anomalies between 7ME and OME dominate the stratospheric polar region and are centered at 10 hPa (Figure 6(f)); the zonal wind difference has most of its largest value within the stratosphere, indicating that improvements of the dynamical model configuration and the inclusion of nonorographic waves can influence the stratospheric circulation simulation.

Another zonal wind anomaly center in both 7ME and OME is located in the tropical stratosphere. The maximum easterly in this area of 7ME is ~4 m/s and is ~16 m/s in the other model ensemble. The decrease of the easterly in the tropical stratosphere is caused by the successful simulation of the QBO in 3 of the 7 models (CMCC-CMS, HadGEM2-CC, and MPI-ESM-MR).

The EP-flux diagnosis was used to explore the dynamic differences between the 7ME and the OME. As shown in Figures 7(d)-7(e), both the 7ME and the OME exhibit suppressed heat fluxes when compared with that of ERA40. This indicates that the planetary wave activities in both ensembles are relatively weak. The 7ME has a weaker tropospheric wave guide (Figure 7(f)) than that of the OME, yet its stratospheric wave guide is slightly stronger than that of the OME. According to Chen et al. [59], this kind of dipole structure in EP-flux anomalies could also lead to the strengthening of the subtropical jet and the weakening of the polar night jet. Compared to that of ERA40, the convergence area of the 7ME models in the mid stratosphere has expanded into higher latitude regions, and thus the divergence center in the polar stratosphere found in ERA40 has almost vanished in the 7ME. In Figure 7(d), an anomalous convergence center is shown to exist in the polar vortex area, adjacent to the warm anomaly centers. Though the simulated heat fluxes () in the 7ME are weaker than those in ERA40, they are stronger than those in the OME (Figure 7(f)). Moreover, there are poleward propagating momentum flux anomalies in both Figures 7(d) and 7(f). As a result, the EP-flux anomalies and the convergences in the polar vortex regions lead to relatively warmer temperatures in the 7ME models.

4. Summary and Discussion

In this study, the abilities of the CMIP3 and SMIP5 models to simulate the stratospheric polar vortex of the boreal winter are assessed. Considerable improvements were shown in the CMIP5 models, with decreased cold biases in the polar region and more realistic wind configurations. The grouped ranking analysis shows that at least 3 aspects contribute to these improvements: higher model tops, the inclusion of nonorographic gravity wave drag, and higher vertical resolutions in the stratosphere. The nonorographic gravity wave drags are vital for simulating a more realistic polar vortex. The forcing of the nonorographic wave drag together with heat flux () will warm the polar stratosphere and lead to a moderate polar vortex in the models. In fact, the introduction of nonorographic wave drag requires a high model top and a high vertical resolution in the stratosphere. A high vertical resolution of the stratosphere helps to portray the dynamics of this region. In addition, the incorporation of global carbon cycle can also help to improve the simulations of the polar vortex.

Raising the model top can be the first step in improving the simulations of stratospheric circulation. Analysis shows that including the nonorographic waves (NOGs), interactive chemistry (CHEM), carbon cycle (ESM), and improved horizontal resolution can lead to better simulations of the polar vortex, under the condition that the model top is increased to include the whole stratosphere. However, for low-top models, the modifications of the above processes lead to very limited improvements and even setbacks of the model rankings.

The ensemble of the 7 models with high model tops, nonorographic waves, and finer horizontal and vertical resolutions almost overcomes the cold-pole problem. Therefore, the temperature gradient between the high and midlatitudes is weakened, leading to a more realistic polar night jet. Although the EP flux is still weaker than that observed, it is stronger than that modeled by the CMIP3 models and the other CMIP5 models. The inclusion of finer vertical and horizontal resolutions resolves more small-scale waves in the model. Meanwhile, the NOG parameterization provides extra wave forcing in the stratosphere. All these help to damp the polar night jet and produce a better simulation. Compared to the HTOP models, the stratosphere in LTOP models is more exposed to outer space and the outgoing long wave radiation may be higher than that of the HTOP models. On the other hand, GHGs and ozone in the upper stratosphere of the HTOP models can absorb the long wave radiation from the lower atmosphere and thus keep the stratosphere warmer.

It is worth noting that the complexity of the current coupled models makes it very difficult to determine which process is responsible for the improvement of a specific system. The comparisons of each model group with others are “ensembles of opportunity” rather than “clean” comparisons, because any of the comparisons made will certainly contain other important differences. However, the comparison of the stratospheric circulation between CIMP5 models can help model users to understand the model uncertainties in stratospheric processes, and also provide useful information for future model improvements on model ability in describing the stratospheric circulation.

The improvements made to the simulation of the polar vortex will further influence the simulation of the troposphere climate since the stratosphere has significant influences on the tropospheric climate, as introduced in Section 1. The climate change projections based on models with cold biases in the boreal polar vortex need to be reconsidered. For example, Wei et al. [60] have shown that the discrepancies in the winter surface air temperature in East Asia between the HTOP and LTOP models can be up to 1.3 K by the end of the 21st century under different RCP scenarios. Hence, global and regional climate projections need to be reevaluated based on models that are better able to characterize the stratospheric circulation; otherwise, adaptation and mitigation policies might produce insufficient responses to the effects of anthropogenic global warming over the coming decades.

Conflicts of Interest

The authors declare that there are no conflicts of interest.


This research was supported by the National Natural Science Foundation of China (Grant no. 41375046), the National Key Research and Development Project of China (Grant no. 2016YFA0600600), and the Youth Innovation Promotion Association of Chinese Academy of Sciences (no. 2014064). In this study, the ERA40 data were provided by the ECMWF and are available online at http://apps.ecmwf.int/datasets/data/.