Abstract

We propose to apply Piecewise Parabolic Method (PPM), a high order and conservative interpolation, for the parameters estimation in a PM2.5 transport adjoint model. Numerical experiments are taken to show the accuracy of PPM in space and its ability to increase the well-posedness of the inverse problem. Based on the obtained results, the PPM provides better interpolation quality by employing much fewer independent points. Meanwhile, this method is still well-behaved in the case of insufficient observations. In twin experiments, two prescribed parameters, including the initial condition (IC) and the source and sink (SS), are successfully estimated by the PPM with lower interpolation errors than the Cressman interpolation. In practical experiments, simulation results show good agreement with the observations of the period when the 21th APEC summit took place.

1. Introduction

PM2.5 pollution, particulate matter with aerodynamic diameters less than 2.5 μm, has gained widespread concern due to its adverse influence on human and ecosystem health. It has direct effect on climate by scattering and absorbing sunlight and has indirect effect by serving as cloud condensation nuclei and thereby affects the optical properties, lifetime of clouds, and precipitation. Therefore it is essential to estimate PM2.5 emissions to understand the changes to the atmospheric environment around the world. The research and prediction of PM2.5 pollution should be paid more attention, through which local air quality agencies can alert the public far from the unhealthy air in time.

For more in-depth understanding of the physical, chemical, and dynamical processes concerned with PM2.5, a lot of atmospheric numerical models have been conducted and are publicly available in various studies. A box model was used to simulate the atmospheric chemistry and gas/particle partition of inorganic compounds by Pun and Seigneur [1]. And the concentration of PM nitrate was found to be sensitive to reductions in VOC emissions. A Lagrangian air quality model was developed which represents the airborne particle complex as a source-oriented external mixture by Kleeman et al. [2]. Hudischewskyj and Seigneur [3] described the formulation and evaluation of a mathematical model that calculates the gas-phase and aerosol-phase pollutant concentrations in a plume that undergoes transport, dispersion, and dry deposition in the atmosphere. The performance of the Eta-Community Multiscale Air Quality (CMAQ) modeling system in forecasting PM2.5 and chemical species is assessed over the eastern United States by Yu et al. [4]. Seigneur [5] reviewed 3-dimensional (3D) Eulerian air quality models for PM in terms of their formulation and past applications. Basically, analysis of the PM2.5 on a continental scale depends on the sophisticated knowledge of the distribution. However, with incomplete observations in the spatial or temporal range, model estimates demonstrate significant uncertainties. Therefore, both further utilization of PM2.5 measurements and improvement of model estimates are identified as the important components for the continued analysis of PM2.5 sources.

Data assimilation methods provide a configuration for combining observations and models to form an optimal estimate of the PM2.5 sources. In this method, observations are used to constrain estimates of model parameters that are both influential and uncertain. Among all data assimilation methods, four-dimensional variational (4D-Var) data assimilation is regarded as one of the most effective and powerful approaches developed over the past two decades (e.g., [69]). Efforts have been made with regard to the applications of the 4D-Var data assimilation in the study of PM2.5 emissions. The adjoint of GEOS-Chem, a four-dimensional variational data assimilation method, is used to map US air quality influences of inorganic PM2.5 precursor emissions by Henze et al. [10]. Zhang et al. [11] used the adjoint method for the examination of the source attribution of PM2.5 pollution over North China. Capps et al. [12] revealed the relative contributions of global emissions to PM2.5 air quality attainment in the US based on the GEOS-Chem adjoint model. Wang et al. [13] established a PM2.5 transport adjoint model which is applied for the estimation of the parameters successfully.

The ill-posedness of the inversion problem is the key part needed to be solved. The ill-posedness is caused by the incompleteness of the observation data and excessive control parameters in practice, and it is generally characterized by the nonuniqueness and instability of the parameters in the identification process (e.g., [14, 15]). Many efforts have been made towards overcoming ill-posedness. And, it is justified that independent point scheme (IPS) is an effective approach for overcoming ill-posedness (e.g., [1113]). In detail, we select several grids uniformly as independent points (IPs) in the space domain. Values of the control parameters at the IPs can be optimized, while those at other grids can be calculated through linear interpolating with the values at the IPs.

Referring to previous studies, Cressman interpolation [16] is used as the first-choice for the IPS owning to its simple operation. However, when applied to the datasets with large distances between grid points, it should be hard for Cressman interpolation to obtain satisfactory interpolation errors. As we know, mass conservation is an important issue for climate and atmospheric chemistry models. Thus, the Piecewise Parabolic Method (PPM) developed in [17], which has been proved to be a high order accurate interpolation in the past (e.g., [1821]), is used in the same PM2.5 adjoint model shown in [13] to ensure the local mass conservation for the advection problems.

The application of the PPM in the PM2.5 adjoint model is presented in the following structure. Section 2 provides the detailed descriptions of the PPM and the PM2.5 transport adjoint model. Twin experiments are carried out, and the results are analyzed in Section 3. Based on the conclusion above, practical experiments are implemented in Section 4. Finally, some key conclusions drawn from the work are presented.

2. The Piecewise Parabolic Method and the PM2.5 Transport Model

2.1. The Piecewise Parabolic Method

For preserving mass, we define a particular parabolic interpolation distribution by the Piecewise Parabolic Method (PPM) [17]. The foundation of PPM is a piecewise parabolic interpolation scheme which generates a parabola to describe the internal structure of a computational cell, or zone, of the grid. A parabola is generated for any desired variable based upon knowledge of the zone-averaged values of that variable. In this section, the 2D PPM is established for solving the problems in two-dimensional PM2.5 transport adjoint model.

We divide the two-dimensional computing domain into regular Eulerian cells , where cells and cells centers are defined byLet be the independent point and be the value of parameters at the independent point. For the two-dimensional space, we first perform a computation along -direction and then compute along -direction. In this part, we just show the computation along -direction by using the conservative interpolation. The computation along -direction is similar. Then the values on at time can be obtained.

We apply a particular parabolic interpolation distribution [23], which can preserve mass conservation of the IPS. The interpolation distribution is written as

The interpolation distribution is defined on and is the value at the point of .

For obtaining second-order approximation scheme of PPM (PPM for short), is given as follows:

Then we can get all values at the point on by

For obtaining fourth-order approximation Scheme of the PPM (PPM for short), is given as follows:

Then we can get all values at the point on by

For the boundary region, we use Taylor expansion to get the lacked values

2.2. PM2.5 Transport Model Description

Generally speaking, the simulation and prediction of the PM2.5 are difficult due to the fact that PM2.5 is generally not directly emitted; instead, PM2.5 varies due to interactions among many processes including emissions, transport, photochemical transformation, and deposition, with meteorology playing an overarching role. The secondary PM2.5 is formed via chemical and thermodynamic transformations of gas-phase precursors that may potentially emanate far from nonattainment regions. Just to investigate the estimation ability of adjoint method, the emitted and deposited primary PM2.5 and secondary PM2.5 are taken as a whole, called “source and sink” (SS), without considering the specific details.

We use the same adjoint tidal model as in [13]. Considering that the observations are ground-level, the vertical meteorology environment is indeterminate. A two-dimensional PM2.5 transport model is established in rectangular coordinates as follows:where denotes the PM2.5 concentration, and are the horizontal wind velocity in -coordinate and -coordinate, respectively, is the horizontal diffusivity coefficient, is the initial condition (IC), and is the value of SS.

The boundary conditions are set as constant at the inflow boundary and as no gradient boundary conditions at the outflow boundary .

The finite difference scheme of (2) is as follows:

Here, the upwind scheme is used in the advection term, that is,

And it is similar for the term in -coordinate.

With IC, SS, and the background value which is the constant in (12), the PM2.5 transport model could be solved by a sequence of time steps of length in the discrete schemes (i.e., (13)).

As a powerful tool for parameter estimation, the adjoint model is defined by an algorithm and its independent variables including initial conditions, boundary conditions, and empirical parameters. It allows for calculations of the gradients of the cost function with respect to various input parameters, which incorporate all physical processes included in the governing model, to obtain the minimization of the cost function. Based on the governing equations (10)–(13), its adjoint model can be constructed as follows.

First, the cost function is defined aswhere denotes the set of the observations and and are the simulated and observed PM2.5 concentrations, respectively. And is the weighting matrix and should be the inverse of observation error covariance matrix theoretically. can be fixed simply, assuming that the errors in the data are uncorrelated and equally weighted. In the present model, is 1 when the observations are available and otherwise.

Then the Lagrangian function is constructed based on the theory of Lagrangian multiplier method and can be expressed as where is the adjoint variable of .

According to the typical theory of Lagrangian multiplier method, we have the following first-order derivatives of Lagrangian function with respect to all the variables and parameters:

Equation (18) gives the governing equation (10). The adjoint equation can be developed from (20), which is given as follows:

The finite difference scheme of (21) is similar to (13).

By running the adjoint model, the optimized gradients of the control variables and parameters including the horizontal diffusivity coefficient, SS, and IC can be obtained from (17) and can be described as follows:

3. Twin Experiments and Analysis

We use the PM2.5 transport model shown in Section 3. Initial condition (IC) and “source and sink” (SS) are the major parameters in this model. As the key for the simulation of PM2.5, inversions of SS and IC are the vital part of the whole model. For better model results, IC and SS will be inverted together in all the experiments.

The modeling domain covers China, from 70°E to 140°E and from 15°N to 55°N, with a grid resolution of 0.5° latitude by 0.5° longitude (see Figure 1). There are 141 × 71 grids totally in the area. The simulation period is 168 hours and the inverse integral time step is 10 minutes. The background value is fixed as 35.0 μg/m3 based on the actual PM2.5 concentration and the air quality standards in China. Inflow boundary values are fixed equal to the background values. The winds with spatial resolution 2.5° latitude by 2.5° longitude and 24-hour temporal resolution derived from the National Centers for Environmental Prediction (NCEP) are used in this work.

For each experiment, the iteration process is designed as follows.

Step 1. Run the governing model with the prescribed parameters. The model-generated PM2.5 concentrations at grid points of the observation positions are regarded as the “observations.”

Step 2. An initial guess value of control parameters, SS (or IC), is taken as zero in this work (or the background value for IC). Run the governing model with the initial values; the simulations are obtained.

Step 3. The adjoint model is driven by the discrepancy between simulations and observations. The gradients with respect to control parameters are calculated by the adjoint variables obtained by backward integration of the adjoint equations. The control parameters at the independent grids should be updated with a certain optimization algorithm until the convergence criterion is met. The parameters at other points are determined by the interpolation of the values at the independent points.

Step 4. With the procedure repeated, the parameters will be optimized increasingly. The differences between simulated results and “observations” will decrease. Meanwhile, differences between the prescribed parameters and the inverted ones will be decreased as well.

During the iterative minimization of the cost function, the parameters are optimized with the steepest descent method. The iteration will be terminated once a convergence criterion is reached. In this work, the criterion is that the number of iteration steps is equal to 300 exactly in the twin experiments and the practice experiments.

We pick out seventy-four major cities in China (see Figure 1) as the observation positions. With the sign of “assimilated cities,” the observations of these cities are assimilated in the optimization procedure. The PM2.5 concentrations records of these cities are selected as the observations at two-hour intervals. Considering the influence of the noises on in situ observations, we add the artificial random error to the “observations.” The maximum percentage error is 5%.

In this section, three versions of the IPS are considered and they are different in the method of interpolation. Cressman interpolation, PPM , and PPM are fixed in the three versions, respectively. A concise overview of Cressman interpolation can be found in [14]. For the purpose of making a computational investigation on the performances of the PPM, we carry out two groups of twin experiments. In Group One, the influence of distribution schemes is studied. In Group Two, two sets of observation data are assimilated, respectively. All the experiments are carried out following Steps 14. For model performance evaluation, the values of normalized cost function and the mean absolute errors (MAEs) of the inversion results are selected as the main reference. The optimization algorithm used in this work is the steepest decent method. To facilitate the narrative, PPM simulations mean the results obtained from IPS with PPM, so does the Cressman simulations.

3.1. Results and Discussions of Group One

The important factors that affect the result of IPS are the selection of independent points and interpolation format. Studies, such as [24, 25], have already coupled IPs distribution scheme with Cressman interpolation aiming at capturing more accurate results in the solutions and parameter inversions. Obviously, the optimal IPS distribution scheme matched with different interpolation should be not necessarily the same, due to the different interpolation effect. To find a relatively efficient and high-precision way of the interpolation in the PM2.5 transport adjoint model, five IPs distribution scheme are presented. The distribution schemes are described as follows:(A)We choose one from each 2 points as the independent one; the total number of the independent points is 71 × 36 and each is from a 1°  ×  1° area.(B)We choose one from each 4 points as the independent one; the total number of the independent points is 36 × 19 and each is from a 2°  ×  2° area.(C)We choose one from each 6 points as the independent one; the total number of the independent points is 25 × 13 and each is from a 3°  ×  3° area.(D)We choose one from each 8 points as the independent one; the total number of the independent points is 19 × 10 and each is from a 4°  ×  4° area.(E)We choose one from each 10 points as the independent one; the total number of the independent points is 15 × 8 and each is from a 5°  ×  5° area.

With different IPs distribution schemes, the MAEs of the inversion parameters and the values of the normalized cost function obtained from different interpolations are presented in Figures 2 and 3, respectively. As we can see clearly, the assimilation errors change along with the variation of the IPs distribution scheme. It is noteworthy that PPM and PPM obtain more accurate results than Cressman interpolation, smaller values of the normalized cost function, and smaller MAEs of the inversion parameters, regardless of the choice of independent point distribution scheme.

For Cressman interpolation, when too many (Scheme (A)) independent points are used, the simulation errors after assimilation are relatively large due to ill-posedness caused by excessive control parameters. When too few independent points (Schemes (D), (E)) are used, it leads to poor interpolation effect, which can fail to give satisfactory results. Just in this case, in Schemes (D) and (E), PPM simulations have obvious advantages that the cost function can reach 10−3 or 10−4 of their initial value, over an order of magnitude higher than that with Cressman interpolation. Our direct comparison of the normalized cost function by using PPM with Cressman interpolation indicates that PPM can accurately simulate PM2.5 concentrations in a wide range of spatial scales.

We take the distribution scheme, in which the experiment has the smallest normalized cost function and the minimum MAEs of the inversion parameters, as the optimum scheme for the interpolations. It can be readily seen that the optimum scheme is Scheme (C) for Cressman interpolation, while those for PPM and PPM are both Scheme (D). We focus on the optimal results of the three interpolations. The normalized cost function of Cressman interpolation and PPM can reach the minimum with the values of 5.51 × 10−4 and 1.20 × 10−4, while that of PPM is found to be one magnitude lower, with the values of 7.00 × 10−5. The PPM enables the MAE between the prescribed and inverted IC reduced by 82.62%, which is 3.95% better than the results obtained from Cressman interpolation. And the MAE between the prescribed and inverted SS is reduced by 74.37%, which is 3.85% better than the results obtained from Cressman interpolation. In fact, it is a statement that PPM provides the better interpolation quality by employing much fewer independent points, which accounted for 58.4% of Scheme (C). Figure 3(b) demonstrates that, as expected, the normalized cost function in the case of the PPM declines rapidly compared to the other two methods. Further evidence to support the conclusion can be seen through the inversion parameters showed in Figure 4.

As shown in Figure 4 and Table 1, the parameters can be inverted successfully by three versions of IPS, which demonstrates the reasonability and feasibility of the model. Compared against the inversion results, patterns from the PPM are closer to the prescribed ones (the smallest values of MAE). The main features of all the parameters can be recovered very well. In conclusion, to some extent, PPM is effective in remedying the blight of the ill-posedness caused by excessive control parameters.

3.2. Results and Discussions of Group Two

As we know, insufficient observations may cause the ill-posedness of inversion problem. For further investigation of the performance of PPM , we scale down the number of the observation stations according to the geographic distribution of the 74 observation stations. We deduce the observation stations by a quarter, a third, and a half of the original. 55, 49, and 37 observation stations are assimilated in the experiments, respectively. Cressman interpolation, PPM , and PPM are used in the PM2.5 adjoint model, respectively. The error statistics are shown in Tables 24.

We compare the error statistics from Tables 14, in the case of a reduced observation points, the MAEs of the inversion parameters are proved to be much worse than that obtained from sufficient observations, which indicates that the assimilation can be more efficient if more observations are used. This is consistent with the conclusion shown in [26] that accurate values of the parameters can be estimated if the provided data from sufficient number of locations are available for assimilation.

PPM remains much more accurate than Cressman interpolation for all error measures in this group. The MAEs of the inversion parameters after the assimilation demonstrate scenarios where Cressman interpolation can fail to give acceptable solutions with inadequate observations. Compared with the experiment results from Group 1, it is crucial to stress that the inversion parameters obtained from the PPM with 34 observation points are in the same level with that from Cressman interpolation with 74 observation points. All the experiments demonstrate that the application of high order accurate PPM in the adjoint model can effectively overcome the ill-posedness caused by inadequate observations.

4. Practical Experiments

In this part, practical experiments are carried out during the 21th APEC summit taking place in Beijing. During the assimilation procedure, both the position and the values of observations are used in the practical experiments. The simulation and data assimilation were conducted for the period of one week from 5 November to 11 November 2014. SS in the practical experiments has temporal variation.

The seventy-four major Chinese cities in Figure 1 are treated as the observation locations. For the verification of the model’s inversion ability, Jinan (JNa), Zhengzhou (ZhZ), Shenyang (SY), Quanzhou (QZ), Hangzhou (HaZ), Kunming (KM), Chengdu (CD), and Beijing (BeJ) are selected as the “checked cities” in which the observations are not assimilated. The hourly PM2.5 observations obtained from the Data Center of Ministry of Environmental Protection of the People’s Republic of China (http://datacenter.mep.gov.cn/) are used in the model. These observations were measured based on the tapered element oscillating microbalance (TEOM) method [13, 27]. But there was a lack of observations in a few hours. Each version of the IPS that matched the optimal IPs distribution scheme is implemented in the practical experiments. We compare the simulation results at the 300th step. Model-produced PM2.5 concentrations are researched as the main target.

Error statistics of the practical experiments are listed in Table 5. With the PPM , the MAE between simulated values and observations in “assimilation cities” is 9.47 μg/m3, and the MAE between simulated values and observations in all the cities at the initial time is 17.05 μg/m3. Meanwhile, the MAEs between simulated values and observations in “checked cities” also reach an ideal level, which is the minimum in all the experiments.

Measurements at four representative stations, Fuzhou (FZ), Quanzhou (QZ), Jinan (JNa), and Beijing (BeJ), are selected to evaluate the forward model and inverse modeling, as well as the assimilated results. In the practical experiment, Fuzhou is the city with the smallest MAEs between simulated results and observations in all the “assimilated cities.” Among all the “checked cities,” MAEs between simulated results and observations in Quanzhou are the smallest, while in Jinan it is the largest. Beijing is a major city of the “checked cities.” We will focus on the comparison and analysis of the simulations from PPM and Cressman interpolation.

In detail, from Table 6, the best simulations come from Fuzhou, which is the “assimilated city” where the observations are used in the whole assimilation procedure. The MAE from PPM , between simulated results and observations after the assimilation, is only 2.04 μg/m3, while that from Cressman interpolation is 2.27 μg/m3. Obviously, it is persuasive to evaluate the simulation results with observations that were not used in the assimilation. In Quanzhou and Jinan, PPM gets the MAEs to 8.18 μg/m3 and 24.76 μg/m3, while Cressman interpolation gets the MAE to 8.43 μg/m3 and 27.79 μg/m3. As the major concern, the simulations in Beijing from PPM is almost the same with the observations, the MAE after the assimilation is 8.22 μg/m3, while that from Cressman interpolation is 12.21 μg/m3, respectively. From Figure 5, the time-varying PM2.5 concentrations of Beijing clearly show that simulations from PPM fit the observations better than Cressman interpolation. All the analyses prove that PPM has obtained more accurate results, and the following evaluation for PM2.5 simulations is carried out based on the results from the PPM .

The scatterplot of Figure 6(a) indicates that the model captures a majority of observed PM2.5 within a factor of 2, though the observations of the high PM2.5 concentration range display an underestimation. The reason of the underestimation might be that the TEOM measurements for PM2.5 should be considered as lower limits because of volatilization of soluble organic carbon species in the drying stages of the measurement [13]. Accordingly, the simulations are more reasonable and believable. For the investigation of the model performance over time, the values of mean observations, Normalized Mean Bias (NMB), Normalized Mean Error (NME), and correlation coefficients are calculated as the statistical measures (e.g., [28, 29]) (domain wide averages) and plotted as an hourly time series in Figure 6(b). A closer inspection of the results indicates that the domain mean simulations present the same time-varying trend with the domain mean observations. The NMB values range from −20.3% (4:00 Beijing time, 5 November) to 22.4% (18:00 Beijing time, 10 November), most of which are between −10% and 10%. And the results in Figure 6(b) clearly show that the NMB values are with the underestimations of PM2.5 at the beginning of the period (from 0:00 to 12:00 Beijing time, 5 November). This may be caused by the lack of the observations in the first few hours. Nevertheless, the domain correlation coefficients between the domain mean simulations and mean observations are as high as 0.76 at least, most between 0.85 and 0.95. These results show that the scheme can effectively assimilate the PM2.5 observations.

By way of illustration, Figure 7 shows the spatial distribution of simulated averaged PM2.5 concentrations during assimilation window across China. Significantly high levels of PM2.5 concentrations are shown in Northeast China. The mean values are about 100–160 μg/m3, which present a typical seasonal variation with high values mainly caused by field biomass burning and the domestic use of coal and biomass as a household fuel in wintertime [30]. And the mean values are smaller in southeast China where precipitation is rich. Meanwhile, the concentrations are slightly large in middle and east coast of China, which is in conformity with industrial development. As shown in Figure 7, the mean PM2.5 concentrations of Beijing are less than 60 μg/m3, which is half of that in the same period of previous years (e.g., [27, 31]). Meanwhile, the environment of the areas around Beijing has been effectively improved that the mean concentrations are about 60–90 μg/m3.

5. Summary and Conclusions

We have presented a general high order, conservative method for the parameter inversions of a PM2.5 transport adjoint model. It is verified that the application of the PPM into the independent point scheme can solve the ill-posedness of the inversion problem much effectively. A series of test cases have demonstrated that the PPM is superior to Cressman interpolation for the simulations and the inversion parameters in this model.

We believe the evidence presented here clearly indicates that PPM produces better approximations than can be obtained by using Cressman interpolation. With the availability of the PPM, the ill-posedness caused by insufficient observations is effectively improved. Regarding the interpolation technique, PPM is favourite in this work. This is mostly due to its ability to achieve high order accuracy in space. In the practical experiments, with the PPM , the MAE between simulated values and observations in “assimilation cities” is reduced by 75.00% and that of the “checked cities” is reduced by 56.65%. In the real case simulation of the 21th APEC in Beijing, the results from the adjoint model with the PPM are in good agreement with the observation data. Furthermore, the results could also display the pollutant concentration distributions features of selected area.

The numerical results shown here are performed in two-dimensional adjoint model only. However, the PPM has a full three-dimensional capability. Additional future work will extend to the verification of the PPM in three-dimensional adjoint model simulations.

Disclosure

The last two authors made equal contributions to this work and are equally considered to be corresponding authors.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was partly supported by the National Key Research and Development Plan of China (Grants nos. 2017YFC1404000, 2017YFA0604100, and 2016YFC1402304), the Fundamental Research Funds for the Central Universities of China (Grant no. 201513059), the Natural Science Foundation of Zhejiang Province (no. LY15D060001), the National Natural Science Foundation of China (no. 41606006 and no. 41371496), the Education Department Science Foundation of Liaoning Province of China (no. JDL2016029), and the Education Department Science Foundation of Liaoning Province of China (no. JDL2017019).