Abstract

In this study, we focus our attention on the forecasting of daily PM2.5 concentrations. According to the principle of “divide and conquer,” we propose a novel decomposition ensemble learning approach by integrating ensemble empirical mode decomposition (EEMD), artificial neural networks (ANNs), and adaptive particle swarm optimization (APSO) for forecasting PM2.5 concentrations. Our proposed decomposition ensemble learning approach is formulated exclusively to deal with difficulties in quantitating meteorological information with high volatility, irregularity, and complicacy. This decomposition ensemble learning approach mainly consists of three steps. First, we utilize EEMD to decompose original time series of PM2.5 concentrations into a specific amount of independent intrinsic mode functions (IMFs) and residual term. Second, the ANN, whose connection parameters are optimized by APSO algorithm, is employed to model IMFs and residual terms, respectively. Finally, another APSO-ANN is applied to aggregate the forecast IMFs and residual term into a collection as the final forecasting results. The empirical results show that the forecasting of our decomposition ensemble learning approach outperforms other benchmark models in terms of level accuracy and directional accuracy.

1. Introduction

With the development at the technological level and the improvement of people’s living standards, environment pollution becomes more and more serious, especially in developing countries. PM2.5 refers to the particles having the diameter of 2.5 micrometers or smaller, which can go directly to the alveoli of the lungs. Compared to the PM10 (size of 10 microns or less) and TSP (size of 100 microns or less), PM2.5 is more likely to absorb hazardous and noxious substances. It is the carrier of all sorts of toxic substances in the air. Some scientific research shows that nitrogen oxides and sulfur dioxide emissions may be separately transformed to PM2.5 nitrate ion and sulfate ion in particular environmental conditions. Human exposure to PM2.5 can lead to a variety of adverse health impacts, such as cardiovascular and respiratory problems [13]. Based on the effect on environment and human health, PM2.5 pollution hierarchies have been divided into six grades from excellent to serious pollution, which are described in Table 1. During the past two decades, some epidemiological studies have demonstrated that the major air pollutant impacting human health is particulate matter [4]. The adverse health impacts of particulate matter have become a well-known problem in our daily life. Except the accumulation of dust and the reduction of visibility, the direct effect on human health via inhalation is a severe problem [5, 6].

Due to the worse harm of PM2.5, it became the study object and chief pollutant for rigorous control in the world, especially in the developed countries, during recent years. Air quality monitoring systems have gain large amounts of pollutant concentration data hourly or daily; it is necessary for us to analyze the data through appropriate methods [7, 8]. However, due to the serious environmental pollution, air quality monitoring systems show that many areas do not conform to the standards, which may lead to serious health problems, with ecological and economic effects. However, very few countries have a real-time air quality forecasting (RT-AQF) platform. In the United States, the public can learn about the future air quality index (AQI) through television, newspapers, radio, Internet, and other media, including air pollutant concentration and its associated health risks [9]. Therefore, PM2.5 concentration forecasting is obviously necessary, which will be able to give early pollution warnings and nip the pollution in the bud, so precaution and governing can get in progress as early as possible. Recently, a lot of efforts have been made in the research of PM2.5 concentration forecasting.

A lot of mathematical models are applied for forecasting PM2.5 concentrations. According to their fundamental principles and math representation, the mathematical models are mainly classified into two types: empirical models and deterministic models [10, 11]. Empirical model is the use of statistics or big data technology to quantify the relationship between AQI observed by air quality monitoring systems and that observed by meteorological parameters. Deterministic model is to estimate the air quality index based on simulating physical and chemical reaction, which uses mathematical models to understand how chemical processes occur in the transmission and transformation process of the atmosphere, and then test these models to see if they can create the desired results [11]. However, because of the complexity of meteorological parameters and the difficulty of quantitative estimation, there exists a vast amount of uncertainty which causes PM2.5 concentration forecasting to differ from reality. Therefore, compared with deterministic model, empirical model has higher precision of forecasting and better adaptability. Many empirical models, such as Autoregressive Integrated Moving Average (ARIMA), multilinear regression (MLR), and artificial neural network (ANN) models, have been applied to PM2.5 concentration forecasting [1215]. As a traditional statistical model, ARIMA needs historical data continuity, because it is better at capturing the linear pattern of a time series, especially seasonal pattern. Similarly, MLR is more suitable for the linear pattern, but it is difficult in capturing extreme values. Additionally, artificial neural networks (ANNs), as a new machine learning technique which has great versatility, can recognize noise and nonlinear patterns that include extremes in the original data [16]. Moreover, some researchers have found that, compared with the single model, hybrid empirical models can better capture linear and nonlinear patterns of the time series and deal with extreme value effectively, ultimately improving the forecast accuracy [1719].

Because of the computational efficiency and forecasting accuracy, ANN model has been widely used [20, 21]. Three types of artificial neural network models and a linear model have been chosen to forecast daily PM2.5 concentrations in El Paso (USA) and Ciudad Juarez (Mexico) [22]. Zhu et al. put forward a hybrid model optimized by particle swarm optimization (PSO) algorithm and obtained good performance in PM2.5 concentrations forecasting [23]. However, even considering the meteorological and geographical data, combination of linear and nonlinear models cannot meet the complexity of air quality data [24].

Fortunately, some problems we mentioned above can be partially solved by the principle of “divide and conquer” [25]. The purpose of “divide” is to simplify the forecasting difficulties by decomposing a task into some relatively easy subtasks, while its overall goal is to formulate a consensus forecasting result for the original data [26]. Therefore, recently based on this principle, some hybrid ensemble approaches had been put forward to solve some difficult forecasting problems, such as the forecasting of international crude oil price, and empirical results show that hybrid ensemble approaches are better than individual forecasting models [27, 28]. In fact, previous research has already demonstrated the advantage of “divide and conquer” principle. For instance, while integral models may ignore some value properties and thus lead to evaluation errors, Fischer has proved that decomposition method can analyze problems and their intrinsic properties and make them more comprehensive and clear [28]. Likewise, Kleinmuntz argued that individuals have the bounded ability to deal with the information, which may become invalid in the face of a large and complex system [29].

The main contribution of this study is to establish a more accurate approach to forecast PM2.5 concentrations, and evaluate the forecasting performance of the approach. PM2.5 concentrations are influenced by a lot of factors, but the influence law is uncertain, so we just care about PM2.5 concentrations. Based on the principle of “divide and conquer”, this study proposes a novel decomposition ensemble learning approach by integrating EEMD, ANN, and APSO optimization algorithm for PM2.5 concentrations forecasting at Lanzhou city in China. Generally, because of the complexity and irregularity of PM2.5 concentration time series, the principle of “divide and conquer” is established to deal with this problem. Therefore, a novel framework of decomposition ensemble learning approach integrating EEMD, ANN, and APSO is presented. In the proposed approach, a difficult forecasting task has been divided into several relatively simple subtasks; the process of adding such a decomposition process can make it easier to solve the problem of forecasting, thus improving the forecasting performance. The main reason for selecting Lanzhou city as the research area is that it has significant characteristics in terms of climate, topography, and population. In addition, the study verified how well the approach we presented performs in different circumstances.

The remaining parts of this article are organized as follows. Section 2 will illustrate research data collection and preprocessing. Then, Section 3 will briefly introduce the related methods used in this study. The accuracy of forecasting results and validity of the proposed approach are discussed in Section 4. Finally, the paper is concluded in Section 5.

2. Data Collection and Preprocessing

The research, analysis, and results of this paper are all based upon the data of PM2.5 concentrations in Lanzhou, which is the capital city of Gansu province and has specific location and climatic conditions. Lanzhou is located on the upper reaches of Yellow River and at the geometric center of China’s continental territory. With the Yellow River going through, the city is sandwiched by mountains on the northern and southern banks. The average altitude of Lanzhou is 1520 m, and it is located 36 degrees 3 minutes north latitude and 103 degrees 40 minutes east longitude and situated in the temperate zone with a semiarid climate.

The PM2.5 concentration data used in this study are obtained from the Ministry of Ecology and Environment of China (http://www.mee.gov.cn/). The PM2.5 concentration daily data covers the period from January 1, 2017, to October 31, 2019, with a total of 1004 observations.

As we all know the PM2.5 concentration has high volatility, nonlinearity, and irregularity. In this study, we propose a new decomposition ensemble learning approach to forecast PM2.5 concentrations in terms of the principle of “divide and conquer”. The general framework of our proposed decomposition ensemble learning approach is as follows: decomposition, single forecasting, and ensemble forecasting. First of all, some decomposition methods can be utilized to decompose the original PM2.5 concentrations data into several meaningful component consequences. Then, some optimized forecasting methods are employed to forecast each component, respectively. Finally, the forecasts results of each component can be aggregated into the final forecasting results by means of the ensemble approaches [30].

In summary, different data decomposition methods, intelligent optimization algorithms, forecasting models, and ensemble approaches can formulate different decomposition ensemble learning approaches. In this study, firstly, we utilize ensemble empirical mode decomposition (EEMD) to decompose original data of PM2.5 concentrations into a specific amount of independent intrinsic mode functions (IMFs) and a residual term. Secondly, artificial neutral network (ANN) optimized by adaptive particle swarm optimization (APSO) is applied to forecast all IMFs and residual term, respectively. Finally, another APSO-ANN is employed to aggregate the forecasting results of IMFs and residual term into a collection as the final forecasting results. This is called EEMD-based APSO-ANN ensemble learning approach. The overall formulation process of our proposed decomposition ensemble learning approach is as follows.

3.1. Ensemble Empirical Mode Decomposition

Empirical model decomposition (EMD) was initially proposed by Huang et al. [31]. In order to overcome the shortcoming of the mode mixing problem in EMD, Wu and Huang presented the ensemble empirical mode decomposition (EEMD) [32]. EMD and EEMD are self-adaptive algorithms compared with other traditional decomposition methods, such as wavelet decomposition and Fourier decomposition. The specific effect of local feature can identify all modes; hence, EMD and EEMD decompose signals into several intrinsic mode functions according to its characteristic of time scale.

In recent years, EMD and EEMD have been widely applied to decompose complex time series and some complex system modeling [31, 3335]. This study chooses EEMD as the data decomposition method. The EMD and EEMD method will be introduced as follows.

EMD method is a kind of adaptive time series decomposition technique which is used to process nonlinear and nonstationary signals and is based on Hilbert-Huang transform (HHT) [36]. Because of the complexity of the data, the method assumes that data may have different modes of oscillations simultaneously. Tested signals are decomposed into a number of intrinsic mode functions (IMFs) by using local wave method, and time-frequency spectrums of IMFs are acquired by means of Hilbert transformation, which must meet the following requirements: (1) they function within the entire time domain, in which the number of local extreme value points and zero crossings must be equal, or at most by one; (2) the local maximum envelope and the local minimum envelope must be zero on average at any time point.

We can define meaningful IMFs through those two conditions. According to the definition, we can decompose any complicated data series ; the process is presented as follows.(1)Find out all the local extrema of original data series .(2)Use cubic spline interpolation to create the upper and lower envelopes and , respectively, and calculate the average of the upper and lower envelope:  (+)/2.(3)Subtract the envelope mean from original time series and define it as , . Inspect whether meets the above two basic conditions of IMF; if is not an IMF, replace with and repeat the above two steps.(4)Extract an IMF and replace with the residual . Repeat Steps 1–3 until the stop criterion is satisfied.

Using this screening process, the original data series can finally be decomposed into a sum of IMFs and a residue term:where is the number of IMFs, is the final residue term, and is the jth IMF.

Even though EMD is a fully data-driven and self-adaptive data decomposition method, there is also an obvious disadvantage, such as the mode mixing. In order to address the mode mixing problem, EEMD technique was proposed by Wu and Huang [32]. EEMD takes the method of EMD as the basis and successfully solves the mode mixing problem caused by intermittent noise by adding white noise to the original time series before decomposition. EEMD method can not only reserve the information of the original data, but also overcome the drawback of mode mixing. The sifting steps of EEMD are as follows:(1)Add a group of white noise to the original data series to acquire : .(2)Employ EMD method to decompose , and obtain a series of IMFs: .(3)Add different white noise series to the original data, repeat the above steps: , and obtain corresponding IMF components: .(4)The final results are the ensemble averages of corresponding IMFs: .

Wu and Huang demonstrated that the effect of the added noise is strictly controlled via the following statistical criteria [32]:where is the total number, is the amplitude of the added noise, and is the final standard deviation of error between original data series and the corresponding IMFs. In practice, the total number is often set to 100 and of white noise series is set to 0.1 or 0.2 [33].

3.2. Artificial Neural Networks

Artificial neural networks (ANNs) are widely applied in air pollution forecasting, which can build flexible model for various nonlinear problems. Relative to other types of nonlinear models, ANNs are universal approximators with a high reliability and accuracy in estimating a large class of functions. Additionally, ANNs are largely determined by the characteristics of data in the model building process; hence these techniques do not require prior assumption. The neural network architecture usually consists of the input layer, the hidden layer, and the output layer [37]. The input layer accepts the data imported to the network, and the output layer realizes the output of evaluation results. The hidden layer, which is between the input and output layer, consists of a number of neurons or hidden units placed parallel to each other. From the viewpoint of mathematics, the hidden neuron is described as the following mathematical expressions [38]:where is the activation function that is usually chosen as the logistic sigmoid function , is the weight of input at neuron , and is the bias of neuron . The relationship between the output and the inputs is presented as the following mathematical expressions:where is the connection weights, is the number of hidden nodes, and is a function determined by the network structure and the connection weights. In this study, the architecture of ANN selects backpropagation neural network (BPNN) that is one of the most popular and effective forecasting techniques. BPNN is a three-layered feedforward architecture based on backpropagation (BP) algorithm. The details of BPNN can be found in [38].

3.3. Adaptive Particle Swarm Optimization

Particle swarm optimization (PSO) is a heuristic search algorithm based on swarm intelligence and has been widely used to solve various problems. The principle of PSO is to simulate the characteristics that the birds update location in searching food. First of all, it initializes a group of particles in the solution space, each of which denotes a potential optimal solution. The characteristics of the particles are measured by three indicators: location, speed, and fitness. Particles update timing position by tracking individual extremum (Pbest) and global extremum (Gbest). PSO algorithm can easily cause early maturing; in order to address this problem, a novel adaptive particle swarm optimization (APSO) algorithm has been proposed to solve the problem of low precision and avoid premature phenomena of basic PSO algorithm [39].

Supposing that there is a population in a D-dimensional search space, which consists of n particles . Among them, denotes the position of ith particle in the D-dimensional search space, also on behalf of a potential solution of the problem. According to the objective function, we can calculate the corresponding fitness value of each particle. The speed of the position change for particle is , denotes the best previous position which gives the best fitness value of the ith particle, and denotes the best position among all the particles of the population. The specific formula of adaptive adjustment is as follows:where is the adaptive inertia weight, and are nonnegative constants, which are called acceleration factors, and and are random numbers distributed in [0, 1]. and are constraint factors in the range [0, 1]. is the minimum inertia weight. is the fitness function; it is defined as follows in this study:where and are the actual value and the forecast value of PM2.5 concentration, respectively [37].

3.4. The Framework of Our Proposed Approach

Given that is a time series, we could purpose a proactive mechanism to make m-step ahead forecasting, i.e., . In this study, it is worth reminding that we apply iterative forecasting method, which can be represented as follows:where is the forecast value, is the actual value, and denotes the lag orders. In ANN, the initial weights and thresholds have significant meaning and play an important role in learning and optimizing the neural network [40]. However, these parameters are randomly generated in the beginning and then adjusted in the whole training process. Hence, APSO algorithm is applied to determine the threshold and weight values of artificial neural network, as shown in Figure 1. Meanwhile, the time series data inevitably adulterates some noise or worthless and meaningless information. Therefore, our proposed EEMD-based APSO-ANN ensemble learning approach has been established to forecast PM2.5 concentrations at Lanzhou city in China.

According to the framework in Figure 2 and the previous research, this study will establish a novel decomposition ensemble learning approach by integrating EEMD and APSO-ANN for PM2.5 concentrations forecasting. As shown in Figure 2, our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach is generally composed of the following three main steps:(1)The original PM2.5 concentrations time series is decomposed into IMFs and one residual component by EEMD method.(2)APSO-ANN is regarded as a single forecasting technique to model the decomposed IMF components and the residue component, respectively. As a result, all components can obtain the corresponding forecasting results.(3)Finally, the final forecasting for the original PM2.5 concentrations time series is obtained by integrating the forecasting results of the IMFs and residue components, using another APSO-ANN technique as an ensemble approach.

In short, in view of the principle of “divide and conquer”, our proposed EEMD-based APSO-ANN ensemble learning approach can be described as a general framework of “EEMD (decomposition)–APSO-ANN (single forecasting)–APSO-ANN (ensemble forecasting)”. In order to verify the effectiveness of our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach, PM2.5 concentrations data collected from Lanzhou city is used as the test target. For more details, we will discuss in the next section.

4. Empirical Study

In this study, the sample data are divided into two subsets: training subset and testing subset. We treat data from January 1, 2017, to September 30, 2019, as training subset with 974 observations used for model training. Similarly, data from October 1, 2019, to October 31, 2019, with 31 observations are treated as the testing subset to evaluate the forecasting performance of the model. Additionally, data of the past 1 day (lag order 1), 2 days (lag order 2), 3 days (lag order 3), 4 days (lag order 4), and 5 days (lag order 5) are utilized as initial input form to forecast the following daily PM2.5 concentrations, respectively, and finally the input form with minimum forecasting error is chosen as optimal input structure.

4.1. Evaluation Criteria of Forecasting Performance

In this study, two evaluation criteria are utilized to evaluate the forecasting performance of our proposed decomposition ensemble learning approach. They are mean square error (MSE) and mean absolute percent error (MAPE). The smaller the index value is, the better the forecasting performance will be. The formulas of criteria are as follows [41]:where is the number of observation points, represents the value of actual PM2.5 concentrations for a time period , and is the forecast value for the same period.

Additionally, we also consider the directional forecasting accuracy; it can be expressed bywhere if and 2 otherwise.

4.2. Empirical Results

In our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach, the first step is to apply EEMD method to decompose the original PM2.5 concentration data series into several independent IMF components and one residue term. In this study, the ensemble member is set to 100, and the standard deviation of added white noise in each ensemble member is 0.2. All IMF components are sorted from the highest to the lowest according to the frequency, and the last one is the residue term. The decomposition results of original PM2.5 concentrations at Lanzhou city in China are shown in Figure 3. It is easy to find that the original PM2.5 concentrations time series is decomposed into nine independent components.

For comparison, we choose some other popular forecasting models as benchmarks to be compared with our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach. According to previous literature, five single forecasting models, ANN, GA-ANN, PSO-ANN, APSO-ANN, and ARIMA, and three groups of decomposition ensemble learning approaches are chosen as benchmark models. For the purpose of consistency, the parameters of the decomposition ensemble learning approaches are the same as single forecasting models.

To clearly analyze data, the empirical results consisted of two parts. In the first part, we will compare the results of five single forecasting models and then choose the optimum model as a single forecasting and ensemble model for decomposition ensemble learning approach. In the second part, the forecasting performance of our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach is compared with the other three decomposition ensemble learning approaches.

4.2.1. Performance Comparison of Single Models

In this subsection, we compare five single forecasting models, ANN, GA-AND, PSO-ANN, APSO-ANN, and ARIMA. For the ANN techniques, the numbers of inputs and hidden layer nodes are determined using the trial-and-error method, and the active function of hidden layer is sigmoid function. Table 2 shows the forecasting errors by means of MSE, MAPE, and . The forecasting results indicate that the APSO-ANN has a high forecasting accuracy, followed by PSO-ANN.

From Table 2, it is clearly seen that all of the ANN techniques are superior to the traditional ARIMA model, and the optimal lag order of inputs is 3. Compared with ARIMA, ANN, GA-ANN, and PSO-ANN, the MAPE, MSE, and of APSO-ANN are 25.22%, 0.1491, and 61.29%, respectively. On the contrary, the MAPE, MSE, and of ARIMA are 36.27%, 0.2231, and 51.61%. It is obvious that ANN technique whose parameters are optimized by PSO or APSO algorithm is better than ANN without any optimization scheme, so the optimized ANN technique is regarded as the single forecasting and ensemble forecasting method in our proposed decomposition ensemble learning approach.

4.2.2. Performance Comparison of Decomposition Ensemble Approaches

This subsection focuses on the forecasting performance comparison of three groups of decomposition ensemble learning approaches. Some variants of decomposition ensemble learning approaches with other decomposition methods (e.g., EMD method) and other ensemble approaches (e.g., simple addition (ADD)) are also employed as decomposition ensemble learning benchmarks to be compared with our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach. Therefore, we select three groups of decomposition ensemble learning approaches, i.e., [EMD-APSO-ANN-ADD, EMD-APSO-ANN-PSOANN, EMD-APSO-ANN-APSO-ANN], [EEMD-PSOANN-ADD, EEMD-PSOANN-PSOANN, EEMD-PSOANN-APSO-ANN], and [EEMD-APSO-ANN-ADD, EEMD-APSO-ANN-PSOANN, EEMD-APSO-ANN-APSO-ANN]. Table 3 provides the forecasting results of different decomposition ensemble learning approaches.

For the above different decomposition ensemble learning approaches, we firstly discuss forecasting performance of decomposition ensemble learning approaches with different decomposition methods. We can clearly see that the EEMD-based decomposition ensemble learning approaches can obtain better forecasting accuracy than the corresponding EMD-based decomposition ensemble learning approaches. That is, the EEMD is much more efficient than EMD in data decomposition of PM2.5 concentrations. Secondly, the forecasting performance of APSO-ANN-based decomposition ensemble learning approaches is mostly better than ADD-based decomposition ensemble learning approaches in terms of MSE, MAPE, and criteria. This indicates that APSO-ANN is a powerful ensemble learning method. Thirdly, we compare single forecasting models; it is clearly seen that the forecasting accuracy of APSO-ANN is better than that of PSO-ANN and GA-ANN.

In general, through the analysis above, we can obtain some interesting findings as follows: (1) Decomposition ensemble learning approaches are significantly better than other single models, such as ARIMA, ANN, GA-ANN, PSO-ANN, and APSO-ANN. The main reason is that the strategy of “divide and conquer” can effectively improve the performance of PM2.5 concentrations forecasting. (2) It is clearly seen that the EEMD method performs much better than the counterpart method with EMD in terms of both level forecasting accuracy and directional forecasting accuracy. (3) After decomposition, the second and third steps of decomposition ensemble learning approach are individual forecasting and ensemble forecasting by means of APSO-ANN and PSO-ANN with optimal weights and threshold values; empirical results show that these decomposition ensemble learning approaches are better than the other AI benchmark models. (4) Our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach is superior to all the other benchmark models in terms of both level forecasting accuracy and directional forecasting accuracy. Therefore, our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach can be used as an effective forecasting framework for forecasting PM2.5 concentrations.

Additionally, we have set the length of inputs for ANN as lag order 1, 2, 3, 4, and 5, respectively, and while the input form is lag order 5, our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach has the highest forecasting accuracy. Figure 4 shows the best forecasting results of PM2.5 concentrations at Lanzhou city in China from October 1, 2019, to October 31, 2019.

5. Conclusions

The ascension of PM2.5 concentration will lead to serious health, climate, and environment problems and cause respiratory and cardiovascular diseases. As a result, it is important and urgent to establish an early warning system based on the accurate PM2.5 concentration forecasting. In order to address this hard issue, based on the principle of “divide and conquer”, this study proposes a new decomposition ensemble learning approach by integrating ensemble empirical mode decomposition (EEMD), artificial neural networks (ANNs), and adaptive particle swarm optimization (APSO) in order to improve the performance of PM2.5 concentration forecasting. The PM2.5 concentration data used in this study covers the period from January 1, 2017, to October 31, 2019, at Lanzhou city in China. Our proposed decomposition ensemble learning approach takes advantage of multiple methods, such as the effective self-adaptive data decomposition of EEMD and end-to-end parameters optimization of APSO, to improve the performance of PM2.5 concentration forecasting. To verify performance of our proposed approach, three groups of decomposition ensemble learning approaches were chosen as benchmarks to be compared with our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach. Empirical results show that our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach significantly improves the forecasting performance and outperforms some other benchmarks in terms of of level forecasting accuracy and directional forecasting accuracy. This indicates that our proposed decomposition ensemble learning approach with effective decomposition, as well as nonlinear single and ensemble forecasting, can be used as a very promising framework to solve other complex time series forecasting problems, especially for the data characterized by high volatility and irregularity.

Additionally, our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach can be applied to other applications such as finance forecasting and energy forecasting. Furthermore, this study mainly considers the univariate time series forecasting, while other factors affecting PM2.5 concentrations were not taken into consideration. If those factors were incorporated into our proposed EEMD-APSO-ANN-APSO-ANN decomposition ensemble learning approach, the forecasting performance may still improve. These limitations will hopefully be addressed in future research.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported in part by the National Natural Science Foundation of China under Grant no. 71904153 and the project funded by China Postdoctoral Science Foundation (No. 2018M53598).