Abstract

This study proposes an improved metabolism grey model [IMGM] to predict small samples with a singular datum, which is a common phenomenon in daily economic data. This new model combines the fitting advantage of the conventional GM in small samples and the additional advantages of the MGM in new real-time data, while overcoming the limitations of both the conventional GM and MGM when the predicted results are vulnerable at any singular datum. Thus, this model can be classified as an improved grey prediction model. Its improvements are illustrated through a case study of sulfur dioxide emissions in China from 2007 to 2013 with a singular datum in 2011. Some features of this model are presented based on the error analysis in the case study. Results suggest that if action is not taken immediately, sulfur dioxide emissions in 2016 will surpass the standard level required by the Twelfth Five-Year Plan proposed by the China State Council.

1. Introduction

In response to technological advances and the progress of human society, many scholars have proposed and investigated various theories and methods that analyze uncertain information from different angles and perspectives. The grey system theory as proposed by Chinese scholar Julong Deng in 1982 is a novel uncertainty theory, which has received increasing attention and has recently become a popular subject of research. The grey system is an uncertainty system in which information is only partially known; it is also suitable to modeling small samples and poor information.

The grey prediction models, especially the GM, may be superior to other prediction techniques in the context of small samples and poor information. Thus, grey prediction models have received much attention since their introduction and successful application in the small sample prediction fields. They have been applied in many fields. Wang [1] predicted the stock price using a fuzzy grey prediction system. Sun et al. [2] proposed a small sample prediction technology for early warning of financial crises and recognition of crisis patterns of large-scale enterprises based on the GM. Vishnu and Syamala [3] used the basic GM to predict dynamic changes in water supply systems. Chang and Lu [4] used the GM predictor to forecast the flicker severity. Li et al. [5] used the adaptive GM to forecast short-term electricity consumption. Tang and Yin [6] forecasted the education expenditure and school enrollment by the GM. Yin and Tang [7] fitted and forecasted China’s labor formation based on the prediction GM. Li et al. [8] applied the GM and GRA to evaluate the financial burden of patients at hospitals in China by PPP model. Wu et al. [9] applied an improved GM to analyze the trends of the plum virus incidence in China. Tabaszewskin and Cempel [10] developed a grey prediction method for the fan vibration in a cogeneration plant (CHP) by combining the GM and the optimal estimated approaches. Pao et al. [11] forecasted CO2 emissions, energy consumption, and economic growth in China using a prediction entirely improved GM. Recently, Huang et al. [12] constructed a hybrid nonlinear grey prediction model to analyze the carbon reduction at dual levels in China. Wang et al. [13] provided an approach to forecast the coal mine water inflow based on the GM prediction theory. Ma and Fu [14] studied the accidents due to human factor in coal mine based on the GM. These applications of the grey prediction models show the feasibility and superiority of the grey system theory in dealing with small samples, which is also a reason why we choose this method to predict sulfur dioxide emissions.

With respect to the limitations of the basic grey prediction models, many scholars have proposed some optimization GM. Xie and Liu [15, 16] proposed and studied a discrete GM and verified its unbiasedness to fit an exponential sequence. Yao and Liu [17] and Wang et al. [18], respectively, developed an improved and optimal discrete GM based on Xie’s discrete GM. Shih et al. [19] improved the GM by changing the background and initial values. Li et al. [20, 21] provided an overall definition of the grey prediction model and proposed some basic GM. Truong and Ahn [22] proposed the SAGM to overcome the instability and improve the prediction accuracy of the GM. Cui et al. [23] provided the NGM based on the GM, which belongs to an entirely optimized GM. Dai and Huang [24] developed the GM by designing calculation program and selecting the initial value. Zhou and He [25] proposed the GGM which is generalized GM that includes the DGM, SAGM, and NGM. Lin and Lian [26] designed a grey prediction self-organizing model, which is an extended GM. Xiong et al. [27] proposed the nonequidistant GM based on the optimization initial condition. The main objectives of the above methods are to improve the accuracy of grey prediction models and, thereby, perfect them. However, these methods have a common data background referred to as small simples, which demonstrate an increasing or decreasing trend. This simple sequence does not include a small simple sequence with several singular data or a nonmonotone sequence, which generally constitute a practical phenomenon in the periodicity of economic and environmental uncertainty. For this reason, this study investigated the applicability of a new grey prediction model in a small simple sequence with a singular datum, which is a key point in the prediction of sulfur dioxide emissions in China.

Due to the recently increasing interest in environmental protection, many studies have focused on predicting energy consumption and pollution emissions. However, the prediction results of such studies remain relatively inaccurate due to changing environmental indicators and nonuniform standards. To solve this problem, some scholars have successfully applied the grey system theory and its prediction models to predict the environmental indicators with small simples. For example, Wang et al. [28] applied the GM to analyze and predict the atmospheric ozone levels in Asia. Lee and Tong [29] proposed the GPGM to forecast the energy consumption in China and achieved significant fitting results. Lin et al. [30] used the GM to predict the carbon dioxide emissions in China. Pan et al. [31] and Dai et al. [32] applied the GM to predict the degrees of air pollution in Tianjin and Shenzhen, respectively. These studies prove that the grey prediction model is suitable and useful in forecasting the energy consumption and pollution emission in China. However, the grey prediction model has yet to be investigated with small simples of singular data, which have become increasingly common in Chinese economic data to reflect the decline in the economic growth.

Based on the literature review above and practical trends in Chinese pollution indicators, this study proposes a new grey model to predict small simples with singular data and applies it to predict recent annual sulfur dioxide emissions in China. This paper is organized as follows: Section 2 describes the basic GM and MGM. Section 3 analyzes the limitations of the GM and MGM and then proposes the improved MGM called IMGM, based on which it provides the corresponding modeling steps. In Section 4, a practical example is provided to demonstrate the effectiveness and the practicability of the new grey prediction model. The paper ends in Section 5 with conclusions.

2. The GM(1, 1) and MGM(1, 1)

2.1. The Modeling Mechanism of the GM(1, 1)

Grey system theory, the name of which was derived from cybernetics and a clear degree of information, was proposed by Chinese scholar Julong Deng in 1982 [33]. In economics management processes and scientific research, incomplete information can arise in a general situation belonging to a grey system, as defined in grey system theory. As such, a grey prediction model has a comprehensive scope of applications in our society. However, a prerequisite for the application of grey prediction models, such as the GM, is that the original data must be monotonically increasing or decreasing. If this prerequisite is not met, the prediction results are likely to seriously deviate from the reality. This prerequisite and its influence are investigated through the following analysis of the modeling mechanism of the GM.

Let a nonnegative small sample sequence be the original data sequence, and . As a small sample sequence, is generally small at . Deng [33] proposed the GM and its modeling steps to fit and predict based on as follows.

Step 1. Use the one-time accumulated generating operation (1-AGO) to obtain , in which and

Step 2. Calculate the background value , in which , , and

Step 3. Construct the original form of GM, and then estimate the developing coefficient and the grey action quantity based on the ordinary least squares method presented as follows: where , , and

Step 4. Calculate the time response sequence through the following formula:

Step 5. The time response sequence is transformed into the restored time response sequence through the following formula:

The abovementioned steps provide the modeling process of the GM, which also is its modeling mechanism. Here, is the cumulative prediction value, is the restored prediction value, is the cumulative fitting value, and is the restored fitting value, where and .

With respect to the above process, the GM was found to have two advantages and two shortcomings. The advantages are as follows: (1) The model does not require many samples and is simpler and more suitable than general econometric models. (2) It can be used for recent or short-term prediction activities. The shortcomings are as follows: (1) The model is a monotonic exponential prediction approach, which is its modeling prerequisite. (2) It models and predicts all data when , but it ignores new information and it cannot accurately reflect the characteristics of the current situation.

Liu et al. [34] proposed the metabolism grey model, namely, MGM, by updating the modeling data and introducing new information to overcoming the second shortcoming, which is a significant improvement as explained in Section 2.2. In Section 3, we propose and analyze another new GM based on the above MGM to overcome the first shortcoming.

2.2. Construction and Calculation of the MGM(1, 1)

As previously stated, the general grey prediction models, such as the GM, provide forecasting based on the original data without considering the impact of new information or dynamic predictions when information is subsequently supplied and updated. As such, Liu et al. [34] proposed the MGM. As an improved grey prediction model, the MGM can add new information and remove old information in what is also a metabolic process. The corresponding modeling steps of the MGM are presented as follows.

Step 1. Construct the GM for a small sample sequence also called original data sequence, and then calculate the time response sequence and the restored time response sequence based on (5) and (6), where the original data sequence is and .

Step 2. Eliminate and metabolize the fitting data. In particular, the new prediction data is used instead of the first data . Thus, we can obtain a metabolic small sample sequence called metabolic data sequence, which is similar to the original data sequence and includes a new datum.

Step 3. Use the MGM to calculate and predict, following the same modeling process of the GM. Then, a new prediction datum can be obtained.
If more prediction data is needed, then the above process is repeated to obtain the data. This calculation process demonstrates the modeling steps of the MGM.
Unlike the GM, the MGM uses the prediction data as the original data to predict other data. Therefore, there could be a propagation phenomenon of the prediction error from the MGM, which could further affect the metabolic modeling process. Thus, there are two error inspection methods (Step  4) to improve the MGM.

Step 4. The fitting errors are calculated based on the following equations.(1)The residual error and its corresponding form are as follows:(2)The relative error and its corresponding form are as follows:

According to the above calculation process of the MGM, this new grey prediction model is a small sample and updated information prediction model. Therefore, the MGM is more suitable than the GM in the prediction of changing small simple sequences.

3. The IMGM(1, 1)

3.1. Shortage Analysis of the GM(1, 1) and MGM(1, 1)

Based on the modeling processes of the GM and MGM and the analysis of their disadvantages in the previous section, the original data sequence, which is a monotonic exponential sequence for these two grey prediction models, is taken as a prerequisite. Therefore, if the monotonic exponential character is changed or if singular data exist in the original data, then the prediction results can be affected, and a larger error could result, which can be proved by (9) and the following analysis:

Equation (9) exhibits the following conclusions:(i)Given that the ratio between and in the GM is a constant and is related to the development coefficient , a monotonic exponential trend exists in this grey prediction model.(ii)This monotonic exponential trend also exists in the MGM, which leads to its prerequisite of the monotonic exponential sequence. In our opinion, this prerequisite is unsuitable.

We provide a simple example below to further show the conclusions and limitations discussed above.

Example 1. Let a small sample sequence with a singular datum be

The distribution and trend of this sequence are shown in Figure 1. This sequence is similar to a monotonic exponential sequence, but it has a singular datum “5.”

Let the first five numbers be the original data sequence, that is, , and let the last three numbers be the future and unknown data sequence for comparing with the prediction results. It is found that this original data sequence almost meets two prerequisites of the GM and MGM, namely, the small simple and the increasing trend. Thus, the modeling processes of these two models can be used, and then the corresponding results can be obtained, which are shown as follows.

Step 1. Use the 1-AGO operator to obtain

Step 2. Calculate the background values as follows:

Step 3. Estimate the developing coefficient and the grey action quantity through the ordinary least squares method and the results shown as follows:

Step 4. Calculate the time response sequence shown as follows:

Step 5. Transform the time response sequence into the restored time response sequence as follows:Thus, the modeling steps of the GM are completed. Let ; the fitting values of the five original numbers can be obtained. Let ; the three prediction values based on the five original numbers can be obtained. The results and the corresponding fitting and prediction errors are shown in Table 1.

We also provide the following fitting and prediction results using the MGM. In the calculation process of this model, the original data still follow . After two metabolism computations, the results and corresponding fitting and prediction errors are obtained shown in Table 2.

In Figure 2, the three sequences include the original data sequence, the fitting and prediction result sequence of the GM, and the fitting and prediction result sequence of the MGM.

Based on Tables 1 and 2 and Figure 2, the following conclusions can be derived.(i)A clear monotonic exponential trend is observed in the results of the GM and MGM, which is consistent with pervious analysis.(ii)The fourth data point is a singular datum in the original data, and it cannot be shown in the fitting results of the GM and MGM.(iii)The GM and MGM can fit three data points before the singular datum.(iv)The fitting results after the singular datum and the prediction results are inappropriate for the above second conclusion. The later prediction results are worse than the initial results.

Therefore, the GM and MGM are unsuitable to fit and predict the small simple sequences with singular data. Because of this, we propose the IMGM in Section 3.2.

3.2. IMGM(1, 1)

In this subsection, the IMGM is proposed by combining the GM and MGM, which can be used to effectively fit and predict the small simple sequence with a singular datum. This new grey prediction model combines the fitting advantages of the GM in the small sample sequences and the updated advantages of the MGM in new real-time data. It also overcomes their limitations when the predicted results are vulnerable to any singular datum. Thus, this IMGM belongs to an improved grey prediction model, and its modeling steps are summarized as follows.

Let a small sample sequence with a singular datum be where is the singular datum.

Step 1. Construct and apply the GM, and the corresponding calculation steps can be referenced in Section 2.1. Thus, we can obtain the time response sequence and the restored time response sequence , which showed in (5) and (6), respectively, where . Furthermore, the fitting value of the singular datum can be shown as follows:

Step 2. Estimate the singular datum, and then update it by substituting with its first fitting value , and a new original sequence is obtained as follows:

Step 3. Construct and apply the GM again based on the new original sequence . Then we can obtain the second-time time response sequence and the second-time restored time response sequence , which are different from and .

Step 4. Derive the metabolism process for the original sequence which can be referenced in the modeling process of the MGM, and then construct and apply the GM again based on two new sequences.

Step 5. Calculate the fitting errors based on the above error equations.

Thus, the modeling process of the IMGM is completed. Step  2 can effectively eliminate the singular datum and provide a monotonic exponential sequence to construct grey prediction models. Step  4 is a metabolic process similar to the MGM. Steps  1 and 3 are the modeling process of the GM. Step  5 is unique to the IMGM. The new grey prediction model combines the GM and MGM and eliminates the inappropriate influence of singular datum.

Therefore, we think that the IMGM is a new and improved grey prediction model that can be used to fit and predict the small simple sequence with a singular datum. This new model deals with a singular datum only, which is a limitation. Although likely to be a challenging undertaking, future studies will conduct a more general situation than a small simple sequence with singular data.

4. Empirical Application and Analysis

4.1. Background

Sulfur dioxide is an environmental pollutant that is a major component of acid rain. It originates from the combustion of sulfur-containing fuels, metal smelting, and petroleum refining, which produce sulfuric acid and silicate products. Given the rapid development of its economy, China has seen its highest concentrations of sulfur dioxide emissions in recent years. Three years ago, China officially promulgated several policies to measure and control sulfur dioxide pollution. As such, we believe that the prediction of sulfur dioxide emissions is important to effectively measure and control such emissions. Although many prediction techniques can be used, these approaches are relatively inaccurate because of changing environmental indicators and nonuniform standards. Thus, small simple prediction approaches are more suitable than other prediction models. In this section, three grey models are applied to predict sulfur dioxide emissions based on the data from 2007 to 2013. Table 3 lists the data from the China Statistical Yearbook (2013).

Table 3 shows that Chinese sulfur dioxide emissions from 2007 to 2013 exhibit an approximately decreasing trend, which is consistent with the modeling prerequisite of the grey prediction models. The GM and MGM can be used to predict and analyze this scenario. The singular datum in 2011 is also found in this sequence. Thus, the GM and MGM could produce larger errors based on the aforementioned analysis. The IMGM is then applied to generate fitting and prediction results based on this sequence. Lastly, a detailed comparison of the three grey prediction models is provided.

4.2. Prediction Analysis of the Sulfur Dioxide Emissions in China
4.2.1. Fitting and Prediction Based on the GM(1, 1)

The first five numbers are considered the original data, that is, 2007-to-2011 data, and the data from 2012 to 2013 are considered the unknown data. This demarcation is the reason why the real fitting and prediction errors can be calculated and obtained and why the more suitable models can be selected for further predictions. Based on the above set, we can obtain the original data sequence; namely, . The following calculation steps and results are first obtained by using the GM.

Step 1. Use the 1-AGO operator, and then we have

Step 2. Calculate the background values, and then we can obtain

Step 3. Estimate the developing coefficient and the grey action quantity through the least squares method, and then we have

Step 4. Calculate the time response sequence , and then

Step 5. Transform into the restored time response sequence , and then we can obtainBased on the time response sequence and the restored time response sequence , the specific calculation results are obtained using the GM, which show in Table 4.

4.2.2. Fitting and Prediction Based on the MGM(1, 1)

Based on the above results and modeling steps of the MGM, the MGM is applied to fit the data of the sulfur dioxide emissions in China from 2007 to 2011 and then to predict the data from 2012 to 2013. Two metabolism calculation processes are used, the special results showed in Table 5. Here, we ignore the calculation steps and process.

4.2.3. Fitting and Prediction Based on the IMGM(1, 1)

Table 3 and the original data sequence show that the data from 2007 to 2010 follow a decreasing trend, whereas the 2010-to-2011 data follow an increasing trend. Therefore, the 2011 data is a singular datum. According to the aforementioned analysis, the GM and MGM produce larger errors than the IMGM. Therefore, the IMGM is applied for further calculations and predictions in this subsection.

Step 1. Construct and apply the GM, and obtain the time response sequence and the restored time response sequence as follows:

Thus, we have , which can be taken as the fitting value of the 2011 data.

Step 2. Eliminate the singular datum and replace it by its first fitting data . Then, the following new original data sequence is obtained:The 1-AGO values of this sequence can be presented as follows: Here, the new original data sequence exhibits a decreasing trend, which is different from the original data sequence. Thus, this sequence is more suitable to be used in the grey prediction models.

Step 3. Construct and apply the GM again based on this new original sequence. The second-time time response sequence and the second-time restored time response sequence are obtained as follows:

Step 4. Perform the metabolism process by eliminating the fitting data and metabolizing the prediction data , and then construct a metabolic data sequence , based on which construct GM to predict the new data in 2013. The calculation results are shown in Table 6.

Step 5. Compare the fitting and prediction errors, and then check the results which are shown and analyzed in Section 4.2.4.

4.2.4. Comparisons of the Three Grey Prediction Models

The original data and the fitting and prediction results previously provided through three grey prediction models are displayed in Figure 3 together.

Figure 3 exhibits the following conclusions:(i)If a singular datum exists in the original data sequence, the GM and MGM cannot effectively fit and predict the data.(ii)If the singular datum does not suitably fit, other fitting data after the singular datum will be inaccurate, their corresponding errors could be amplified, and the prediction results are deflected.(iii)The IMGM can improve the fitting and prediction capability by handling singular data, which would allow for its classification as an improved grey prediction model.

The residuals and relative errors of the three grey prediction models are compared, which are shown in Figures 4 and 5.

If the residuals and relative errors are near zero, then the prediction model has high forecast accuracy. Based on this principle and Figures 4 and 5, the following conclusions are derived:(i)The singular datum exerts an inappropriate influence on the GM and MGM and caused large residuals and relative errors in the prediction results.(ii)The IMGM can effectively transform the singular datum to a trend datum, thereby meeting the modeling prerequisite of the grey prediction model.(iii)The IMGM has good fitting and prediction results because it has less residuals and relative errors in the prediction process than the GM and MGM.

By distinguishing the fitting and prediction results, we divide the residuals and relative errors into four parts: fitting average residuals, prediction average residuals, fitting average relative errors, and prediction relative errors; their calculated results for this case study are shown in Table 7.

Table 7 demonstrates the improvements and optimization of the IMGM and shows that the IMGM has less fitting average residuals, prediction average residuals, fitting average relative errors, and prediction relative errors than other models. Therefore, the IMGM is a better grey prediction model than the GM and MGM in fitting and predicting small simple sequences with a singular datum. As such, it would be reasonable to select the IMGM to predict future data on sulfur dioxide emissions in China.

4.3. Further Prediction Based on the IMGM(1, 1)

Based on the results from Section 4.2.2, the prediction results of the sulfur dioxide emissions in China from 2014 to 2018 are shown in Table 8.

Table 8 shows five prediction results for sulfur dioxide emissions in China. These prediction results show no major changes in the sulfur dioxide emission in China over the next five years if the background emissions management and Chinese economic growth remain the same. These results also predicted that the amount of sulfur dioxide emissions in China would be tons in 2015 and will be tons in 2018, which in turn means that the sulfur dioxide emissions goal proposed in the Twelfth Five-Year Plan by the China State Council will be very difficult to achieve.

Recently, the China Statistical Yearbook (2014) was published by the National Bureau of Statistics of China. Thus, we have also incorporated and analyzed data on the sulfur dioxide emissions of China in 2014 and analyze it as follows.

Actual data on 2014 sulfur dioxide emission in China was tons; the data can be found in the China Statistical Yearbook (2014) or at http://data.stats.gov.cn/easyquery.htm?cn=C01. The prediction result based on the proposed IMGM is tons presented (see Table 8 in this paper). Therefore, the residual error is 39 × 104 tons and the relative error is just 1.98%. This is a good result that illustrates the effectiveness of the proposed model.

5. Conclusions

This study proposed an improved grey prediction model, the IMGM, by combining the fitting advantage of the GM in small sample sequences and the additional advantages of the MGM in new real-time data. Given this combination, the IMGM is able to overcome the limitations of both previous models when the predicted results are vulnerable at any singular datum. The IMGM is suitable in modeling a small simple sequence with a singular datum and in predicting recent data. It can be an effective tool to predict economic and social data when the data are small and include a singular datum.

A practical case study on the sulfur dioxide emissions in China from 2007 to 2013, as well as the development trend, was used to illustrate the improvements and advantages of the proposed model. The results from applying the three grey prediction models lead to the following conclusions.

(1) The singular data can seriously affect the prediction effect of the GM and MGM for the modeling prerequisite.

(2) The IMGM can transform a singular datum into a trend datum and thereby meet the modeling prerequisite of grey prediction models. It can also obtain better fitting and prediction results than other grey prediction models.

(3) The level of sulfur dioxide emissions in China will not decrease to the requisite level proposed in the Twelfth Five-Year Plan by the China State Council in 2015 if effective action is not taken immediately.

The IMGM considers only a small simple sequence with a singular datum and does not include more singular data, which is its limitation. Although likely to be a challenging undertaking, further studies should be conducted to address this limitation.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the Natural Science Foundation of China (nos. 71301141 and 71561026), Humanity and Social Science Youth Foundation of Ministry of Education of China (no. 13YJC630247), Science Foundation and Major Project of Educational Committee of Yunnan Province (no. 2014Z100), Applied Basic Research Programs of Science and Technology Commission of Yunnan Province (no. 2013FD029), Social Science Fund of Yunnan Province (no. YB2015087), and China Postdoctoral Science Foundation (no. 2015M570792).