Abstract

The nonlinear grey Bernoulli model, abbreviated as NGBM(1,1), has been successfully applied to control, prediction, and decision-making fields, especially in the prediction of nonlinear small sample time series. However, there are still some problems in improving the prediction accuracy of NGBM(1,1). In this paper, we propose a novel optimized nonlinear grey Bernoulli model for forecasting Chinaʼs GDP. In the new model, the structure and parameters of NGBM(1,1) are optimized simultaneously. Especially, the latest item of first-order accumulative generating operator (1-AGO) sequence is taken as the initial condition, then background value is reconstructed by optimizing weights of neighbor values in 1-AGO sequence, which is based on minimizing the sum of absolute percentage errors, and finally, we establish the new model based on the rolling mechanism. Prediction accuracy of the proposed model is investigated through some simulations and a real example application, and the proposed model is applied to forecast the annual GDP in China from 2019 to 2023.

1. Introduction

Macroeconomic monitoring, forecasting, and early warning are important branches of economic research and an essential prerequisite for scientific macroeconomic decision-making. Among the many macroeconomic indicators, the status of GDP is the most prominent. As an important indicator to measure the size of a country’s economy and market, it is the sum of values of all final products and services in a country or region for a certain period of time. Moreover, the growth rate of GDP can also reflect the growth of the country’s economy and national strength to a certain extent. Thus, accurately forecasting GDP data has been a hot topic in recent years.

Annual China’s GDP data should be regarded as a dynamic, nonlinear, and uncertain time series. According to the previous studies, many methods are used to predict time series problems. In general, these models can be divided into three categories, including the statistical model, intelligent model, and grey system theory. The statistical model includes linear regression [1], semiparametric and partially varying linear regression [2], nonlinear regression [3], and autoregressive moving average model (ARMA) [4]. The disadvantage of statistical models is that they rely too much on collecting data and estimating parameters. As for the intelligent model, such as artificial neural network [5] and support vector machine [6], the common drawback is that it requires lots of data to train the model. Grey system theory, firstly proposed by Deng [7], is developed to solve uncertain and unknown system problems; the grey prediction model is often named as GM based on grey system theory. Over three decades of development, the grey model has been widely studied, for example, according to disadvantages of the traditional self-adaptive intelligence grey predictive model (SAIGM), Zeng and Liu [8] proposed a novel SAIGM model with the fractional order accumulating operator called SAIGM-FO. In 2018, Zeng and Li [9] proposed and optimized a new unbiased grey prediction model (UGM(1,1)) based on extremely limited data on the output of shale gas. Some cutting-edge research can be seen in the studies by Xie et al. [10, 11], Ma et al. [1214], Zeng et al. [9, 15, 16], and other scholars. The new research results for grey prediction models appear continuously in the existing literature.

However, most of these models, considered as single-variant prediction model, are not suitable for time sequences with nonlinear characteristics, which can be obviously seen from their easy mathematical calculation [17] and first-order ordinary linear AGO; Xie and Liu [18] also found that the traditional grey model obtains a satisfactory result in the pure index sequences. To this end, professor Liu et al. [19] introduced a power exponent throughout the Bernoulli differential equation, but he only discussed a special case for the power exponent which is equal to 2, so this model is often called the grey Verhulst model. To futher expand applicability of the GBM(1,1) model, Chen [20] proposed a novel optimized GBM(1,1) model termed NGBM(1,1); in his research, the power index can be changed and adapted according to real-time sequence and determined by a computer program. Wang et al. [21] proposed an optimized NGBM(1,1) for forecasting the qualified discharge rate of industrial wastewater in China. He improved prediction precision by optimizing weighted parameters of background value in 2011. In addition, Pei et al. [22] proposed a new method combined with nonlinear squares (NLS) and nonlinear grey Bernoulli model. Wang [23] built a weighted nonlinear grey Bernoulli model (short for WNNGBM) for forecasting nonlinear economic time series with small datasets. To predict quarterly sales volume of the new energy vehicles industry from China, Pei and Li [24] put forward a data-grouping approach based on the NGBM(1,1) model. These models mainly discussed optimization of NGBM(1,1) model parameters. At the same time, the Nash NGBM(1,1) model, proposed by Chen et al. [25], simultaneously optimized background value and the power index. Since then, some scholars tried to optimize the NGBM(1,1) model through grey differential equation; for example, Liu and Xie [26] introduced Weibull cumulative distribution function into the NGBM(1,1) model, the proposed model is abbreviated as WBGM(1,1), and they also provided the new model has the advantage of NGBM(1,1) and Weibull cumulative distribution. To build a more general nonlinear grey forecasting model, Ma et al. [27] proposed a novel multivariate nonlinear grey Bernoulli model (NGBMC(1,n)), which can be considered as a combination of the NGBM(1,1) and the GMC(1,n) with different power parameters. More recent research results can be seen in [2831].

Although there are many obvious advantages among these models, it is easy to generate unaccepted error among these models in fact. To capture the nonlinear trend in annual GDP data from China and obtain an appreciate prediction accuracy, this paper proposes a novel optimized nonlinear grey Bernoulli model, and the main contributions can be summarized as follows:(1)The structure and parameters of the grey prediction model are optimized simultaneously in this paper. The latest item of 1-AGO sequence is taken as the initial condition, and the weighted of neighbor values in background value is determined by minimizing the sum of absolute percentage error.(2)The grey model is established based on the rolling mechanism, according to the principle of priority of use of new information.(3)The validity of this proposed model is verified by numerical examples and applied to forecast China’s annual GDP.

The rest of paper is organized as follows. In Section 2, we give a brief description of the NGBM(1,1) model. Section 3 introduces the novel optimized NGBM(1,1) model in detail. In Section 4, some simulations are conducted to verify the effectiveness of the proposed model. Section 5 compares the proposed model with others and applies the proposed model to predict GDP data from China in the next years. Section 6 gives some main conclusions and future work. The R program for simulation and prediction is listed in Algorithm 1.

#R program of simulation and prediction in the original NGBM(1,1) model
raw_data <- c(53.8580, 59.2963, 64.1280, 68.5992, 74.0060, 85.0754, 90.0309)
x <- raw_data [1:5] #training data
y <- x[−1] #Y
m <- length (x)
ago <- cumsum(x) #1-AGO sequence
B <- matrix(nrow = m − 1, ncol = 2)
n <- seq(−1, 0.99, 0.01) #Given range of parameter n
#compute all errors from different n
err <- NULL
for (i in 1:length(n)){
for (j in 1:dim(B) [1]){
  B[j, 1] = −0.5 ∗ (ago[j] + ago[j + 1])
  B[j, 2] = (0.5 ∗ (ago[j] + ago[j + 1])) ^ n[i]
 }#construct matrix B
a <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [1].
b <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [2].
x_simu <- NULL
for (k in 1:m){
  x_simu[k] = (b/a + (x [1] ^ (1 − n[i]) − b/a) ∗ exp(−(1 − n[i]) ∗
  a ∗ (k − 1))) ^ (1/(1 − n[i]))
 }
  #Compute the sum of absolute percentage errors
  err[i] <- sum(abs(diff(x_simu) − x[−1])/x[−1])
}
n = n[which.min(err)] #Optimal parameter n
for (i in 1:dim(B) [1]){
B[i, 1] <- −0.5 ∗ (ago[j] + ago[j + 1])
B[i, 2] <- (0.5 ∗ (ago[j] + ago[j + 1]))^n
}
a <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [1] #parameter a
b <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [2]#parameter b
for (k in 1:(m + 2)){
x_simu[k] = (b/a + (x[1] ^ (1 − n) − b/a) ∗ exp(−(1 − n) ∗ a ∗ (k − 1))) ^ (1/(1 − n))
}
x_pre <- diff(x_simu) #predicted values
#compute error
RPE <- (x_pre-raw_data[−1])/raw_data[−1]
MAPE <- sum(abs(x_pre-raw_data[−1])/raw_data[−1])/(length(raw_data) − 1)
#R program of the optimized NGBM(1,1) model
raw_data <- c(53.8580, 59.2963, 64.1280, 68.5992, 74.0060, 85.0754, 90.0309)
x <- raw_data[1:5] #training data & first rolling
#x <- raw_data[2:6] #training data & second rolling
y <- x[−1] #Y
m <- length(x)
ago <- cumsum(x) #1-AGO sequence
B <- matrix(nrow = m − 1, ncol = 2)
n <- seq(−1, 0.99, 0.01) #Given range of parameter n
alpha <- seq(0, 1, 0.01) #Given range of parameter alpha
#compute all errors from different n and alpha
matri_err < - list()
for (i in 1:length(n)){
 err <- NULL
for (j in 1:length(alpha)){
  for (k in 1:dim(B) [1]){
   B[k,1] = −alpha[j] ∗ (ago[k] + ago[k + 1])
   B[k,2] = (alpha[j] ∗ (ago[k] + ago[k + 1]))^n[i]
  }#construct matrix B
  a <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [1].
  b <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [2].
  x_simu <- NULL
  for (l in 1:m){
   x_simu[l] = (b/a + (s[m] ^ (1 − n[i]) −b/a) ∗ exp(−(1 − n[i])∗
   a ∗ (l − m))) ^ (1/(1 − n[i]))
  }
  #Compute the sum of absolute percentage errors
  err[j] <- sum(abs(diff(x_simu) − x[−1])/x[−1])
 }
 matri_err[[i]] <- err
}
matri <- do.call(cbind, matri_err)
dimo <- arrayInd(sort.list(matri) [1],dim(matri))
n = r[dimo [1]] #Optimal parameter n
alpha = alpha[dimo [2]] #Optimal parameter alpha
for (k in 1:dim(B) [1]){
B[k, 1] = −alpha ∗ (ago[k] + ago[k + 1])
B[k, 2] = (alpha ∗ (ago[k] + ago[k + 1]))^n
}#construct matrix B
a <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [1]#parameter a
b <- (solve(t(B)% ∗ %B)% ∗ %t(B)% ∗ %y) [2]#parameter b
for (l in 1:(m + 2)){
x_simu[l] = (b/a + (ago[m]^(1 − n) − b/a) ∗ exp(−(1 − n) ∗ a ∗ (l − m))) (1/(1 − n))
}
x_pre <- diff(x_simu) #predicted values
#compute first rolling error
RPE <- (x_pre-raw_data[−1])/raw_data[−1]
MAPE <- sum(abs(x_pre-raw_data[−1])/raw_data[−1])/(length(raw_data) −1)

2. The Description of the Existing NGBM(1,1) Model

In this section, we shall give a brief introduction of the NGBM(1,1) model. To adjust the traditional grey model to obtain the higher prediction precision, professor Chen [20] firstly proposes the NGBM(1,1) model; the modeling procedure of NGBM(1,1) model is described as follows.

Supposeis a nonnegative time sequence.

is constructed by the first-order accumulative generating operator, which is given bywhere .

The equationis called the basic grey differential equation of the NGBM(1,1) model, where is defined as the background value, and n is often called the power index.

The equationis called the whiten differential equation of the NGBM(1,1) model.

If we letthe model parameters a and b can be estimated by the least square method, which is

The general solution of NGBM(1,1) can be easily written aswhere C is an arbitrary constant. We set and , then we have

Substituting (8) into (7), the time response function of the NGBM(1,1) model is given by

To further acquire the discrete time response function, we set , then we have

The predicted value of is computed by the first-order inverse accumulative generating operator (short for IAGO), which is

3. The Optimized NGBM(1,1) Model

According to previous studies, related to grey forecasting model, it is shown that both structure and parameters have great influence on prediction performance. For this purpose, we propose a novel optimized nonlinear grey Bernoulli model based on the rolling mechanism.

3.1. Optimization of the Initial Condition

In the existing NGBM(1,1) model, the initial condition is set to be , the oldest data in the original sequence, this means all of the information expect first item is not fully used for the forecasting model. According to the principle of new information prior using, we choose the latest item of 1-AGO as the new initial condition.

Theorem 1. Assuming B, Y, and are same as mentioned in Section 2, the following conclusions can be summarized:(1)The solution of the whiten grey differential equation of the NGBM(1,1) model can be written as(2)The discrete time response function of the NGBM(1,1) model is given by

Proof. Firstly, we consider the general solution of the NGBM(1,1) modelBy setting , we obtainThen we haveThus, the time response equation is written asBy letting , it is easy to yieldTheorem 1 is then proved.

3.2. Optimization of Parameters

According to Wang et al. [21], it is obvious that prediction precision of the NGBM(1,1) model is directly influenced by the parameter α in the background value. The background value of both the GM(1,1) model and the other derived grey model are often considered as the approximate value of the true one representing the area of the integral region between the curve and the abscissa axis in the interval , which is mathematically expressed as

According to the mean value theory, we havewhere , generally, in the GM(1,1) model and its derived model. The predicted values, however, deviate from actual values in the original time sequence with large fluctuations. Meaning prediction accuracy of these models loses effectiveness in strongly fluctuating sequences.

Here, the model parameters a and b can be estimated with the unknown α by least square method, which can be written as

After simplification, we have

Thus, a and b are given as

In this study, the parameter α is automatically determined by minimizing the sum of mean absolute percentage error generated from the NGBM(1,1) model, which can be considered as the optimization problem, we thus havewhere and are the actual values at time k and corresponding prediction, respectively.

In the NGBM(1,1) model, the whole datasets are used for prediction. However, constructing the model based on the rolling mechanism has been widely employed into various grey models in recent years, which can be seen in [32, 33]. In the proposed NGBM(1,1) model, is used for forecasting , when prediction is found; the oldest data shall be deleted and the latest data should be added into the modeling data, that is, is used for forecasting . Repeat the above procedure to find all the data.

3.3. Modeling Evaluation Criteria and Detailed Modeling Procedures

In order to compare the proposed model with other commonly used grey models, the three statistical indicators are collected for this study, including relative percentage error (RPE) and mean absolute percentage error (MAPE), which are defined asrespectively. In addition, the criteria of MAPE for prediction error are shown in Table 1.

Computational steps of the optimized NGBM(1,1) model can be summarized as follows:Step 1. Given the original time sequence , calculate the first-order accumulative generating operator sequence using (2).Step 2. Construct the matrices B and Y and estimate the parameters a and b especially with unknown coefficient α using (5).Step 3. Substitute a and b into (12), find the optimal parameters α and power index n using (17).Step 4. Re-estimate the parameters a and b using (5).Step 5. Compute .Step 6. According to the first-order inverse accumulative generating operator, compute the predicted value using (10).

4. Verification of the Optimized NGBM(1,1) Model

To demonstrate prediction precision of the proposed model, we also establish the traditional NGBM(1,1) model, and the output value of optoelectronic components and applications in Taiwan are taken as examples in this section. More details are available in [4, 17].

The parameters a, b, n, and α of the two models shall be estimated by employing the least square method and by minimizing the sum of absolute percentage errors, respectively. n is empirically provided in [−1, 0.99]. The track of seeking parameters is detailedly shown in Figure 1, and the simulative results of the two models are listed in Tables 2 and 3.

It is a remarkable fact that, in Figure 1, both n and α will lead the sum of absolute percentage errors to be missing or unacceptably large (e.g., ) in extreme cases. To ensure the graph is clear and smooth, we set the absolute percentage error as 1 when it is missing, and we set the absolute percentage error as 10 when it is larger than 10.

According to Table 2, the optimized NGBM(1,1) model with has a better prediction performance than that of the original NGBM(1,1) model in optoelectronic components. According to Table 3, prediction accuracy of the optimized NGBM(1,1) model with is higher than that of the NGBM(1,1) model in optoelectronic application. In the optimized NGBM(1,1) model, MAPE are reduced from 4.82% and 4.10% to 4.28% and 3.64%, meaning the optimized NGBM(1,1) model is more effective and applicable than the original NGBM(1,1) model.

5. Application

5.1. Data Source

In this section, we consider the annual GDP data from China, which can be downloaded from the official website of the National Bureau of Statistics of China (http://www.stats.gov.cn/English/) as listed in Table 4. The data from 2012 to 2016 are used for train models, and data from 2017 to 2018 are used for testing models.

5.2. Forecasting Results

To further demonstrate effectiveness and applicability of the optimized NGBM(1,1) model (short for the new model), we also establish the other comparative models, including the NGBM(1,1) model (written as M1), NGBM(1,1) model of initial condition optimization (written as M2), NGBM(1,1) model of background value optimization (written as M3), and NGBM(1,1) model of initial condition and background value optimization without the rolling mechanism (written as M4).

In M1 and M2, the unknown parameters include a, b, and n. In M3, M4, and the new model, in addition to the parameters above, there is an unknown parameter α. By calculating, the model parameters of the five models are shown in Table 5, and results of simulation and prediction are listed in Tables 6 and 7; the results of simulation and prediction also can be clearly seen in Figure 2. The R program for calculating model parameters, simulating, and predicting results are shown in Algorithm 1.

According to Tables 6 and 7, we can easily observe the five grey models are basically close to the actual values in training period, that is, the relative error values generated from the five models are quite ideal, within the range [−0.45%, 0.24%]. Those of the five model are, however, generally more than 1% in the testing period; despite this, the prediction performance of the five models are still considered excellent, which can be referred in Table 1. It means the NGBM(1,1) model has a good performance on predicting the small sample time sequences with nonlinear characteristics.

We can further obverse that the MAPE of M1, M2, M3, M4, and new model are 1.55%, 1.54%, 1.48%, 1.45%, and 0.65%, respectively; this means optimizing the initial condition or optimizing the background values of the original NGBM(1,1) model can enhance the prediction accuracy of the original NGBM(1,1) model, while optimizing both parts at the same time has a better performance. The principle of new information priority is also seen by comparing the MAPE of M4 and that of the new model. The computational results show the values predicted by the optimized NGBM(1,1) model are considerably closer to the actual values than those of the other models, and the optimized NGBM(1,1) model has the best prediction performance among these models.

5.3. Forecasting China’s GDP from 2019 to 2023

Given the optimized NGBM(1,1) model shows the best performance in the sections above, we apply the optimized NGBM(1,1) model to forecast China’s GDP data in the next five years. The predicted values and increasing rates are displayed in Figures 2 and 3.

It is not difficult to find from Figure 2 that annual GDP of China will continue to grow rapidly in the next five years. The annual GDP will grow to yuan. According to Figures 3 and 4, we can observe that the increasing rates of annual GDP of China will be increasing and decreasing simultaneously. In summary, China’s GDP will continue to grow steadily. At the same time, we must also see that the economic operation is stable and changeable, and the external environment is complicated and severe. The economy is facing a downward pressure. The problems in progress must be addressed in a targeted manner.

6. Conclusion and Future Work

In this paper, a novel optimized NGBM(1,1) model has been proposed to forecast the annual GDP of China. The numerical results of simulation and real example have implied that the optimized NGBM(1,1) model has more excellent performance on forecasting the annual GDP than the other commonly used models. Then, the main conclusions are listed as follows.(1)The novel optimized model can be considered as an extension of the NGBM(1,1) model, but the modeling procedure differs from that of the NGBM(1,1) model. Instead, we take the latest item of 1-AGO sequence as the initial condition, according to the principle of new information before using. Secondly, the parameter in the background value is automatically determined by minimizing the sum of absolute percentage errors. Finally, the model based on the rolling mechanism is established to further increase prediction precision of the grey model.(2)As the optimized NGBM(1,1) model shows the best performance among the existing models, we apply it to forecast the annual GDP of China in the next five years; it is obviously believed that the annual GDP will increase in the next few years, and the increasing rate also demonstrates the annual GDP will grow steadily.

However, there still exist some issues which should be discussed and solved in the future work, for example, it should be noticed that the size of sample time sequence used for predictive modeling is artificially defined in this paper; however, it is studied that a changeable sample size has a different effect on the grey system model [34]; therefore, how to scientifically determine the sample size should be discussed in future work. In addition, it can be obviously seen that the annual GDP in China is influenced by other factors, such as consumption, investment, and export-import, which are changeable and complex; in this paper, the modeling procedure of the optimized model is based on the sample time sequence themselves, that is, other information that cannot be ignored is not effectively used; therefore, the multivariate grey models will be concentrated in the future work.

Data Availability

The data used to support the findings of this study are deposited at http://www.stats.gov.cn/english/.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

Tao Zhang’s research was supported by the National Natural Science Foundation of China (Grant nos. 11861014 and 11561006) and Natural Science Foundation of Guangxi (2018JJA110013). Chengli Zheng’s research was supported by the Humanities and Social Science Planning Fund from Ministry of Education (16YJAZH078), the Fundamental Research Funds for the Central Universities of China (Grant nos. CCNU 19TS062, CCNU19A06043, and CCNU19TD006), and the raising initial capital for High-Level Talents of Central China Normal University (30101190001).