Abstract

Accurate estimations can provide a solid basis for decision-making and policy-making that have experienced some kind of complication and uncertainty. Accordingly, a multivariable grey convolution model (GMC (1, n)) having correct solutions is put forward to deal with such complicated and uncertain issues, instead of the incorrect multivariable grey model (GM (1, n)). However, the conventional approach to computing background values of the GMC (1, n) model is inaccurate, and this model’s forecasting accuracy cannot be expected. Thereby, the drawback analysis of the GMC (1, n) model is conducted with mathematical reasoning, which can explain why this model is inaccurate in some applications. In order to eliminate the drawbacks, a new optimized GMC (1, n), shorted for OGMC (1, n), is proposed, whose background values are calculated based on Simpson’ rule that is able to efficiently approximate the integration of a function. Furthermore, its extended version that uses the Gaussian rule to discretize the convolution integral, abbreviated as OGMCG (1, n), is proposed to further enhance the model’s forecasting ability. In general, these two optimized models have such advantages as simplified structure, consistent forecasting performance, and satisfactory efficiency. Three empirical studies are carried out for verifying the above advantages of the optimized model, compared with the conventional GMC (1, n), GMCG (1, n), GM (1, n), and DGM (1, n) models. Results show that the new background values can effectively be calculated based on Simpson’s rule, and the optimized models significantly outperform other competing models in most cases.

1. Introduction

Grey system theory has gained extensive attentions from worldwide researchers and has been successfully used in many fields with favorable outcomes since it was designed by Deng in 1982 [13]. This theory is capable of addressing issues characterized by uncertainty, insufficient information, and limited data points, thereby providing strong technical support for uncertain analysis [4, 5]. As the most important part of this theory, grey prediction approach and its extended versions have captured enormous attention owing to their capability to provide accurate forecasts, especially under the situation that facing sparse data and poor information [6, 7]. In contrast to the intelligent techniques and traditional statistical models, which require a large amount of data for model calibration, grey models only need limited data points to estimate the system behavior.

Among the grey-model groups, the model, of which means one order of differential function and stands for the number of total variables, gains its reputation and operates as a typical forecasting technique due to its capability to deal with the system prediction issues that are influenced by a range of relevant factors. Unfortunately, as the solutions to the whitening function for are rough and incorrect [8], it is prone to produce large errors in applications. Thereby, since its introduction, many researchers carried out studies for further improve its prediction performance in terms of the background values [9], discrete variants [10], model structures [11], fractional model [12], and time-delayed versions [13]. Out of these, a novel grey prediction model with convolution integral, namely, , was designed by Tien [14] with a view to avoiding the incorrect solutions of the model. As Tien presented, a convolution integral exists in the analytic solution to the whitening function of , and the Trapezoid formula is used to discretize the convolution integral. Many cases have proved that the model outperforms the typical . Moreover, for further adapting to the sequences having various characteristics, a range of variants derived from the model is put forward, such as [1519] , , , , and kernel-based nonlinear [20].

In addition to these extended versions, higher accuracy grey approaches are typically obtained by reconstructing the background values, which are of great importance to the accurate parameter estimations. Essentially, the background values are used to smooth data to reduce randomness further [9] and customarily are represented by the average values of adjacent neighbor 1-AGO sequences, namely, . Detailed discussions on the reasonability and practicability have been carried out from diverse perspectives [1, 21, 22]. However, the rough estimations of background value are prone to give rise to the inconsistency of the grey differential and continuous equations, which results in the inaccurate forecasting performance of the target model. In order to improve the model’s performance, Wang and Hao [23] incorporate the dynamic coefficients into the structure of background values in , namely, the background values for the system behavior sequence and the background values for the relevant sequences . Subsequently, for the purpose of further enhancing precision, various heuristic intelligent techniques are employed to determine the background value coefficient, including Particle Swarm Optimization [2426], Genetic Algorithm [27, 28], Ant Lion Optimizer [29], and Ant Colony Algorithm [30].

Though the previous improvements in background values with the support of those heuristic intelligent techniques lead to the improvements of forecasting ability, they also result in the low-level, simply-repeated, and complicated computations of grey models. Meanwhile, it is troublesome and difficult for beginners or inexperienced users to choose a suitable heuristic intelligent algorithm for solving actual forecasting problems. Therefore, proposing an accurate forecasting grey model that has a simple structure and easy-understanding computations with a focus on the background values is a very valuable work. General consensus gotten from the literature was that accurate background values are regarded as a crucial requirement to ensure the reliability and practicability of the model [23, 24], and the forecasting precision is heavily dependent on their correct computations. Supporting this argument, one of the major contributions in this paper stems from the mathematical derivation of the actual background values, finding its differences from the estimated one in the conventional model. Another equally important merit of this paper is to design an optimized , whose background values are reconstructed and calculated based on Simpson’s rule. Moreover, the new model’s extended version is also derived by using the Gaussian rule, instead of the Trapezoidal rule, to calculate the whitening function with convolution integral, namely, . Subsequently, three cases of diverse subjects, including tensile strength of a material, China’s gross industrial output value, and electronic waste in Washington State, are utilized for verifying the efficacy and robustness of the proposed models. Empirical results, the optimized background values, based on Simpson’s rule can act as a supplementary for the current estimation methods of grey models. Furthermore, the model having optimized background values performs best among all competitors. The detailed comparisons will be elaborated in Section 3.

This work has the following structure. The research methodology is presented in Section 2. In this section, the focus is on the previous procedures and shortcomings of the model. Subsequently, the optimized and models are elaborated on. Section 3 is dedicated to three different cases for depicting the accuracy and reliability of the proposed models in comparison with a range of competitors. Conclusions and discussions are provided in the last Section 4.

The nomenclature utilized in this paper is presented below and detailed explanations related to the formulas employed in each technique are given in the following sections.

2. Methodology

2.1. Overview of the GMC (1, n) Model

The model has gains popularity in practical applications because of its accurate estimations and simple structures. Many researchers such as Tien [1417], Ma et al. [31, 32], and Duman et al. [24], have elaborated the procedures for this model, which can currently be outlined below:Step 1: collect the original data. Assume that is a data sequence of a system’s characteristic variable, short for the system behavior sequence, and are the data sequences of relevant factors, abbreviated as the relevant series. In the series, represents the number of entries for establishing the model, stands for the delayed period, and is the total number of variables. Moreover, these two categories of series are nonnegative and equally spaced over time.Step 2: generate the by using the first-order accumulated generating operation . Then the accumulated sequences of the original data can be produced by usingAccordingly, the sequences of the system behavior can be obtained asAnd the sequences of the influencing factors areStep 3: build the model. Its first-order whitening equation is presented bywhere and represent the system’s development coefficient and grey control parameter, respectively. Additionally, is known as the driving term, and are the driving coefficients corresponding to the relevant factor sequences , respectively.The grey derivative in equation (5) can be approximately estimated byIn addition, the background values of and are usually taken asHowever, recent studies [9, 23, 24] has pointed out that, the estimated values of the background value may be unequal to the actual values, explained in Section 2.2. Hence, based on the background values, the first-order grey differential equation can be formulated asStep 4: estimate the parameters by using the Ordinary Least Squares method (). Substituting the values of into equation (9), we can obtainIn this matrix form, , whereBy solving the matrix , the estimated values of the parameters can be calculated:Step 5: obtain the response function for projections. As introduced above, the whitening equation of the model is displayed in equation (5). By substituting the parameter values into equation (5) and opting the as the initial condition, the time response function can be obtained by solving the whitening equation in equation (5), which will be presented aswhereAs Tien [14] introduced previously, the Trapezoidal rule is originally used to discretize the convolution integral in equation (13) in the traditional model, and subsequently, the response function for sequences can be determined by using the following equation:where the unit step function in equation (15) is formulated byUnfortunately, due to the traditional model’s poor performance, the Gaussian rule gains more popularity than Trapezoidal equivalent in recent studies for discretizing the convolution integral [17, 31]. Consequently, the Gaussian rule is used in this paper as well. Then, the discrete response function can be presented below:where the unit step function in equation (17) is formulated byFor convenience, the model whose convolution integral is discretized by using the Gaussian rule is remarked as .Step 6: generate the simulative and predictive values in the original domain. Through equation (14), we will get the fitted and predicted values of the sequences, which normally are considered as intermediate sequences. Subsequently, the recursive function in the original domain can be given by utilizing the method, which is defined aswhere are called simulative values, and are named as predictive values.

As outlined previously, when the and satisfy , the is reduced to the traditional model. Moreover, when and , the model becomes the original model. Therefore, the model is extensions of the and models, indicating that the model has wider application fields.

2.2. Drawback Analysis of the GMC (1, n) Model

Although the popularity of has been increasing steadily, some inherent drawbacks still exist, which may generate unacceptable forecasting errors. After reviewing the procedures of the model, we can find that two core components of this model are the model establishment and parameter estimation, which refers to equations (5) and (9), respectively. Especially for the parameter estimation, the accuracy of the estimated parameters is the precondition of obtaining the response function in equation (5). Specifically, equation (5) is used for predicting the system behavior sequence on the basis of the known parameters , which are estimated by equation (9). Hence, prior to obtaining projections, the optimal or near-optimal parameters in the and models are essential for better forecasting precision, which are significantly influenced by the background value.

To expose the inherent weakness of the model, the actual meaning of background value needs to be further analyzed by comparing the grey differential equation in equation (9) and whitening equation in equation (5). To this end, definite integrals of equation (5) will be given on the interval , which is displayed aswhere

Substituting equations (21) and (22) into equation (20), the discrete form of can be obtained as follows:

In equation (23), the real background values in both sides of the equation are noted as and . Then, comparing equation (23) with equation (9), we find that the actual background values of and are approximately represented by the average values of adjacent neighbor sequences in equations (7) and (8) for the model. However, they are not always mathematically equivalent. Accordingly, the gap between the actual background values and the approximate ones produces the inaccurate parameter estimations, which exert a negative influence on the predictive performance of the whitening equation in equation (5).

To address the weakness, Wang and Hao [23] put forward a new formula to calculate the background values based on the Mean Value Theorem of Integrals, which can be expressed aswhere represents the dynamic interpolation coefficients. Obviously, equations (24) and (25) are equivalent to equations (7) and (8) when , which infers that the optimized model having dynamic background values in [23] has better adaptability than the original one having fixed interpolation coefficients. To determine the dynamic interpolation coefficients, metaheuristic algorithms that have complex structures and require well-versed computer skills, such as Particle Swarm Algorithm and Genetic Algorithm, are necessary for combining with the model.

Although the optimized model in [23] provides an effective solution to avoid the deviations between the actual and estimated background values, potential difficulties in using this new model may increasingly arise, especially for beginners, due to the introduction of the complex optimization algorithms. Furthermore, these metaheuristic algorithms may not generate the satisfactory parameter estimations because they can fall into the local optima stagnation, consequently failing to find the actual global optimum.

Therefore, to effectively and accurately estimate the parameters in the model, an alternative method based on Simpson’s rule is proposed for accurately measuring the background values in this paper. This algorithm, elaborated on in Section 2.3, has merits in simple structure and is easy to understand for beginners. Furthermore, it can also significantly improve the accuracy of the and models, which will be demonstrated by several experiments in Section 3.

2.3. A New Optimized GMC (1, n) Model Based on Simpson’s Rule

In order to overcome the shortcomings of the above approaches to estimating background values, mentioned in Section 2.2, a new method based on Simpson’s rule is employed to calculate the background value in the model. The flowchart of the new optimized model is displayed in Figure 1. Furthermore, the background values in [14, 23] are also presented in Figure 1 in comparison with the new model.

Simpson’s rule, named after the English mathematician, Thomas Simpson (1710–1761), is a numerical approach to approximate the integration of a function. This method is particularly useful when integration is difficult or even impossible to do by utilizing standard techniques. The most commonly used function of Simpson’s rule is provided as

As introduced in Section 2.2, the actual background value can be estimated by computing the integration and , instead of the average values of adjacent neighbor sequences in equations (7) and (8). In order to be consistent with the forms of Simpson’s rule in equation (26), the grey differential equation that is used for parameter estimation can be obtained by considering the definite integrals of equation (5) on the interval , which is presented aswhere

Hence, substituting equations (28) and (29) into equation (27), the grey differential equation is mathematically equivalent to

According to Simpson’s rule in equation (26), we can get

Substituting and simplifying, equation (30) mathematically equals to

Simplifying, we can get this:

Assume that

Subsequently, substituting equation (35) and the values of into equation (34), we can obtain

Equation (36) can be written into the matrix form , where

Solving this above matrix, the estimated values of parameters will be formulated:

Similar to the and models, we can also achieve two optimized models by using the Trapezoidal and Gaussian rule to discretize the convolution integral in equation (13). Specifically, substituting the parameter values into equations (14) and (17), the discrete response function can be obtained:

Where the unit step function and function in equations (39) and (40) are formulated by

For convenience, these two models in (39) and (40) are remarked as and models, respectively. Subsequently, by utilizing the method, the recursive function in the original domain is defined as

Where are called simulative values, and are named as predictive values.

Based on Simpson’s rule, the background values of the model can be accurately calculated, which is quite different from those in [14, 23]. Moreover, Figure 1 provides the detailed procedures of the optimized model, which is helpful for beginners to use this model.

2.4. Performance Evaluation

In addition to the modeling procedures of the model, the measuring indicators are also essential for evaluating the model’s performance. In this study, three different indices will be adopted to examine the performance of competing models, including (Absolute Percentage Error), (Mean Absolute Percentage Error), and (Root Mean Squared Error).

The algorithms of the three indices are calculated using the following equations:where is the actual value at time , and the is the corresponding simulative or predictive value at time .

3. Experiments and Analysis

In this section, for the sake of demonstrating the practicability and efficacy of the new optimized model, three empirical studies are carried out in comparison with traditional multivariable grey prediction models.

3.1. Case One: Predicting the Tensile Strength of a Material

The multivariable grey prediction models, such as and , can be used for the indirect measurement of the tensile strength of a material [14], the efficacy of the and models can be revealed by comparing the experimental results of predicting tensile strength of a material in comparison with the , , , and models. The original data are presented in Table 1, which were collected from the experimental observations in Table 117 of Samuel [33]. In addition, the tensile strength represents the system behavior sequence , and the Brinell hardness works as the relevant series . Moreover, six data points of and are utilized for model calibration, and the remaining four observations of are used for checking the predictive performance, namely, setting parameters ,, , and [33].

As introduced in Sections 2.1 and 2.3, the procedures and flowchart of and its extensions are presented, and the procedures of and are similar to those of . Thus, according to these procedures, the estimated parameter-values of six competing models are given in Table 2.

Taking as an example for explanations, we can obtain the parameter values after substituting the original data into equations (35) and (36). Subsequently, substituting the parameter values into equation (39), we will have the discrete response function:where and .

By using the time response function in equation (44), the forecasted results and the values of three indices measuring accuracy will be obtained, which are shown in Table 3. It is obvious that four and its extensions are superior to the traditional and whose values reach 24.94% and 55.91%, respectively. The reason for this phenomenon is that has an incorrect time response function and has narrow application fields. In contrast, the and models are able to provide accurate forecasts because their modeling values with convolution integral are exact theoretically [14]. However, the and models seem to obtain a litter better forecasting performance owing to smaller values, 2.48 and 2.52, respectively. Furthermore, the values of these two models are much smaller than the other four competitors, inferring good stability and reliability of the optimized background values based on Simpson’s rule.

In general, the , , , and models share almost the similar precision in predicting the tensile strength, and they are equally applicable for further estimations. Additionally, the optimized background values based on Simpson’s rule can effectively contribute to the improvements of forecasting precision.

3.2. Case Two: Estimating the Gross Industrial Output Value

In addition to the numerical example from Tien’s study [14], another actual case concerning forecasting China’s industrial output value, collected from Ma and Liu [31], is selected to test the effectiveness and practicability of the proposed models. The raw observations, listed in Table 4, are collected from the National Bureau of Statistics of China, who published correct, official, and complete data. Apparently, the industrial output value is the system behavior sequence, marked as . Then, the total current assets and fix assets represent the relevant series, noted as and . The ,, , , , and are employed for forecasting China’s industrial output value with parameter setting , , , and . In other words, ten data points of the three variables are used for estimating parameters, and the last three observations of the system behavior series are selected for testing the predictive performance.

Taking the model as an example for explanations, we can obtain the parameter values by utilizing equations (27)–(36). Subsequently, substituting the parameter values into equation (39), the discrete response function will be given:where and .

Additionally, the parameter values of the other five competing models are also can be estimated, which is listed in Table 5.

Subsequently, by using the discrete response function, the forecasted results and values of the evaluation indicators will be obtained, which are displayed in Table 6. As this table illustrates, similar conclusions can be achieved: the ,, , and models significantly outperform the and models. Moreover, the values of the and models are 26.52% and 46.22%, which means these two models are unable to forecast future values of China’s industrial output values due to their bad performances. On the other hand, the grey prediction models with convolution integral are able to provide very competitive results in terms of improved accuracy.

As anticipated, with its 5.4% values and 39155.02 values, the model provides the highest prediction accuracy followed by , , and , respectively. The reasons for such an excellent performance of this optimized model are mainly provided from two perspectives. On one hand, the Gaussian rule that discretizing the convolution interval is generally more accurate than the Trapezoid rule, which can also be verified by other studies, such as Ma and Liu [31], Tien [15], and Wang [18]. On the other hand, the improvements of the background values based on Simpson’s rule are equally important to enhancing its forecasting capability, because the optimized background values are crucial requirement to ensure the reliable and accurate parameter estimations, explained in Section 2.2. Similar findings can also be obtained from the comparison of the and models, of which the accuracy rate is heavily dependent on the optimization of background values.

Generally, the model obtains the highest forecasting ability leading itself as the optimal technique for the projections of China’s industrial output value. Moreover, Simpson’s rule can be employed for calculating the background values with high precision, which is essential for parameter estimations.

3.3. Case Three: Forecasting Electronic Waste in Washington State

The empirical results of cases one and two demonstrate the superior forecasting performance of the and models having optimized background values based on Simpson’s rule, in comparison with other competitors. Furthermore, the model achieves a litter higher accuracy than the model. Therefore, in addition to the above two examples coming from the published papers, another new case concerning electronic waste is introduced to verify the proposed model’s efficacy and reliability in the following subsections.

3.3.1. Data Description and Model Calibration

As introduced in earlier work [34, 35], e-waste has increasingly become one of the major contributors of municipal solid waste (MSW), whose commonly used influencing factors are the population density and median household income. In this paper, the original data sets of e-waste from 2003 to 2015 are available from the government website (ecology.wa.gov), whereas data since 2016 are unavailable. In addition, observations of these two factors are collected from the Washington State Office of Financial Management website [36, 37]. All the available data of these three variables are presented in Table 7. Moreover, in Table 7, data sets from 2003 to 2012 are used for establishing six models, whereas observations from 2013 to 2015 are employed for comparison. In the GMC (1, n) and its extensions, the parameters are used as , ,, and .

All computations are executed on MATLAB 2016a and the parameter estimations are given in Table 8. Then, we take the model as an example for explaining the main procedures. By using equations (1)–(4) and (27)–(36), we can obtain the estimated parameters, namely, . Furthermore, the discrete response function will be displayed below:where and . Subsequently, the visual comparison of original and forecasted values is shown in Table 9.

3.3.2. Comparative Analysis of the Forecasting Performance by Six Competing Models

As seen in Table 9, notice that and produce inaccurate results with 42.55% and 68.51% values, respectively. Hence, these two models fail to predict the e-waste amount. In contrast, the and models, as expected, provide obvious improvements in the forecasting precision. The reason for this situation is that the modeling values by using conventual integral in these two models are theoretically correct, while those in the model are wrong [8]. In addition, the Gaussian rule performs a little better than the Trapezoid rule because the model achieves higher predicting accuracy.

Compared with the previous and models, the and models that have new optimized background values based on Simpson’s rule are able to produce much more accurate predictions in terms of both and values. Thus, Simpson’s rule can work as an efficient alternative technique for calculating background values, which has good capability to enhance prediction accuracy. Moreover, the model achieves the highest predicting accuracy with 11.67% values and 8019.37 values. Therefore, the e-waste amount in Washington State from 2016 to 2020 is predicted by the superior model, namely, .

3.3.3. Future Projections of e-Waste via OGMCG (1, 3)

The future e-waste estimations can only be forecasted under the premise of known future values of the population density and median household income. For the future values of the population density, the Washington State Office of Financial Management has provided the trends on the website [38], which is given in Table 10. For the median household income in Table 10, its observations from 2016 to 2017 are available from the above website [36], and its future estimations from 2018 to 2020 are predicted by using the model [39], which is a reliable and practical technique for univariate predictions.

Initially, data sets of the median household income from 2003 to 2017 are used for model calibration. Then we will obtain the parameters , and the corresponding time response function is given below:

Subsequently, estimated values from 2018 to 2020 are given in Table 10. By the way, the value in the in-sample period from 2003 to 2017 reaches 3.14%, inferring excellent forecasting capability of the model.

Once we have the values of these two relevant factors from 2016 to 2020, the future projections of the e-waste amount can be forecasted by using the model. following equations (1)–(4) and (27)–(36), the forecasts from 2016 to 2020 will be calculated in Table 10.

Furthermore, by using the model, the visual comparison of the real observations and predicted values from 2003 to 2020 are presented in Figure 2. As it can be observed from Figure 2, the predicted values fit the real observations well in the in-sample period, and further projections are likely to remain increasing from 2016 to 2020. Thus, the optimized can act as an appropriate tool for e-waste forecast in this paper.

4. Conclusions and Discussions

In summary, accurate future estimations play an essential role for decision-makers in framing and implementing sensible plans and policies. Thus, a grey prediction model with convolution integral, shorted for , is introduced alongside with its improved variants . These two models have been applied to solve various prediction issues by many researchers because of their correct solutions to the whitening function. However, inaccurate methods to calculate the background values may incur large deviations of the parameter estimations, which are essential for accurate future projections. Therefore, mathematical analyses of the gap between the actual and estimated background values are discussed in detail. Moreover, Simpson’s rule is introduced to compute the background values. Subsequently, the and models are designed by utilizing the optimized background values. Then, a comparative analysis via utilizing the same data points is run to analyze the performance of the proposed models in the three case studies. Through the discussion and analysis of the experimental studies, several conclusions can be drawn:(i)The new proposed and models generally outperform the previous , , , and models because of the optimized background values. Moreover, the model is normally superior to due to the accurate Gaussian solution.(ii)The optimized background values based on Simpson’s rule can significantly enhance the forecasting capability of the previous models. Additionally, the reconstructed background values have merits in the simplified structure, high reliability, and wide applicability, which have been proven in three cases. Most importantly, the new method can improve the accuracy of model calibration avoiding the unsatisfactory errors generated from the discrete function to the continuous one.(iii)The model rises as a more viable option for future projections of e-waste amount in Washington State owing to its accurate and reliable performances in three cases. Moreover, the future e-waste amount remains increasing for the upcoming years.

A limitation of this paper is caused by the linear structure of the proposed model, which may produce errors when dealing with the nonlinear sequences. Therefore, future work will be carried out by proposing a nonlinear model with optimized background values based on Simpson’s rule.

Nomenclature

AGO:Accumulated generating operation
IAGO:Inverse accumulated generating operation
GM (1, n):Grey model having one order and n variables [8]
DGM (1, n):Discrete grey model having one order and n variables [40]
GMC (1, n):Grey model with convolution integral based on trapezoidal rule [14]
GMCG (1, n):Grey model with convolution integral based on the gaussian rule [31]
OGMC (1, n):Optimized grey model with convolution integral based on the trapezoidal rule
OGMCG (1, n):Optimized grey model with convolution integral based on the gaussian rule.

Data Availability

The data utilized to support the findings are available in the above context or can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was sponsored by the National Natural Science Foundation of China (nos. 71901191, 71971154, 71701024, and 71771119) and Project of the Philosophy and Social Sciences in Hangzhou (M20JC086).