Abstract

In order to solve the problem of long product lead time, accurate demand forecasting for space science payload components is of great significance to the development of China’s space science industry. In view of the unsteady, nonlinear, and small sample characteristics of space science payload component demand, this paper proposes the EEMD-CC&CV-MPSO-SVR model to predict the future demand of space science payload components. First, this paper effectively adopts EEMD to decompose the normalized demand sequence and analyze the stationarity of each subsequence. The sequence complexity is distinguished by sample entropy, and the optimum kernel function CC-MPSO-SVR and CV-MPSO-SVR prediction models are established for high-complexity and low-complexity sequences, respectively. Finally, the prediction results of each subsequence are ensemble to form a total prediction. Experimental results shows that the model proposed in this paper performs better than single benchmark models and other hybrid models in terms of prediction performance and robustness. It can effectively predict the quantity and trend of the demand for China’s space science payload components, which provide decision-making basis for the government to formulate policies, demand-side procurement, and supply-side inventory control.

1. Introduction

With the development of China’s space science, space science payload becomes the focus of widespread attention in the industry. Space science payload is equipment loaded on spacecraft platforms to achieve experimental goals such as on-orbit space science research, space exploration, and verification of new space technologies [1]. Generally, space science payload has the characteristics of international leadership, diverse needs, and strong exploratory nature in multiple disciplines. The components represented by the central processing unit (CPU), memory, diodes, and resistors are the basic units of space science payload supply chain management, which are the foundation and key to ensuring the realization of space science mission goals.

However, the current supply chain of China’s space science payload components has shortcomings, and the management method is relatively backward. The uniqueness of the demand for components in the space science payload has led to the widespread problems of long product lead time and delayed arrivals in the component supply chain. This has affected the development schedule of space science payloads. It further affects scientists’ chances of achieving scientific results and occupying commanding heights in related fields.

The main reason for this problem is that the space science mission has the characteristics of discontinuity and unsteady state, which leads to the uncertain and nonlinear characteristics of the demand for space science payload components. In addition, component manufacturers have a long supply cycle in accordance with the “demand order serial traction scheduling” model, there is no “demand” macro-control measures on both the development and use of components, and the less historical data scattered on each supplier leads to difficulties in forecasting market demand.

This paper accurately predicts and analyzes the total demand and trend of space science payload components. (1) It is helpful to provide management departments with reference for financial budget decision-making based on the total demand. (2) It is helpful to guide the government to plan and adjust related policies in advance and correctly guide the arrangement of purchase plan for the demand side and the formulation of inventory strategy for the supplier according to the development trend and other obvious signals. (3) It is helpful for managers to identify and warn of supply risks and do a good job of risk control and emergency management in advance.

Therefore, this paper solves the problem of demand forecasting for space science payload components and scientifically predicts the future demand quantity and trend of space science payload components, which is of great significance to promote the output of China’s space science achievements and strengthen the supply chain management of space science payload components. It is imperative to establish a set of forecasting methods that meet the characteristics of unsteady, nonlinear, and small sample demand for space science payload components.

The main contributions of this research are as follows: (1)Aiming at the unsteady state, nonlinear, and small sample data characteristics of space science payload components, this paper introduces the “decomposition-ensemble” idea and proposes an EEMD-CC&CV-MPSO-SVR comprehensive model, which decomposes the unpredictable time series with violent fluctuations into relatively stable time series that are easy to predict(2)Combine the sample feature learning ability of phase space reconstruction rolling window prediction in high-complexity time series with the relieving overfitting ability of -fold cross-validation time series prediction in low-complexity time series. This paper adopts the idea of “divide and conquer” for the first time, applying two different forms of model training methods to the demand subsequences of different complexity(3)Considering the shortcomings of PSO, such as dependence on initial parameters, premature convergence, and being easy to fall into local optimum, the PSO algorithm was modified by chaos strategy, crossover and mutation operations, and adaptive control process

At present, the academic circle has formed a relatively mature prediction theory system. The existing forecasting theoretical methods include time series methods, causal methods, artificial intelligence methods, gray methods, simulation methods and other single methods, combined forecasting models, TEI@I methodology, decomposition-ensemble forecasting model, and other ensemble methods. The demand forecasting methods that adopt in the field of components are basic, including time series forecasting and unary causality forecasting. For example, Wang and Chen [2] built a dynamic ARIMA model to adapt to the volatility of semiconductor products and conduct demand forecasting. There are no scholars considering single methods such as artificial intelligence method, gray method, and simulation method, let alone ensemble methods. The unsteady and nonlinear characteristics of the demand for space science payload components determine that the single models with limitations currently used are difficult to predict accurately. The decomposition-ensemble prediction model inspired by TEI@I methodology [3] has become the frontier technology of time series prediction research due to its advantages of being suitable for complex system analysis and prediction, as well as strong interpretability. The decomposition-ensemble prediction model has been applied to energy prices [4, 5], PM2.5 concentration [6], foreign exchange rates [7, 8], agricultural product futures prices [9], tourism demand [10], and many other fields. However, no research has been found to apply it to the demand forecasting for space science payload components.

When the data is a combination of sequences with different frequencies and has nonlinear and fluctuating characteristics, separating these sequences with different frequencies from chaotic data is a solution that may improve the prediction accuracy. The model based on decomposition shows better performance than the traditional single model in the prediction of unsteady and nonlinear data. Time series data decomposition method includes wavelet transform (WT) [11], Fourier transform (FT) [12], empirical mode decomposition (EMD) [13], and singular spectrum analysis (SSA) [14]. EMD is superior to other decomposition methods because it is very suitable for complex unsteady and nonlinear time series and easy to model, but it also has the problem of modal aliasing [15]. In recent years, some improved EMD methods, such as sliding window empirical mode decomposition (SWEMD) [16], ensemble empirical mode decomposition (EEMD) [17], complementary ensemble empirical mode decomposition (CEEMD) [18], complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [19], variation mode decomposition (VMD) [20], and weighted EMD [21], have been proposed. EEMD is a multiscale analysis method that deals with unsteady, nonlinear, and complex time series data decomposition, which has been successfully applied to many fields. It retains all the advantages of EMD. At the same time, it effectively overcomes the shortcoming of EMD mode aliasing by adding white noise to the original time series [22]. Li et al. [17] combined the signal decomposition method and ANN to predict the long-term runoff time series. The experimental results show that EEMD has better performance than EMD and DWT. Therefore, multiscale decomposition-ensemble forecasting represented by EEMD is a new application direction in the field of demand forecasting for space science payload components, and it is expected to improve the accuracy of forecasting.

In terms of prediction models, support vector regression (SVR) is an artificial intelligence method suitable for solving small sample, nonlinear, and high-dimensional problems. It not only has good prediction performance but also can overcome the deficiency of overfitting. The prediction performance of SVR depends on the data preprocessing methods largely such as the normalization problem and feature extraction in the early stage, as well as the kernel function selection and parameter optimization in the later stage.

The existing normalization methods include logarithmic function conversion, min-max normalization, arctan function, sigmoid function, and quantile method. Because the demand of space science payload components is always positive, fluctuates violently, and has abrupt changes, this paper adopts the logarithmic function conversion method to normalize the data.

The academic community has not yet reached a consensus on kernel function selection and parameter optimization [23]. SVR mainly has two types of kernel functions: global kernel and local kernel. For global kernel functions, such as linear kernel function and polynomial kernel function, the discrete data points are allowed to influence the kernel value, and the extrapolation ability is stronger, focusing on finding the global optimum solution. For local kernel functions, such as Gaussian Radial Basis Kernel Function (RBF), only the centralized data points are accepted to have an effect on the kernel value, and the interpolation ability is better, aiming to find the optimal local solution [24]. This paper tries many types of kernel functions, and the optimum kernel function is more suitable for the nonlinear changes in the demand sequence of space science payload components.

Metaheuristic algorithm is considered to an intelligent method to solve SVR parameter optimization problems. Metaheuristic algorithms include genetic algorithm (GA), Tabu Search (TS) algorithm, simulated annealing algorithm (SAA), particle swarm optimization (PSO) algorithm, and Cuckoo Search (CS) algorithm [25]. The parameter optimization of SVR is a continuous optimization problem. The PSO algorithm has obvious advantages in dealing with such problem. And because of its memory mechanism [26], fewer parameters, and easy implementation [27], it is more suitable for searching SVR parameters. However, the PSO algorithm also has the disadvantages of dependence on initial parameters, premature convergence, and easy to fall into local optimum [28]. It is difficult to find the global optimum of the SVR model parameter set that solves the demand sequence of space science payload components with multiple peaks. Therefore, it is necessary to further optimize and improve the traditional PSO algorithm. This paper proposes a modified PSO algorithm to overcome the above shortcomings. Its algorithm type is swarm intelligence in evolutionary computation.

Currently, the SVR method is applied to realize data-characteristic-driven time series prediction. Model training and prediction include two types. The one is to use the embedding dimension and delay time to reconstruct the phase space of the time series data and realize one-step or multistep prediction through rolling windows [6]. The second is to use the time series data as the input of the model directly, which uses the -fold cross-validation (-Fold CV) method for model training and prediction [29]. The above two methods have their own advantages and disadvantages in solving time series prediction problems of different complexity. Therefore, this paper adopts the idea of “divide and conquer” and uses different model training methods for time series data of different complexity to further improve the accuracy of ensemble forecasting for space science payload components. Among them, the complexity of the time series data has expressed by the confusion degree of the sequence. The complexity measurement methods include sample entropy (SE) [30], information entropy (IE) [31], and permutation entropy (PE) [32]. This article chooses the commonly used sample entropy to measure the complexity of the sequence. There are many calculation methods for embedding dimension and delay time, but generally, they can calculate just only one factor separately. The C-C method adopted in this paper is different from other methods in that it can estimate the delay time and embedding dimension simultaneously based on the statistical results [33].

3. Proposed Model

3.1. Ensemble Empirical Mode Decomposition

Ensemble empirical mode decomposition (EEMD) is a local and adaptive time-frequency analysis method proposed by Wu and Huang in 2009, which solves the inherent modal aliasing drawbacks of empirical mode decomposition (EMD) [34]. The basic principle of EEMD is that it is assumed that there are several fluctuation modes in time series data, which can be decomposed into intrinsic mode function (IMF) and residuals of different frequencies step by step. The time series data contains both real information and noise. Random white noise that obeys the normal distribution (mean value is 0; standard deviation is ) is added to the original time series data. After ensemble averaging, the noise of each subsequence cancels each other out, and the modal aliasing phenomenon is reduced significantly. EEMD algorithm flow for multiscale decomposition of time series data of space science payload component demand is as follows:

Step 1. This method adds Gaussian white noise that obeys to the original data to generate time series data . means the iteration number for adding white noise. Iterate is set as , .

Step 2. This method determines the local maximum and local minimum of the sequence . Then, connect all local maximums and local minimums through cubic interpolation to form the upper envelope and the lower envelope. After that, calculate the average envelope value of the upper envelope and lower envelope .

Step 3. It is set as , if meet the two conditions of the IMF. (1)The difference between the number of extreme points and zero crossing points is at most 1(2)The average value of the upper envelope and lower envelope is 0Then, is the , the iterate is set as , and residual is . If does not meet the conditions, then let .

Step 4. Repeat steps 2 to 3, and terminate the iteration when the number of extreme points of the residual does not exceed 1.

Step 5. This method loops the above 4 steps NE times, adding different Gaussian white noise each time, integrating and averaging NE IMF as the final decomposition result. Finally, EEMD decomposes the original sequence into IMF and a residual term . See Appendix A.1 for the description of variables involved in EEMD.

3.2. Support Vector Regression

Support vector regression (SVR) is a kernel-based nonlinear regression method. Its basic idea is to map the nonlinear raw data into a linear mode of a high-dimensional feature space in order to find the best regression hyperplane with the least structural risk [35]. The specific form of the standard model is as follows:

Among them, is the input vector, is the predicted value, is the weight vector, is the nonlinear mapping function, and is the bias term.

The original problem of the standard model is as follows.

Among them, and are slack variables, is the true value, and is the number of samples. The insensitive parameter indicates the maximum allowable error in the insensitive area. The penalty factor is greater than 0, which is used to weigh the complexity of the model and the size of the error loss.

The original problem can be transformed into a dual problem:

Among them, and are the Lagrangian multipliers, and is the kernel function of the inner product . This paper uses the quadratic programming method and the KKT condition to solve the optimum solution of , , and in the above dual optimization problem. The regression function of is obtained as follows:

Among them, is the dimension of feature space.

The kernel function of SVR is a symmetric function that satisfies Mercer’s condition. Its function is to map low-dimensional space to high-dimensional space. Table 1 shows five different kernel functions.

Different kernel functions show different performances in various types of problems. This article attempts to apply five alternative kernel functions to the research of demand forecasting for space science payload components and chooses the kernel function with the best prediction result as the optimum kernel function for this problem.

See the Appendix A.2 for the description of variables involved in SVR.

3.3. Model Parameter Optimization

The optimum kernel function model established in this paper has multiple parameters that need to be optimized jointly to improve the accuracy and stability of demand forecasting for space science payload components. They are the penalty factor , the insensitive parameter , and the kernel function parameter . Among them, the kernel function parameter may contain 0 to 3 hyperparameters due to the difference of the kernel functions. In this section, aiming at the shortcomings of the traditional particle swarm optimization (PSO) algorithm, a modified particle swarm optimization (MPSO) algorithm is proposed to screen the parameter set to realize the comprehensive optimization of multiple parameters.

3.3.1. Particle Swarm Optimization Algorithm

The particle swarm optimization (PSO) algorithm is a swarm intelligence algorithm originated from the foraging behavior of birds. It regards a group of particles as candidate solutions to the researched problem and finds the best solution by flying in -dimensional search space [36]. Each particle contains two fixed attributes: position and velocity . The optimum position in the individual history is , and the optimum position in the global history is . The update formula for the velocity and position of each particle is as follows:

Among them, and . indicates the number of particles. indicates the number of iterations. indicates the number of parameters. represents the inertia factor, which is used to control the moving speed of particles and find a balance between global and local search. A larger value will help the particle global search but will slow down the convergence; a smaller value will help the particles converge quickly but may lead to a local optimum. and are the individual and social learning factors, and are generally taken as 2. and are random numbers uniformly distributed in the interval [0, 1].

The fitness function of the particle is the mean square error (MSE):

Among them, represents the predicted value, represents the true value, and represents the number of samples.

3.3.2. Modified Particle Swarm Optimization Algorithm

In this paper, three methods are used to modify the particle swarm optimization algorithm.

Firstly, in view of the shortcomings of PSO algorithm relying on initial parameters, this paper adopts the Logistic chaotic mapping method [37] to initialize the particle swarm, so that the initial population distributes in the predetermined search space evenly, to achieve the purpose of improving the diversity of the population.

Among them, and . represents the number of particles. represents the number of parameters. Chaotic variable is , and . Control parameter is . and are the value ranges of optimization variables.

Secondly, considering that the PSO algorithm is easy to converge prematurely, this paper uses crossover and mutation operations [19] to enhance the diversity of the population in the iterative process of the algorithm. In the process of each iteration, when the position of each particle in the population is updated, three particles are randomly selected for particle , ensuring . It is assumed that and are the optimum solution and the worst solution of the population in this iteration. The solutions produced by crossover and mutation operations are as follows.

Among them, ~ are random numbers in the range of [0, 1]. Choose the particle with the best fitness function value in , , , , , and to replace .

Thirdly, in order to balance the global and local search capabilities of the PSO algorithm and prevent the algorithm from falling into the local optimum, this paper introduces a nonlinear inertia factor [38] to improve the PSO algorithm and dynamically adjust the inertia factor through an adaptive control process. With the gradual increase in the number of iterations, the modified inertia factor decreases nonlinearly from to . A slow drop in the initial stage is conducive to the global search, and a fast drop in the later stage can enhance local optimization.

Therefore, this paper proposes a modified particle swarm optimization (MPSO) algorithm to search for the optimal parameter set of the optimum kernel function model. The algorithm flow is shown in Algorithm 1.

  Input: time-series data, number of particles , number of parameters , number of iterations , inertia factor , individual
  learning factor and social learning factor
  Output: the position of the optimal parameter set
1 Set , randomly generate chaotic variable according to formula (12)
2 According to formula (13), the initial position is obtained, and the initial velocity is randomly generated
3 Calculate the fitness function according to formula (11)
4 Assign the individual historical optimum position directly, calculate the global historical optimum position , and make
5 While ()
6   Generate and randomly, adjust the inertia factor dynamically according to formula (26), and update the position and velocity of each particle according to formula (9) and formula (10).
7   Calculate the best value and the worst value of the population in this iteration, perform population crossover and mutation according to formula (14)–formula (25), and update the position of particles.
8   Calculate the fitness function according to formula (11) and update the global historical optimum position .
9   If ()
10    Update the best position in individual history
11   End If
12   
13 End While

See the Appendix A.3 for the description of variables involved in MPSO.

3.4. Criteria for Prediction Performance Evaluation

In order to evaluate the pros and cons of prediction methods, this paper constructs reasonable evaluation criteria to examine the prediction performance of the model from multiple angles. Specifically, the mean absolute percentage error (MAPE) and the root mean square error (RMSE) are selected to evaluate the accuracy of the prediction, the standard deviation of the absolute percentage error (SDAPE) is selected to evaluate the stability of the prediction, and the directional accuracy of the prediction is evaluated by .

Among them, refers to the number of samples, refers to the true value, and refers to the predicted value. is a directional discriminant index. If , then ; that is, the direction predicted by the model in the period is correct. On the contrary, , the model predicts the wrong direction.

In addition, in order to test the difference in prediction performance between models from the perspective of statistical significance, this paper also introduces the classic DM statistics [39] to determine whether model A is significantly better than model B’s prediction accuracy through statistical test. The null hypothesis of the DM test is that there is no significant difference between the prediction performance of model A and model B. If the test result () rejects the null hypothesis at a certain level of significance, it indicates that there is a significant difference between the prediction performance of model A and model B. At this time, if , it means that model A is significantly better than model B’s prediction performance.

Among them, the average loss function is , loss function value is , , , and are the predicted values of model A and model B in period, respectively.

See the Appendix A.4 for the description of variables involved in criteria for prediction performance evaluation.

3.5. Demand Forecasting Model Design

Taking into account the nonlinear, unsteady state and small sample demand characteristics of space science payload components, this paper proposes an EEMD-CC&CV-MPSO-SVR model to predict the demand of space science payload components. The steps of the model are as follows.

Step 1. Data collection and preprocessing: this paper collects historical data of space science payload component demand, divides the training set and testing set, and adopts the logarithmic function conversion method to preprocess the data.

Step 2. Sequence decomposition and verification: EEMD is used to decompose the demand sequence of space science payload components into IMF and a residual term . ADF is used to test the stationarity of each subsequence.

Step 3. Model training: the sample entropy [30] is introduced to measure the complexity of each subsequence. The CC-MPSO-SVR model is used to train high-complexity sequences with sample entropy greater than 0.8, while the CV-MPSO-SVR model is used to train low-complexity sequences with sample entropy less than 0.8. (1)CC-MPSO-SVR model: firstly, the C-C method [33] is used to determine the embedding dimension and delay time corresponding to the subsequence, and the phase space of the subsequence is reconstructed. Secondly, the five candidate kernel functions are used as the model kernel function, respectively. Take training samples to train the model of each subsequence, and the modified PSO algorithm is used to optimize the parameter set of the model.(2)CV-MPSO-SVR model: five candidate kernel functions are used as the model kernel function, respectively. Take training samples to train the model of each subsequence. The modified PSO algorithm and -fold cross-validation method are combined to optimize the model parameter set to enhance the applicability of the model and prevent overfitting.

Step 4. Model prediction: the model obtained by the above training is used to predict the testing samples, and the prediction results of each subsequence are ensemble in the form of summation to form a total prediction. The kernel function with the best prediction result is the optimum kernel function.

Step 5. Model evaluation: the prediction performance evaluation criteria of the established model are used to evaluate the accuracy, stability and directional accuracy of the model. Finally, perform statistical test and robustness analysis.

In summary, the demand forecasting model framework for space science payload components designed in this paper is shown in Figure 1.

4. Experiments

4.1. Data Source and Preprocessing
4.1.1. Data Sources

This paper selects the component procurement data by the undertaking research and development institution of Chinese space science payload from September 2017 to April 2022 as an example for empirical research and analyzes the actual effect of applying the EEMD-CC&CV-MPSO-SVR model to the demand forecasting of space science payload components. The research object is the total demand of components, and the sample points are divided into units of each month. The number of samples is 56.

4.1.2. Data Preprocessing

Due to the violent fluctuations in the total component demand data, some data have large mutations. In order to eliminate the influence of collinearity and heteroscedasticity, the logarithmic function conversion method is used to normalize the total component demand data sequence. The data sequence of the total component demand by logarithmic function conversion is shown in Figure 2. It can be seen that the total component demand data after logarithmic function conversion is unsteady, nonlinear, trending, and random.

The statistical descriptions are shown in Table 2. The minimum and maximum of demand data are very different. The ratio of the standard deviation to mean gives the coefficient of variation, which is 0.1952 and shows high fluctuation. In addition, the skewness and kurtosis exhibit great difference from normal distribution.

4.2. Sequence Decomposition and Verification
4.2.1. Sequence Decomposition

EEMD is used to decompose the total component demand data sequence converted by logarithmic function into IMF and a residual term (RES). According to the formula , the number of IMF is 4. The number of iterations is 100. The white noise standard deviation is 0.2. Each subsequence is drawn as shown in Figure 3.

The high frequency part reflects that the demand for space science payload components is affected by short-term market demand imbalances, embargo policies, and other irregular events, showing transient and frequent fluctuations. The low frequency part reveals that the demand for space science payload components is affected by the country’s long-term macropolicies and other factors, showing a permanent and stable trend of change.

4.2.2. ADF Test

To measure the effect of time series decomposition, it is necessary to introduce a stationarity test method to compare the effect before and after decomposition. This article chooses the most commonly used ADF method to test the stationarity of the total component demand data series and each subsequence after logarithmic function conversion. Assuming that there is a unit root, that is, the sequence is not stationary. If the significance test statistic is less than 1% confidence level, the null hypothesis can be rejected at a probability level of 99%. The ADF test results obtained by using the econometric software Eviews are shown in Table 3.

The ADF test results show that the total demand sequence converted by the logarithmic function is a unit root process with intercept term and trend term, and the sequence is nonstationary. After EEMD, IMF1, IMF2, IMF3, IMF4, and R are stable excluding the intercept term and the time trend term, which can be used to predict future demand.

4.3. Model Training and Prediction

This paper calculates the sample entropy of each subsequence, and the results are shown in Table 4. The CC-MPSO-SVR model is used to train and predict IMF1 with sample entropy greater than 0.8, and the CV-MPSO-SVR model is used to train and predict IMF2, IMF3, IMF4, and residuals with sample entropy less than 0.8. Among them, according to the C-C method, the embedding dimension is 4 and the delay time is 1. After multiple rounds of experiments, the cross-validation fold number is taken as 3. Parameters of the modified particle swarm optimization algorithm are set as follows. The maximum number of iterations is 100. The number of particles is 20. The inertia factor is 0.6. The range of penalty factor, insensitive parameter, and kernel function parameter are [0, 100], [0.01, 100], and [0.01, 1], respectively.

4.3.1. Model Training

The EEMD-CC&CV-MPSO-SVR modeling process is shown in Figure 4. The model is implemented in Windows 10, Matlab R2015a environment.

The EEMD-CC-MPSO-SVR model reconstructs the subsequence into a phase space that uses the demand of the first four months to roll forecast the demand of the next month and generates 52 sets of data. The first 47 sets of data are used for the training of the SVR model. The MPSO algorithm optimizes the parameter set that minimizes MSE. And the last 5 sets of data are used for the model prediction.

After the sequence is decomposed, the EEMD-CV-MPSO-SVR model divides each subsequence into a training set and a testing set. Each subsequence has 56 data. The training set is the first 51 data, and the testing set is the last 5 data. Divide the training set into three equal parts. Use each one as the validation subset in turn and the remaining two as the training subset. For the three situations, the MPSO algorithm continuously optimizes and iteratively generates several parameter sets and uses the SVR model to train each training subset to obtain the predicted value and MSE of each validation subset corresponding to the different parameter sets. Integrating the MSE values in the three situations, determine the parameter set that minimizes the average MSE as the optimal parameter set of the SVR model of the subsequence.

4.3.2. Model Prediction

For each subsequence, use the trained model to predict the training set and testing set. Add the prediction results of each subsequence to get the total demand prediction result of space science payload components converted by logarithmic function. Through multiple rounds of comparison, it is found that the ensemble prediction result is the best when RBF is used as the kernel function. See Section 4.4.1 for the proof. Therefore, RBF is selected as the optimum kernel function of the demand forecasting model for space science payload components.

The prediction results of each subsequence are shown in Figures 5(a)5(e). Calculate the MSE of each subsequence, as shown in Table 5.

Figure 5 compares the original values and predicted values of each subsequence obtained after EEMD. Table 5 quantitatively measures the mean square error of the original values and predicted values of each subsequence. The results in Figure 5 and Table 5 show that the predicted values of each subsequence are basically consistent with the original values.

It can be seen from Figure 5 that in recent years, the fluctuation of the demand for space science payload components in China has gradually decreased and stabilized, while the demand quantity has gradually increased, showing a rapid first and then a slow upward trend. The reason for the change in demand is that with the formulation and implementation of national policies, the development of space science payload industry has become increasingly mature, the demand for components has gradually stabilized, and its scale has continued to expand. According to analysis and speculation, the demand for China’s space science payload components will gradually stabilize and slowly rise in the future.

By analyzing the results in Tables 4 and 5, we find that the greater sample entropy value of subsequence, the greater corresponding MSE. It provides evidence that the higher complexity of subsequence leads to the more difficult prediction and the larger prediction error.

The ensemble prediction result is shown in Figure 6. The forecast result is denormalized to obtain the actual forecast result of the total demand for space science payload components.

It can be seen from Figure 6 that the EEMD-CC&CV-MPSO-SVR model adopting the decomposition-ensemble concept can well predict the total demand for space science payload components.

4.4. Model Performance Comparison

To prove the advanced nature of the proposed method, this paper introduces several performance evaluation criteria to evaluate the performance of the model, including the model’s fitting effect on the training set and prediction effect on the testing set.

4.4.1. Selection of Kernel Function

Compare the fitting effect and prediction effect of the model when the kernel functions are linear, polynomial, sigmoid, Laplace, and Gaussian Radial Basis Kernel Function (RBF). The experimental results are shown in Table 6.

The experimental results in Table 6 show that, taking the fitness value MSE as the criterion, the priority order of kernel function selection is RBF>Laplace>polynomial>linear>sigmoid. The RBF is superior to other kernel functions in model fitting and prediction effects and is more suitable for nonlinear changes in the demand sequence of space science payload components.

4.4.2. Proof of Mixed Method

This paper compares the prediction results of the EEMD-CC&CV-MPSO-SVR model with other 13 models. The experimental results are shown in Table 7.

In order to observe the fitting and prediction effects of the 14 models more intuitively, the experimental results in the above table are normalized. The evaluation value is closer to 1; the effect is better. Among them, the reciprocal of the MSE value of the MPSO-SVR model and the EEMD-MPSO-SVR model is too large. For the convenience of comparing the MSE values of other models, the MSE values of the two models are set to 1 directly, and the MSE values of other models are divided by 18 after taking the reciprocal. Draw a radar chart to show the prediction performance of 14 models, as shown in Figure 7.

The experimental results of Table 7 and Figure 7 prove the advanced nature of the EEMD-CC&CV-MPSO-SVR model compared to other models in unsteady, nonlinear and small-sample time series forecasting problems.

Three classical single methods are selected as comparison models, namely, ARIMA model, GM(1,1) model, and ANN model. In terms of fitting and prediction effect, the experimental results show that (1) the method proposed in this paper is superior to the traditional time series method, gray forecasting method, and neural network method and (2) the ARIMA model performs better than the GM(1,1) model and ANN model.

All the decomposition-ensemble prediction models have better prediction effect than MPSO-SVR model, which proves the necessity of the time series decomposition to solve the problem of unsteady state and nonlinear time series forecasting. The fitting and prediction effect of the model proposed in this paper are better than those of EMD-CC&CV-MPSO-SVR, which proves that the EEMD method is superior to the EMD method.

Compared with the method proposed in this paper, the EEMD-MPSO-SVR model lacks phase space reconstruction and cross-validation process. Experimental results show that direct training and prediction of each subsequence make the model obtain a better fitting effect, but the prediction effect is significantly worse.

The EEMD-CC-MPSO-SVR model is slightly better than EEMD-CC&CV-MPSO-SVR model in prediction effect, but its fitting effect is very poor. After analysis, it is accidental and unstable that the EEMD-CC-MPSO-SVR model has a good prediction effect. That is, the prediction effect of a single subsequence is not good, but better prediction results happen to be obtained through ensemble. The fitting and prediction effect of the EEMD-CV-MPSO-SVR model is worse than that of the EEMD-CC&CV-MPSO-SVR model. The EEMD-CC&CV-MPSO-SVR model combines the sample feature learning ability of the EEMD-CC-MPSO-SVR model in high-complexity series and the mitigation overfitting ability of the EEMD-CV-MPSO-SVR model in low-complexity series. Without losing too much prediction effect, it can obtain more stable fitting and prediction results.

Compared with the EEMD-GS-SVR model, the EEMD-CC&CV-GA-SVR model, and the EEMD-CC&CV-PSO-SVR model, the model proposed in this paper has better fitting effect and prediction effect. It proves the advantages of the modified particle swarm optimization algorithm in the multiparameter combination optimization problem of the SVR model.

Artificial intelligence models have better prediction performance and better generalization ability than SVR, so we replace SVR with ANN and LSTM in the EEMD-CC&CV-MPSO-SVR model. The experimental results show that ANN and LSTM models cannot learn the characteristics of small sample time series well.

4.4.3. DM Statistics

In order to further evaluate whether the difference between the proposed model and other models is statistically significant, this paper uses DM statistics to determine the significance. The results are shown in Table 8.

The DM test results show that the model proposed in this paper is different from other comparison models in most cases. Compared with the ARIMA model, the GM(1,1) model, the ANN model, the EEMD-CC-MPSO-SVR model, and the EEMD-CV-MPSO-SVR model, the DM value and value of EEMD-CC&CV-MPSO-SVR model are less than -4.2111 and almost zero, respectively. This shows that the EEMD-CC&CV-MPSO-SVR model is obviously superior to the above models at almost 100% confidence level. In addition, its prediction accuracy is obviously better than the EMD-CC&CV-MPSO-SVR model, the EEMD-CC&CV-PSO-SVR model, the EEMD-CC&CV-MPSO-ANN model, and the EEMD-CC&CV-MPSO-LSTM model.

4.4.4. Robustness Analysis

In order to evaluate the robustness of 14 prediction models, we calculate the standard deviation of prediction performance evaluation results of each model running 20 times. The results are shown in Table 9.

The experimental results in Table 9 show that the EEMD-CC&CV-MPSO-SVR model is the most robust of all prediction models, because the standard deviation of MSE on the training set and MAPE, RMSE, SDAPE, on the testing set is much smaller than that of other models. The ARIMA model is the least robust of all prediction models.

4.4.5. Proof of Logarithmic Function Conversion

In the above comparison experiments, we compare with transformed logarithms. In order to prove the credibility of the comparison results and the effectiveness of the model, we compare the real value with the denormalized predicted value of all models. The EEMD-CC&CV-MPSO-LSTM model and the EEMD-CC&CV-MPSO-SVR model are taken as examples to illustrate. The results are shown in Table 10.

The results in Table 10 show that the EEMD-CC&CV-MPSO-SVR model has less error between the denormalized predicted value and the true value and is significantly better than the EEMD-CC&CV-MPSO-LSTM model. Compared with Table 7, although the logarithmic function conversion method reduces the error to a certain extent, it can predict the future development trend and reflect the relative error between models. Other models have reached the same conclusion. This proves that the logarithmic function conversion method is a data engineering with credible conclusion.

4.4.6. Proof of Modified Particle Swarm Optimization Algorithm

To evaluate the performance of different algorithms, MAPE and calculation time are selected as evaluation indexes to analyze the performance of MPSO algorithm in prediction accuracy and convergence speed. The test results of different algorithms are shown in Table 11.

Experimental results show that the (1) MPSO algorithm can obtain better prediction accuracy than GA and PSO algorithm. The convergence speed is slightly poor, but the difference is not big, which is within the acceptable range. (2) GA is better than PSO in prediction accuracy and convergence speed.

In order to further verify the above conclusions, a set of real data of equipment spare parts demand are used for forecasting. The raw data is shown in Table 12. The test results of different algorithms are shown in Table 13.

The experimental results in Table 13 are consistent with the above conclusions. The MPSO algorithm significantly improves the prediction accuracy with seldom loss of calculation time.

5. Discussion

Experimental results show that, compared with other models, the EEMD-CC&CV-MPSO-SVR model proposed in this paper has good prediction ability and robustness for time series with unsteady, nonlinear, and small sample characteristics. The EEMD-CC&CV-MPSO-SVR model is characterized by high prediction accuracy, good stability, strong ability of relieving overfitting, and good at capturing complex time series features. The model takes “decomposition first and then ensemble” as the guiding ideology, following the “time series decomposition-decomposition sequence analysis-decomposition sequence prediction-prediction result ensemble-prediction effect evaluation.” Modeling idea can grasp the intertwined operating rules of complex systems at multiple different scales, which is reducing the difficulty of modeling complex systems. It also improves the analysis and prediction performance of the model effectively.

After a large number of experiments, we found that the time series with violent fluctuations and mutations are usually unpredictable. Our research shows that it is an effective data engineering to normalize the data with logarithmic function conversion method for time series which is always positive, fluctuates violently, and has abrupt changes.

In the process of model training, we found that the phase space reconstruction method can mine more complex time series information and effectively predict high complexity time series data. Nevertheless, overfitting is prone to occur and the prediction effect is not good for low-complexity time series data. The cross-validation process can alleviate the overfitting problem effectively. It can predict time series data with low complexity and obvious regularity. However, when encountering time series data with high complexity, the prediction effect is poor and the learning ability of sample features is insufficient. The reason for this result is that the phase space reconstruction method is based on rolling time window prediction, which can capture the characteristics of high-complexity sequences. While the cross-validation process verifies with each other after dividing the training set into blocks, the model parameters are determined by comprehensive trade-off, so as to avoid overfitting of low-complexity sequences.

The modified particle swarm optimization algorithm proposed in this paper improves the search quality of model parameters. The main reasons are reflected in three aspects. First, the use of chaotic strategies helps to improve the initial population diversity of the PSO algorithm and optimize the quality of the initial parameters. Second, the crossover and mutation operations are used to enhance the population diversity of the PSO algorithm, which is beneficial to alleviate premature convergence. Third, the adaptive control process is introduced to dynamically adjust the inertia factor, so that the PSO algorithm achieves a balance between the global and local search capabilities and avoids falling into the local optimum.

6. Conclusion

The unsteady, nonlinear, and small sample characteristics of the China’s space science payload component demand data make it difficult for the existing prediction methods to be applied directly. This paper proposes the EEMD-CC&CV-MPSO-SVR model, which can accurately predict the future demand of space science payload components. The main conclusions of this paper are summarized as follows: (1)The EEMD-CC&CV-MPSO-SVR model has unique advantages in solving the time series data prediction problems with strong volatility and mutation, collinearity and heteroscedasticity, multimodal and multiscale, high complexity and dimension, and multipeak of parameter optimal solution(2)Based on the comparison results of prediction performance evaluation criteria, the EEMD-CC&CV-MPSO-SVR model has better fitting and prediction effects than single benchmark models and other hybrid models. The results of robustness analysis show that the EEMD-CC&CV-MPSO-SVR model is the best and the ARIMA model is the worst

Thus, the method proposed in this paper can be extended to other time series forecasting problems of the national major special task supply chain that also have the unsteady, nonlinear, and small sample characteristics. However, this article only considers the single-point forecast of the demand for space science payload components. In the future, a variety of influencing factors can be incorporated to predict the demand range and probability of space science payload components from an interval perspective. In this way, more reliable prediction results can be obtained.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors do not have any possible conflicts of interest.

Acknowledgments

This work was supported by the Key Technical Talents Program of Chinese Academy of Sciences.

Supplementary Materials

Abbreviations. Summarize the abbreviations of the paper, and describe the abbreviations. Appendix. A.1. Description of variables involved in EEMD. Summarize the variables involved in EEMD and describe them. A.2. Description of variables involved in SVR. Summarize the variables involved in SVR and describe them. A.3. Description of variables involved in MPSO. Summarize the variables involved in MPSO and describe them. A.4. Description of variables involved in criteria for prediction performance evaluation. Summarize the variables involved in criteria for prediction performance evaluation and describe them. (Supplementary Materials)