Abstract

The high demand of the competitive market for innovation has brought the increase of research and development (R&D) investment. High-tech enterprises can reasonably control R&D cost and effectively manage R&D activities by accurately predicting R&D investment. Given the characteristics that high-tech enterprises have high uncertainty and frequently changing information in R&D investment, this paper uses the grey metabolic GM (1, 1) model and the exponential smoothing method in time series to establish a single prediction model of R&D investment in high-tech enterprises. With the analysis of the advantages and disadvantages of each single model, a combined forecast model of R&D investment in high-tech enterprises is thus established. The model was applied to the forecast of R&D investment of a high-tech enterprise in China from 2019 to 2023, and the results verified the higher accuracy and practicability of this model. The establishment of this model can provide effective support for high-tech enterprises in R&D cost management.

1. Introduction

In the era of knowledge economy, market competition for innovative products is becoming increasingly fierce. In order to meet the increasingly personalized needs of customers, the speed of product upgrading is continuously accelerated [1]. With the constant emergence of new technologies and new products, the investment in research and development (R&D) expenses required for enterprise innovation is increasing. With the increase in the intensity of R&D investment, however, most companies have not achieved a directly proportional increase in product output and revenue, and the problem of improper control of R&D costs is still common [2]. In addition, from the statistics on science and technology expenditures released by the Ministry of Science and Technology of China, it can be seen that the intensity of corporate R&D expenditure has been increasing in recent years as shown in Table 1. This phenomenon shows that, in the construction of an innovative country, the role of enterprises as the main body of technological innovation has become more prominent. Enterprises are the mainstay of technological innovation, and their R&D activities have an important impact on the promotion of national innovation-driven development [3]. As the main body of independent innovation, this is especially true of high-tech enterprises and corporate R&D investment is more effective in the high-tech field [4]. In order to improve the core competitiveness, high-tech enterprises must pay attention to R&D investment, effectively manage R&D investment, and achieve sustainable development based on a reasonable increase in input and output [3]. For this reason, effective management of R&D investment in high-tech enterprises has become an urgent problem to be addressed.

Cost forecasting is the prerequisite for all R&D activities, and the goal of forecasting is to provide important information on subsequent stages [5], so as to effectively control enterprise costs. At present, there have been many studies on the forecast of R&D investment in China, mainly focusing on the overall scientific research investment of the national and regional levels [6, 7]. The R&D activities of enterprises are uncertain and unknown, which are more prominent during the R&D activities of high-tech enterprises. This phenomenon will increase the degree of random fluctuations in R&D investment, making it very difficult to forecast accurately. Therefore, there are a few relevant studies on forecasting R&D investment of high-tech enterprises. Li and Hu (2017) used radial basis function neural network (RBF) and backpropagation neural network (BP) to construct an investment simulation system of R&D activity, based on the nonlinear relationship between GEM companies’ R&D investment and its influencing factors. The study found that the RBF neural network model has better fitting and forecasting effects than BP neural network [8]. Chen (2013) carried out research on forecasting R&D costs of high-tech enterprises by applying the system dynamics model, effectively improving the ability and accuracy of forecasting enterprise R&D cost [9]. Some scholars recognize the importance of forecasting R&D costs of enterprises, and they believe that while disclosing historical data onto R&D expenses, attention should be paid to the provision of future forecast information [10].

Macroeconomic fluctuations and some sudden uncertain factors often have a major impact on R&D investment in a certain field. Most R&D investment focuses on the short-term changes in some economic indicators, and long-term forecasting of R&D investment and building more accurate prediction models have always been hot research issues in the field of prediction [11]. R&D investment forecasting is usually used to forecast single or a few closely related investment indicators. Therefore, the methods used for R&D investment forecasting are mainly time series methods [12, 13], semiparametric model methods (e.g., Rounaghi et al. [14] and Dordonnat et al. [15]), uncertain forecasting methods [1619], and so on. R&D investment forecasting is very important to high-tech enterprises, but the uncontrollability in the forecasting process will increase the difficulty of forecasting and management [2]. It is difficult to accurately forecast the investment of R&D funds in enterprises, and it is rarely achieving through reasonable quantitative methods at present [20]. Scientific and reasonable forecasting of R&D funds investment in high-tech enterprises has crucial practical significance for promoting innovative enterprises and national construction.

The high demand of the competitive market for innovation has brought the increase of research and development (R&D) investment. High-tech enterprises can reasonably control R&D cost and effectively manage R&D activities by accurately predicting R&D investment. In view of this, given the characteristics that high-tech enterprises have high uncertainty and frequently changing information in R&D investment, this paper takes the investment of high-tech R&D funds as the research object, respectively using the grey metabolic GM (1, 1) model and the exponential smoothing method to establish a single forecasting model of high-tech enterprise R&D investment. However, the prediction accuracy of a single prediction model is poor, then according to the advantages and disadvantages of each model, a combined forecasting model for R&D funds investment of high-tech enterprises is thus established, which uses variance reciprocal method to assign weights. It is exemplified by the forecasting results that the combined forecasting model has higher forecasting precision and better fit with the actual situation. Thus, it is more suitable for the R&D investment forecasting of high-tech enterprises with a fast information update.

2. The Combined Forecasting Model of R&D Investment

The R&D activities of high-tech enterprises are characterized by high risks and high uncertainties. Their R&D investment is affected and restricted by the external environment and various factors, and the information on R&D innovation activities is updated fast and the product life cycle is short. There are many models to choose from for the prediction of R&D costs, but different methods have different scopes and characteristics. The selection of models and the different methods of solving lead to different accuracy of the results. Although many scholars have conducted extensive and in-depth research on R&D cost forecasting, a single forecasting method is mainly used in actual application, which makes the information contained in different forecasting methods ineffective. Individual studies have improved the prediction accuracy through combination forecasting, but the combined model is mainly limited to the same method. For example, Xiao and Zhou (2006) used the improved ant colony algorithm to solve the weighted average coefficient in the combination forecast. And they tested and verified the feasibility and effectiveness by applying this method to R&D funds in China [21].

The grey model can extract valuable information through the generation and development of sample data itself, which avoids discussing the relationship between other influencing factors [22]. The improved grey metabolic GM (1, 1) model can make full use of the new information generated by the time lapse, and the forecasting accuracy is higher. The exponential smoothing method based on time series can consider new information and historical data. Both of these two forecasting methods can better meet the requirements of high-tech enterprises’ R&D investment forecasting. According to the characteristics of R&D investment of high-tech enterprises and the applicable scope of the forecasting model, the metabolic GM (1, 1) model optimized by the grey system and the exponential smoothing method of the time series model is selected as the single model for forecasting.

2.1. Grey Metabolic GM (1, 1) Model

The grey system theory was proposed and developed by Deng in 1982. In theory, the GM (1, 1) model has been widely used as an effective forecasting tool [2325], especially in areas with significant uncertainty and lack of data, such as green electronic materials [26], energy consumption [27], and electricity [28]. Grey prediction is to accumulate irregular historical data with randomness and uncertainty to generate a series of exponential growth laws, thereby establishing a prediction model of grey differential equations [29]. The R&D innovation system of high-tech enterprises is easily interfered by various uncertain factors of the development process. It contains both unknown information and known information, which is a typical grey system of “small sample, poor information” [30]. In addition, the rapid development of the system prompts the continuous update and increase of R&D information, and the forecasting of R&D investment of high-tech enterprises needs to consider information that is newer. However, the ordinary GM (1, 1) model uses the first component of the original sequence as the initialization, and the new information is not fully utilized in the forecasting [31]. According to the new information principle of grey system theory and the characteristics of high-tech enterprise R&D investment, the grey metabolic GM (1, 1) model is selected as the grey forecasting single model.

The ordinary modeling process of GM (1, 1) model is as follows [32].

Set the nonnegative original time sequence :

Based on the initial sequence , a new series can be obtained through a one-order accumulating generation operator (1-AGO):where

Then, generate a sequence from the mean generation with consecutive neighbors of :

In the formula,

So, the basic form of the GM (1, 1) model is the first-order grey differential equation:where and are undefined constants, is development coefficient, and is system grey action quantity. Let satisfy the whitening equation of the GM (1, 1) model:

Moreover, set ; then, the parameters can be estimated by the principle of least squares, that is, satisfying .

Let

It can be solved by substituting into the whitening equation, and the time response sequence becomes the following:

Finally, the predicted value of the original sequence can be obtained through an inverse accumulated generating operation:

In the ordinary GM (1, 1) model, if only all historical data of the observation object are used, then it will be found that only little data closing to the real time have high forecast accuracy. Over time, random disturbance factors will continue to enter the system and have an impact in the future. This situation will lead to a gradual increase in forecast errors and gradually weaken the predictive significance of the model. The grey metabolic GM (1, 1) model is a grey forecasting model optimized by the ordinary GM (1, 1) model. It not only has the advantages of the ordinary GM (1, 1) model but also can consider random disturbance factors and make full use of the latest information carried by the original data sequence [22]. That makes the predicted value generated during the dynamic development process, and the predicted result is more accurate.

The grey metabolic GM (1, 1) model is not one-time forecasting, but one-by-one forecasting with successive replacement, continuously adding new updated information and removing useless old information to maintain the sequence of equal dimensions [33]. The new information is the forecasting data obtained through the model, which will directly affect the R&D system. The oldest information that is far away from the future time will have a small impact on the system and can be gradually discarded during the dynamic forecasting process. Keeping the system updated and developed can improve the forecasting accuracy of the model. Specifically, the forecasting process of the grey metabolic GM (1, 1) model is to build a GM (1, 1) model based on the known original data sequence to forecast the next data . Next, by putting the new information into the original sequence and removing the oldest data with reduced information significance, a new prediction sequence is generated [26]. Repeating the above steps to continuously build a new GM (1, 1) model to forecast new information one by one and substitute in turn until the predetermined forecasting target is completed [34], the required prediction data can be obtained.

2.2. Exponential Smoothing Method

The exponential smoothing method is a commonly used time series analysis and forecasting method developed from the moving average method. By calculating the smoothing value and working with the time series model, it can be used to predict the future development of the phenomenon or system [1]. This method assigns a greater weight to recent observations and a smaller weight to longer-term samples [35]. Then, they are calculated using the weighted average method in chronological order, making the model’s predicted value able to reflect more recent information and include all historical data, so that the forecasting results are more in line with the actual development of the system [36]. The modeling logic and calculation method of the exponential smoothing method is easy to understand, and, more importantly, the model forecasting results are relatively stable. According to the number of times that the observation value is smoothed, it can be divided into single exponential smoothing, secondary exponential smoothing, and cubic exponential smoothing. Which exponential smoothing model is used for prediction is mainly based on the trend of the data value and the effect of the model and the actual value fitting. The basic formula of the exponential smoothing method is [37]

In the formula, is the smoothing forecasting value in the time , is the actual observation value in time , is the smoothing forecasting value in the time , is the smoothing coefficient, and its value range is [0, 1]. The value of in the calculation of exponential smoothing is easily affected by subjective factors, so it is very important to determine a reasonable value of . Generally speaking, if the data fluctuates greatly, the value of should be larger. The larger the value , the more attention the exponential smoothing prediction model pays to new information. Yet, in the actual forecasting process, the appropriate smoothing coefficient is mainly selected according to the forecasting accuracy. The basic formula can be extended as

It can be seen from the above formula that the forecasting value of the time is actually a weighted sum of the time and the actual observation value of all previous periods in exponential form. After increasing the observation value of the latest period, the new data replaces the status of the old data, and the weight of the old data will gradually weaken so that the forecasting value can always reflect the latest data structure.

The grey metabolic GM (1, 1) model and exponential smoothing method are both suitable for R&D investment forecasting of high-tech enterprises. However, the single forecasting models have certain limitations. The characteristics of the two forecasting methods are different, and they can only reflect the future situation of the R&D system from an individual perspective, which cannot fully meet the forecasting requirements of the research object.

The advantage of the grey metabolism GM (1, 1) model is that there is less demand for the original data [38], and the more recent the data prediction, the more effective. The model that passes the posterior variance test is more suitable for medium and long-term forecasting [39]. However, its short-term forecasting effect is not as good as the time series model. The exponential smoothing method performs nonequal weight processing on data at different times [35], which means, as time goes on, it gradually reduces the degree of influence, which can offset or reduce the influence of abnormal factors, so that the forecasting model has higher stability. When the external factors change significantly, however, the results of the exponential smoothing method are likely to cause large deviations, which is more suitable for short-term forecasting.

2.3. Combined Forecasting Model

A single prediction model usually contains only part of the information of the prediction object, which makes it suffer from defects such as not extensive information sources and being affected by the setting of the model. According to the foregoing description, it can be found that the scope of application of the metabolic GM (1, 1) model and the exponential smoothing method can complement each other. If these two single models are properly combined through certain rules, effective system information can be extracted from the single predictive model. This makes the error caused by incomplete consideration in the prediction process smaller and the prediction accuracy higher. Hibon and Evgeniou (2005) also pointed out that the combination model may not be the optimal model based on large sample experiments, but the predicted risk of choosing the combination model is less than choosing the single model [40]. Therefore, based on the limitations of the single model in practical application and the randomness of system influencing factors, this paper uses the combined forecasting model [41], which aims comprehensively to use the information provided by each single forecasting method to improve the forecasting accuracy [42]. Combined forecasting models assign weight to single models according to certain rules, and that improves the performance of single forecasting models by including more comprehensive information [4244]. Suppose X is the forecasting value of the combined model, is the forecasting value of the i-th single model, the combined model weight is W, and ; then, the combined forecasting model is then given as

The determination of the weighting coefficient in the combined forecasting model is very important. The distribution of weight directly affects the forecasting realization and the final effect, and reasonable weighting can effectively improve the forecasting accuracy of the model [24]. The variance reciprocal method is simple and effective in practical applications, so it is used frequently. In this paper, the variance reciprocal method is used to assign the combination weight. This method assigns different weights to the single model according to the size of the data’s sum of squared errors. That is, the smaller the single models’ sum of squared errors, the higher the forecasting accuracy and the greater the weight [20]. Assuming as the sum of squared errors of the i-th single model, then

In the formula,where is the observed value of time and is the predicted value of the i-th method in time .

3. Application and Result Analysis of the Combined Forecasting Model

Company A is a high-tech enterprise in the software and information technology service industry. Its investment in R&D from 2011 to 2019 is shown in Table 2. As the company attaches great importance to improving its independent R&D capabilities, it continuously increases its investment in new business R&D every year, and it constantly achieves new breakthroughs in conventional technologies and products, which enhances the company’s product competitiveness. The relevant data of company A comes from the CSMAR database (http://www.csmar.com/Csmar.html).

The grey system theory requires that the sample data should not be less than 4, and this study selects the data from 2011 to 2015 as the model testing sample. In order to reflect the validity of the model forecasting, the root mean square error (RMSE) and mean absolute percentage error (MAPE) are selected as error evaluation indicators. The conventional GM (1, 1) model and the metabolic GM (1, 1) model were, respectively, predicted by MATLAB software. In addition, according to the accuracy test requirements of the grey model, a posterior variance test was performed on the GM (1, 1) model. The test result shows the ratios of mean square deviation were 0.0993 and 0.0861, both of which were less than 0.35. According to the above results, the test results are good and the model accuracy level is the first level. In view of the changing trend of the quadratic curve of the sample data, cubic exponential smoothing is selected as the exponential smoothing method. After repeated measurement and comparison, the smoothing coefficient selects , which has higher simulation accuracy. The predicted value of the three individual models is compared with the actual value, and the forecasting error results are shown in Table 3.

Table 3 gives an overview that the mean absolute percentage error (MAPE) of the two grey models is less than 10%, which meets the requirements of forecasting accuracy and has good predictability. Comparing the fitting condition of forecasting each year, it is found that the forecasting error of the conventional GM (1, 1) model fluctuates to a large extent. The indicators of the grey metabolic GM (1, 1) model are all far less than the result of the conventional GM (1, 1) model, which shows higher forecasting reliability. The prediction results in Table 3 show that the metabolic GM (1, 1) model is more suitable for the prediction of R&D investment of high-tech enterprises. The MAPE of the triple exponential smoothing method is 3.73, and the fitting result of the model is good, which can meet the actual forecasting requirements of high-tech enterprise R&D investment, and it can also reflect their rapidly changing information characteristics.

In order to make up for the shortcomings of a single model, a combined forecasting model is introduced. The variance reciprocal method is used to calculate the weights of each single model, which are , and the combined prediction results are obtained. The forecasting results and errors are calculated according to the combined forecasting model. The calculation results are shown in Table 4.

Table 4 shows that the forecasting errors for all other years are below 3% except for 2016. Compared with the grey metabolic GM (1, 1) model and the cubic exponential smoothing method, all evaluation indicators of the combined forecasting model are greatly reduced, which is much lower than the results of the two single forecasting models. This shows that this model has a very high degree of fitting and forecasting accuracy to the original data, and it can overcome the limitations of the single model, which is more suitable for the actual situation of R&D investment forecasting in high-tech enterprise.

Therefore, according to the data from 2011 to 2019, the company’s R&D investment for the next five years is forecasted. The weights are calculated based on the predicted values of the two single models, and the combined prediction model is obtained as follows:

The results are shown in Table 5. The company will continue to increase its R&D investment in the next few years, and the amount of R&D investment will continue to rise steadily.

4. Conclusion

R&D innovation is an important indicator of the current market competitiveness of enterprises, and it is regarded as the engine room of high-tech enterprises. While increasing R&D investment, however, effective forecasting of R&D funds is a problem that must be solved. In this paper, based on the information update characteristics of high-tech enterprises’ R&D investment and the limitations of the single forecasting model, a combined forecasting model based on grey metabolic GM (1, 1) model and exponential smoothing method is established to study its R&D investment forecasting in high-tech enterprises. The application in the forecasting proves the practicability and effectiveness of the combined forecasting model for improving forecasting accuracy.

High-tech enterprises can make more accurate and reasonable forecasting of their own R&D capital demand, and they can rationally allocate their R&D funds according to the forecasting value, so as to effectively manage R&D costs and improve their economic and social benefits. In addition, high-tech enterprises are susceptible to external factors. While committed to the management of their own R&D fund, they must always pay attention to the social and even global technology investment. In the future, we will use neural networks [8, 45], bilateral matching [46, 47], fuzzy information integration [48], distance measures [49], and other methods to compare and analyze our own research and development innovation and closely follow the development trend of domestic and foreign competitive markets.

Of course, the prediction results of any model cannot be completely accurate. For a complex system such as enterprise R&D investment that is simultaneously affected by internal and external environments, if a longer-term forecast is to be made, the external economic factors of the enterprise should be considered. In addition, there are many forecasting methods that can be selected, and further research on the selection of single models and the combination of multiple rules on the basis of the characteristics of R&D activities of high-tech enterprises is worthy of research.

Data Availability

All data included in this study are available from the corresponding author upon request. Most of the data are already in the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the Philosophy and Social Science Foundation of Guangdong Province (no. GD16XYJ26), the National Statistical Science Foundation of China (no. 2019LY29), and the Characteristic and Innovative Foundation for Humanities and Social Sciences of Education Department of Guangdong Province (no. 2018WTSCX041).