#### Abstract

In order to accurately predict the athletes' performance, this paper adopts the improved GM (1,1) model in combination with the athletes' performance characteristics. Considering the defects of the GM (1,1) model, the modelling process of the GM (1,1) model is improved by using exponential transformation preprocessing original data and dynamic generation coefficient reconstruction background value. The improved bee colony algorithm is used to solve the global optimal dynamic generation coefficient, and then the exponential transformation grey model optimized by the improved bee colony algorithm is established. The training results of athletes in a gymnasium in the north were tested. From the experimental results, it can be seen that the prediction effect of the model is significantly better than other models. The accuracy of training performance prediction is effectively improved. It can be better applied to the prediction of athletes' 100m training performance.

#### 1. Introduction

In modern sports, especially competitive sports, the sports’ overall level has been improved unprecedentedly; people’s research on it can be said that there are many aspects. However, people's understanding of the internal regularity that affects the changes of competitive sports results is just like our understanding of human beings themselves, which is both familiar and strange [1]. Obviously, from the sum total of human knowledge, at least for now, our knowledge of the laws of sports is limited. There are various factors that affect the change of sports performance, some of which we have already understood, and some of which we are still cognizing [2].

The level of athletes is improving, and the performance of athletes has attracted widespread attention. Athletic performance prediction can help coaches and athletes understand their own competitive level. Training programs can be formulated according to athletes' performance, so athletes’ performance prediction plays an increasingly important role [3].

Sports have been paid more and more attention by the state, which has invested a lot of money and time to develop sports. Through the accumulation of a large number of historical athletes’ training results, it is of great significance to deeply dig into them and predict their training results in the future [4]. The prediction results can provide clear training guidance for athletes and track the development characteristics of sports events [5]. Therefore, how to design a reasonable prediction method for athletes’ performance has always been a research topic in the field of sports [6].

There are three steps in the development of athletes’ training performance prediction. In the first step, the method is artificial; that is, coaches or researchers collect the results of certain sports by manual means and then estimate the results of athletes based on experience [7]. This method has a large amount of calculation and complex working process. In addition, the prediction result is far different from the actual value when errors occur, which has obvious limitations [8]. Subsequently, sports performance prediction models based on mathematical statistics theories emerged, such as linear regression and Grey Model (GM) [9]. Some mathematical modelling methods are mainly used to analyse the prediction of athletes’ performance, and they can only describe the characteristics of the linear or ascending trend of athletes' performance [10]. In fact, the trend of sports performance is not necessarily a rise or linear trend, there are also nonlinear characteristics and downward trend. Thus, the established model cannot accurately describe the characteristics of athletes’ training performance changes, and the prediction accuracy is not high enough [11].

According to different modelling methods, athlete performance prediction models can be divided into two categories: linear prediction model [12] and nonlinear prediction model [13]. Because the performance of athletes is affected by a variety of factors, such as testing environment and athlete psychology, the performance of athletes has certain characteristics of nonlinear change [14]. Therefore, the prediction error of the linear prediction model is large, which cannot guarantee the prediction accuracy of athlete performance. The nonlinear model can describe the relationship between the influencing factors and the performance of athletes, which has become the main research direction at present. Nonlinear athlete prediction models can be divided into two categories. The first type is the athlete performance prediction model based on time series. The characteristics of athletes’ performance changes were analysed through the historical data of athletes' performance, and the prediction of athletes’ performance was realized according to the correlation of data in time, mainly including ARMA and the exponential smoothing method. [15]. The second type is the athlete performance prediction model based on the machine learning algorithm, mainly including neural network and support vector machine regression. [16]. The two models have their own advantages, but also have some limitations. For example, the ARMA model requires a relatively stable change in athletes’ performance, while the exponential smoothing method requires a large number of historical data. However, the network structure of the neural network is difficult to determine, and the kernel function and parameters of the support vector machine need to be optimized. Therefore, it is urgent to establish an effective prediction model for the randomness and instability of an athlete’s performance.

This paper proposes an optimization model improved on GM (1,1) model [17] to improve the prediction accuracy of athletes’ performance. Applied this method to the training performance of athletes in a gymnasium in north China for an experiment. It shows the prediction effect of the proposed method is obviously better than the traditional method. The accuracy of training result prediction is improved effectively. Thus, the proposed method can be used in the prediction of athletes’ performance in 100-meter race training.

The innovations and contributions of this paper are listed as follows:(1)The modelling process of the GM (1,1) model is improved by using exponential transformation preprocessing original data and dynamic generation coefficient reconstruction background value(2)The global optimal dynamic generation coefficient is solved by using an improved bee colony algorithm(3)The exponential transformation grey model optimized by an improved bee colony algorithm is established

This paper consists of five main sections: Section 1 is the introduction, Section 2 is relevant theories, Section 3 is the improved optimization model of the GM (1,1) model, Section 4 is the experiment and analysis, and Section 5 is the conclusion.

#### 2. Relevant Theories

The GM (1,1) model uses the accumulation of the original sequence to generate a new sequence, which makes the original chaotic data show regularity. Even if there is only a small amount of data, it can get good prediction results. It generally includes three steps: cumulative generation, modelling solution, and cumulative reduction. The details are as follows:(1)Cumulative generation where . Then, is the nearest mean generation sequence of , where where .(2)Modelling and solving The whitening differential formula is as follows: where is the development coefficient and *h* is the grey action quantity, both of which are parameters to be solved. The values of and *h* can be estimated by the least square method. where Then, the solution of the whitening differential equation is as follows: GM (1,1) model’s time response sequence is as follows: where .(3)Progressive reduction.

The original data sequence restore value is

#### 3. Improved Optimization Model of GM (1,1) Model

##### 3.1. Preprocessing of Original Data Sequence

When dealing with some data sequences with insufficient smoothness, the accuracy of the grey model will be greatly reduced. The common method is to preprocess the original data with data transformation to improve its smoothness. In this paper, the exponential transformation grey model EGM (1,1) is established by using exponential function transformation to preprocess the original data. The specific process is as follows:

Exponential transformation of the original sequencewhere *c* is the base. The new sequence is modelled according to the GM (1,1) model, and the new response sequence is as follows:

The restore value of the new sequence is as follows:

The reduction value of the original sequence can be obtained from .

##### 3.2. Background Value Optimization

The background value construction formula of the GM (1,1) model is defective. As shown in Figure 1, the real background value should be the integral of over the interval [z−1, *z*]. The background value of the traditional modelling method is trapezoidal area. When dealing with some drastically changing data, the traditional background value construction method will bring large errors, resulting in a decrease of model accuracy.

Dynamic generation coefficient was used in this paper to replace the fixed value and reduce the error of background value to the greatest extent by dynamically adjusting the generation coefficient of each interval. The new background value construction formula iswhere is the dynamic generation coefficient, .

Because there are many parameters to solve the dynamic generation coefficient, it is difficult to solve it by general methods. The bee colony algorithm has unique advantages in solving nonlinear and multidimensional complex optimization problems, and it can obtain the global optimal solution of parameters. Therefore, the bee colony algorithm is used to solve the dynamic generation coefficient of the grey model.

ABC algorithm is a heuristic intelligent optimization algorithm that imitates the honey collecting behaviour of bees to search an optimal solution. Its idea is that there are three types of bees in the whole bee colony. The leading bee is responsible for finding the honey source and transmitting the information of the honey source to the following bee. The following bee is responsible for updating the honey source information according to certain rules. The investigation bee is transformed from the leading bee, which is responsible for discarding the honey source trapped in the local optimum and randomly generating a new honey source. This specific process of this algorithm is as follows:(1)*Initialize Population.* Initialize the basic parameters, randomly generate the initial positions of W honey sources according to (13); that is, the feasible solution of the problem to be optimized, and calculate the initial fitness value fit (*x*) of each honey source according to equation (14). where is the y-dimension value of the xth solution. , are the lower and upper bounds of the *y*th dimension. , *D* is the dimension of the solution of the optimization problem. where *f*(*x*) is the objective function value of the *x*th solution.(2)*Lead Bee Stage.* where , and *z* ≠ *x*. is a random number between [−1,1].(3)*Follow the Bee Stage.* After the following bees got the nectar source information transmitted by the leading bees, the probability of each nectar source being selected was calculated according to (16), and then nectar source information was selected and updated according to the probability. The updating principle was the same as the updating method in the leading bees stage.(4)*Bee Detection Stage.* If a nectar source still does not update information after repeated searching, then it is abandoned, and the corresponding leading bees are transformed into reconnaissance bees, and finally, a new nectar source is initialized.(5)*Judgment of Termination Conditions*. Judge whether the number of iterations exceeds the maximum number of iterations. If so, output the global optimal dynamic generation coefficient . Otherwise, turn to Step (2) and execute the cycle until the iterations’ maximum number is exceeded.

ABC algorithm has strong global search ability, but poor local search ability. Because it does random neighbourhood search when updating the nectar source, and it does not make full use of the current optimal solution. The crossover operator is introduced to improve the search strategy, and the following formula is used to replace formula (15) for search:where is the current optimal solution; *λ* is a random number between [0,1].

To sum up, the specific process of the IABC algorithm is shown in Figure 2.

The objective function of this paper is the sum of weighted squared errors, and its formula is as follows:where is the modelling original data sequence. is the model fitting value. *φ*(*z*) is the weighting coefficient, and its value is the ratio of the squared errors to the sum of the squared errors. The square error sum is corrected by weighting coefficient to increase the weight of large error and reduce the weight of small error so that the overall error distribution is more uniform.

The accuracy test is an important method to measure the quality of the model.(1)Grey Absolute Correlation Degree*.* where is the grey absolute correlation calculation parameter of the original data. is the prediction data grey absolute correlation calculation parameter. is the grey absolute correlation degree.(2)Average Relative Error .

#### 4. Experiment and Analysis

Figure 3 shows the convergence of the GM (1,1) algorithm and proposed algorithm when optimizing grey model parameters. As can be seen from the figure, the proposed algorithm not only has a significantly higher convergence speed than the traditional GM (1,1) algorithm but also has a certain improvement in stability.

This paper selects the performance of an athlete's 100m run as the research object and collects 200 data (see Figure 4).

This paper takes the training results of athletes in a gymnasium in the north as the experimental sample data. The data recorded the daily training results of the museum in detail. After eliminating and filling in the abnormal data, 8761 valid data were obtained. Because the athletes' training performance has the characteristics of randomness and volatility, this experiment separately forecasts the data of the first quarter.

In the process of data processing, the selection of time dimension also determines the prediction accuracy. If the selected time dimension is too long, the contribution of too far historical data to the current prediction data will be small, so that the training of the model cannot achieve the effect, thus affecting the prediction accuracy of the model. If the selected time dimension is too short, it will lead to the loss of some data features and the loss of some modal information in the process of modal decomposition. The historical relationship between training data and test data cannot be well mapped, which affects the prediction accuracy of the model. After the test, it is found that when the time series used in the test adopts the dimension of 1H, the proposed model’s prediction accuracy is the highest. Therefore, the time dimension is 1H.

In this paper, the time period of sampling is 1H, and the historical data of the first 24 hours are selected as the data characteristic attribute of the next time node in modelling each mode, respectively, and the training performance of athletes at the next time node (1H) is predicted through the model. In the process of experimental simulation, historical data points of the first 12, 24, and 36 hours were selected as sample points. After simulation, it was found that the prediction accuracy was the highest by selecting historical data of 24 hours as sample points.

In order to achieve the optimal decomposition of OVMD, the central frequency observation method is used to calculate the central frequencies of each component at different K values. In this way, the total number of modal decompositions K can be determined. Taking the time series of the first quarter as an example, the central frequencies of each modality under different K values are calculated, as shown in Table 1.

As can be seen from Table 1, when *k* = 6, the centre frequency of mode 2 and mode 3 is similar, that is, overdecomposition, so the total number of modes is 5. OVMD can effectively extract the dominant component representing the changing trend of the original sequence. The penalty factor *α* is set to 2000, TAU to 0, TOL to 1E -- 6, DC to 0, and INIT to 1. After parameter setting, OVMD decomposition was performed for the time series of the first quarter, and the results were shown in Figure 5.

The proposed model was compared with the literature [18] model, literature [19] model, and literature [20] model, respectively. The prediction results of the first quarter (90 days in total) are shown in Figure 6. From Figure 6 as you can see, in this paper, the prediction effect of the proposed model is better than other prediction models, this is due to the optimal variational mode decomposition to nonstationary and complex athletes' training achievement being decomposed into a series of stable subsequence, reduces the complexity of the original sequence. Compared with the model in literature [20], the prediction effect of the proposed model is better, because the search optimization ability of SSA can optimize the input parameters of the ELM-AE hidden layer, avoiding the error caused by the random generation of parameters, thus improving the prediction accuracy of the model. Based on error correction in this model, further improve forecasting precision, this is because there is an error component OVMD decomposition, and therefore the GM model further forecast the error series, the error sequence prediction results, and the athletes' training result of preliminary forecast results are superimposed orthodontic treatment, the final prediction results are obtained.

Table 2 lists the comparison results of root mean square error (RMSE), mean absolute error (MAE), and prediction error (MAPE) of the four models for the data in the first quarter. Table 2 shows that the other three models can predict the training performance in the next period with relative accuracy. The overall prediction accuracy of the proposed model is the highest, with an average relative error of 1.1896%, RMSE of 0.2108 kW, and an average error of 0.1338 kW. Taking the first quarter as an example, compared with the single SVR and DELM models, the average relative errors of the proposed model are improved by 1.9915% and 1.343%, respectively. Comprehensive analysis shows that the proposed model’s prediction performance has been improved to varying degrees, which not only indicates that the athletes' training results change regularly but also reflects that the proposed model has a stronger fitting ability when dealing with the data with stable changes.

Table 3 compares the error results of this paper in the first quarter with the other three models. Under the mean relative error evaluation index, the proposed model proposed reaches 1.2616%, which is superior to other comparison models, and further verifies that the proposed model is better.

Overall, the proposed model can better predict the variation trend of athletes’ training results in the two indexes of deviation degree RMSE and overall accuracy MAPE, which fully verifies the validity of the model.

Figure 7 compares the relative errors of the four models. The minimum relative error of the model in this paper is 0.0053, and the maximum relative error is 0.0335. Compared with literature [20] model, the error fluctuation is smaller. Therefore, it is effective to use exponential transformation to preprocess the original data sequence and dynamically generate the coefficient to reconstruct the background value formula. The improved model can not only deal with the randomness of athletes' training results but also deal with the unstable training results, making up for the defects of literature [20] model.

Table 4 is the comparison of the average relative error and grey absolute correlation degree of the predicted results of the four models. The average relative error of the proposed model is 2.64% lower than literature [18], and the grey absolute correlation degree is increased by 0.02. It shows that the prediction accuracy of the proposed model is higher and it can better predict the training results of athletes.

#### 5. Conclusion

The level of athletes is improving, and the performance of athletes has attracted widespread attention. Athletic performance prediction can help coaches and athletes understand their own competitive level. The prediction of athletes' performance has always been the focus of current research, but the existing prediction methods generally have limitations, so the prediction algorithm based on the improved GM (1,1) model is proposed. An experiment was conducted on the training results of athletes in a gymnasium in the north. The model can track the characteristics of athletes' training performance with high precision, and it is an excellent prediction model for athletes' training performance. Next, the paper will further optimize the selection of model parameters in different environments, and the error correction needs to be further studied.

#### Data Availability

The labeled dataset used to support the findings of this study is available from the author upon request.

#### Conflicts of Interest

The author declares that there are no conflicts of interest.

#### Acknowledgments

This study was supported by the School of Physical Education Wuhan University of Science and technology.