Abstract

Modern information technologies such as big data and cloud computing are increasingly important and widely applied in engineering and management. In terms of cold chain logistics, data mining also exerts positive effects on it. Specifically, accurate prediction of cold chain logistics demand is conducive to optimizing management processes as well as improving management efficiency, which is the main purpose of this research. In this paper, we analyze the existing problems related to cold chain logistics in the context of Chinese market, especially the aspect of demand prediction. Then, we conduct the mathematical calculation based on the neural network algorithm and grey prediction. Two forecasting models are constructed with the data from 2013 to 2019 by R program 4.0.2, aiming to explore the cold chain logistics demand. According to the results estimated by the two models, we find that both of models show high accuracy. In particular, the prediction of neural network algorithm model is closer to the actual value with smaller errors. Therefore, it is better to consider the neural network algorithm as the first choice when constructing the mathematical forecasting model to predict the demand of cold chain logistic, which provides a more accurate reference for the strategic deployment of logistics management such as optimizing automation and innovation in cold chain processes to adapt to the trend.

1. Introduction

With the economic growth and social development, people's living standard has been steadily improved these years. Nowadays, there are increasingly consumers that have updated view of health and consumption. They tend to focus on a higher level of life quality such as the requirements of freshness and varieties when selecting and purchasing product, which greatly promoted the vigorous development of cold chain logistics. Actually, cold chain logistics is a special category of supply chain logistics, depending on the refrigeration technology to keep products in the specified temperature environment during the process from production period, circulation period, to sale period. Such professional logistics activities aim to ensure the quality of fresh products and satisfy the needs and requirements of consumers. Cold chain is vital in preserving the integrity and freshness of transported temperature-sensitive products. Generally, the subdivision product of fresh agricultural products includes meat, aquatic product, mill, fruit vegetable, etc. These fresh products are prone to spoilage and deterioration, which is the reason why they have to be transferred, stored, and distributed in a relatively complicated environmental where the temperature and humidity are specially designed and controlled during the logistics process. In order to preserve the freshness of these agricultural products to the greatest extent and deliver them to consumers perfectly, it is necessary to treat these products in a suitable low temperature environment. Besides, the newest data show that China’s cold chain logistics market scale reached 303.5 billion yuan in 2019. Such a large scale of cold chain logistics is due to the continuous enrichment of fresh products, which objectively increases the demand for low-temperature preservation and transportation of products. According to the data revealed in National Bureau of Statistics of China, the production of various fresh agricultural products is rising year by year, except for the production of meat. From an overall perspective, the total production of China’s fresh products is going up (see Table 1). The data shown in Table 1 obviously illustrate that the annual total output of products has steadily increased, which solves the problem of supply shortage and provides a solid material foundation for the development of cold chain logistics.

On the contrary, the increasingly extensive and continuous improvement of China's cold chain infrastructure also provides equipment support and technical support for the development of cold chain logistics. In 2019, the total number of cold storages in China reached 60.53 million tons, and the number of refrigerated trucks reached 21,147 million tons. The increasing popularity of refrigeration equipment has greatly alleviated the problem of “availability of cold” and further released the development potential of cold chain logistics.

From the overall point of view, although the development of cold chain logistics in China has made remarkable progress, the logistics demand has not been fully met due to the relatively latter reaction to the cold chain market, leading to the mismatch between supply and demand. What is worse, the outbreak of COVID-19 causes all kinds of effects on all walks of life, and the cold chain logistic is no exception. As the social distance became one of the important factors of consumer behaviour, “Contactless” distribution is more and more popular and adopted widely to the logistic development under this special situation, which made many electronic e-business platforms providing fresh products emerge blowout growth in the epidemic situation. Unfortunately, faced with such a huge flow of market demand, the problems of cold chain logistics industry are also more severe, such as the instability of product supply, shortage of personnel, lack of transport capacity, system collapse, and other issues that need to be solved. The rapid increase in cold chain logistics demand and the lag of effective supply in the short term can account for the emergence of these problems basically. Due to the high equipment cost of cold chain logistics, most enterprises are not willing to purchase a large number of refrigeration equipment in advance when facing the fuzzy market demand but tend to add corresponding supporting facilities only when the demand arises in order to avoid waste and save expenses. At the same time, compared with the general logistics supply chain, cold chain logistics has more difficulties in management, higher technical requirements, and longer equipment construction cycle, so it needs to make corresponding preparations in advance. All the analysis proves the importance of demand forecasting of cold chain logistics. Consequently, it is very difficult to supplement quickly in a short period of time if there is a shortage of supply, which also explains why the cold chain logistics is in short supply when the outbreak of the epidemic.

To solve the above problems, a more reasonable way is to forecast the demand of cold chain logistics scientifically by conducting mathematical model. Accurate prediction of cold chain logistics development trend is of great strategic significance for the sustainable development of the overall national economy. From the macrolevel, the cold chain logistics infrastructure construction, cold chain-related support policies, etc., can have better effect only if the layout is planned according to the demand [1]. From the microlevel, if the demand scale and development trend of cold chain logistics can be predicted in advance, it can play a role of wind vane for microindividuals to a certain extent. Therefore, this paper innovatively uses data mining to conduct the mathematical analysis and forecast the demand of cold chain logistics. According to the mathematical logic of grey prediction and neural network algorithm, relevant program codes with R language are created and run to establish prediction models. Then, we compare the prediction effect of the two and find out the relatively better prediction model. This paper provides more accurate and effective prediction methods by constructing the neural network algorithm model based on data mining, so as to promote the progress of society and the development of cold chain logistics engineering and management.

2. Literature Review

China’s cold chain logistics industry started late, and its foundation is relatively weak. The relevant data statistics and research work are not comprehensive enough so far. From the existing literature, the current research on cold chain logistics mainly focuses on the following aspects. The first kind is qualitative research, stressing on the theoretical analysis of cold chain logistics status and countermeasures. Although China’s cold chain logistics is booming, there are still many problems related to the cold chain logistic, which cannot be ignored. The problems of cold chain logistics mainly include the backward of infrastructure construction, insufficient investment in construction funds, the incomplete formation of cold chain logistics system, and the low level of informatization. From the perspective of the development of commercial circulation industry, China’s cold chain logistics has deficiencies in transportation cost, cold chain continuity, enterprise development factor investment, agricultural product standardization, and cold chain awareness, which restrict the development of cold chain logistics [2]. In view of these problems, some scholars believe that it is necessary to promote the development of cold chain logistics from the aspects of macromanagement, standardization, brand building, and consumer awareness training of cold chain logistics. At the same time, we should increase the investment in cold chain logistics infrastructure and information construction, accelerate the construction and improvement of agricultural products cold chain network system, and strengthen the research and development and application of new technologies of agricultural products cold chain logistics [3]. Through strengthening coordination and supervision, strengthening policy support, increasing capital investment, and accelerating personnel training, the development of cold chain logistics can be promoted [4].

On the literature of cold chain logistics, the second hotpot of research direction mainly focuses on empirical analysis from different influencing factors and evaluates the effect and efficiency of cold chain logistics, which emphasizes data mining and processing. Guo Mingde et al. calculated and analyzed the development level of cold chain logistics in 12 provinces and cities by using a factor analysis method, multilayer perception, and cluster analysis method and pointed out that there are significant differences in the cold chain logistics level of agricultural products in China. The development level of cold chain logistics in the eastern region is the highest, followed by the middle part, and then the western region. Generally speaking, the development level of cold chain logistics in China presents a ladder shape [5]. Gao Fan analyzed the influencing factors of deterioration loss of cold chain agricultural products and pointed out the development trend of controlling the deterioration loss of cold chain agricultural products from the perspectives of cold chain inventory, cold chain storage, transportation and distribution time, cold chain investment, and temperature control [6]. Tian Yujie et al. adopted the comprehensive weighting method combining AHP method and entropy weight method to determine the weight of the comprehensive evaluation index and comprehensively evaluated the logistics safety of fresh products from the aspects of storage, transportation, packaging, distribution, and handling [7]. Cao Wujun et al. analyzed the distribution efficiency of cold chain logistics based on system dynamics and found that when enterprises pay attention to the dynamic market demand, actively meet the needs of consumers, and increase the supply-demand ratio of cold and fresh meat products, the distribution efficiency can be improved to a certain extent, thus improving customer satisfaction and increasing the market share of enterprises [8]. Based on the perspective of time and space, Wang Jun and Li Hongchang discussed the role of the intermediate layer organization in the agricultural product cold chain logistics and pointed out that the role of the intermediate layer of the agricultural product cold chain is affected by the external environment such as social economy and policy and also affected by the internal governance performance [9].

The third aspect of current literature is the combination of qualitative and quantitative, focusing on the study of practical problems in cold chain logistics, such as location of distribution points, quality and safety monitoring system, and path optimization. Aiming at the network layout and transportation problems of cold chain logistics network, Zhang Wenfeng proposed a nonlinear mixed integer programming model with the construction cost and operation cost of cold chain logistics network as the optimization objectives and solved the nonlinear mixed integer planning model of cold chain logistics network by using a quantum particle swarm optimization algorithm and the layout problem of precooling station and cold chain logistics distribution center in a cold chain logistics network [10]. Zhao Zhixue et al. considered the economic cost and environmental cost and integrated the road congestion factor into the cold chain logistics green vehicle routing optimization mathematical model [11]. Zhou Qiang et al. established the mathematical model of distribution route optimization with minimum total cost based on the Internet and cloud computing technology. They established an intelligent comprehensive technology prevention and control system, with core functions including real-time monitoring, safety early warning, and food traceability [12]. Based on real-time traffic information, Yao Yuanguo et al. analyzed the fixed cost, transportation cost, refrigeration cost, damage cost, and penalty cost of agricultural products cold chain logistics distribution and established a mathematical model of distribution path optimization with the total cost minimization [13]. Aiming at the problem of slow convergence speed caused by insufficient pheromone in the initial stage of ant colony algorithm, Fang Wenting et al. constructed a hybrid ant colony algorithm and established a mathematical model of cold chain logistics path optimization with the minimum total cost as the research objective and carried out simulation optimization and comparative analysis on an example to verify the effectiveness of the model and algorithm [14]. Similarly, according to the technical advantages of the Internet and the characteristics of cold chain logistics, Yao Zhen and Zhang Yi used the improved genetic algorithm to solve the mathematical model and proposed the logistics distribution path optimization model with the minimum total cost as the objective function, and the effectiveness and rationality of the model and algorithm were demonstrated by an example [15].

In the related research of cold chain logistics, demand forecasting is also a research hotspot. Demand forecasting can help avoid over or undersupply of cold chain logistics and play a certain role in indicating the direction of investment. Therefore, it has attracted more and more scholars’ attention in recent years. Based on a combination of weight distribution methods, Wang Xiumei ensemble partial least squares method, time series ARIMA method, and quadratic exponential smoothing method; three single forecast methods, respectively, predicted China’s aquatic products, meat, poultry, eggs and milk products, and fruit and vegetable products. The demand trend of cold chain logistics for large-scale agricultural products pointed out that the forecast accuracy of the weight distribution combination method is better than that of the three single forecast methods [16]. Yuan Jing compared the results of one-way forecasts of agricultural cold chain logistics and positive weight combination forecasts using extended trend, exponential smoothing, neural network algorithms, regression methods, and grey forecasting methods and found that the positive weight combination forecasting method is closer to the true value [17]. Based on the grey model, support vector machine, BP neural network, RBF neural network, and genetic neural network, Wang Xiaoping and Yan Fei established a forecasting model of cold chain logistics demand for agricultural products. It is found that the ability of the five models to analyze agricultural product cold chain logistics demand problems is ranked as follows: genetic neural network model > RBF neural network model > BP neural network model > support vector machine model > grey model. Chain logistics demand analysis has advantages [18]. In the follow-up research, they constructed their index system from the five perspectives of agricultural product supply, social economy, cold chain development, humanities, and logistics demand scale. Using the global search capability of genetic algorithm, they constructed BP neural network to forecast agricultural cold chain logistics demand of urban agricultural products [19]. Yang Yangwei and Cao Wei integrated traceability information and monitoring information in the supply chain and also used BP neural network to establish an early warning indicator system for fruits and vegetables [20].

The academic community attaches great importance to the research of cold chain logistics technology. Many scholars are keen on the exploration of regional cold chain logistics development strategies and pay attention to the analysis of cold chain demand. However, there are still very few quantitative researches on cold chain logistics demand forecasting, especially the analysis of demand influencing factors. Considering that cold chain logistics demand is easily affected by many factors and the statistical data of the year are limited, the research intends to find out a cold chain logistics demand forecasting model.

3. Research Methods

3.1. Backpropagation Network (BPN)

The first model constructed in this research is the backpropagation neural network prediction model. Specifically, the neuralnet function in the R program is used to run the backward pass neural network algorithm to construct a predictive model.

Artificial neural network is a famous strategy to solve the problem of multitarget prediction. Neural networks contain a large number of models and learning methods, which are mainly divided into feedforward neural networks and feedback neural networks [21]. By imitating the behavioural characteristics of animal neural networks, distributed and parallel information dissemination processing is carried out, thereby constructing algorithmic mathematical models. This kind of network relies on the complexity of the system and achieves the purpose of processing information by adjusting the weight value and threshold value of the interconnection between a large number of internal nodes. The backward pass neural network is the most widely used neural network. It is a multilayer feedforward network trained by error backpropagation. The basic idea is to use the gradient descent method to search technology to make the predicted output value and expected output of the network. The mean square error of the value is the smallest. The calculation structure of neural network nodes is shown below (see Figure 1).

As shown in Figure 1, the input layer includes two input nodes (X1, X2), the middle layer also includes two input nodes (H1, H2), and the output layer has an output node (Y1). If the node receives the input, the vector is represented by X, the network weight value vector of the node and the upper layer is represented by W, and the threshold value of the node is represented by , and then, the adder of the jth node is defined as follows:

In (1), n is the number of nodes in the upper layer, is the output of the ith node in the upper layer, and is the link weight value of the ith node in the upper layer and the jth node in this layer. The startup function y = F (U), using the sigmoid function, is defined as

In the backward pass phase, the link weights between nodes will be corrected in the reverse direction according to the prediction error. After repeated corrections, the predicted output of the BPN will approach the target value. The error function formula is as follows:

In (3), is the expected output value of the ith output neuron in the output layer and is the predicted output value of the ith output neuron in the output layer. Also, the partial differential of the error function to the weight value is as follows:

In (4), is the correction amount of the link weight value of each layer of neurons and N is the learning rate parameter of the neural network, which mainly controls the learning speed of the BPN neural network.

3.2. Grey Prediction

Grey prediction is to identify the degree of difference in the development trend between system factors and process the original data to find the law of system changes, generate a data sequence with strong regularity, and then establish the corresponding differential equation model to predict the future development trend of things [22]. It constructs a grey prediction model with a series of quantitative values that reflect the characteristics of the predicted object observed at the same time interval to predict the feature quantity at a certain time in the future or the time to reach a certain feature quantity [23]. In order to weaken the randomness of the original time series, before establishing the grey forecast model, the original time series must be processed with data. The time series after data processing is called the generated column. The data processing methods commonly used in grey systems include accumulation and accumulation. The steps required to run the grey prediction model are as follows.

Generate original modelling sequence as follows:

Calculate the accumulation to generate the next-to-average sequence as follows:

Calculate the mean as follows:

Construct matrix B and matrix Y as follows:

Solve the model accuracy (a, b), where a is the development coefficient and b is the grey effect as follows:

Bring (a, b) into the improved grey prediction model formula and calculate k = 1, 2, 3,..., n as follows:

The relative error of the grey prediction model is as follows:

The average relative error of the grey prediction model is as follows:

Generally, the average prediction accuracy should be higher than 80%. If the prediction accuracy is above 90%, it means that the prediction effect of the model is better.

4. Prediction Modelling

4.1. Data Source

The data for this study come from the official website of the National Bureau of Statistics of China, the Cold Chain Committee of China Logistics Alliance, and iiMedia Data Center (see Table 2). The time span of the data is from 2013 to 2019. When establishing the neural network model, the input indicators include per capita disposable income of residents and per capita consumption expenditure of residents, and the output indicator is the scale of China’s cold chain market.

In the past literature, the production of fresh agricultural products was usually regarded as the demand for cold chain logistics directly. Although this definition has its rationality and operability, it also has certain defects because it ignores the situation that the target market and the original produce place are the same. The sales destination of fresh agricultural products can be mainly divided into urban and rural areas. Fresh agricultural products sold in urban areas do require cold chain logistics, but some agricultural products sold in rural areas are sold to surrounding rural areas near the place of origin. The transportation distance is not far, and the transit time is very short. In that case, the demand for the cold chain logistics of these fresh agricultural products is not so strong. In addition, some fresh agricultural products are even sold directly at the where they were grown, and this part has no effective demand for cold chain logistics in fact. Moreover, there is no clear correlation between the total product changes in fresh agricultural products and the regional distribution ratio of product sales. That is to say, when the output of fresh agricultural products increases, we cannot judge that the new output is transported more to cities and towns, transported to the surrounding rural areas, or retained in the place of origin for self-selling. If fresh agricultural products are used as the demand for cold chain logistics, it cannot fully reflect the real situation of cold chain demand and is not accurate enough. Therefore, this article takes the variable of China's cold chain market size as a proxy indicator of cold chain demand. There is a significant positive correlation between demand and market size, and the main purpose of predicting future demand is to accurately understand the development trend of cold chain logistics. The market size is also an important measurement factor reflecting the development trend of the industry. Therefore, the scale of China’s cold chain market is more reasonable as a proxy indicator of cold chain demand. In the construction of the two models, the variable of China’s cold chain market size was used. However, in addition to the expected output variables, the neural network algorithm also needs other exogenous influencing factors as input variables, so this article selects the per capita disposable income of residents and per capita consumption expenditure of residents and they are used as proxy indicators. This operation is because it is considered that the role of cold chain logistics is mainly to ensure the quality of fresh food, which reflects the residents’ pursuit of quality life, and at the same time, these two indicators reflect the income level and consumption concept of residents to a large extent. It is the economic foundation and endogenous driving force for residents to pursue a quality life. Therefore, the per capita disposable income and per capita consumption expenditure of residents can be used as important factors affecting the demand for the cold chain.

4.2. Model Based on Neural Network Algorithm

According to the mathematical logic of the neural network and the rules of the R language, we write the neural network model algorithm program statement for data mining and substitute the relevant index values into the program to run. Among them, the data from 2013 to 2018 are used to train and revise the model, and the latest data of 2019 are used to test the difference and accuracy between the predicted value and the actual value. The complete R language program algorithm is as follows:install.packages (“neuralnet”)library (neuralnet)traininginput1 < -as.data.frame (c (1.83, 2.02, 2.20, 2.38, 2.60, 2.82))traininginput2 < -as.data.frame (c (1.32, 1.45, 1.57, 1.71, 1.83, 1.98))trainingoutput < -as.data.frame (c (1.26, 1.5, 1.8, 2.25, 2.55, 3.035))trainingdata < -cbind (traininginput1, traininginput2, trainingoutput)trainingdatacolnames (trainingdata) < -c (“Input1”, “Input2”, “Output”)trainingdataf = Output ∼ Input1 + Input2net.sqrt < -neuralnet (f, trainingdata, hidden = c (5, 3), rep = 50, algorithm =  ‘rprop+’, threshold = 0.001, linear.output = T)print (net.sqrt)plot (net.sqrt)testdata1 < -as.data.frame (c (3.07))testdata2 < -as.data.frame (c (2.16))testdataout < -as.data.frame (c (3.78))testdata < -cbind (testdata1, testdata2)colnames (testdata) < -c (“testdata1”, “testdata2”)testdatanet.results < -compute (net.sqrt, testdata)cleanoutput < -cbind (testdata1, testdata2, testdataout, as.data.frame (net.results$net.result))colnames (cleanoutput) < -c (“testdata1”, “testdata2”, “Expected Output”, “Neural Net Output”)print (cleanoutput)

The above algorithm codes are set to establish the neural network. The parameter “Hidden” determines the number of intermediate layers of the neural network. This study controls a multilayer neural network structure, so the c function “hidden = c (5, 3)” is used, which means that the middle layer has two layers: the first layer has 5 neurons, and the second layer has 3 neurons. The parameter “threshold” sets the stop threshold of the neural network error function. This study sets it to 0.001, which means the algorithm stops running when the neural network error drops to 0.001. The parameter “rep” neural network training is repeated 50 times. Generally speaking, the more the repetitions, the higher the prediction accuracy. However, since the upper limit of the image processing ability of the R 4.0.2 is 63, that is, rep <63, and after testing, it is found that the prediction results of training repetitions higher than 50 are not ideal, the prediction accuracy in the interval [50, 63] does not increase but decreases, so this study sets the training repetition accuracy to 50. The parameter “algorithm” calculates the algorithm of the neural network. The algorithm includes backprop, rprop+, rprop-, etc. However, due to the poor performance of the backprop test in the neuralnet function, this study uses elastic backprop to construct the neural network prediction model. The parameter “linear.output” sets whether the result is a continuous numeric output, if it is set to “TRUE”, the output is numeric; otherwise, it is binary output. In this study, the numerical output is set, so it is set to “TRUE.” After the analysis of the above R language program, the neural network structure diagram (see Figure 2) is finally obtained as follows.

In the neural network structure diagram, the value on black line represents the link weight value between neurons, and value on blue line represents the error term added to the blue line at each step during the neural network fitting process. It can be seen from the figure that a total of 21198 steps were taken to construct the model, and the average error term was only 0.0019. After running the codes in R for more than 50 times, we find that the output result of the neural network algorithm once reached 3.60, which has a very small error value from the expected input; that is, the true value is 3.78. It can be seen that the backward pass neural network has an effective predictive ability.

4.3. Model Based on Grey Prediction

According to the R language rules and the mathematical logic of grey prediction, a grey prediction program is written, and the required data are substituted into the program to run. The specific program and process screenshots are as follows:X0 = c (1.26, 1.5, 1.8, 2.25, 2.55, 3.035, 3.78)X1 = 0X2 = 0z = 0X1 = cumsum (X0)n = length (X0)for (k in 2 : n) z [k] = (1/2)(X1 [k] + X1 [k−1])b1 = −z [2 : n]b2 = rep (1, length (z)–1)b3 = c (b1, b2)B = matrix (b3, nrow = length (b1), ncol = 2)Y = X0 [2 : n]Y = matrix (Y, nrow = length (b1), ncol = 1)bata < -solve (t (B) %% B)%% t (B) %% Ya = bata [1]b = bata [2]X2 [1] = X0 [1]for (k in 2 : n–1) {X2 [k + 1] = (–a)(X0 [1]–b/a)exp (–ak)}err = (X0X2)/X0

As shown in Figure 3, the demand for cold chain logistics in 2019 obtained by the grey forecast model is about 4.052. Compared with the real value of 3.78, the error is −0.072. Although the absolute value of the error is very small, it is still slightly larger than that of neural network. The error of the prediction result of the network algorithm model is −0.048. In other words, the neural network algorithm model has a higher accuracy and better forecasting results for the demand of cold chain logistics.

5. Conclusions

This paper analyses the problems existing in cold chain logistic in Chinese market and constructs the demand prediction models based on data mining. Neural network algorithm and grey prediction are applied in the R language program to study the cold chain logistics demand. The results show that the accuracy of grey prediction model and the neural network algorithm is very high, and the prediction effect is significant, while the neural network prediction model has smaller errors and slightly higher accuracy than the grey prediction model. Therefore, neural network algorithm model is more suitable for predicting demand of cold chain logistic management.

As for the positive contribution of this paper, we find out the better one between the two mathematical model based on data mining. According to the findings, in cold chain logistics and supply chain management, the use of neural network algorithm models for demand forecasting should be more popularized, and more appropriate mathematical models should be used for data mining to provide forward-looking guidance for production. First, it can guide the production scale of farmers who grow fresh agricultural products and ensure that the supply of goods is abundant and does not cause too much waste; the second is to help logistics companies clarify the budget for frozen preservation equipment and staffing, such as the number of refrigerated trucks and cold storage, and guarantee that the cold chain logistics and transportation capabilities will not cause too much financial burden to the enterprise; the third is to promote the maintenance and upgrade of the relevant information management system of the enterprise and let the enterprise clarify the investment in data management. In short, the forecast of cold chain demand is very important. It can have a relatively clearer understanding of the future cold chain logistics demand scale and development trend, provide forward-looking guidance for logistics management, and better ensure the balance of supply and demand.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Social Science Foundation of China (NSSFC) under grant no. 17BGL238.