Abstract

Container throughput forecasting plays an important role in port capacity planning and management. Regarding the issue of container throughput of Tianjin-Hebei Port Group, considering the container throughput is an incomplete grey information system affected by various factors, the effect is often unsatisfactory by adopting a single forecasting model. Therefore, this paper studies the issue by combining fractional GM (1, 1) and BP neural network. The comparison results show that the combination model performs better than other single models separately and has a higher level of forecasting accuracy. Furthermore, the combination model is adopted to forecast the container throughput of Tianjin-Hebei Port Group from 2021 to 2025, which would be a data reference for the future development optimization for the container operation of Tianjin-Hebei Port Group.

1. Introduction

Transportation by sea is the most important pattern of transportation in the international trade. 80%∼90% of the total import and export goods in China are conducted by sea [1] in which container transport has most proportion. This is due to the development of container transportation, which not only makes the operation develop in the direction of aggregation and rationalization but also saves the packaging materials and miscellaneous costs, guarantees the cargo integrity, shortens the transport time, and thus reduces the transport cost. Tianjin-Hebei Port Group is located on the west bank of China’s Bohai Economic Rim as in Figure 1, which is one of the shipping hubs in northern China, mainly including Tianjin Port, Tangshan Port, Qinhuangdao Port, and Huanghua Port. In the first quarter of 2021, Tianjin Port completed a container throughput of 4.469 million TEU, increasing 20.4% year on year, setting the record highest in the same period [2]. And Tangshan Port, Qinhuangdao Port, and Huanghua Port completed a total container throughput of 882,000 TEU, increasing by 31.9% in the first quarter of 2021 [3]. Under the background of international and domestic double circulation, the Tianjin-Hebei Port Group has become a primary support for the Beijing-Tianjin-Hebei region to participate in the international labor, cooperation, and competition and a key driving source for the economic development of the Beijing-Tianjin-Hebei region.

The structural arrangement of this paper is as follows: the second part reviews the relevant literature, the third part introduces the research methods, the fourth part gives the forecasting results and discusses the results, and the fifth part gives the research conclusions of this paper.

2. Literature Review

2.1. Importance of the Port Throughput Forecasting

Port throughput is a main scale index of port development and plays a crucial basic role in port planning and management. While the short-term forecasting of port throughput is for port enterprises in resource prescheduling and port intelligent scheduling, the long-term forecasting has an impact on the port strategic planning and national development strategy. The reasons that port throughput forecasting plays a particularly important role in port management are as follows. (1) Port infrastructure construction has a long technical life of indivisibles and irreversible nature of port infrastructure investments [4]. Once the infrastructure is in place, the characteristics of the port are determined for a long period [5]. Furthermore, port planning processes may take 5–15 years from the initiation of the masterplan to its final approval [6] for which port capacity is critical to be determined with consideration of port throughput forecasting. (2) Port projects require capital and fixed investments having long payback periods. This necessitates the financial viability of investments based on projections of port throughput and commodity flows [7]. If the port throughput forecast is relatively accurate, it can provide valuable reference for the port investment and construction. Otherwise, it is likely to cause wrong port investment decisions and incalculable economic losses. (3) Under the “Belt and Road Initiative,” the national sea strategy of China has been paid more and more attention, and the optimization of port construction has become the top priority. In order to make ports construction accurately serve the market demand in the future, it is of great practical significance to accurately forecast ports throughput for improving the port freight efficiency and raising economic benefits.

2.2. Influencing Factors and Various Methodologies on the Issue

Currently, many scholars have put up with extensive research studies on the issue of container throughput forecasting using different methods. There are 4 main factors dominating port container throughput: (1) the impact of the world economy; (2) the impact of port external environment; (3) the impact of port supply and demand; and (4) the impact of the port’s own conditions. Moreover, the issue is also facing miscellaneous and volatile uncertainties. For example, the outbreak of the US-China trade war has been affecting the development process and pattern of global trade, leading to changes in the cost, circulation, and price of commodities and thus has an unpredictable impact on the shipping industry with demand of cargo transportation [8]; the outbreak of COVID-19 has been causing uncertainties to cargo flows, increasing the challenges of indecision-making in port development projects [9]. Chen et al. [10] noted that due to the various factors affecting the throughput, it is difficult to use a single linear or nonlinear model when the forecasted data fluctuates. Based on a large number of the literature, Xiao et al. [11] summarized the quantitative prediction methods of port throughput, mainly including the following:(1)Time Series Method. This method establishes mathematical models based on historical throughput data, including autoregressive integrated mobile average (ARIMA) model, exponential smoothing, grey model (GM), and decomposition method (X-11). For examples, Rashed et al. [12] used the ARIMA intervention model to predict the container throughput in the Antwerp Port. Chen et al. [13] used the improved GM (1, 1) model to predict the Shanghai Port container throughput.(2)Causal Analysis Models. This method examines the correlation between the port hinterland throughput and a series of economic indicators and establishes the port hinterland throughput forecast model according to the relevant economic indicators. Currently, these methods mainly include regression analysis and elastic coefficient method. For example, Rashed et al. [14] used the autoregressive distribution lag model combined with the economic scenario to analyze and predict the relationship between container throughput and the EU trade index of 19 countries.(3)Nonlinear Dynamics Forecasting Models. The time series model and causal analysis models can obtain satisfactory forecasting performance when the container throughput time series is linear or nearly linear. However, the factors affecting the container throughput are complex, and container throughput fluctuations often reveal high nonlinear dependencies. Therefore, using only these linear models may be very poor. Recently, some nonlinear dynamic prediction models are introduced in container throughput forecasting such as artificial neural network (ANN) and genetic planning (GP). For examples, Fang and Fang [15] used a port throughput prediction model based on the BP neural network algorithm. Chen and Chen [16] studied port container throughput prediction using the genetic planning-based approach. Eskafi et al. [17] predicted the port throughput using Bayesian estimation models taking epistemic uncertainty into account affecting macroeconomic variables to forecast the annual throughput of the multipurpose Port of Isafjordur in Iceland.(4)Combined Forecasting Method. This method combines two or more prediction models to compensate for each other’s defects and improve data processing so as to obtain more stable and accurate results. For examples, Fang and Fang [15] studied the issue of port throughput prediction in Guangdong Province using a multivariate combination model of the genetic algorithm (GA) and back propagation neural network (BPNN). Chen et al. [10] studied combination models of the pearl curve model, GM (1, 1), and double exponential smoothing models, which perform better by case study than any other single prediction model from the two or more.

These research studies have provided valuable reference for port throughput forecasting research, but some of the models consist of limits such as some lack of input data, limiting their performance, increasing uncertainty, and reducing the reliability of prediction results [18]; some models themselves have limited treatment of uncertainties, consider only internal factors, and ignore external factors [5];. Chen et al. [10] noted that a single model may cause inaccurate predictions due to numerous influencing factors.

2.3. Introduction to the Methodology in This Paper

In order to achieve better performance and more accurate results, considering the various factors entangling and the relevant research studies studied, we have decided to establish a combination model for the container throughput issue consisting of the fractional GM (1, 1) model and BP neural network model.

In 1981, Professor Deng [19] put forward and introduced [20] the concept of the grey system. In the last 3 decades, many scholars are having been studying and developing the new modified grey model to make up for the defects and increase the performance. Wu et al. [21, 22] firstly placed and improved the fractional accumulation on the grey system models which is of great innovation and dramatically improved the prediction precision of grey models. In recent years, new grey models have been put forward and relevant research studies have been carried out [23, 24] like mushrooms after a spring rain. Among the models, fractional order GM (1, 1) has been practically mature and extensively applied as well as BP neural network. BP neural network was proposed by scientists led by Rumelhart and McClelland in 1986. It is a multilayer feed-forward neural network trained according to the error back propagation algorithm, and it is the most widely used neural network [25]. For recent examples, Gao et al. [26] used the BP neural network to study the forecast of the short-term rainstorm; Deshwal et al. [27] has established a language recognition system using the BP neural network model; Duddu et al. [28] used the BP neural network model to predict visibility at the road connectivity level; Liu et al. [29] used fractional GM (1, 1) and BP neural network for power load forecasting and so on.

In this paper, we combine two models of the fractional GM (1, 1) model and BP neural network model, with the detailed data of Port Statistical Yearbook of China [30] and the officially published information from Ministry of Transport of China [3] and carry out a forecasting on the container throughput and provide a 5-year reference data for the port enterprises in resource prescheduling and port intelligent scheduling.

3. Methodologies

3.1. Modeling and Testing Method of FGM (1, 1) Model

The advantage of the GM (1, 1) model is that it can handle grey information and poor data, but the model also suffers with great errors in some cases and performs unstably. To improve the GM (1, 1) model, the fractional GM (1, 1) model selects the appropriate accumulation order, reduces the error, and can get better prediction results [21]. The basic process of the FGM (1, 1) model is given as follows:(1)From the original nonnegative data, the original sequence is given as follows:(2)Based on the original nonnegative sequence, the order accumulation sequence is as follows:where(3)The whitening differential equation was established as . The form of the solution is an exponential function as follows:solved by the least square method, for ,where(4)Time response function solved is as follows: while is the value of the time .(5)For the sequence , the reduction of the sequence is as folllows:where .Through the b-b operation, the prediction sequence is as follows: .(6)The model is evaluated using the mean absolute percentage error (MAPE), whereWhen , the FGM (1, 1) model is the grey GM (1, 1) model.

3.2. Establishment Process of BP Neural Network Based on Layer Training

The BP neural network is a multilayer feed-forward neural network (MLFNN). The main features of BP neural network are the signal forward transmission and the error in back propagation. The input signal after the hidden processing is transmitted to the output layer. If the output layer node fails to reach the expected output, it will proceed to the back propagation phase of the error. The output error is returned in some subform to the input layer through the implicit layer and apportioned to the implicit layer nodes and the input layer nodes. Thus, the error signal of each layer unit is obtained as the basis for modifying the weights of each unit.

The BP algorithm only uses the mean square error function for first derivative (gradient) of weight and threshold, so the convergence rate of the algorithm is slow and easy to fall into local minimum. In order to solve this problem, Hinton and Salakhutdinov [31] proposed a unsupervised greedy layer-wise training algorithm, a machine learning method of deep neural network based on human brain learning thought, which brought hope to solving the optimization problem related to deep structure. The main idea of the layer-by-layer training algorithm is to train only one layer in the network each time, each layer training separately. It firstly trains a network with only one hidden layer and only then starts training a network with two hidden layers, and the rest can be done in the same manner (Figure 2). In each step, we fix the trained front k − 1 layer and then add the layer k (that is to take the output of the prealready trained k − 1 as input). The weights obtained by these layers trained individually are used to initialize all the weights of the BP neural network, putting all the layers together to optimize the training errors on the labeled training set.

BP neural network used in this paper is only capable for training normalized data ranging from [0, 1]; as in this paper, data from the FGM model must be normalized before being trained in the BP neural network.

The objective function of the BP neural network above is the mean relative error (MRE). BP neural network structure is shown in Figure 2:

4. Testing, Forecasting, and Results

4.1. Comparing Test between Combination Model and Other FGM Models

This paper forecasts the container throughput of Tianjin-Hebei Port Group. The data from 2012–2020 come from the Port Statistical Yearbook of China and the official website of the Ministry of Transport, including Tianjin Port, Tangshan Port, Qinhuangdao Port, and Huanghua Port. We will firstly substitute the data of 2012–2017 into the model for operations, with the forecasted results of 2018–2020 against the actual figures of 2018–2020. Then, the forecasting accuracy of the combination model would thus be verified. The reason we only used six years of data is as follows. (1) The port group develops rapidly. The data base of each port is small, and the data change range is large. The earlier data have little influence on the current and future data. (2) The grey model has strong performance in processing small sample of data.

We substitute the container throughput data from 2012–2017 into GM (1, 1) and FGM (1, 1) models, respectively, when and obtain the simulation data for 2018–2020 (Tables 14).

As it can be seen from the above, different ports and total values are with different MAPE values. When for Tianjin Port, for Tangshan Port, for Qinhuangdao Port, for Huanghua Port, and for the total, MAPE value is the minimum, respectively: 1.137%, 1.568%, 2.459%, 6.777% ,and 0.800%, as shown in Figure 3.

Particularly, the reason for the largest data MAPE in Huanghua Port is that it is an emerging port with the smallest data and the most random factors and unpredictable influences. Regardlessly, we will proceed and test the BP neural network model with the output data.

As the FGM model is mainly suitable for small data modeling, while BPNN is more suitable for big data modeling, bootstrap has been applied in the process with which small data for FGM are transformed to big data for BPNN by random sampling. Because the BP neural network model can only deal with the normalized data ranging [0, 1], we normalize and substitute the original data into the BP neural network model for a test, error percentage calculated against the original data, as in Table 5.

From Table 5, we can see the data error after BPNN processing is very small, especially the recent 2020 data error is only −0.92% which is completely satisfactory for the forecasting. So as tested, the BPNN model is fully feasible for the combination model. Therefore, after normalization of the results of grey forecasting of the 4 ports in Table 6 ( Tianjin Port, Tangshan Port, Qinhuangdao Port, and Huanghua Port), we substitute the optimized output data from FGM into the BP neural network for training, which is actually using the FGM (1, 1) and BP neural network combination model to process the data, and conclude in Table 6.

It can be seen that the data obtained by the FGM (1, 1) and BP neural network combination model had a low error compared with the original data and outperformed any other grey model in forecasting accuracy and stability as in Figure 4. Amazingly, the runner-up is the GM (1, 1) model as we mentioned before that it could perform unstably. On some level, it proves again that the grey model fits well with the container throughput forecasting.

4.2. Forecasting Process of the Grey Combination Model and Results

As the grey combination model is feasible for forecasting, we proceed with the actual figures to forecast for the next 5 years. Given that we used six years of data for testing, we also use six years of data in the forecasting. In the first step, we substitute the data from 4 ports and the total, respectively, into the FGM (1, 1) with optimized orders for processing. Results of the first step using FGM (1, 1) model are shown in Table 7.

In the sSecond step, we then normalize the output data which we would apply bootstrap to transform into big data and substitute into the BP neural network for training. Training error progress is shown in Figure 5.

As shown in Figure 5, in the progress of training, the error rapidly decreases and reaches the target error of 0.01 when the epoches reach nearly 2200. Thus, the forecasting normalized data of the combination model are obtained. After the antinormalizing processing, the forecasted results and the container throughput developing trend can be obtained, as shown in Table 8 and Figure 6.

5. Conclusion

Since the change of container throughput in Tianjin-Hebei Port Group is an incomplete information grey system affected by various factors, it is difficult to cover a variety of factors in the forecasting using a single forecasting model, which leads to the insufficient accuracy and stability of the forecast. The traditional GM (1, 1) model is suitable for the study of such grey systems, including the advantages of processing data from small samples, but the model has disadvantages such as high original sequence dependence and slow convergence. In this paper, we use the FGM model and BP neural network combination model that improves for the defects of each single model. The results show that the accuracy and stability of the combination model are better than those of any other grey model in data fitness and stability which meets the forecasting requirements of container throughput of Tianjin-Hebei Port Group. This paper forecasts the container throughput of Tianjin-Hebei Port Group in the next 5 years. The results show that the container throughput of Tianjin-Hebei Port Group remains increasing year by year in the next 5 years, which provides a data reference for the further exploitation of the ports’ resources, shipping schedule planning, related infrastructure construction, and so on in dealing with the increasing container throughput.

Data Availability

The port throughput data from 2012-2017 that support the findings of this study are openly available in China Statistical Yearbook published by the National Bureau of Statistics of China. 2013-2018. The port throughput data from 2018-2020 that support the findings of this paper are openly available at Government Information Disclosure column, Comprehensive Planning Division, Official website of the Ministry of Transport of the People's Republic of China (https://xxgk.mot.gov.cn/jigou/zhghs/201905/t20190513_3198922.html, https://xxgk.mot.gov.cn/2020/jigou/zhghs/202006/t20200630_3321297.html, and https://xxgk.mot.gov.cn/2020/jigou/zhghs/202101/t20210121_3517383.html).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partially supported by Handan Science and Technology Research and Development Program (19422303008-72), Handan Philosophy and Social Science Planning (2020030), Major Projects of China National Social Science Foundation (20 & zd129), and General Program of National Natural Science Foundation of China (72073018).