Abstract

Research on the forecasting of marine traffic flows can provide a basis for port planning, planning the water area layout, and ship navigation management and provides a practical background for sustainable development evaluation of shipping. Most of the traditional marine traffic volume forecasting studies focus on the variation of the traffic volume of a single port or section in time dimension and less research on traffic correlation of associated ports in shipping networks. To reveal the spatial-temporal autocorrelation characteristics of the shipping network and to establish a suitable space-time forecasting model for marine traffic volume, this paper uses the AIS data from 2011 to 2016 for the South China Sea to construct a regional shipping network. The adjacent discrimination rule based on network correlation is proposed, and the traffic demand between ports is estimated based on the gravity model. On this basis, STARMA (space-time autoregressive moving average) model was introduced for deducing the interaction between he traffic volumes of adjacent ports in shipping network. The experimental results show that (1) there is a significant positive correlation between time and space in the South China Sea shipping network, and this spatial-temporal correlation has the characteristics of time dynamics and spatial heterogeneity; (2) the forecasting accuracy of the marine traffic volume based on the spatial-temporal model is better than the traditional time-series-based forecasting model, and the spatial-temporal model can better portray the spatial-temporal autocorrelation of maritime traffic.

1. Introduction

With the deepening of economic globalization, marine transportation as an important pillar of international trade has developed greatly [1]. At the same time, with the advancement of science and technology and the market demand continuously being upgraded, the marine traffic is becoming increasingly busy, and the contradiction between busy marine traffic and navigation safety, navigation efficiency, and navigation resources is becoming increasingly prominent [2]. To alleviate this contradiction, better safeguard navigation, improve navigation efficiency in water areas, and scientifically and rationally plan the layout of water areas, it is necessary to determine the future development trends by forecasting and researching marine traffic flows and to promote the sustainable development of shipping.

Marine traffic volume forecasting uses the historical data from society, economy, and transportation to predict traffic conditions in the sea for a period of time in the future, and whether the selection of the forecasting method is reasonable is the key to predicting success or failure. According to the nature of the forecasting method itself, the common forecasting methods can be summarized into three categories: linear-based forecasting methods, nonlinear-based forecasting methods, and combined forecasting methods. The forecasting method based on linear theory realizes prediction in a certain period of time in the future based on the periodic pattern of time series and is widely used in traffic volume forecasting [3]. However, the disadvantage of the time-series model is that it is only applicable to the situation where the traffic volume trend is stable. As the nonstationary of traffic volume increasing, there is a large deviation between the forecast result and the observed value [4]. With the development of emerging intelligent methods, such as artificial intelligence, machine learnings, neural networks [5, 6], support vector machine [7, 8], and genetic algorithms, the forecasting model of knowledge discovery provides a new method for modelling spatial-temporal series. Researchers use the time series as an input variable to intelligent calculation methods to forecast future marine traffic to reflect seasonal changes in the marine traffic volume [9], and the results show that the nonlinear forecasting results are more accurate and reliable [10]. Although the marine traffic forecasting based on the intelligent method can be applied to the prediction of nonstationary marine traffic, there are problems such as a slow convergence rate and the ease of fall into a local optimal solution [11], and it is difficult to achieve higher accuracy relying on a single forecasting model. To overcome the deficiencies of a single forecasting model, scholars have studied the use of multiple forecasting models for joint forecasting and have achieved better results [12]. The forecasting model based on hybrid theory can effectively improve the forecasting accuracy of traffic volume to a certain extent, but due to the relatively complex model, the setting of its weight parameter is the key to the accuracy of the model [13].

In summary, it can be seen that each forecasting model has its applicable conditions and that the merits of a certain model cannot be determined simply. In addition, most of the above methods for predicting the maritime traffic volume focus on analyzing the variation characteristics of the traffic volume of a single port or section in the time dimension [14, 15] and are less involved in large-scale shipping networks, taking into account the link to traffic conditions between ports. Urban road network, which belongs to the transportation network as the shipping network, mainly defines the spatial adjacency [16] by means of the upstream and downstream relationship of the road network [17]. However, marine traffic is a complex system, and the spread of influence of traffic conditions between adjacent routes is not isotropic [18]. At the same time, the influence of different network topologies on the diffusion of traffic conditions also has a greater impact and constraints [19, 20]. It is biased to rely solely on simple spatial adjacency to determine the autocorrelation of traffic conditions. In addition, to more accurately depict the complex geographical phenomena existing in the shipping network, it is necessary to avoid isolated use of a temporal or spatial model. The spatial-temporal models such as space-time dynamics models and space-time Bayes, are widely used in climate change [21], environmental monitoring [22], urban transport [23], public health [24], and other fields. However, most of the spatial-temporal models are built for specific applications and do not adequately reflect the geographic characteristics of spatial-temporal data. As a very important spatial-temporal statistical model, the STARMA (space-time autoregressive moving average) model characterizes the spatial-temporal dependence between different regions by deriving the interaction between adjacent regions within the system and is more suitable for modelling geographic spatial-temporal data [25]. Therefore, in order to reveal the spatial-temporal autocorrelation characteristics of shipping networks and to establish a suitable spatial-temporal prediction model for marine traffic volume, this paper proposes an adjacent rule based on network correlation based on the definition of road network adjacency and extends STARMA model to derive the relationship between traffic volumes between ports.

The basis for marine traffic volume forecasting is historical data on marine traffic. Not much data were available to use in previous marine traffic surveys, and usually only a few major ports were available. The lack of accurate marine traffic data has become a bottleneck in the previous study, which has brought great difficulties to the forecasting of marine traffic volume [26]. AIS (automatic identification system) data contain a large number of marine traffic characteristics, from which we can obtain the characteristics of ship behavior. In recent years, the number of shore-based base stations and low-orbit satellites in the AIS system has continuously increased, and the continuity and effectiveness of the ship’s trajectory are continuously improving, making it possible to forecast the marine traffic volume.

In view of the above, this paper studies the STARMA model to establish the spatial-temporal model of traffic volume in the shipping network. First, the time series of traffic flow attached to the port are taken as the objects, and the spatial-temporal correlation characteristics of the marine traffic flow network are analyzed by using spatial statistics. Second, the port correlation in the network that meets the adjacency characteristics is determined based on the subdivision hierarchy of the marine traffic flow network, and the traffic demand between ports is estimated based on the gravity model. On this basis, the spatial weight matrix is constructed. Finally, we use the measured marine traffic volume of ports in the South China Sea and its surrounding sea to verify the applicability of the space-time autoregressive model.

2. Materials and Methods

2.1. Study Area

The study area of this paper is in the range of 96°E∼126°E and −5°S∼26°N, and the specific range is shown in Figure 1. The total study area is approximately 4.2 × 106 km2 and includes the areas surrounding the South China Sea, the Gulf of Thailand, the Philippines, the Strait of Malacca, and the sea surrounding Kalimantan. The layout of the shipping network is regional, to more accurately reflect the regional shipping system and the study area in this paper covers the national ports around the South China Sea. Hereafter “South China Sea” in this article refers to the study area in this article.

The South China Sea is one of Asia’s three largest marginal seas and is known as the “Mediterranean of Asia.” The South China Sea shipping system is composed of numerous ports and routes, and the differences among the ports are relatively large. Due to the large number of people involved and the large economic aggregates, its development potential is huge. As a very important location, the South China Sea is a must-have destination for East Asian and Southeast Asian westbound routes, and its special geographical location means that it has a large traffic flow within its sea area [27]. According to the statistics of the World Shipping Council, 25% of the world’s sea transport takes place through the South China Sea, and an average of 300 ships travel through the South China Sea every day. In addition, the Malacca Strait, which is located between the Indian Ocean and the Pacific Ocean, is one of the most important channels for international trade transport, and it bears one-third of the global trade volume of goods and more than one-half of the oil transport volume. Based on the above reasons, this paper focuses on analyzing the spatial-temporal characteristics of traffic flow around the South China Sea and, based on this, builds a spatial-temporal model of the marine traffic volume.

2.2. Data and Data Processing

The experimental data used in this paper are the AIS data within the study area from January 2011 to December 2016. The ship types in the AIS data include cargo ships, passenger ships, tugboats, pilot ships, and oil tankers. The other types of data are excluded, and only the cargo ship data are retained as the data for the experiment; the amount of AIS data involved in each year is 1.27, 1.36, 1.52, 1.64, 1.58, and 175 million, respectively.

In this paper, the grid DBSCAN algorithm [28] is used to obtain the area of the ship’s stay area in the study area, and the near-shore section is selected as the ship arrival port identification. On this basis, the ship’s stop position at the port of departure , start time and end time , and ship track point series in the open sea area are used to extract the ship trip information.

Through the above steps, thousands of ship stay areas were obtained. However, after analyzing the data, it was found that a port city has multiple stopovers, and in reality, many of the ship’s stay areas are only one of the many port areas in the port city, such as Singapore’s main port areas of Jurong, Pulau Bukom, Sembawang, Tanjong, and Penjuru. In this paper, we obtained 155 port cities by integrating the obtained stay areas; that is, this article uses the port city as a shipping network node. Based on this, we obtained a total of 368 pairs of ports with a total of more than 50 voyages in the surrounding ports of the South China Sea, which constitutes the connections between the ports in the shipping network. In addition, to excavate the internal correlation characteristics of the ports in the shipping network and to reveal the spatial-temporal correlation of the marine traffic flow, the time series of each port’s traffic flow were calculated. The skewness and kurtosis values of some ports were found to be relatively large (skewed > 20 and kurtosis > 5), and after further verification by the normal distribution test, the abnormal ports are deleted. Finally, the 60 ports, which mean monthly voyages are greater than 10, were selected as the modelling objects of the spatial-temporal model (Table 1).

Figure 2(a) is the spatial distribution map of 60 representative cargo ports in the study area. Figure 2(b) shows the monthly time series of marine traffic in the three ports of Thailand, Hong Kong, and Singapore in the South China Sea shipping network for 6 years. It has an upward trend with obvious periodic characteristics. During the experiment, to test the accuracy of the spatial-temporal model, data from 2011 to 2015 were used as training samples, data from 2016 were used as test samples, and the forecast time was 12 months.

3. Method

The STARMA (space-time autoregressive moving average) model fully considers the autocorrelation existing in the geographic spatial-temporal data set and is suitable for processing spatial-temporal series. However, the STARMA has certain limitations in some aspects, such as spatial and temporal nonstationary time-series data modelling and effectively describing the correlation between spatial variable series by existing spatial weight matrix decision rules. In view of the above problems, this paper builds the spatial-temporal model of marine traffic volume in shipping network based on the STARMA model. First, Moran’I and temporal autocorrelation function are used to test the spatial and temporal autocorrelations of spatial-temporal data. Second, the shipping network hierarchical subdivision structure is used to determine the port correlation in the network that meets the adjacent rules, and the traffic demand between the supply end and the demand side is comprehensively considered to estimate the traffic demand between ports. On this basis, the spatial weight matrix of the spatial-temporal model of the shipping network is constructed. Finally, the applicability of the spatial-temporal model based on network correlation proposed in this paper is verified by using the measured ship traffic volume data of the ports in the South China Sea and its surrounding seas.

3.1. Spatial and Temporal Characteristics of Traffic Volume in Shipping Network

The traditional linear regression method cannot capture the autocorrelation characteristics contained in spatial-temporal data, which makes it impossible to find the spatial-temporal patterns. Therefore, the premise of using above models to simulate traffic volume in the shipping network is that the data are statistically spatially independent and evenly distributed. In order to explore the spatial-temporal heterogeneity and nonstationary characteristics of the spatial-temporal series in the shipping network, we use Moran’s I statistic to describe the spatial autocorrelation characteristics of traffic volume in shipping network and use time autocorrelation function and partial autocorrelation function to describe the temporal autocorrelation characteristics of traffic volume in the shipping networks.

Moran’s I statistic is the most common type of spatial autocorrelation statistic in current applications and is divided into global Moran’s I and local Moran’s I according to the difference in the metric regions. The Global Moran’s I statistic is used to characterize the degree of correlation between all spatial objects in the spatial range to show whether the spatial distribution pattern among the spatial objects is significant. The local Moran’s I statistic is used to characterize the local spatial distribution pattern.

Time autocorrelation is one of the main features of time-series data, which can express the dependence of object attributes in time. The ACF illustrates the degree of correlation between time-series data at different times, ranging from −1 to 1, and the closer the value to 1 is, the higher the degree of autocorrelation of the time series is. The ACF map shows how the correlation of the time-series changes with the lag period k. The partial autocorrelation function is another method for describing the structural characteristics of a stochastic process in a stationary time-series and each regression coefficient in the function indicates the autocorrelation coefficient between and which excludes the influence of the intermediate variables .

3.2. Spatial Weight Matrix considering Network Node Correlations

The purpose of the spatial weight matrix is to accurately and comprehensively reflect the correlation between spatial geographical units. However, because of the economic development level of the port hinterland and the geographic location of the traffic flow, the spread of marine traffic flow is not isotropic. Even if the geographical distance is closest to the port, the traffic status may not be directly related. This nonuniformity of the proliferation of the marine traffic flow makes it difficult to find the spatial correlation of marine traffic by measuring the adjacent ports by geographical distance alone. To solve the above problems, first, based on the subdivision hierarchy structure of shipping network obtained in the prestudy [29], the port correlation in the network that meets the adjacent rules is determined. Second, the demand for traffic between ports based on the gravity model is estimated. Finally, a spatial weight matrix is constructed based on the above steps.

3.2.1. Definition of Spatial Adjacency

In the spatial weight setting of the shipping network, it is first necessary to define spatial adjacency, which is also the basis of the spatial pattern measure. The spatial adjacency definition methods commonly used mainly include methods such as coboundary method, topological graph method, and distance method. However, the abovementioned spatial adjacency rules are mostly based on their own related perspectives, ignoring the network aggregation characteristics of the traffic flow in the shipping network. This paper defines the adjacency rules of any two ports in the shipping network as follows: (1) the two ports are directly adjacent; that is, there is navigation between the two ports; (2) the two ports are not directly adjacent, and at this time, the connectivity of the two ports needs to pass through one or more transit ports. Thus, based on the perspective of a complex network, the shipping network is identified as a hierarchical subdivision structure (Figure 3).

In the previous study of this paper [29], the shipping network in the South China Sea has been proven to be a typical scale-free network. There are obvious hierarchical and community characteristics in the network, and there is a direct connection with the ports at the adjacent level within the community. This kind of relationship between nodes in a complex network is just an expression of the adjacent relationship of the spatial adjacency matrix of the port nodes. Thus, the spatial adjacency matrix in the shipping network is defined as follows:where i and j are the nodes in a shipping network; M and N are the different communities in a shipping network; indicates that i and j have an adjacent relationship in the network, that is, belong to the same community; and two nodes are hierarchically adjacent. indicates that nodes i and j are not adjacent in the network; that is, they do not belong to the same community, or two nodes are not hierarchically adjacent.

3.2.2. Definition of Spatial Weight

Based on the spatial weight matrix created by the influence of geographical spatial relationship and attribute values, it is easier to reflect the spatial agglomeration and spatial differences in spatial units. The monthly forecast of marine traffic volume is mainly affected by many factors, such as weather conditions and navigable water level, and the marine traffic volume shows a certain degree of periodicity and seasonality. According to previous scholars’ research, the gravity model can better explain the spatial adjacency of ports in shipping networks.

Spatial weight is a quantitative measure of adjacency. The gravity model is an important method to explain the spatial adjacency among ports in shipping networks [30]; that is, voyages between adjacent ports are more frequent than voyages between distant ports. If is the distance from port i to j, then the trend of the interaction between them is expressed by the lag function , and the voyage expression between them , where is the total number of voyages leaving port i and is the total number of voyages arriving at port j.

In estimating the weight of marine traffic between ports, the traffic attraction between the supply end and the demand end of the marine traffic volume is taken into consideration, and the gravity model is used to estimate the traffic volume between ports. The amount of traffic on the supply end and the demand end is proportional to the traffic and is inversely proportional to the traffic impedance between the two. The following equation is used to estimate the traffic volume between ports:where is the amount of traffic generated between ports i and j, and are the total traffic volumes of ports i and j, and α and β are the traffic volume coefficients ranging from 1 to 2, which are set to 1. In this paper, is the distance attenuation function and is set to 0.59 [31].

3.3. STARMA Model

The STARMA model uses spatial-temporal delay operators to expresses the spatial-temporal variables that are influenced by both temporal lag and spatial lag [25]. The spatial lag operator L indicates that the traffic volume of a port in space is affected by the adjacent port, and the forecasting value of the port traffic volume can be expressed as a weighted average of the values of the adjacent port traffic variables. Its quantified expression is as follows:where p is the time autocorrelation coefficient, q is the temporal moving average order, is the spatial lag operator, is the spatial order of the k temporal autocorrelation term, is the spatial order of the l temporal moving average term, and are parameters, and represents the random error.

The basic idea of the STARMA model is to derive item by item according to the spatial-temporal series and calculate the spatial-temporal average of each item in turn to realize the simulation of long-term trend change. The periodic influence factors of traffic volume in the shipping network make the traffic volume to show a strong seasonality, while the trend influencing factors will make the traffic volume change to have long-term trend characteristics. Therefore, according to the time-series statistics, the ship traffic flow has the characteristics of short-term time autocorrelation and long-term trend change. In this paper, the traffic flow time-series data are decomposed into large-scale spatiotemporal nonlinear trend u and small-scale spatiotemporal variation trend e. The large-scale variability refers to the deterministic global trend, while the small-scale variability refers to the stationary sequence separated by spatial-temporal data. Such a nonstationary spatiotemporal sequence can be expressed as , where is the observation, is the large-scale space-time deterministic trend, and is the periodic trend of the mean constant. The linear part is modelled by STARMA, and the nonlinear part is modelled by the neural network. On this basis, the linear forecasting results are combined with the nonlinear results to obtain the final results.

4. Results

4.1. Traffic Flow Time-Series Analysis
4.1.1. Spatial Autocorrelation Analysis

Spatial autocorrelation is an important feature of the spatial geography unit, and this paper analyzes the distribution of marine traffic of each port from two aspects, global, and local as follows:

(1) Global Spatial Autocorrelation. This paper uses the global Moran’s I to perform global spatial autocorrelation analysis of arrival voyages in the surrounding ports of the South China Sea during 2011∼2016. Taking the arrival voyages of each port as variables, based on the distance weight matrix, we use GeoDa software to calculate the global Moran’s I value. As Table 2 shows, the global Moran’s I value of the arrival voyages to the ports of the South China Sea is positive, and the result passed a Z-value test with a significance level of 0.01. The fact that the global Moran’s I index value is greater than 0 in each year indicates that the marine volume of the surrounding ports in the South China Sea has a positive spatial correlation and shows a certain degree of spatial aggregation. At the same time, Moran’s I index significance level values are all below 0.001, indicating that the spatial correlation of shipping development in the ports of the South China Sea is relatively low, and the overall spatial agglomeration effect is not obvious. In general, marine traffic flow is spatially weakly stable, and in the modelling, the marine traffic flow in the shipping network can be considered as smooth data.

(2) Local Spatial Autocorrelation. The spatial correlation distribution of the marine traffic volume in the South China Sea was obtained through the local Moran’s I analysis. As Figure 4 shows, the marine traffic flow mainly presents a high-high aggregation model in China’s coastal area, while in the Philippines, a low-low aggregation model is mainly presented. It is not difficult to understand as the route density in China’s coastal areas is large and the links between ports are closer than those in other regions. The coastal areas in China have a high density of routes, and the connections between ports are closer than in other regions. It is not difficult to understand that China’s foreign trade has continued to grow rapidly and voyages between ports have continued to hit new highs, while the Philippines has fewer voyages due to its smaller economic aggregates and low dependence on foreign trade. In addition, Ho Chi Minh, Manila, and surrounding areas are high-low correlation models. The abovementioned ports are either economic centers of the countries around the South China Sea or occupy important geographical locations, and all bear the pivotal role of connecting with other ports. Therefore, their marine traffic is relatively large and forms a high-low correlation model with the surrounding ports.

4.1.2. Temporal Autocorrelation Analysis

The ACF map can represent the change in the autocorrelation coefficient between any time periods in the time series. Figure 5 shows the ACF and PACF plot of the monthly time-series data of the total traffic volume of the surrounding ports in the South China Sea from January 2011 to December 2016. Among them, Figures 5(a) and 5(b) are the original time-series data of the total traffic volume and the time-series data after the difference; Figure 5(c) shows the ACF plot of the spatial-temporal series data for each port after the difference, where the abscissa is the lag period, the ordinate is the corresponding autocorrelation function value (that is, the correlation degree), and the dashed line is the confidence level with a 95% confidence interval approximation; Figure 5(d) is the PACF plot of the spatial-temporal series for each port after the difference, and the ordinate is the corresponding partial autocorrelation function value (that is, the degree of correlation).

In Figure 5(c), the step length of each lag period is one month, and within 12 months (lag period k < 13), the ACF is greater than 0.2; the ACF gradually decreases with the increase of the lag period k. The above content shows that the correlation degree of the spatial-temporal series data for each port’s traffic volume gradually weakens when the lag period k increase. In addition, the traffic volume for each port has a strong positive correlation in a certain time range and the traffic volume in the current time period has a certain degree of correlation with the traffic volume in the subsequent time period. As can be seen in Figure 5(d), the PACF rapidly approaches zero from 0.7 and exhibits an alternating exponential decay, which shows that the data after the difference are time-stable time series. Furthermore, the PACF is truncated after the first-order time lag, indicating that the time-shifted average degree is 1.

4.2. STARMA Model Building

In the previous study of this paper [29], the port communities in the shipping network are named China Coastal Community, Taiwan Community, Beibu Community, ASEAN Community, and Philippine Community (Figure 6). Combined with the centrality of ports, the network is presented as a subdivision hierarchy structure. From the above definition of the spatial adjacent relationship, the first-order and second-order spatial adjacencies of the ports around the South China Sea are obtained. The diagonal element of the spatial adjacency matrix is 0. If the off-diagonal element is 0, there is no adjacent relationship between the spatial units corresponding to the row and the column; if the nondiagonal elements is 1, the spatial unit corresponding to the row and the column corresponds to adjacency. Then, by weighting of equation (2) and normalization of the adjacent matrix, the first-order spatial weight matrix W(1) and the second-order spatial weight matrix W(2) are obtained.

In this paper, the ST-ACF and ST-PACF values are calculated based on spatial-temporal series data for each port in the shipping network. The calculation results are as follows:

It can be seen from Tables 3 and 4 that the values of the ST-ACF and ST-PACF show a progressively decreasing phenomenon in both time dimension and the spatial dimension. The ST-PACF shows truncation after the first-order spatial and temporal lags. Therefore, it can be concluded that the sample is a spatial-temporal autocorrelation process of first-order time lag and second-order spatial lag. The initial judgment of the STARMA equation is STARMA (1, 0). The specific model is as follows:

After obtaining the above model, the parameters in the above equation are further estimated using the sample data from 2011 to 2015. The estimated value of each parameter in the equation and its test value are shown in Table 5.

Table 5 lists the parameter estimates of , , and in the equation and the corresponding significance test results. It can be seen that the values of the three parameters are all less than 0.05; therefore, the assumption that the equation coefficient is not related to the dependent variable should be rejected; that is, the independent variable can explain the change in the dependent variable. Therefore, the spatial-temporal model can be expressed as follows:

4.3. Model Comparison

To test the forecasting model of shipping network traffic, we used the nonspatial time-series prediction method STL (seasonal-trend decomposition procedure based on Loess) to verify the results obtained. The STL method is a time-series decomposition method proposed by Cleveland in 1990 [32], and this method decomposes the time-series data into independent long-term trend terms, seasonal components, and residuals to achieve predictions. Although the STL method cannot achieve a spatial-temporal forecast, it can obtain the evolutionary characteristics and rules of things over time by processing the time-series data and then forecast the future development of things [33].

Based on the obtained shipping network model equation (5), spatial-temporal forecasts are made using the first-order and second-order spatial weight matrices constructed previously. By taking the traffic volume time-series data of the 60 ports around the South China Sea from 2011 to 2015 as training data and taking the traffic volume time-series data in 2016 as the test sample of the model’s forecasting ability, the spatial-temporal forecast results were obtained. In the spatial dimension, Figure 7 shows the spatial-temporal models and STL forecasting results of marine volume in the South China Sea shipping network in 2016. In the temporal dimension, Figure 8 shows the forecasting results of marine traffic volume in the South China Sea shipping network in 2016.

From the forecast results in Figure 7, in the spatial domain, compared with the STL method, the forecast results of the spatial-temporal model of the shipping network are found to adapt better to the prediction results of different port sizes and are closer to the observations values. In Figure 8, in the temporal domain, it can be found that both models can obtain a good fit. Furthermore, it can also be found that when the time-stationary of the observations is high, the two models can approach the observations very well; when the observations vary greatly with time, the spatial-temporal model can more easily capture such changes.

4.4. Model Accuracy

In the previous section, we compared the time domain and space domain fitting results of marine traffic volume. In this section, we compare and analyze the two generalization accuracy evaluation results of traffic volume forecasting of the shipping network in the South China Sea. In order to compare the forecasting accuracy of the two models, three indicators, namely, Pearson’s correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error (MAPE) are selected. Table 6 shows the evaluation results.

As shown in Table 6, the correlation coefficient R (0.8827) based on the STARMA model is significantly higher than the correlation coefficient (0.4469) of the STL model, indicating that compared with the time-series model, the spatial-temporal model is more likely to capture the periodicity and trend of the monthly traffic volume of the port in the shipping network. The RMSE of the STARMA model is 12.15% higher than the RMSE of the STL model, indicating that the spatial-temporal model is more sensitive to outliers of traffic observations; that is, it is easier to explain spatial-temporal variations. Similarly, the MAPE accuracy of the STARMA model is also better than the MAPE of the STL model, which has an increase of 19.66%, indicating that the forecasting accuracy of the spatial-temporal model is better than that based on the time series.

Through model comparison, it is found that the spatial-temporal model fits better than the time-series model in forecasting the regional shipping network traffic volume and can better explain the spatial-temporal variation in regional shipping networks. The analysis shows that the reason is that it not only extracts the large-scale nonlinear trends in the spatial-temporal series but also captures the small-scale random spatial-temporal variation. Therefore, the spatial-temporal model can improve the generalization accuracy of the fitting to some extent.

5. Conclusion

This study uses ship AIS data from 2011 to 2016 in the waters surrounding the South China Sea to build a regional shipping network based on data mining technology. Then, considering the influence of the correlation between the ports in the shipping network on the traffic volume, a set of closely connected ports is identified by utilizing the topological features of the shipping network and is used to construct the spatial weight matrix of the shipping network in the South China Sea. Finally, the STARMA model was used to derive the interaction between traffic volumes at adjacent ports in the shipping network and to obtain traffic forecasts for ports in the regional shipping network. The conclusions are as follows:(1)By analyzing the spatial and temporal characteristics of the time-series data of ship traffic in various ports of the South China Sea shipping network, it can be concluded that the maritime traffic volume has certain regularity and difference in both time and space; that is, the port traffic volume in each region shows a certain degree of local autocorrelation in the traditional distance-based weight matrix, and it shows a clear spatial-temporal correlation in the network environment.(2)Marine traffic has strong regularity. This regularity is not only reflected in the time-series self-correlation characteristics of the port’s traffic volume but also in the strong correlation between spatially-related ports. The network adjacent rule based on the characteristics of the shipping network community proposed in this paper not only considers the geographical constraints but also takes into account the role of the port in the network and can better capture the spatial correlation of the marine traffic volume than the spatial adjacent rule that considers only the geographical distance.(3)Combined with the measured marine traffic volume data of the surrounding ports in the South China Sea, the applicability of the ship traffic volume forecasting method based on the spatial-temporal correlation characteristics proposed in this paper is validated. Several evaluation indexes such as R, MAPE, and RMSE are used to evaluate the prediction model’s simulation results. The results show that the fitting result of the spatial-temporal model is better than the time-series model, and it can better explain the spatial-temporal variation in the regional shipping network. The above results show that the prediction model proposed in this paper is reasonable and effective. In particular, the method proposed in this paper is to use the data in a period of time as training data to forecast the subsequent traffic volume. Therefore, the method proposed in this paper is also applicable to the forecasting of traffic volume in the future, that is, 2018 and 2019.

In addition, the results obtained from this paper also reflect that the model constructed in this paper reflects the system characteristics and spatial-temporal correlation characteristics of the regional shipping system and summarizes the integrity of the shipping network traffic and the interrelationship of various parts within the system. To a certain extent, this paper reveals the complexity, changing laws, and development trends of regional shipping systems, provides a scientific basis for the scientific planning of the future development scale of ports and the formulation of long-term strategic decisions, and provides a certain supplement for port system research. Since the shipping is a complex and volatile industry and will rise and fall with the global economy, in the medium and long term, it is recommended that each country should analyze the future development trend of the regional shipping system. On this basis, scientifically plan the layout adjustment of domestic ports, and maintain the moderate construction of the port so as to achieve the sustainable development of regional shipping.

This paper starts with the influence of the correlation of the shipping network nodes on the maritime traffic, analyzes the spatial-temporal autocorrelation of the ship traffic volume on the sea, and builds a corresponding forecasting model on this basis. However, deficiencies remain and need to be further explored. This paper mainly considers the impact of node correlation of a shipping network on maritime traffic. The influence of the spatial structure and topological characteristics on maritime traffic needs further study. In addition, different historical data volumes lead to different prediction results, and sensitivity analysis for historical data volume should be further studied.

Data Availability

The AIS data used to support the findings of this study were supplied by Elane Inc. (http://www.elane.com) under license and so cannot be made freely available.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This paper was supported by the National Key R&D Plan (Grant no. 2017YFB0504205), the Major Program of National Social Science Foundation (Grant no. 14ZDA078-5), and the National Natural Science Foundation of China (Grant no. 41401450).