Abstract

The South China Sea is China’s largest marginal sea area, and it is rich in oil and gas mineral resources; thus, estimating its sea level changes is of practical significance. Based on linear and nonlinear sea level change characteristics, this paper decomposes 1992–2019 monthly mean sea level anomaly time series in the South China Sea into trend, seasonal, and random terms. This paper compares the seasonal autoregressive integrated moving average (SARIMA) and Prophet models for estimating the trend and seasonal terms and the long short-term memory (LSTM) and radial basis function (RBF) models for estimating random terms, and the more suitable models were selected. A Prophet-LSTM combined model was developed based on the accuracy results. This paper uses the combined model to study the effect of known data length on the experimental results and determines the best prediction duration. The results show that the combined model is suitable for short-term and medium-term estimations of 12–36 months. The accuracy at 36 months is 0.962 cm, which proves that the combined model has high application value for estimating sea level changes in the South China Sea.

1. Introduction

The South China Sea is China’s largest marginal sea and is a transportation hub for maritime energy transportation; it also has abundant reserves of oil and gas resources [1]. In recent years, the sea level of the South China Sea has continued to rise [2], and, over the next 30 years, the sea level along the coast will also rise by 50∼180 mm [3], which not only will have serious impacts on the natural environment, ecosystems, and social economy of the coastal areas [4] but will also pose challenges to maritime transport and energy extraction. Research on these changes can assist in measuring regional climate change, contribute disaster warning information, and provide a scientific basis for the coordination of maritime traffic and the rational exploitation of energy. Therefore, research on the estimation of sea level change trends in the South China Sea is necessary.

In terms of sea level estimation, predecessors tried different methods based on mathematical statistics, physical mechanisms, or combined model predictions according to different regions and data types. Mathematical statistics methods are mainly based on the mathematical law of sea level change time series, which is used to fit and extrapolate data [5]. Early mathematical statistical methods included simple linear regression [6], multivariate stepwise regression, maximum entropy spectrum analysis [7], and Kalman filtering [8]. The physical mechanism method considers climate change and sea temperature and salinity. Chen and others used the CCSM3 climate system model to simulate sea level changes, and the results showed that the global sea level will rise by 30 cm in the 21st century [9]. Zhang and others used the sea-atmosphere coupled model to predict the trends and spatial distribution of sea level in the South China Sea in the 21st century [10]. The use of these methods is relatively simple and does not take into account the nonstationary and uncertain characteristics of sea level changes. With the emergence of various neural network models, the estimation of nonstationary characteristics of sea level changes has become more possible; moreover, the combination of mathematical methods and neural networks can improve accuracy. Zhao and others used a model that combined least squares and radial basis function neural network to predict sea level anomaly series in offshore China, and the reliability of the model for short-term predictions was demonstrated, and the accuracy reached 0.65 cm [11]. Among the methods selected in this article, SARIMA is widely used in epidemiological prediction [12] and the Prophet model has good performance in user traffic prediction [13]. As neural network models, LSTM and RBF have certain application value in rainfall and river flow predictions [14, 15].

Currently, few studies have focused on the estimation of sea level changes in the South China Sea. To more accurately estimate the change trends of the South China Sea, this paper divides the monthly average sea level abnormal time series according to the sea level change characteristics in the South China Sea as follows:(i)Trend term and seasonal term combination series(ii)Random term series

In terms of model selection, the SARIMA [12] and Prophet [16] models are suitable for fitting stationary and trending series and the LSTM [17] and RBF [18] models perform well when fitting nonlinear and random series. Therefore, the SARIMA and Prophet models are selected to fit the combination series, the RBF and LSTM models are selected to fit the random term series, and a combination of estimation models which is more suitable for the study area is determined by accuracy evaluation standards, such as RMSE and R2. The experimental steps are shown in Figure 1.

2. Material and Methods

2.1. Data and Study Area

The satellite altimetry data used in the experiment are from the GDR data sets of T/P, Jason-1, Jason-2, and Jason-3, which were all released by the French Space Centre (CNES). This paper processes the data in accordance with the steps in [19]. The data obtained after processing are the mean sea level anomaly (SLA) time series data from October 1992 to December 2019 in units of months, as shown in Figure 2.

The research area of this paper is the South China Sea and it is shown in Figure 3. The coordinates are 110°–119°E, 14°–23°N, and the total area is approximately 1.19 million square kilometers. The area is located between the Pacific Ocean and the Indian Ocean, and it includes many important shipping lanes for material transportation. In addition, there are abundant oil and gas resources.

2.2. Seasonal Autoregressive Integrated Moving Average Model

The SARIMA model evolved based on the ARIMA model, which takes into account the seasonal factors of the time series [2022]. It adopts the method of seasonal difference to estimate parameters and can effectively predict time series with seasonality, trend, and periodicity. The SARIMA model has performed well in industrial and medical research in recent decades [23]. The general form of the SARIMA model is SARIMA (p, d, q) (P, D, Q) [S], where p is the autoregressive order, P is the seasonal autoregressive order, q is the moving average order, Q is the seasonal moving average order, d is the difference order, D is the seasonal difference order, and S is the seasonal period. The SARIMA model is expressed in the following equation:where yt is the time series, μt is the random term, Φp(B) is the nonseason AR(p) part, is the season AR(P) part, (1 − BS)D is the d-order progressive difference, θq(B) is the nonseason MA(q) part, and ΘQ(BS) is the season MA (Q) part.

2.3. Prophet Model

The Prophet model is a time series curve-fitting tool developed by Taylor and Letham [24]. This model is suitable for fitting time series with strong seasonal effects and has strong robustness against missing values, abnormal values, and trend changes. The basic form of the model is shown in the following equation:where is the trend term used to fit aperiodic changes in the time series, s(t) is the periodic term and it uses a Fourier series to approximate the periodic component, h(t) is the holiday term, and εt is the error term. Prophet is robust to missing data and trend changes and usually handles abnormal values well.

2.4. Long Short-Term Memory Model

Hochreiter and Schmidhuber proposed the LSTM model [2528], which is considered a special recurrent neural network. It can solve the problems of gradient disappearance and gradient explosion and automatically learn the sequence features; therefore, it performs better in the prediction of longer time series and is more suitable for the prediction of sea level changes. The historical update information in LSTM model is controlled by the input gate, forget gate, and output gate as shown in the following equation:

The steps are given follows:(1)Calculate the candidate memory unit value at the current moment, where Wxc and Whc represent the input data and the unit output weight at the previous moment, respectively(2)Calculate the input gate value, where xt represents the current input data, ht−1 represents the unit output value at the previous time, and Ct−1 represents the memory unit value at the previous time(3)Calculate the forget gate value f and control the influence of historical information on the current state(4)Calculate the state value Ct of the memory unit at the current moment, where represents the point-by-point product(5)Calculate the output gate value Ot(6)Calculate the output of the final LSTM unit

2.5. Radial Basis Function Neural Network Model

The RBF neural network model is a feedforward neural network, and it has strong nonlinear mapping ability and can approximate a nonlinear function with arbitrary precision [29, 30]. Therefore, it is more suitable for the prediction of random terms in time series. The network consists of an input layer, a hidden layer, and an output layer. The radial basis function acts on the high-dimensional mapping between the input layer and the hidden layer, and the linear least square method is used to calculate the weight between the hidden layer and the output layer. The model is generally expressed as shown in the following equation:where M is the number of hidden layers, φi(xt) is the radial basis function of the ith node, ωt(i) is the corresponding regression coefficient, and i = 1, 2, …, M.

3. Results and Discussion

The curve in Figure 2 approximately reflects the characteristics of sea level changes in the study area. The peak value corresponds to summer and autumn, and the valley value corresponds to winter and spring. The sea level changes in this area are seasonal and cyclical. To consider both linear and nonlinear features when predicting sea level changes, the monthly mean SLA time series in the study area is decomposed into “season,” “trend,” and “random” terms based on the principle of addition (Figure 4).

In Figure 4(a), the trend term indicates that as the global climate warms, the sea level of the South China Sea will gradually rise, and the year corresponding to the falling part of the curve will correspond to time periods with strong El Niño phenomena, which fully reflects the trend of a slow rise and occasional decline in sea level observed from 1992–2019. In Figure 2(b), the seasonal term presents the same changes every year, thus reflecting the seasonal characteristics of sea level changes. In Figure 2(c), the noise generated by uncertain factors during sea level change is presented. This paper chooses 1992–2017 SLA data as the training set and 2018–2019 SLA data as the test set. Then, according to both the stationary and nonstationary characteristics of sea level change, the above decomposition results are divided into two groups: trend-seasonal series and random term series. The SARIMA and Prophet methods are used to fit the seasonal trend series, and the LSTM and RBF models are used to fit random term series. The root mean square error and the coefficient of determination R2 were used as the criteria for evaluating the estimation results.

3.1. Estimation of Trend-Seasonal Series
3.1.1. Trend-Seasonal Series Estimated Using the SARIMA Model

The selection and fitting of the SARIMA model mainly include the following steps:(i)Determine the main structure of SARIMA (p, d, q) (P, D, Q) [S] through experience and autocorrelation and partial autocorrelation function graphs.(ii)Experiment to obtain other unknown parameters.(iii)Evaluate the degree of fit through a residual test.(iv)Perform predictions based on known data [23].

From the above steps, D = 1, d = 0, P = 0, and Q = 2. To determine other parameters of the SARIMA model, this paper adopts an experimental method that takes the autoregressive order p and the moving average order q from 0 to 5. The Akaike information criterion (AIC), RMSE, and R2 were used to evaluate the degree of model fit (Table 1).

The AIC is a measure of the fitting effect of a statistical model. The smaller the AIC value, the better the model fitting effect of the following equation:where k is the number of parameters and L is the likelihood function.

The RMSE can reflect the deviation between the predicted value and the true value and is shown in the following equation:where m is the number of predicted values, yi is the actual value, and is the predictive value.

The coefficient of determination R2 reflects the degree to which the model fits the test data. If the R2 value is close to 1, the model fitting effect result will be better as shown in the following equation:where n is the number of predicted values, yi is the actual value, is the mean of true values, and is the predictive value.

The R2 values of the 4 groups in Table 1 are all small, indicating that the fitting effect is poor and the SARIMA model is not applicable, which may be related to the unobvious trend of sea level rise in 2018 and 2019. Thus, the advantages of the SARIMA model cannot be used.

3.1.2. Trend-Seasonal Series Estimated Using the Prophet Model

The Prophet model transforms a time series into a combination pattern of different time dimensions and then adds the overall trend. The model has a high degree of packaging and few adjustable parameters. According to the sea level change time series characteristics, the estimation frequency is set to “month,” the holiday term is empty, and the seasonal mode is set to “Multiplicative” and “Additive.” The accuracy of the final estimation results is shown in Table 2.

Obviously, the RMSE of the Additive model is small and the degree of fit is relatively satisfactory; thus, it is more suitable for estimating the time series of the sea level changes in this area.

3.2. Estimation of Random Term Series
3.2.1. Random Term Series Estimated Using the LSTM Model

The LSTM model can learn and remember long-term series information and perform selective forgetting. It is suitable for fitting and estimating the random terms of the time series in our study area. This paper uses 327 months of time series data from 1992 to 2019 as input data, and the number of output data sets is 24. The optimal LSTM estimation model was selected by adjusting the number of hidden layers. According to the RMSE and R2 results in Table 3, when the hidden layer is 2, the LSTM model fits the best, and the RMSE value is 0.937 cm.

3.2.2. Random Term Series Estimated Using the RBF Model

The RBF neural network model has a strong nonlinear mapping ability and is suitable for fitting of random terms. It has 3 main parameters: the radial basis expansion speed S, the maximum number of neurons MN, and the network parameter DF, which are added each time. Generally, these parameters are determined based on experience. This paper attempted multiple parameter combinations and achieved good results. Although the RMSE of the model in the table is small, R2 is also too small, indicating that the fitting effect is not good. Therefore, the RBF model is not suitable for the estimation of random terms in this time series (Table 4).

3.3. Prediction by the Combination of Prophet and LSTM Models

According to the results of the model selection experiment, the Prophet model predicts the trend-seasonal series better and the LSTM model predicts the random term series better. This paper chooses the combination of Prophet and LSTM models to predict sea level changes in this area. To explore the influence of the known series length on the results and determine the best prediction duration, this paper sets up training samples and test samples of different lengths to test the prediction effect of the combined model. The experimental results are shown in Table 5. The mean absolute error (MAE) represents the average value of the absolute error between the predicted value and the observed value, which can avoid the problem of mutual cancellation of errors and accurately reflect the actual prediction error. The calculation formula is as follows:where n is the number of predicted values, is the predictive value, and yi is the actual value.

According to the results in Table 5, when the prediction duration is 12–36 months, the accuracy indicators, such as the RMSE, are relatively ideal, indicating that the combined Prophet-LSTM model is suitable for medium- and short-term estimations of the study area for 12–36 months, and the best RMSE of 0.962 cm is obtained. Therefore, this paper uses all known monthly time series of sea level anomalies in the South China Sea from 1992 to 2019 as training samples to estimate sea level changes from 2020 to 2022. The results are shown in Figure 5.

4. Conclusions

The current research aims to evaluate the capability of different models in estimating sea level variability in the South China Sea. Based on satellite altimetry data from 1992 to 2019, this paper compares the estimation effects of the SARIMA, Prophet, LSTM, and RBF models by grouping, and the combination of Prophet and LSTM models was selected. The detailed results are as follows:(1)A comparison of the estimation accuracy of the SARIMA and Prophet models shows that the Prophet model can better predict the trend-seasonal series(2)A comparison of the estimation accuracy of the LSTM and RBF models shows that the LSTM model can better predict random term series(3)The combined model has high accuracy and good performance for 12–36-month short- and medium-term sea level change predictions

The estimation of the time series in this paper simply considers the changing characteristics of the time series. If the temperature, salinity, tides, ocean currents, and climate anomalies of the sea can be included as reference parameters, then the accuracy will be further improved.

Data Availability

The excel data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest in this paper.

Acknowledgments

The authors acknowledge the AVISO website of the French Space Center (CNES) for providing the satellite data. This research has been supported by the Fundamental Research Funds for the Central Universities (17CX02071), NSFC (61571009), and the Key R&D Program of Shandong Province (2018GHY115046).