Abstract

Effective prediction of ionospheric total electron content (TEC) is very important for Global Navigation Satellite System (GNSS) positioning and other related applications. This paper proposes an ionospheric TEC prediction method using the nonlinear autoregressive with exogenous input (NARX) neural network, which uses previous TEC data and external time parameter inputs to establish a TEC prediction model. During the years of different solar activities, 12 datasets of 3 stations with different latitudes are used for experiments. Each dataset uses the first 120 days for training and the next 20 days for testing. For each test dataset, a sliding window strategy is adopted in the prediction process, wherein the TEC of future 2 days are predicted by the true TEC values of the previous 2 days. The results show that in the year with active solar activity (2011), the TEC prediction with the NARX network can improve the accuracy by 32.3% and 43.5%, compared with the autoregressive integrated moving average (ARIMA) model and the 2-day predicted TEC product, named C2PG. While in the year with calm solar activity (2017), the prediction accuracy can be improved by 20.7% and 22.7%.

1. Introduction

The ionosphere is an important part of the near-earth space, and it is distributed in the range of 60–1000 km from the Earth's surface [1]. Due to the influence of solar radiation and various cosmic rays, the atmospheric molecules in this part are ionized to a large number of free electrons. Total electron content (TEC) is an important parameter to characterize the degree of electromagnetic activity of the ionosphere, which, at a certain point, means the integral of free electrons on the propagation path of electromagnetic signals. The international unit is TECU (). Therefore, TEC is often used as an indicator to represent the ionospheric activity and the signal delay caused by the ionosphere.

In Global Navigation Satellite System (GNSS) applications, the delay encountered by satellite signals passing through the ionosphere is one of the main errors, which greatly restricts the performance of GNSS high-precision positioning [24]. Thus, effective prediction of TEC has always been very important for GNSS positioning and other related applications. For this problem, there are several commonly used methods including the International Reference Ionosphere model (IRI) [5], time series analysis [6], and artificial neural network [7, 8]. The IRI models (e.g., IRI2007 and IRI2012) are empirical models proposed by the International Federation of Radio Science, based on long-term worldwide observations from ionosondes and radars and are combined with years of ionospheric research results. Although the model can obtain long-term ionospheric prediction results, the input parameters are more complicated and the error is large under certain conditions [9, 10]. In 1999, Muhtarov and Kutiev [11] proposed an autocorrelation method for ionospheric interpolation and short-term prediction. The autocorrelation or normalized autocorrelation parameters were determined based on past observation data, and the autocorrelation model was further obtained. This model can be used for temporal interpolation over a period of time and for extrapolation (prediction). Ratnam et al. [12] used the dual-frequency GNSS observations from the Indian low-latitude stations to make a short-term prediction of the vertical TEC according to the autoregressive moving average (ARMA) model. Compared with the original observation and the IRI2007 model, the ARMA model showed good performance during the magnetic storm period.

Because of the nonlinear characteristics of the ionospheric change, the artificial neural network is also widely used in the short-term prediction of the ionosphere [13, 14]. Williscroft and Poole [15] firstly applied the neural network to the prediction of ionospheric foF2, which obtained better results than the IRI model. Based on error compensation, Zheng et al. [16] proposed an ionospheric foF2 prediction method, which combined the back propagation neural network (BPNN) and IRI2012 model. The prediction accuracy was improved by 19% compared with the IRI2012 model. The nonlinear autoregressive with exogenous input (NARX) is a kind of recurrent neural network (RNN) [17], which predicts future values from past values of the time series and second time series as inputs. It can establish the relationship between the two time series of inputs and outputs and has been demonstrated to be suitable for nonlinear time series modeling [18]. In addition, the NARX network has also been proven to perform better on issues involving long-term dependencies [19]. About the using of NARX on atmosphere research, Cai et al. [20] firstly used NARX to establish a model for predicting the SYM-H index using solar wind and interplanetary magnetic field (IMF) parameters in advance. The correlation coefficient was at least 0.9 even during a strong magnetic storm, which proved the practicability and performance of this method. It can be expected that NARX can also be used in the prediction of ionospheric TEC [21, 22]. In this paper, a NARX-based ionospheric TEC prediction method is proposed. According to the previous TEC data and exogenous time parameters, the NARX is used for modeling for single-point TEC time series prediction and obtaining 2-day predicted TEC in advance.

2. Materials and Methods

2.1. NARX Neural Network

The inputs of NARX neural network contain not only the exogenous information of the current time but also the output data of the previous time, and thus the network has the feature of memory and feedback, making the information of system more complete. The mathematical expression of the NARX model can be expressed as follows:where is the output value of the NARX at the time and the predicted value for the time . is the number of input and output delays. are the past value of the time series. are the past value of the time series of exogenous inputs. is the nonlinear mapping function of the neural network, which is unknown and needs to be trained.

After feedback, the NARX network is similar in architecture to the multilayer perceptron (MLP), including the input layer, the hidden layer, and the output layer. Other parameters include the number of neuron nodes, the number of network layers, the connection weights, and the activation function. The MLP provides an effective nonlinear problem solution that can be used to train and obtain nonlinear mapping relationships.

The output of the hidden layer nodes iswhere is the activation function of the hidden layer nodes. is the connection weights between the ith hidden layer node and input neuron . is the connection weights between the ith hidden layer node and feedback . is the bias of hidden layer nodes. The value of the output layer node iswhere is the output value of the NARX network. is the connection weight between the ith hidden layer node and output layer. is the bias of the network. is the number of hidden layer nodes.

2.2. NARX for TEC Prediction

On the base of NARX, this paper chooses to use the International GNSS Service (IGS) TEC map as the ionospheric prediction target. After downloading the ionospheric TEC map files from IGS [23], the TEC at any point can be obtained by interpolation. Then, all TEC maps are interpolated for a period of time at the same point to constitute a single-point TEC time series. The time series can represent the change of TEC during this period, which can be expressed as follows:where is the length of time series.

The input parameters of NARX are the time factors related to the variations of TEC, which are selected based on the periodicity characteristics of ionosphere. Diurnal variation is considered as a distinct feature of TEC. The TEC increases to a maximum after noon and then decreases to a minimum during midnight. Season alternation is also an important factor. TEC is large in spring and autumn, but less in winter and summer, generally. So, the time information is taken as input parameters including local time (0–23), day of year (DOY, 1–365), and season. And then, these parameters are normalized by the sine and cosine with an interval (−1, 1).where is the hour. is the DOY. is the virtual value expressing spring, summer, autumn, and winter with 1, 2, 3, and 4. For different types of time data, the inputs are normalized to using trigonometric functions with different periods. The length of the time delay is set to 2 days. When the sampling interval is 2 h, is 24. The number of hidden layer is 1, the number of hidden layer nodes is 20, and the activation function is selected as . The structure of the NARX network is shown in Figure 1.

After training, the accuracy of prediction can be represented by root mean square error (RMSE), regression values (R), and relative error (REL). They can be expressed as follows:where, is the number of samples. is predicted TEC values. is target TEC (IGS TEC maps are used as true values). shows the absolute error between the and . represents the correlation between the two sets of time series, and represents the relative error magnitude.

3. Experiments and Results

3.1. Data Preparation

The experimental TEC data comes from the global ionospheric TEC grid product (IONEX) released by IGS in 2011 and 2017, with an interval of 2 hours and a grid size of . The three stations, different latitudes and basically the same longitude, in the northern hemisphere were selected to analyze the influence of latitude on the ionospheric prediction, namely, Beijing (BJ), Wuhan (WH), and Sanya (SY). The approximate coordinates of the stations are shown in Table 1. The TEC of the three stations in the zenith direction was interpolated according to the TEC map. The 1-year TEC series consists of daily average value which is shown in Figure 2.

In Figure 2, the TEC series have obvious seasonal changes. The TEC in spring and autumn are significantly higher than that in summer and winter, reflecting a trend of fluctuating, and the magnitude of the change is very large. By comparing the data of 2011 and 2017, it can be found that the TECs in 2011 are significantly higher than those in 2017 and even doubled in spring and autumn. In addition, due to the different latitudes of the three stations, it can be clearly seen that the TECs of the low-latitude area are higher than that of the high latitude area.

According to the activities of the ionosphere, the TEC data of four time periods are selected for TEC prediction experiments, and the relationship between ionospheric activity and prediction accuracy at each period are analyzed. The four datasets are called 2011-S, 2011-A, 2017-S, and 2017-A. A total of 1440 TEC values in each dataset are used as training samples in the previous 120 days, and 240 TEC data are used as test samples in the next 20 days. Three stations conduct the prediction experiments, for a total of 12 datasets. The datasets are shown in Table 2. Part of training samples is shown in Table 3.

According to the NARX ionospheric prediction algorithm proposed in this paper, the above 12 datasets are used for TEC prediction experiments. In experiments, the future TECs are predicted by the true values of 24 TEC values in the previous 2 days. Then, slide the predicted window forward until the completion of the 20-day prediction instead of one prediction for 20 days.

3.2. NARX Prediction

The NARX method is used to predict 20-day TECs using all datasets. The predicted TEC series in 2011 and 2017 are shown in Figures 3 and 4. The blue curve in the figures represents target TEC , and the red curve represents prediction TEC . The error statistics and prediction performance are shown in Table 4.

Depending on the predicted of the different datasets, the periodicity of the ionospheric changes can also be proved. The trend of the predicted TEC and the target series are coincident, and the errors are in a small range. For the same period, the RMSE for predicted TEC of the three stations increases with the decreasing latitudes. Because the latitude of Beijing is highest, the prediction error is the smallest. On the contrary, the prediction error is the largest in Sanya. For different time periods, the RMSEs of the three stations are consistent with the ionospheric activity. When the ionosphere is active, the RMSEs are large, and the RMSEs are small when the ionosphere is calm. In general, the RMSE is the maximum in 2011-A, the minimum in 2017-S, and the RMSEs of 2011-S and 2017-A are roughly the same, which are between the first two. Overall, the minimum RMSE is 1.049 TECU in 2017-S-BJ, and the maximum RMSE is 3.722 TECU in 2011-A-SY.

However, the trends of RMSEs are not exactly the same for the relative errors. The RMSEs are related to the ionospheric activity and show periodic fluctuations with time of datasets, but RELs show an increasing trend with time, as shown in Figure 5. It is because the RELs are not only related to the prediction biases but also related to the magnitude of the target TECs as the denominators. Taking 2017-A as an example, although the prediction errors are small, the RELs are large due to the low level of TEC target value. In the 12 datasets, the RELs are less than 10% in 11 datasets, and the REL of only one dataset is 10.71%. Therefore, we can explain that the relative error of NARX is better than 11%.

In addition, the regression coefficients between the predicted values and the target values of the 12 datasets are all greater than 0.9, indicating a very strong correlation.

3.3. Comparison with ARIMA and C2PG

We also used different prediction methods or products to compare with the NARX using the same datasets. Firstly, the 2-day forecast product released by CODE, namely, C2PG, is used for comparison, which can be downloaded on the IGS website in advance. It has the same format as IONEX. Secondly, the ARIMA method introduced by Box and Jenkins [24] is also adopted, and the main formula is as follows:

The corresponding nonnegative integer is the autoregressive order, and is the moving average order. is the regression coefficient. is the moving average coefficient. represents Gaussian white noise. In this article, after testing multiple dataset, the ARIMA model with  = 12 and q = 4 is adopted, and the data processing is the same as that of NARX. Taking the Wuhan station as an example, the residuals of the prediction results of the three methods are shown in Figure 6.

In Figure 6, the red line represents the prediction residuals of C2PG, the green line represents the ARIMA, and the blue line represents the NARX. Comparing the residuals in 2011 (Figures 6(a) and 6(b)) and 2017 (Figures 6(c) and 6(d)), the curves in 2011 fluctuate drastically than those in 2017, indicating that the performances of 3 methods in the active period of the ionosphere are not as good as the calm period. Comparing the three prediction methods, the curve of NARX is more stable around the x-axis. There are more outliers in the residuals of the C2PG and ARIMA, and NARX has the better suppression of outliers. It is illustrated that the NARX has a good performance in TEC prediction in different periods and different areas.

The RMSEs of the three methods are shown in Figure 7 and Table 5. It can be seen that, during the active period of the ionosphere, the NARX has the greatest prediction accuracy, with an average of 2.519 TECU, followed by the ARIMA (3.723 TECU), and the C2PG (4.459 TECU) is the worst. In the ionospheric calm period, the accuracy of the NARX remains as high as 1.772 TECU. The accuracies of the C2PG and the ARIMA are equivalent, with 2.235 TECU and 2.293 TECU, respectively.

4. Discussion and Conclusion

In this paper, a TEC prediction method based on the NARX network is proposed. According to the periodic change of the ionosphere, the TEC information during a period of time is processed as a time series. The TEC series are treated as the inputs and outputs in the NARX at the meantime, and the time parameters are the exogenous inputs including hour, DOY, and season. In the experiments, the TEC data come from the IGS ionospheric grid products and can also use the actual raw observations of GNSS receiver. As a result, the experimental data and the time parameters are easy to obtain, and this method is easy to conduct in GNSS application.

According to the ionospheric activity, representative data in different periods are used for ionospheric TEC predictions. RMSE and REL are used to evaluate the prediction performance and compared with the IGS predicted products and the ARIMA method. The results show that the prediction using the NARX has the highest accuracy, and the RMSEs are between 1.501 TECU and 3.722 TECU in 2011 and between 1.049 TECU and 2.728 TECU in 2017. The accuracies of the ARIMA and C2PG are almost equivalent, and the mean RMSEs are 3.723 TECU and 4.459 TECU in 2011 and 2.293 TECU and 2.235 TECU in 2017. The NARX can have better prediction performance even during the active period of the ionosphere.

At the same time, the training of the NARX network requires a large amount of previous data, and the outputs are dependent on the training results heavily. So, the prediction results may not achieve satisfactory accuracy in the absence of past data or data without enough accuracy. And, we only use TEC map products in this paper, and the purpose is to verify the effectiveness and universality of the NARX algorithm in TEC prediction. Because the product is easy to obtain and has high accuracy, it can provide good experimental data.

For future works, we will further study the various characteristics of the ionospheric TEC and simplify or explore more input parameters related to the ionospheric TEC. The architecture of the NARX network will be optimized for improving the prediction accuracy of the ionospheric TEC. Raw GNSS observations will be used for TEC prediction with NARX network to improve the real-time performance and to weaken the system errors.

Data Availability

The global ionospheric products used in this paper are available at https://cddis.nasa.gov/archive/gnss/products/ionex/(accessed on 1 October 2021).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Fundamental Research Funds for the Central Universities (grant no. 2242020R40135) and the National Natural Science Foundation of China (grant nos. 41904022 and 41774027).