Abstract

Efficient and accurate flight trajectory prediction is a key technology for promoting intelligent and informative air traffic management and improving the operational capabilities and predictability of air traffic. To address the problems in extracting hidden information from historical trajectory information, the approach must accurately select high-dimensional features related to the prediction target and overcome the short-term memory of the time series. Herein, we present a novel trajectory prediction model based on a dual-self-attentive (DSA)-temporal convolutional network (TCN)-bidirectional gated recurrent unit (BiGRU) neural network. In this model, the TCN provides highly stable training, high parallelism, and a flexible perceptual domain. The self-attentive mechanism of the TCN structure can focus on features that contribute the most to the output. After the TCN, the BiGRU network combined with the self-attentive mechanism is used to further bidirectionally mine the connections between the features and outputs of the trajectory sequence, and a Bayesian algorithm is used to optimise the hyperparameters of the model for optimal performance. A comparison and validation based on current well-known neural network models (i.e., CNN, TCN, GRU, and their variants) shows that the DSA-TCN-BiGRU model based on Bayesian hyperparameter optimisation has the best performance. Therefore, the improved predictive model is applicable and valuable, providing a basis for future decision trajectory-based operations.

1. Introduction

The continuous growth of air traffic has raised critical challenges to current air traffic control (ATC) systems, and the international civil aviation community is undergoing a new round of ATC system upgrades. In this new generation of ATC automation systems, a trajectory-based operation mode has been proposed as the core technology for the next phase [1]. In response to the global ATC integration requirements proposed by the International Civil Aviation Organisation [2, 3], the United States and Europe have taken the lead in developing future ATC plans including the Single European Sky Program, U.S. Central Terminal Area Control Automation System, and Distributed Ground and Air Traffic Management System. Together, they support the operational concepts for trajectories and performance, forming a trajectory-based ATC system for providing flight efficiency, energy savings, and capacity assessments. The key technical basis of the trajectory-based operation concept is accurate trajectory prediction [4], as such predictions provide the most direct reference for aircraft surveillance and control prognosis. In addition, the management of airspace is changing from static to dynamic, and the use of automatic dependent surveil-lance-broadcast (ADS-B) systems for real-time surveillance. The need for more detailed airspace management has become even more pressing in light of airspace delineation and use characteristics. Therefore, the next-generation ATC system should not only update the concept of operation, popularise and standardise ADS-B surveillance technology but also strengthen the independent research and development of critical technologies to achieve more automated, intelligent, accurate and safe and efficient air traffic management. There is a clear, practical need, and essential application value for real-time and effective track prediction through ADS-B data.

Thus, trajectory prediction is an important technology for new air traffic management automation systems, and the importance of trajectory prediction technology has made it a research topic of considerable interest in the context of ATC systems in recent years. Research into trajectory prediction technology has led to the proposal of many models for the prediction task. They can be categorised as follows.

Kinetic model: a kinematic model must consider the performance parameters of the aircraft, stage the flight process, and establish the corresponding motion equations. Fan et al. [5] proposed a multiobjective constraint-based aircraft descent segment track prediction method for improving the accuracy of aircraft prediction tracks by considering aircraft performance data, characteristic parameters, flight path restrictions, and cost index requirements. Li [6] combined aircraft energy states to simplify the equations of motion and obtain trajectory characteristics at minimum cost, and then used an equirectangular trajectory computational model to fit the flight parameters with the great likelihood estimation method to achieve a cost-optimised 4D trajectory prediction. Du et al. [7] constructed a prediction method for the vertical profile of a continuous aircraft climb while considering thrust intent using the aircraft full energy equation and considering the wind speed vector and temperature information. Pan et al. [8] used a hidden Markov model (which can predict the future position of an aircraft being flown every minute in the future), and the validity of the horizontal and vertical errors was significantly improved relative to the baseline model. However, the existence of idealised prerequisite assumptions for building a kinematic discrimination model limits its application scenarios and prediction accuracy

Data mining models: data mining-based trajectory prediction techniques have become popular as a ‘big data’ research subject in ATC. Ma and Gao [9] clustered historical trajectories generated by a certain route with typical trajectories through a clustering algorithm, and then performed a trajectory correction on the position information at a certain moment to predict the position information at the next moment. Zazzaro et al. [10] integrated a clustering algorithm, classification-based supervised learning, and an uncertainty model to calculate the probability of collision using Opensky public data, and implemented conflict risk detection in the terminal area. Carlos et al. [11] obtained more variable features by preprocessing the trajectory coordinates and flight plan data, and applied them to a k-means clustering algorithm (to the extent possible) to support a supervised trajectory classification. Choi et al. [12] combined a data-driven and physics-based state estimation model and compared it with a machine learning-only approach, substantially improving accuracy. However, algorithms based on state estimation can produce significant errors when faced with long-term forecasts as they need to consider the long-term movement features of the model capture model over time. Ma [13] proposed classifying the trajectories of different flights using a generalised cluster analysis method combined with the hidden Markov model to obtain a high-density 4D trajectory prediction. In general, a data mining approach can improve the prediction accuracy using different clustering algorithms, but it also generates problems concerning data storage and computational overhead to different degrees

Deep learning models: the development of neural networks in deep learning has provided new ideas for nonlinear models, especially those good at managing time series. Thus, such networks have been implemented in the study of trajectory prediction [1416]. Zhou et al. [17] reconstructed and combined the predictive capabilities of multiple neural networks over different time spans to improve prediction accuracy. Sahadevan et al. [18] extracted data through sliding windows and improved the prediction accuracy using a bidirectional long short-term memory (BiLSTM) model to understand the dependencies of the trajectory data in two directions, which, in turn, improved the ground system trajectory prediction accuracy. However, the proposed model has yet to consider the spatial feature extraction of the data. Tran et al. [19] provided a reliable model for a conflict detection system by combining encoder–decoder modelling with tactical intent, and verified that the prediction accuracy exceeded those of existing models using actual data. In 2020, Lv et al. [20] proposed using a temporal convolutional network (TCN) for trajectory prediction, and compared it with a conventional model. They found that the short-term effect was comparable and the long-term prediction was improved. Currently, gated recurrent unit (GRU) neural networks are primarily used for trajectory predictions [2123]. Wu et al. [24] proposed to combine GAN networks with Conv1D, Conv2D, and LSTM networks and verified that Conv1D-GAN could achieve accurate long-term trajectory prediction with actual data. The great promise of combined neural networks in the field of trajectory prediction is shown. Shafienya and Regan [25] proposed a CG3D model that fully extracts ADS-B data’s spatial and temporal features by combining CNN-GRU and 3D-CNN networks. The uncertainty of the model is also considered, and the combined model has higher prediction accuracy than a single model. Chen et al. [26] proposed a combination of inception and LSTM modules for prediction, effectively extracting the data’s spatio-temporal features. However, the model mainly considers the reliability condition of localization and the error is significant. Han et al. [27] used k-means to cluster similar trajectories, followed by online prediction via GRU to achieve real-time prediction of trajectories. Ma et al. [28] proposed a CNN combined with an LSTM network, which achieved good performance in short-term prediction but still had a significant error in multistep prediction as no attention mechanism was used

To achieve more accurate short- and long-term forecasts and take into account computational efficiency, we adopted a novel TCN-bidirectional GRU (BiGRU) model based on an attention mechanism, with the following innovations. (a)In this study, the TCN model is fused with the BiGRU model, and a dual-self-attentive (DSA) mechanism is introduced. The DSA-TCN-BiGRU model can extract multisource time-series features and maintain causal convolutional properties. High-order trajectory features are extracted by introducing a dilation convolution to improve the processing efficiency of long-time-span memory units. In addition, the model uses residual linking to transfer the underlying complete features across layers, thereby enriching the feature results while optimising the overall network-learning process. Subsequently, the BiGRU model combined with the DSA mechanism can bidirectionally extract the impact of each time node in the hidden layer state on the prediction results. The important problems concerning the high-dimensional feature extraction and long-term dependence of the time series are effectively solved. To the best of our knowledge, this is the first time that a DSA-TCN-BiGRU model has been used to forecast aircraft trajectories(b)The mapping from points in the hyperparameter space to the model generalisation performance can be viewed as a complex black-box function with a high evaluation cost; it is difficult to apply with general optimisation methods. We used a Bayesian Optimisation (BO) algorithm to optimise the hyperparameters of the whole model to improve the model prediction performance and training speed, thereby overcoming the disadvantages concerning the time-consumption and low accuracy in manual parameter tuning(c)To verify the performance of the model in terms of track predictions, three sets of comparisons were made. Particularly, the TCN and DSA mechanisms were investigated to ensure that the important track history information was fully utilised in the extraction of the track data, and that the sequential nature of the tracks was preserved to compensate for the shortcomings of the convolutional neural network (CNN). By comparing the BiGRU model with the GRU and LSTM, the bidirectional temporal feature extraction capability of the BiGRU model was verified, and the time complexity was simultaneously reduced. In addition, we tested the model for single-step and multistep prediction, and the results verified that the DSA-TCN-BiGRU approach has high stability and track prediction capability

The remainder of this paper is organised as follows. Section 2 analyses the data and preprocessing. Section 3 presents the DSA-TCN-BiGRU model. Section 4 presents the experimental and comparative validation results. The final section concludes this paper.

2. Trajectory Description and data Analysis

As a new technology promoted by the Civil Aviation Administration of China, the ADS-B technology generates massive amounts of data that can hide important aircraft flight information [29]. The dynamic data (generated in real time) implies the future movement trend of the aircraft.

2.1. ADS-B Trajectory data Format

An ADS-B message returns the trajectory information of each vehicle at a certain point during flight as a series of discrete trajectory points aggregated into discontinuous trajectory data for each vehicle. The trajectory is represented as follows:

Here, denotes the set of aircraft trajectories, denotes the trajectory number, , denotes the total number of trajectories, and denotes the flight path of flight .

A trajectory consists of a series of multidimensional trajectory points ordered by time.

Here, denotes the multidimensional trajectory point of trajectory and denotes the number of trajectory points.

The information contained in each waypoint is expressed as follows:

Here, denotes the latitude of the aircraft at time ; denotes the longitude of the aircraft at time ; denotes the altitude of the aircraft at time ; denotes the speed of the aircraft at time ; and denotes the heading of the aircraft at time .

2.2. Data Preprocessing

The ADS-B data are often incomplete and inaccurate owing to receiving terminal failures, communication link delays, and navigation signal losses. Accordingly, in this study, data with duplicate 3D position points and time points are deleted to regenerate the trajectory sequence. Missing points were handled by recalculating the trajectory sequence points in 5-s units using three-spline interpolation, as shown in Figure 1.

In Figure 1, the red colour represents the trajectory information after the ADS-B data are parsed. The basic route characteristics can be obtained, but there are missing trajectory points in the middle; if there are trajectory missing points at the edge of the time period when the feature extraction is conducted, the statistics will have certain errors, and the validity of the experiment will be questioned. Therefore, all aircraft trajectory information is interpolated (as shown in black).

When the amount of data is small, the prediction accuracy may be poor owing to overfitting during the deep learning training process. We construct and expand features by adding the distance feature and angle feature between the trajectory point and centre reference point () of the destination to reflect the pilot’s intention, as shown in

Normalisation: the data are processed using normalisation, thereby solving the problem of the nonuniformity of the input data.

Here, is the normalised input data, and is the original input data.

2.3. Sample Construction

In the sample construction, we select the data [30] using sliding windows with a sliding window size of six to achieve single- and multistep predictions, as shown in Figure 2.

3. Methods

3.1. Self-Attentive Mechanism

When using neural networks to process large amounts of sequence data, one can borrow from the attention mechanism of the human brain and selectively process certain primary information while skipping secondary information to improve the network performance. Sequence coding based on convolutional or recurrent neural networks can only establish local dependencies. One way to establish long-distance dependencies between the sequences is to increase the number of layers in the network to obtain long-distance information interactions through deeper networks. The other is to use full connectivity, which, although capable of establishing long-distance dependencies, cannot handle variable-length input sequences. A self-attention model [31] can dynamically generate weights for handling variable-length sequences of information and can establish long-range dependencies.

The self-attention model typically uses a key-value pair model. The similarity function uses the scaled dot product as the attention scoring function, as shown in Figure 3. Assuming that the input sequence and output sequence each input is linearly mapped to three different spaces to obtain the query vector , key vector , and value vector , respectively. The mapping is calculated as shown in

In these equations, , , are the linear mapping weight matrices, and , , and are the query, key, and value matrices, respectively.

When scaled clicks are used as the attention scoring function, the output vector sequence is calculated as follows:

Here, is the matrix dimension, and is a function normalised by the column.

Self-attention models can effectively handle variable-length time series and establish long-range dependencies [32].

3.2. Self-Attention Temporal Convolutional network

The TCN networks [33] applied in trajectory prediction have the following main characteristics: (i) to meet the requirements for trajectory prediction, TCN networks have temporal causality, that is, the current model output is only related to the past and not to the future; (ii) the output sequence length of TCN networks applicable to trajectories can be adjusted arbitrarily; and (iii) multilayer TCN networks have long memory distances despite their shallow depths.

3.2.1. Causal Dilated Convolution Network

The TCN causal dilation convolution structure is shown in Figure 4. The most unique feature of the TCN dilation convolutional structure is that its specific form is determined by the convolutional kernel and dilation coefficients. The convolution kernel and dilation coefficients change the form of the TCN in terms of the number of input elements and distance from the upper layers of the network, respectively. The expansion factor increases exponentially with the depth of the network, i.e., , where is the number of network layers. The larger the expansion factor, the larger is the information extraction range. Therefore, the TCN convolution can obtain a relatively large perceptual field by building a relatively small number of layers.

The receptive field of a TCN network neuron, i.e. the network memory length, is determined by the convolutional kernel size, dilation coefficient, and number of convolutional layers. The value of the operation after the dilation convolution operation is as follows:

Here, is the input sequence, is the convolution operation, is the convolution kernel size, is the ith element in the convolution kernel, and is the element of the input sequence corresponding to the convolution kernel operation.

3.2.2. Residual Links

In the residual block [34], the output of a multilayer network is added to the original input and output through the activation function . The computation is shown in

The self-attentive TCN (SATCN) residual module contains the underlying TCN causal expansion convolution layer, batch normalisation, activation function (LeakyReLu) and dropout layers. The N residual module structure is shown in Figure 5.

The normalisation of weights can eliminate the gradient explosion problem and effectively speed up the computation. To make the TCN network nonlinear while avoiding gradient disappearance, the LeakyReLu activation function is used and the dropout layer is added after the LeakyReLu activation layer to prevent overfitting. Accordingly, it is possible to achieve the regularisation effect. The self-attentive mechanism is introduced to focus on features that contribute more to the output, and adjusts the problem of the different dimensions of the residual tensor using convolution.

3.3. Bidirectional Gated Recurrent Unit network

The GRU model [35] is a type of recurrent neural network (RNN); compared with the LSTM, GRU has only two gating structures, i.e., the reset gate and update gate. Thus, its structure is simpler, and comparable to its effect [36]. The structure of the GRU is shown in Figure 6. Equations (13)–(16) describe the operation of each cell in the GRU network.

Here, is the current moment input, denotes the previous moment hidden state, is the pass to the next moment hidden state, denotes the candidate hidden state, is the reset gate, is the update gate, and is the sigmoid activation function.

The BiGRU [37] network is a two-layer GRU network consisting of a combination of forward and reverse GRU layers as shown in Figure 7.

3.4. DSA-TCN-BIGRU model

The insufficient extraction of the spatial and temporal features of trajectory data can cause complexity and difficulties in trajectory predictions. Accordingly, we propose a DSA-TCN-BiGRU trajectory prediction model, as shown in Figure 8. The model uses the SATCN layer as a feature optimiser, as it can effectively extract hidden spatial feature information and temporal relationships and reduce redundant features. A memory function can be implemented relative to a simple CNN, and the different paths and sequence times for backpropagation enable it to avoid the gradient problem. In addition, if there is an excessive amount of data, it can be processed in parallel, thereby reducing the training time. The BiGRU model can perform bidirectional time-series feature extraction to further improve the global integrity of the time-series feature extraction. The self-attention mechanism can handle longer time-series data, strengthen long-range dependencies, and adjust the dynamic weights of the acquired serialised information to highlight the global key points of the trajectory information.

4. Experiments and Results

4.1. Dataset

The data preprocessing and training prediction of the network were based on the Tensorflow version 2.0 deep learning framework. The experimental dataset for this study was selected from 12 months of historical ADS-B one flight trajectory data from Beijing Daxing Airport to Shanghai Hongqiao Airport, with an average flight time of approximately two hours. Part of the trajectories was profiled laterally and vertically, as shown in Figures 910. When compiling the model, 80% of the data is used as the training set, and the rest is used for model testing.

4.2. Evaluation Metrics

To quantitatively evaluate the performance of the proposed DSA-TCN-BiGRU network prediction model, the root mean square error (RMSE) and mean absolute error (MAE) were used as error evaluation metrics. The equations are as follows:

Here, and represent the true and predicted values, respectively. Both indicate the stability and accuracy of the model; the smaller the value, the higher is the accuracy of the model.

4.3. Determination of model Parameters
4.3.1. Bayesian Hyperparameter Optimisation

Bayesian optimisation (BO) [38] uses Bayes’ theorem to obtain most of the evaluation information from the previous functions, and then selects the next most promising sampling point based on the posterior distribution of the objective function. It can find the global optimal solution of a function with less evaluation and can also achieve good results for black-box functions, making it suitable for the hyperparametric optimisation problems of deep learning models [39]. The hyperparametric optimisation process is shown in Figure 11. (1)Parameter optimisation: the neural network parameters to be optimised include the residual block number, convolution kernel number and step size of the one-layer TCN, the number of neurons of the two-layer BiGRU, the number of neurons of the Dense, as well as the dropout rate, learning rate, batch size, and time step size(2)Objection function: MSE. Number of iterations: 100

and represent the true and predicted values, respectively. (3)Bayesian hyperparametric optimisation of the optimal range: , , , , , , , , , and

In the aforementioned range, is the number of TCN residual blocks, is the numbers of TCN residual block convolution kernel, are the TCN residual block convolution kernel step, and are the numbers of BiGRU neurons, is the numbers of Dense neurons, is the dropout rate, is the learning rate, and is the number of batch processes, and is the time step size. (4)BO parameters: probabilistic agent model uses the Gaussian Process (GP); Acquisition function uses the Upper Confidence Bound (UCB); Number of iterations: 100

According to Figure 12, the loss function of the DSA-TCN-BiGRU model tends to zero after the hyperparameter search by BO, confirming the effectiveness of the hyperparameter search using BO. The final model parameters are listed in Table 1.

4.3.2. Optimiser Selection

The optimiser directly affects the speed of parameter optimisation and model accuracy, and the selection of a suitable optimiser can significantly improve the model training efficiency. In this study, the sgdm, rmsprop, and Adam optimisers were selected for comparison experiments, and the other parameters were kept the same as those mentioned earlier for the model training. The RMSE values of the validation set under the different optimisers are shown in Figure 13. We can see that both the rmsprop and Adam optimisers have high stability and accuracy. The accuracy of the rmsprop optimiser is approximately the same as that of the Adam optimiser; however, the training time consumed by the Adam optimiser is much lower than that of the rmsprop optimiser. The Adam optimiser was chosen based on a comprehensive consideration of the model accuracy and time efficiency.

4.4. Experiments and Comparisons

To verify the accuracy of the DSA-TCN-BiGRU network for single-step prediction, the same dataset and simulation parameters were used for comparison with existing trajectory prediction models (BiGRU, GRU, TCN, SATCN, CNN-BiGRU, DSA-CNN-BiGRU, TCN-BiGRU, and DSA-TCN-BiGRU). The prediction results for each model are shown in Table 2 and Figure 14. To further illustrate the prediction accuracy of the proposed models, 2D and 3D zoomed-in plots are shown in Figures 1516.

The aforementioned prediction results show that the DSA-TCN-BiGRU based on the BO model performs very well and shows the actual flight of the air vehicle in its entirety. The predicted 2D and 3D trajectory curves of the model are in good agreement with the actual flight trajectory of the air vehicle. The altitude, latitude, and altitude RMSE reach a minimum of 20.14 m, 0.004°, and 0.009°, respectively. In summary, the proposed model can fully extract the spatiotemporal characteristics of the trajectory data with a better generalisation effect and lower prediction error.

4.5. Performance Analysis
4.5.1. Self-Attentive Mechanism and TCN Performance

To compare the different effects of the self-attentive mechanism and TCN networks compared to the conventional CNN on the prediction models in trajectory prediction, the following model comparison groups were introduced: DSA-TCN-BiGRU vs. CNN-BiGRU, SATCN vs. TCN, and TCN-BiGRU vs. CNN-BIGRU. The results are shown in Figure 17. We can see that, among these three groups of models, the DSA-TCN-BiGRU has the best prediction performance and lowest RMSE, with an altitude RMSE of 20.14 m, latitude RMSE of 0.004°, and a longitude RMSE of 0.009°.

Compared with the CNN-BiGRU, the DSA-TCN-BiGRU shows RMSE improvements of 85.71%, 92.86%, and 76.34% for the longitude, latitude, and altitude, respectively. Compared with the TCN, the RMSE improvement rates of the SATCN for the longitude, latitude, and altitude are 21.88%, 35.48%, and 59.52%, respectively. Compared with the CNN-BIGRU, the RMSE improvement rates of the TCN-BiGRU for the longitude, latitude, and altitude are 77.78%, 85.71%, and 66.76%, respectively. We can conclude that the hybrid model using the self-attentive mechanism and TCN has higher accuracy. The TCN can more effectively explore the potential features of the trajectory information, and the self-attentive mechanism focuses on the information that is more relevant to the results. Its ability to capture remotely dependent information can be further enhanced by modelling sequence data of variable length. In summary, the DSA-TCN-BiGRU can extract trajectory features from data more effectively, which is important for improving the accuracy of short-term trajectory predictions.

4.5.2. BiGRU Performance

To verify the advantages of using the BiGRU network, the DSA-TCN network was compared with a combination of the BiGRU, BiLSTM, GRU, and LSTM approaches. As shown in Table 3, the DSA-TCN-BiGRU and DSA-TCN-BiLSTM models have comparable performance with all hyperparameters tuned, and both had lower RMSE than the combined GRU and LSTM neural networks. This verifies that the introduction of a bidirectional mining mechanism can effectively improve the prediction accuracy. For a single-step prediction, the DSA-TCN-GRU has the lowest time of 40.1 ms, and DSA-TCN-BiLSTM has the longest time of 60.2 ms. The BiGRU model is a special type of RNN with the advantages of fast convergence and fewer parameters compared with an LSTM. Thus, it simplifies the structure and improves the operation speed while guaranteeing the prediction performance.

4.5.3. Multistep prediction

In the aforementioned comparative analysis, the results from the model’s single-step forecasts are shown. This section provides statistics on the characteristic mean error values of the model for single-step and multistep forecasts. In addition, the results from the model’s single-step and multistep forecasts are quantitatively compared using the RMSE forecast error evaluation metric. Table 4 shows the results for the DSA-TCN-BiGRU model for predicting future s-steps, and Figure 18 shows the multistep prediction level line. As the number of prediction steps increases, the error remains within a certain range. This verifies that our proposed DSA-TCN-BiGRU model still performs well in multistep prediction, and can effectively improve the accuracy and practical requirements of trajectory prediction.

4.6. Further Study

We chose another flight along the same route to validate the robustness of the proposed DSA-TCB-BiGRU model. The dataset contained data from the period 2021–2022, and the prediction results and evaluation metrics comparing the DSA-TCN-BiGRU based on the BO algorithm model with other models (DSA-CNN-BiGRU, CNN-BiGRU, TCN-BiGRU, SA-TCN, TCN, BiGRU, and GRU) are shown in Figures 19 and 20, respectively.

Our proposed DSA-TCN-BiGRU model has the lowest MAE and RMSE for the new dataset, and the fitted curve is closest to the actual route. This verifies that the DSA-TCN-BiGRU model based on BO has good robustness and accuracy for different datasets.

5. Conclusion

In this paper, we propose a DSA-TCN-BiGRU-based trajectory prediction method by fusing time-domain convolutional networks and two-way gated recurrent networks, and by introducing an attention mechanism and Bayesian hyperparameter optimisation algorithm. This combines the advantages of the TCN in extracting time series features with the BiGRU’s ability to learn a sequence before and after conditions to better extract the spatial and temporal features of trajectories while reducing the time complexity. Subsequently, an attention mechanism is introduced to assign different weights to the attributes, and the Bayesian optimisation algorithm is used to optimise the model hyperparameters. We compared the DSA-TCN-BiGRU model with a typical neural network and show the lowest RMSEs of 0.004°, 0.009°, and 20.14 m for the latitude, longitude, and altitude, respectively, demonstrating considerable improvements in aircraft trajectory prediction accuracy. In addition, we verified the model’s robustness and predictability performance using different datasets and multistep predictions. The proposed model assists controllers in decision-making by providing accurate information on aircraft trajectories over a while, calculating and identifying abnormal behaviour such as altitude anomalies and trajectory deviations, respectively, and alerting them to potential abnormal behaviour, helping controllers to be informed in advance of aircraft operating situation on the route and the terminal area. As a result, using the model can reduce the controller’s workload and improve air traffic safety. The trajectory data also receives influence from factors such as meteorological controls. In the future, our approach will be to fuse weather, control, and other uncertainties. AMDAR can provide real-time weather data for aircraft, with the same real-time performance as ADS-B, and is the preferred choice for weather data fusion, considering the time delays associated with data fusion and forecasting. In addition, we can develop a trajectory data visualisation system that further explores the potential value of trajectory data by integrating the trajectory prediction module and displaying it visually on a map, which will significantly improve ATC’s performance and predictive capability.

Data Availability

The ADS-B data used in this paper are from the publicly available dataset of Flightradar24 Technologies, Inc. (https://www.flightradar24.com/data).

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was cosupported by the CAAC Vertical Project (Project No. 0242109) and (Project No. 180Z0550), the Civil Aviation Air Traffic Management Authority Horizontal Project (Project No. H2021-14) and (Project No. H2021-57), and the Institute of New Technologies for Civil Aviation Communications Navigation Surveillance (Project No. JG2022-20).