#### Abstract

In this paper, a novel approach to solving the formation online collaborative trajectory planning for fixed-wing Unmanned Combat Aerial Vehicles (UCAVs) is proposed. In order to describe the problem, the formation attack process which consists of communication framework and synergy elements is analyzed. The collaborative trajectory planning model which is based on avoiding the threat zones, reducing the execution time, and accomplishing the mission combines kinematics/dynamics model of UCAV with formation relative motion model to establish the optimal control problem. The approach based on* hp* adaptive pseudospectral method is presented to generate formation trajectory that satisfies the collaborative constraints. When a trigger event is detected, based on the offline planning, the online collaborative trajectory replanning using rolling horizon strategy is carried out. Simulated experiments which are divided into offline scenarios and online scenarios demonstrate that the proposed approach can generate trajectories which can meet the actual flight constraints, and the results verify the feasibility and stability of the proposed approach.

#### 1. Introduction

The research and development activities on applications of Unmanned Aerial Vehicle (UAV) have motivated a significant increase in both civilian and military areas [1]. The main topic in military fields focuses on reconnaissance, surveillance, and attack of Unmanned Combat Aerial Vehicle (UCAV), which will be capable of air combat with the development of artificial intelligence in the future [2]. As the battlefield environment is increasingly complicated, it is difficult to deal with emergencies and changeful environment for the single UCAV [3]. However, compared with single UCAV, the formation with the capacity complementarity and operations coordinating can improve the quality of completing the missions, shorten the execution time, and reduce the risk of military operations. In the process of actual combat mission, the battlefield situation and mission targets are continuously changed over time, which makes UCAVs collaborative operation more complicated.

At present, as one of the key technologies of collaborative operation, UCAV formation collaborative trajectory planning has become a hot research spot all over the world. The problem of trajectory planning with specific mission for UAVs has been considered in several papers. In [4], according to the characteristic of radio communication path loss constraints, the trajectory planning method based on distributed model predictive control is proposed. In [5], a three-dimensional path planning for UAVs using fast marching method is provided for generating paths free from obstacles by adjusting parameters. In [6], trajectory planning algorithm for fixed-wing communication relay UAVs in urban environment is presented by considering dynamic constraints. The UCAV formation collaborative trajectory planning for air-to-ground attack missions and avoiding the radar detection and antiaircraft missile is mainly focused on in military field [7].

Many works on trajectory planning algorithms have been published. Aimed at the problem of collaborative trajectory planning, in [8], a three-dimensional multi-UCAV collaborative trajectory planning model is established by adopting the spatial fuzzy set to indicate the planning space. In [9], a path planning method based on collaborative track nondominant sequencing evolutionary algorithm is proposed. With the development of heuristic stochastic optimization algorithm, many intelligent algorithms that possess fast convergence rate and high precision are used to solve the UCAVs path planning problem. In [10], by introducing the cooperative evolutionary strategy under the multiconstraint conditions of ant subgroups, the improved ant colony algorithm is used to solve the UCAVs multibatch collaborative three-dimensional flight path planning. In [11], the genetic algorithm is introduced to solve path planning of multi-UCAV system. In [12], the self-adaptive *α*-constrained virus colony search based on opposition-based learning is proposed for path planning. In addition to intelligent optimization algorithms, other different techniques of path planning for UCAVs have been studied in many researches: machine learning [13], rapidly exploring random trees [14, 15], core paths graph [16], A search algorithms [17], D search algorithms [18], and probabilistic roadmap method [19]. However, these path planning methods do not take the platform movement characteristics of Unmanned Aerial Vehicle into account [20].

In practice, due to the lack of control variable, the paths which are planned by above algorithms and UCAVs path planners are unenforceable. In order to solve the problem of trajectory planning with UCAV motion characteristics, in [21], adaptive pseudospectral method is adopted and the low detectable attack trajectory of UCAV based on this method is simulated. In [22], a convex programming method based on penalty function sequence is proposed for the convex and discrete multi-UCAV trajectory planning. In [23], through considering the change of speed, climb angle and heading angle, genetic algorithm (GA), and particle swarm optimization (PSO) are applied to solve the three-dimensional flight path planning for UAV on a low altitude terrain following/terrain avoidance mission. In [24], a detailed 3-DOF point-mass model is given to multiple UAVs cooperative trajectory planning using distributed receding horizon control. As a new type of direct method, the pseudospectral method is the search hot spot which most people pay attention to in twenty years. Compared with the traditional direct method, the pseudospectral method takes advantages of the size in the domain of convergence and convergence speed [25]. Because of the limitation of selecting point, Gauss pseudospectral method (GPM) can make an impact on the precision of the solution. In [21], the method is only applied to the single UCAV trajectory planning. In [26], the direct trajectory optimization method based on a variable low-order adaptive pseudospectral method is proposed. In [27], the total penetration trajectory planning process is separated into multiple phases, which is solved by using GPM as optimal control problem.

Due to the changes of the battlefield environment, mission demands, and task information in the implementation process of the actual UCAV collaborative attack task, the current trajectory cannot satisfy the constraints of the real-time environment [28, 29]. In [30], Gauss pseudospectral method with the rolling time domain strategy is adopted to solve the problem of UCAV online attack trajectory planning. In [31], the trajectory replanning method based on rapidly exploring random trees (RRT) is presented. In [32], in order to ensure the safety and reliability of multiple UAVs’ mission, a collision avoidance method based on Legendre pseudospectral method (LPM) with a rolling horizon policy in dynamic environment is proposed. In [33], aimed at drawbacks of the path planning process that cannot meet the timeliness, online collaborative path planning with a view to dynamic task allocation based on chaotic local search grey wolf optimization is proposed. Though the method can solve timeliness problem, the paths planned by the approach cannot meet the maneuverability of UCAVs flight. However, the difference between path planning based on optimization algorithm and trajectory planning based on pseudospectral method is that the trajectories planned by pseudospectral method employed in this paper can satisfy the flight need. Meanwhile, the collaborative trajectory planning model in correlative papers has all or some of the following disadvantages: (i) the changes of environment and tasks are not considered [34]; (ii) relative motion characteristic is not considered [35]; (iii) UCAV kinematics/dynamics model is not considered [36].

This paper addresses the problem of UCAV formation online collaborative trajectory planning using* hp* adaptive pseudospectral method and rolling horizon strategy in dynamic environments during attack process. The contributions of this paper are as follows: the dynamic environments are considered for online trajectory replanning; the relative motion characteristic is denoted in the UCAV formation model; a specific UCAV kinematics/dynamics model with wind power is given. The online collaborative trajectory replanning is based on the offline planning. When a trigger event is detected, the online replanning based on* hp *pseudospectral method in a certain rolling time domain is carried out.

The research of this paper is organized as follows. Section 2 presents the analysis of UCAV formation attack process, which includes the communication of formation trajectory planning and analysis of UCAV formation synergy elements. The model of formation trajectory planning which consists of kinematics/dynamics model of UCAV and formation relative motion model is introduced in Section 3. The proposed approach to solving the problem is denoted in Section 4. Section 5 presents the simulation results of offline planning and online replanning. Finally, the conclusions are presented in Section 6.

#### 2. Analysis of UCAV Formation Attack Process

UCAV formation attack trajectory planning refers to designing the trajectories which are from the starting point to the target point according to flight and attack mission requirements. These trajectories are overall optimal flight trajectories, which make UCAV formation leave the battlefield, make comprehensive cost minimum, and satisfy the formation maneuver performance constraints [37]. There may be many factors in actual formation attack process: (i) the change of battlefield environment, such as new-found threats; (ii) the dynamics of task targets, such as target position maneuver and goals’ addition and deletion; (iii) other factors, such as UCAV system errors and wind impact [38]. Faced with the above situation, UCAV formation trajectory needs to be replanned or adjusted locally in real time. The UCAV formation attack process is given as shown in Figure 1.

The task coordination of the formation includes not only the time coordination, but also the space coordination that consists of the attitude, speed, and height of each UCAV [39]. The UCAV formation collaborative elements’ analysis is given as shown in Figure 2. This paper assumes that the formation is composed of two UCAVs, which are the leader and the follower. The following procedure describes how the communication of formation trajectory planning works:

Task allocation. After confirming the mission target information, according to the specific task information and task structure, the general control Agent assigns the tasks to each UCAV.

Leader trajectory planning. According to assigned task information, perceived environment and relative position, speeds, and angles between two UCAVs, the leader Agent exports the optimal trajectory by trajectory planning module and then sends the attack leader’s trajectory parameters to the general control Agent and the follower Agent.

Follower trajectory planning. According to the attack trajectory parameters of leader UCAV, the follower Agent obtains the optimal task execution trajectory. As part of the formation, the follower UCAV can carry out the attack task itself or cooperate with the leader UCAV to carry out the task.

#### 3. Model of Formation Trajectory Planning

##### 3.1. Definition of Reference Coordinate System

There are many kinds of reference coordinate systems, such as ground coordinate system, body coordinate system, and speed coordinate system [40]. In addition to the ground coordinate system, this paper defines the trajectory coordinate system of UCAV’s centroid motion as reference coordinate for formation. The definition is as follows.

*Definition 1. *The origin is UCAV’s centroid; the direction is consistent with the speed direction of UCAV. The direction is in the plumb plane containing the flying speed direction, which is perpendicular to and points upward. is perpendicular to the plane and points right. The motion of UCAV’s center of mass that is relative to the ground coordinate system is determined by the angle of track inclination , yaw angle , velocity roll angle , and speed . The reference coordinate system is as shown in Figure 3 [41].

##### 3.2. Kinematics/Dynamics Model of UCAV

The paper adopts the centroid motion model of UCAV. Meanwhile, the dynamic model which considers wind power is used [42]. The kinematics/dynamics model of UCAV is obtained as

where is the position of UCAV in the ground coordinate system; is true airspeed for UCAV; and are the components of wind speed and wind acceleration along each axis; is angle of attack; is sideslip angle; is mass of UCAV; is acceleration of gravity; is projection of the axis of the rotation angular velocity in the air flow coordinate system; is lateral force; is thrust; is drag; is lift; is fuel consumption coefficient. Meanwhile, the thrust, drag, and lift are, respectively, calculated aswhere is throttle position; is the maximum available thrust; is air density; is the cross-sectional area of UCAV; is the resistance coefficient; is the lift coefficient.

According to flight envelope constraints which are mentioned in [43], the constraint conditions for UCAV are established aswhere which is an intermediate variable represents the term ; is the maximum angle change rate in the pitch direction; is the maximum angle change rate in the rolling direction; is the maximum lift coefficient; and are the overload in the loader direction and the maximum direction overload; is maximum dynamic pressure head; is minimum safe height.

##### 3.3. Formation Relative Motion Model

The important point in UCAV formation attack process is relative distance between leader and follower. In Figure 4, the relationship between the leader and the follower in position vector is represented [44]. The relative motion model for UCAV formation is shown aswhere the subscripts and are, respectively, represented as the leader and the follower; and , respectively, represent the relative distance and distance change rate between the leader and the follower. The term is the deviation of yaw angle between the leader and the follower.

The kinematics/dynamics model of UCAV formation can be obtained through the combination of kinematics/dynamics model of single UCAV which is represented as formula (1) and formula (2) and formation relative motion model which is denoted as formula (7).

##### 3.4. Objective Function

UCAV formation trajectory planning should not only consider the trajectory planning constraints of the single UCAV, but also the space coordination constraints and time coordination constraints between UCAVs. In the UCAV formation attack process, one of the most commonly used objective functions to evaluate the planning trajectories of the UCAV formation is to minimize number of collisions between UCAVs with minimizing collaborative time and minimizing threat cost.

For the space constraints of UCAV formation, the collision constraint is considered in this paper, and the objective function is established aswhere represents the collision objective function; represents the equivalent collision. When the distance between the leader and the follower is longer than the safe relative distance , the value of is 0; otherwise it is 1.

For the collaborative time constraint, the objective function is established aswhere represents the time objective function; is the time during which UCAV arrives at a specified location. is the instruction time during which UCAV arrives at a specified location.

During the mission, UCAV can be threatened by ground radar and antiaircraft fire. For the threat model of UCAV, the threat vector model is adopted. In order to calculate simply, the paper denotes the concept of threat vector.

*Definition 2. *The vector of threats at different locations represents threat vector for UCAV. For the radar threat, the detection probability of UCAV in a certain position is related to the attitude and distance which are currently relative to radar. For antiaircraft fire, the damaged probability of UCAV in a certain position is connected with current distance from antiaircraft fire to UCAV and flight speed. Threat value is calculated by the type of specific threats. Because modern antiaircraft fire is composed of low level antiaircraft missile and air defense radar, the paper incorporates the radar and antiaircraft missile as joint air defense system. Joint air defense system detection probability is shown aswhere represents the value of the detection probability of the joint air defense system; is the detection probability of the subsystem [42].

In the trajectory planning process, the threat cost is calculated from starting point to ending point. For the threat model of UCAV, the threat vector model is adopted, and the threat objective function is denoted aswhere is the threat cost value.

Based on the analysis of the above objective function and the actual battlefield needs, the integrated objective function is given aswhere is the integrated objective function value. is the terminal time of completing the task for the formation. is the task initial time for the formation. , and , , , and are the weight coefficients of time, threat, equivalent collision frequency, and instruction time error, respectively.

#### 4. Online Trajectory Planning Method

##### 4.1. Online Planning Strategy

The problem of UCAV trajectory planning in uncertain environment is very complex. In addition to considering the changing environment, the problem should consider the algorithm with high convergence accuracy and fast calculation speed. In order to solve the problem, the paper proposes an online trajectory planning method based on* hp* adaptive pseudospectral method. Firstly, UCAV formation offline trajectory for global space is planned. Then, according to the real-time task environment information, the local trajectory is solved by using* hp* adaptive pseudospectral method with less Gauss nodes. The process of method is presented as shown in Figure 5.

###### 4.1.1. Rolling Horizon Strategy

Considering the complexity and dynamics of battlefield environment, offline global planning method cannot meet the real-time demand in attack process. The paper uses Rolling Horizon Control (RHC) strategy for UCAV formation real-time trajectory planning [45]. Firstly, the time length of the initial planning is set as , and the time domain can be divided into phases. In the process of executing current trajectory, entire phase is not all carried out, but partial phase is executed, whose time length is the execution time length . When the current trajectory phase is carried, the planner starts to plan next time domain. Before UCAV formation reaches terminal position of current time domain, new time domain is sent to the execution unit. During the new time domain, planning is repeated, and Figure 6 shows the rolling horizon policy. In general, if the time domain length is longer, global optimal solution performs better. However, the time domain must be limited within the scope of onboard computer processing competence. Meanwhile, the shorter time domain could lead to execution timeliness of trajectory.

###### 4.1.2. Trajectory Replanning

Online trajectory planning strategy of the whole task process is divided into a serial of domains which iterate each other and keep moving towards the terminal position. And according to the changes of the mission environment, the local trajectory must be adjusted or replanned. The following definitions denote trajectory replanning factors.

*Definition 3. *The trigger event of trajectory replanning defined in the paper is sudden threat and the changes of mission targets, which include their maneuvering and number increasing and decreasing. When sudden threats and the changes of target’s positions and target’s number are detected in a certain position during a certain time domain, the planner of UCAV formation receives the information and starts to replan trajectory in a certain phase. The method how the phase is ensured is denoted as next definition.

*Definition 4. *For ensuring the replanning phase, it is very important to design the replanning region size, the starting position, and the terminal position of replanning trajectory.

The selection method of replanning region size must follow the principle that UCAV member should be far from threats with a certain distance as far as possible. The paper assumes that the action radius of threat is and safe distance between threat and UCAV member is . The replanning region length is denoted as follows.

In addition to ensuring the length of replanning region, the starting position and the terminal position must be given to ascertain the replanning region. The paper assumes that the threat position is , trajectory replanning calculating time is , UCAV current position is , and flight speed is during rolling horizon domain time:where is the starting position and is the terminal position.

According to the above analysis of rolling horizon strategy and replanning region, the online trajectory planning steps of UCAV formation are shown as follows.

*Step 1. *According to the data information uploaded by the integrated sensor in real time, the planner estimates whether the trigger event of trajectory replanning occurs. If the event occurs, turn Step 2; otherwise, continue to execute the offline trajectory and repeat Step 1.

*Step 2. *Based on sudden threats and the changes of mission targets, the replanning region length, the starting position, and terminal position are quickly determined. Then, the trajectory replanning is solved by* hp* adaptive pseudospectral method; turn Step 3.

*Step 3. *UCAV formation executes replanning trajectory mission until the optimization conditions are satisfied; otherwise turn Step 1.

##### 4.2. Hp Adaptive Pseudospectral Method

When planner calculates the offline trajectory of UCAV formation, the model based on* hp* adaptive pseudospectral method is adopted. Meanwhile, in the time domain when the trigger event of replanning trajectory is detected, the replanning trajectory using* hp* adaptive pseudospectral method is executed. According to the above analysis of UCAV formation trajectory planning, the optimal control model is described aswhere is Mayer index function in Bolza problem, which can represent terminal constraints cost; is Lagrange index function, which denotes state cost; index function can consist of formula (12); subscripts 0 and represent, respectively, planning initial time and terminal time; formula (17) can consist of (1), (2), and (7); formula (18) can be represented by trajectory constraint set (6); formula (19) denotes boundary value constraints.

Because the trajectory planning model belongs to the nonlinear optimal control model, it is difficult to solve the analytical solution. At present, there are two main numerical methods: indirect method and direct method [46]. The solving accuracy of the indirect method is very high, but its convergence radius is small. In addition to small convergence radius, due to the complexity of the formula derivation and the difficulty of boundary value problem, the indirect method has been rarely used. On the contrary, because the results obtained by the direct method can satisfy the KKT conditions without calculating the costate variables, most of the researchers all over the world tend to learn and use direct method, but the accuracy of direct method is low. According to difference of discrete object, the direct method includes discrete state variables and discrete control variables. The pseudospectral method belongs to the type where state variables and control variables are discrete.

Pseudospectral method uses a set of orthogonal polynomial functions as the primary functions to approximate state variables. Firstly, the time interval is divided into a serial of smaller time intervals, and a certain number of collocation points are given. Then, according to the time intervals and collocation points, the state variables are discretized, which constitutes a nonlinear programming problem. Finally, the problem can be solved by quadratic sequential programming.

In order to increase the accuracy and computational efficiency of pseudospectral method, the paper adopts* hp* adaptive pseudospectral method. In this section, a nonlinear programming model is presented, and then* hp* adaptive pseudospectral strategy is proposed.

###### 4.2.1. *Hp* Adaptive Pseudospectral Strategy

Direct collocation method uses a set of primary functions to approximate state equation at a specific time interval and certain collocation points. Most of direct collocation methods adopt* h* method, which uses the fixed order for state estimation and divides problem into multisegment. By increasing the number of time intervals, the convergence of the numerical discretization is guaranteed. In the region with the maximum trajectory error, through increasing the number of time intervals, the result can be guaranteed to keep in a certain range of precision.

In recent years, the pseudospectral method has become the main method to solve the problem of optimal control. This method gives collocation points based on the accurate integral rules and typical basis functions such as Chebyshev polynomial or Lagrange interpolation polynomial. In contrast to* h* method, the pseudospectral method uses* p* method, which converges by increasing the order number of polynomials in a time interval. However, there are some drawbacks: (i) when nonsmooth problem is handled by even using the high-order polynomial, the approximation effectiveness is not ideal, and the convergence rate of* p* method is very low; (ii) when using high-order global interpolation polynomial, the dimension increases rapidly.

In order to improve the accuracy and computational efficiency of the pseudospectral method, a new method, namely,* hp* adaptive pseudospectral, is proposed by combining* h* method and* p* method [47]. In the method, according to the need of the algorithm and the number and the width of the time interval, the polynomial orders in each time interval are adaptively determined. Whether to increase the number of time intervals or the polynomial order is determined by the relative curvature of each time interval. If the relative curvature is large enough, the current time interval is subdivided; otherwise, the polynomial order of current time interval is increased.

###### 4.2.2. Method Procedure

According to the analysis of above* hp* adaptive pseudospectral strategy, the specific procedure for solving nonlinear optimal control problems based on adaptive pseudospectral method is as follows.

Assuming that the current time is , the current time domain is divided into time intervals, whose cut-points are . In each time interval, the transformation is performed aswhere and are, respectively, starting point and terminal point of the time interval . By (20), is converted into , and the derivation of is given as follows.

Based on formula (21), the objective function of nonlinear optimal control problem can be presented as follows.

Likewise, the state equations, path constraints, and boundary value constraints are, respectively, represented as follows.

In order to keep states continuous, the term is established, where .

After the collocation points are determined, state variables and control variables can be approximated by interpolation polynomial. For continuous Bolza problem, state variables of each time interval are approximately shown aswhere belongs to the scope . is the basis of Legendre polynomial, and . is the Legendre-Gauss-Radau (LGR) collocation points set in the time interval . is 1, which is terminal point and not a collocation point. Similarly, control variables are approximately represented as follows.

In this step, nonlinear optimal control problem is discretized. Firstly, the state equation is discretized as follows.

In a collocation point, the state equation can be approximately obtained aswhere ; is Radau pseudospectral differential matrix in a time interval, whose order is .

Similarly, the discretization of the path constraints can be denoted as follows.The discretization of the boundary value constraints can be expressed as follows.

In order to ensure the continuity of the time interval, the common endpoints of adjacent time interval should be equal.

The objective function discretization is expressed as follows.where represents the Legendre polynomial coefficient.

Through the above steps, the continuous nonlinear optimal control problem is converted into a nonlinear programming problem, which can be solved by nonlinear programming method to obtain approximate solution of optimal control problem.* Hp* adaptive pseudospectral method can modify the number or the distribution of collocation points by judging the relative curvature criterion. Meanwhile, the method can also increase the polynomial order or subdivide the current time interval.

* Hp *adaptive pseudospectral updating mechanism is designed as follows.

In a collocation point of a time interval, the error between state differential equation and its discretization polynomial and the error between the trajectory constraint condition and its discretization constraint condition are compared:where these terms which include , , and really hold, in which is the number of state differential equations and is the number of constraint equations. If each element of and is greater than permitted deviation threshold value , the current time interval can be subdivided or the order of polynomial can be increased [41].

#### 5. Simulations and Analysis

Having taken the start and goal points of UCAV formation collaborative trajectory, the approach based on* hp *adaptive pseudospectral method in the simulated environment plans the trajectory with multiconstraint. Firstly, the simulation environment which includes operational environment and computer environment is established. Then, two experiments have been carried out, where the changeable threats and missions are used to verify the validity and stability of the approach. The first experiment analyzes how the offline trajectory is generated and verifies that the approach is stable. In the second experiment, the online trajectory planning is studied based on the previous experiment, and the simulation shows the method is feasible.

##### 5.1. Scenario of Mission Environment

The simulation experiments on cooperative trajectory planning of the classic two-UCAV formation are carried out, which can verify the effectiveness and stability of the approach. In the initial mission scenario, it is assumed that UCAV formation detects a target at the position (20, 20, 0) km, whose type is the ground fixed target. The current threats in this scenario are, respectively, radars and antiaircraft missiles, whose positions distribution is shown in Table 1, where SAM represents surface-to-air missile or antiaircraft missile. The simulation parameters of threats refer to [21]. The performance parameters of UCAV are shown in Table 2. The simulation parameters of the initial state and exiting battlefield state for UCAV formation are presented in Table 3. The UCAV weapon is a certain semiactive laser-guided missile, whose performance parameters refer to [42]. The weight coefficient of the objective function of UCAV formation cooperative trajectory planning model is set as .

The setting parameters of* hp* adaptive pseudospectral method are the same as those in [21]. In this paper, the number scope of nodes is [6–12]. The state variables of UCAV are , and the control variables are .

In this section, the approach described in the previous section is run under the toolbox pseudospectral method of Matlab 2014a, whose operating system is Windows 7 implemented under the processor Intel(R) Core(TM) i5-3470.

##### 5.2. Offline Scenarios

As a trajectory planner, the offline trajectory is firstly resolved before the attack process is performed. In order to verify the stability of UCAV formation collaborative trajectory planning, the simulations of the experiments from the points of different simulation scenarios and pseudospectral method parameters are carried out.

###### 5.2.1. Different Offline Simulation Scenarios

*(1) Case 1. *Based on the initial offline scenario environment, the first experiment for the classic two-UCAV formation collaborative trajectory planning is simulated to verify that* hp* adaptive pseudospectral method is valid. Figures 7, 8, 9, and 10 show the results of formation trajectory, its control variables, state variables, and relative distance between members.

**(a)**

**(b)**

**(a) change rate of attack angle**

**(b) change rate of sideslip angle**

**(c) change rate of roll angle**

**(d) throttle**

**(a) speed**

**(b) climb angle**

**(c) course angle**

**(d) mass change**

**(e) attack angle**

**(f) sideslip angle**

**(g) roll angle**

Firstly, Figure 7 presents, respectively, the 2-dimensional and 3-dimensional trajectories of UCAV formation, where the trajectory with blue line is leader trajectory and the trajectory with cyan line is follower trajectory. In this experiment, there are six threats that include two kinds of radars and two kinds of SAMs. The radars consist of Radar-A and Radar-B with different detection zones whose corresponding positions and performance parameters are given in Table 1 and [21], respectively. Meanwhile, the different detection zones are modeled as different dimension hemispheres shown in Figure 7. As with radars, due to different attack areas, SAMs which include SAM-A and SAM-B are simulated as different dimension attacking hemispheres, and their corresponding specific positions and performance parameters are shown in Table 1 and [21], respectively. Compared with the scale of trajectory planning space, the size of target which includes airport, building, and other important infrastructures is very small. In Figure 7, the target whose corresponding position and performance parameters are mentioned in previous section is simulated as a black point. As can be seen from Figure 7, UCAV formation does not pass through the threat zones and uses the height advantage to avoid the threat zone. According to the display of trajectory, the planned trajectory is smooth, which can conform to the feasibility of UCAV actual flight trajectory.

Then, the control variables and state variables changing curves of UCAV formation are, respectively, shown in Figures 8 and 9, where the variables are divided into leader’s variables and follower’s variables. As expected from Figure 8, in the whole trajectory planning process, the control variables which consist of the change rate of attack angle, the change rate of sideslip angle, the change rate of roll angle, and the throttle are within the ranges set in Table 2. The flight state variables include UCAV speed, climb angle, course angle, mass change, attack angle, sideslip angle, and roll angle. In Figure 9, all flight attitude variables meet the constraints of UCAV formation collaborative trajectory planning. As can be seen from Figure 8(d), the leader whose mass reduces drops the weapons in 60s to attack the target. Due to decreasing of leader’s mass, in order to keep the corresponding flight attitude stable, the control must be changed. Because of the first threat zone, in starting phase, the control variables are adjusted to avoid the threat zone and keep the formation stable.

Lastly, Figure 10 presents the change of the relative distance marked in blue solid line between UCAV formation’s members. In Figure 10, the blue dotted line represents the minimum safe distance which is 0.7km. It can be seen from the figure that the number of equivalent collisions in planned trajectory is zero, which can meet the condition of no collision. Meanwhile, with task execution time increasing, because the tasks of members are different, the relative distance between leader and follower increases firstly and decreases later. On the whole, the runtime of UCAV formation collaborative trajectory planning is 138.66s, and the objective function is 193.64.

*(2) Case 2. *In order to verify the stability of the approach about operational environment, the new mission environment is simulated. In this new mission, the UCAV formation attacks two targets, whose positions are, respectively, (15, 23, 0) km and (20, 18, 0) km. The type of two targets is the ground fixed target. The new threats are added into the original environment, whose distribution situation is shown in Table 4. The parameters of the new threats, the performance parameters of UCAV, and the UCAV weapons parameters are the same as the parameters of former experiment. At the same time, it is assumed that the UCAV weapons for attacking targets are enough. Through simulating the new experiment with new operational environment, where UCAV formation attacks two targets, the results are presented in Figures 11, 12, 13, and 14.

**(a)**

**(b)**

**(a) change rate of attack angle**

**(b) change rate of sideslip angle**

**(c) change rate of roll angle**

**(d) throttle**

**(a) speed**

**(b) climb angle**

**(c) course angle**

**(d) mass change**

**(e) attack angle**

**(f) sideslip angle**

**(g) roll angle**

Figure 11 shows the 2-dimensional and 3-dimensional trajectories of UCAV formation in the new environment. The new threats are marked with dotted circles in Figure 11(a) and dotted hemispheres in Figure 11(b). Meanwhile, targets and other threats are simulated with different marks as in case 1. As can be seen from the figures, the planned trajectory is relatively smooth and conforms to the perform ability of flight trajectory. According to the information reflected from the figures, the planned trajectory increases properly the length to avoid the threat zones.

The control variables and the state variables are, respectively, given in Figures 12 and 13. On the whole, in the process of total trajectory planning, the control variables and the state variables can meet the constraints of collaborative trajectory planning. Due to the addition of threat zones, in order to avoid the threat zones, the UCAV formation makes some maneuvers, and the control variables and state variables are not more stable than case 1. In Figure 12(d), it can be seen that the leader drops weapons to attack target 1 at 67.75s, and the follower drops the weapons to attack target 2 at 67.95s. Due to dropping of weapons, the UCAV mass decreases.

Figure 14 presents the relative distance variation curve marked in red between members of UCAV formation, which is bigger than blue minimum safe distance and can satisfy the constraint conditions of collision-free. On the whole, the runtime of UCAV formation collaborative trajectory planning is 194.17s, which is longer than case 1. And the objective function is 229.56, which is more than case 1. Because of the addition of threat zones, the addition of calculation can increase runtime and objective function value.

Based on the analysis of above two simulation experiments, the UCAV formation collaborative trajectory planning approach based on* hp* adaptive pseudospectral method is feasible and effective.

###### 5.2.2. Different Pseudospectral Method Parameters

In order to verify the stability of the approach, the offline trajectory planning approach based on* hp* adaptive pseudospectral method is stable under different operational environments and, in addition, a serial of simulation experiments are carried out in scopes of different nodes. According to the definition of* hp* adaptive strategy, the number of nodes in each interval time is controlled by a certain scope. In order to present the impact of nodes on the trajectory planning, we select four experiments about different scopes, whose results are shown in Tables 5 and 6, Figures 15 and 16.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**Aimed at case 1 of Section 5.2.1, the new experiments under different nodes scopes are simulated to obtain the results in Table 5 and Figure 15. In Table 5, the planning runtimes and objective function values under different nodes scopes are shown. It can be seen from the table that as the number of nodes increases, the planning runtime increases, and the planning objective function value decreases. Figure 15 shows the 3-dimensional spatial trajectories of UCAV formation under different nodes scopes. As the number of nodes increases, the planning trajectory becomes more precise.

With regard to case 2 of Section 5.2.1, the experiments are carried out to verify the stability of the approach further. The results in Table 6 and Figure 16 are similar to the results in above experiments. In summary, with node number increasing, the planning runtime becomes longer, but the accuracy is higher. Through comparing the new experiment results of case 1 and case 2, it can be obtained that with threat zones increasing, the increasing of nodes can make objective function values greater and runtime longer. Therefore, before planning trajectory, according to the environment, we must select an appropriate node scope to balance the relation between runtime and calculating accuracy.

##### 5.3. Online Scenarios

In order to verify the feasibility and validity of online collaborative trajectory planning method, based on offline experiments, two experiments under different dynamic environments are carried out. There are two dynamic environments which include sudden threats and mission changing.

*(1) Case 1.* The case 1 experiment based on sudden threats is simulated to verify the feasibility and effectiveness of approach based on* hp* adaptive pseudospectral. In online scenario environment based on the initial setting of offline scenario, UCAV formation can detect a new threat SAM-B at 10s, whose position is (10, 7, 0) km. Because the replanning approach needs faster running speed, the node scope of local planning is [3–5]. Based on the online planning strategy, the flight trajectory that does not satisfy the flight path constraints must be replanned. The replanning region should meet the runtime of local replanning, so we select 15s as rolling time domain length and 10s as execution length. Figures 17, 18, 19, and 20 show the results of formation trajectory, its control variables, state variables, and relative distance between members under dynamic environment.

**(a)**

**(b)**

**(a) change rate of attack angle**

**(b) change rate of sideslip angle**

**(c) change rate of roll angle**

**(d) throttle**

**(a) speed**

**(b) climb angle**

**(c) course angle**

**(d) mass change**

**(e) attack angle**

**(f) sideslip angle**

**(g) roll angle**

Firstly, in Figure 17, the formation collaborative trajectory based on sudden threats is presented, where the blue and cyan trajectories are the original offline trajectories of leader and follower, respectively, and the red and magenta trajectories are the execution parts of local replanned trajectories of leader and follower, respectively. As can be seen in the figure, when the original trajectory does not meet the threat zone constraint, the replanned trajectory avoids the threat zone.

Then, the control variables and state variables of online trajectory planning are shown in Figures 18 and 19, respectively, where the red curves are parameters variation curves of replanning. On the whole, all variables satisfy the constraints of UCAV formation collaborative trajectory planning. Compared with blue and cyan variation curves, the red and magenta variation curves are not more stable. Due to sudden threat zone that is detected at 10s, in order to avoid the sudden threat, the UCAV formation must adopt some maneuverings to sacrifice length cost and decrease threat cost. In the replanning process, replanning region time is from 25s to 45s.

Lastly, Figure 20 shows the relative distance curves which are marked with red solid line between members of UCAV formation in execution time of replanning phase and blue solid line in offline phase, respectively. In the figure, the relative distance is greater than the minimum safe distance which is marked with blue dotted line. In the general, while selecting fewer nodes, the replanning runtime is 14.74s, and objective function is 56.34.

*(2) Case 2*. After we verify the feasibility and effectiveness of approach based on* hp* adaptive pseudospectral, the stability of online trajectory planning needs to be verified by other online scenarios. Due to the limitation of paper length, we select another dynamic environment in case 2, which combines addition and maneuvering of mission and sudden threat. In case 2, at 50s, the formation receives the new mission and detects new threat. In this scenario, new mission target is carried out by leader, whose position is (17, 22, 0) km. The initial target is carried out by follower, whose position changes from the original position point (20, 20, 0) km to the position point (20, 18.5, 0) km. Meanwhile, the new threat SAM-B is detected, whose position is (19.5, 21, 0) km. According to the definition of online trajectory planning strategy, the replanned region execution time must be less than rolling domain time. Hence, the approach selects the nodes scope [3–5]. At the same time, we select 10s as replanned region execution time and 15s as the rolling domain time. Through simulating new dynamic environment in case 2, Figures 21, 22, 23, and 24 are obtained.

**(a)**

**(b)**

**(a) change rate of attack angle**

**(b) change rate of sideslip angle**

**(c) change rate of roll angle**

**(d) throttle**

**(a) speed**

**(b) climb angle**

**(c) course angle**

**(d) mass change**

**(e) attack angle**

**(f) sideslip angle**

**(g) roll angle**

Figure 21 shows UCAV formation trajectories planning in new mission environment, which include 2-dimensional trajectory and 3-dimensional trajectory. As can be seen in the figure, when blue trajectory does not meet the threat constraints, the red trajectory which is replanned based on* hp* adaptive pseudospectral method avoids the threat zone and executes the dropping of weapons that are used for attacking targets. Meanwhile, due to synergy of UCAVs, in order to avoid colliding with leader, the cyan trajectory of follower needs to be replanned, and then the magenta trajectory is simulated as shown in Figure 21. In Figures 22 and 23, the control parameters and state parameters variation curves are presented, respectively, where red and magenta curves represent the replanned trajectory parameters. According to display of figures, these variables satisfy the collaborative trajectory planning constraints. Because of avoiding the sudden threat and target addition, the UCAV formation must do some maneuverings, and the formation trajectory is not more stable than offline planned trajectory. In the replanning process, replanning region time is from 65s to 85s. Figure 24 presents the relative distance variation curves marked in red between members of formation during replanned phase, which is bigger than minimum safe distance that is 0.7km and meets the conditions of noncollision constraint. On the whole, the replanning runtime is 14.67s and the objective function value is 56.37.

Through combining case 1 and case 2, the approach denoted in the paper can verify the feasibility and stability of the online trajectory planning method.

#### 6. Conclusion

This paper presents a novel approach based on* hp* adaptive pseudospectral method to plan online collaborative trajectory for UCAV formation. The method is proposed to efficiently replan formation trajectories based on offline trajectories in dynamic environment. The collaborative trajectory planning model adopts kinematics/dynamics model of UCAV and formation relative motion model to compute more realistic trajectories. The continuous nonlinear optimal control problem is transformed into the nonlinear programming problem, which uses* hp* adaptive pseudospectral method to generate trajectories. For online collaborative trajectory planning, the* hp* adaptive pseudospectral method is applied iteratively and the replanned trajectories are generated in a subproblem defined by the horizon when a trigger event occurs.

The main advantage of the proposed approach is to consider dynamic environments, relative motion characteristic, and UCAV kinematics/dynamics model together. Most of the papers published on pseudospectral method consider only one vehicle. Another novel aspect is to consider the experiments of offline trajectory planning and online trajectory replanning.

In terms of offline trajectory planning, the simulation results of the static experiments show that the collaborative trajectory planning approach based on* hp *adaptive pseudospectral method achieves the goal of searching the optimal trajectory. In other words, online scenario experiments have been carried out to denote that the online collaborative trajectory replanning approach with using* hp *adaptive pseudospectral method and rolling horizon strategy is effective and stable.

The online trajectory planning of multiple UCAVs which include classical two UCAVs and three or more UCAVs is a very complex problem, where the task allocation problem and anticollision problem need to be considered. Meanwhile, as the number of UCAVs increases, due to the high time-consuming computation of more nodes and the limitation of variable number setting in* hp* adaptive pseudospectral method, it is necessary to combine* hp* adaptive pseudospectral method with metaheuristic optimization algorithm mentioned in previous literature. The approach is the next research orientation in the future.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (51579209) and the Aviation Science Foundation of China (20175196019).