Abstract

In this paper, a real-time optimal attitude controller is designed for staring imaging, and the output command is based on future prediction. First, the mathematical model of staring imaging is established. Then, the structure of the optimal attitude controller is designed. The controller consists of a preprocessing algorithm and a neural network. Constructing the neural network requires training samples generated by optimization. The objective function in the optimization method takes the future control effect into account. The neural network is trained after sample creation to achieve real-time optimal control. Compared with the PID (proportional-integral-derivative) controller with the best combination of parameters, the neural network controller achieves better attitude pointing accuracy and pointing stability.

1. Introduction

Staring mode of satellites can be used to collect the dynamic information of ground targets. Emergency rescue, airport, and traffic monitoring are all cases where staring imaging is more advantageous than conventional modes.

The optical axis of the camera is required to point fixedly to the ground target in the process of staring imaging. Thus, the attitude controller should run in a three-axis mode. Imaging quality is closely related to attitude control performance which is affected by the control strategy.

Control laws in analytical type are widely applied in attitude controllers for staring imaging. In those analytical control laws, one-step decision is usually considered instead of long-term planning. Lian proposed two different calculation methods of the desired angular velocity for staring imaging [1]. Zhang designed the control strategy under the condition that the target error can be detected, and that the actuator acts with delay [2]. Cui designed a PD adaptive controller for multiobjective staring imaging [3]. Zhang designed a controller to observe spatial cooperative targets [4].

Staring imaging is one application of attitude tracking control. For the more general tracking control problem, one-step decision instead of long-term planning is also the main concern. Wu [5] designed a combined control strategy consisting of a saturation controller for spatial motion targets and proposed a backstepping controller using multiexecuting mechanism control. Leonard designed a control law to monitor space debris [6]. An adaptive fault-tolerant controller with backstepping control is designed by Hu for attitude tracking control problems of rigid spacecraft under unknown but constant inertial parameters, unexpected disturbances, and faults under actuators or saturation [7]. Lu proposed an FNTSM control law with finite time control [8]. Pong designed a control law to achieve spacecraft pointing in a specified direction [9]. Wu implemented attitude tracking control based on iterative learning [10].

Recently, neural networks are mainly applied to the following three aspects in aerospace guidance and control.

First, neural networks are used to accelerate some evolutionary algorithms in aerospace. Cassioli et al. [11] applied SVMs (support vector machines) to improve initial guesses for some trajectory problems in the GTOP (global trajectory optimization problem) database. Basu et al. [12] deployed a neural network to set initial conditions for the PSO (particle swarm optimization), which reduced the planning time of reorientation for a space telescope.

Another application of neural networks is real-time trajectory planning. Izzo et al. [13] trained a DNN (deep neural network) to design optimal interplanetary trajectories. Furfaro et al. [14] designed neural networks based on LSTM (long short-term memory) to generate optimal trajectories for planetary landing tasks.

Neural networks are also directly used in aerospace controllers. Zou [15] proposed a scheme for limited time attitude tracking control of spacecraft using the terminal sliding mode and Chebyshev neural network. Biggs and Fournier [16] trained MLP to optimize the thruster application scheme to control the attitude of a 12 U CubeSat. CNNs (convolutional neural networks) have shown superior performance in the image classification, and they are also used as control networks. Furfaro et al. [17] designed a CNN to process simulated moon images and generate landing control commands in the simulation of a 2D environment.

Different from traditional control methods, the optimal control method proposed in this paper seeks the optimal control command sequence for a period of time in the future. The beginning of the command sequence is the current optimal control decision. However, the optimal control method requires iterative calculation, which consumes high computational resources and is not a real-time control method. Thus, the optimal control method is used to create training samples where the relationship between input and output is contained. Finally, the neural network is trained to achieve optimal control and real-time control simultaneously.

The main contributions of this paper are stated as follows. (i)In the aspect of staring imaging, a controller which considers a future period of time and makes the optimal control decision is proposed, and it performs better than the traditional controller(ii)An algorithm for compressing the satellite state information is proposed. The state information consists of current dynamics of the satellite as well as the desired sequence of the future angle and angular velocity. After compression, only 12 scalars are prepared as the input of the neural network

This paper is organized as follows. Section 2 states the staring imaging problem of the spacecraft system. The structure of the real-time optimal controller for staring imaging is designed in Section 3. In Section 4, auxiliary parts of the controller are designed. In Sections 5 and 6, the training sample set is generated, and the neural network in the controller is constructed, respectively. Simulation results are provided in Section 7. Finally, conclusions are given in Section 8.

2. Problem Formulation

2.1. Definition of the Coordinate System

Inertial frame : its origin is located at the mass center of the Earth. The axis points from the Earth towards the vernal equinox. The axis is along the Earth rotation axis. The axis is determined by the right-hand rule.

Satellite body frame : its origin is located at the mass center of the satellite. , , and axes lie along the corresponding principal axes of the satellite.

Desired satellite body frame : it is used for describing the desired attitude of the satellite. Its three axes lie along , , and when the satellite is located at the desired attitude.

Orbital reference frame : its origin is located at the mass center of the satellite. The axis is along the antisymmetric direction of the orbit momentum. The axis points to the Earth center. The axis is determined by the right-hand rule.

2.2. Kinematic and Dynamic Equations

The satellite is considered as a rigid body equipped with 3 reaction wheels installed orthogonally along the three axes of the satellite body frame. The camera is fixed to the satellite, and its optical axis lies along the axis of the satellite body frame.

The mathematical model contains orbit and attitude constraints, both of which are described by kinematic and dynamic equations. The orbital kinematic equation in the inertial frame is given by where is the satellite position and is the satellite velocity.

The orbital dynamic equation in the inertial frame is where is the product of the gravitational constant and the mass of the Earth.

The attitude kinematic equation ([18], pp. 344) is defined using quaternions where is the quaternion denoting the rotation from the inertial frame to the satellite body frame . is the vector part, and is the scalar part. is the angular velocity vector of the satellite body frame with respect to the inertial frame . It can also be expressed in terms of basis vectors of as .

The attitude dynamics equation is where is the inertia tensor of the whole satellite. is the inertia tensor of the 3 reaction wheels. is the angular velocity of the 3 reaction wheels relative to the satellite body. is the control torque generated by reaction wheels. is the external disturbance torque.

2.3. Attitude Control Effect Evaluation

As illustrated in Figure 1, is the imaging time interval, and (i =1, 2, 3…) is the instant when image is taken. Imaging quality has a close relationship with the nearby attitude control effect. For example, the image taken in is affected by the attitude control effect during [, ].

Thus, two steps are designed to properly evaluate the attitude control effect for the whole task of staring imaging. First, the attitude control effects nearby each image are evaluated, respectively, (i.e., the attitude control effect nearby image is evaluated based on the angle and angular velocity within [,]). Then, the average value of those effects is obtained for evaluating the whole task.

Pointing accuracy and stability are the two major categories that the payload attitude control requirements fall into [19]. Therefore, mean pointing error [20] and rate error [19] are included in the attitude control effect evaluation. Assuming that the number of all images is , the pointing accuracy is proposed in Eq. (6), the pointing stability is proposed in Eq. (7), and the control effect of the whole task is obtained by Eq. (8).

In Eq. (6), symbol means getting the absolute value of the variable within. is the number of simulation steps during [, ], and is the Euler angle error sequence during that time. is the element in row and column of the matrix . The column in the matrix is denoted as , and the Euler angle error vector represents the rotation from the desired satellite body frame to the real satellite body frame, with the order 3-1-2 (i.e., ). Other Euler angles defined in this paper share the same method of rotation () as .

In Eq. (7), is the angular velocity error sequence during [, ], and is the element in row and column of matrix . The column in matrix is denoted as , and the angular velocity error vector represents the difference between the real angular velocity (in the satellite body frame) and the desired angular velocity (in the desired satellite body frame).

In Eq. (8), and are the weights of each term.

3. Structure of the Real-Time Optimal Controller

As is shown in the dashed box named “Attitude Controller” in Figure 2, six modules are outlined. They are established in the following sections of this paper.

The modules named “Calculating Staring Imaging Desired Sequence” and “Network Input Preprocessing” are explained in the section named “Auxiliary Parts of the Real-time Optimal Controller.”

The modules named “Normalization,” “Neural Network,” and “Anti-normalization” are central parts, which will be explained in two sections of this paper. In order to establish the neural network, the method of creating training samples will be firstly shown in the section named “Sample Creation.” Then, the design of those three modules will be shown in the section named “Learning the Optimal Control Strategy.”

The module named the “PID Control Law” is designed in case that the real inputs exceed the designed input range of the neural network. This module only works under big disturbances, and it is tested in “Impulse Response” of “Results and Discussion.” The PID coefficients are shown in Table 1.

The inputs of the controller consist of seven parts, which are current Euler angle from the star sensor, current angular velocity from the gyro, target longitude and latitude from telecommand, UTC (coordinated universal time) from the on-board computer, and current as well as from GPS (Global Positioning System). represents the rotation from the orbital reference frame to the satellite body frame.

The output of the controller is the torque command . Reaction wheels receive the torque command and generate the in Eq. (4).

4. Auxiliary Parts of the Real-Time Optimal Controller

4.1. Calculating Staring Imaging Desired Sequence

The module named “Calculating Staring Imaging Desired Sequence” is used to predict and obtain the future data for the controller. The outputs consist of the desired Euler angle sequence and the desired angular velocity sequence . is the number of steps in the simulation to predict the future.

Denote as the time taken in a single simulation step. Column in matrix is defined as , and it represents the desired Euler angle (representing the rotation from the inertial frame to the desired satellite body frame) in time . Column in matrix is defined as , and it represents the desired angular velocity (in the desired satellite body frame) in time . and are calculated according to the states of the satellite, and the algorithm is shown in ref. [1].

4.2. Network Input Preprocessing

The module named “Network Input Preprocessing” is used to calculate the input for neural network. The inputs consist of , , , and , and the outputs consist of , , , and .

The desired angular velocity is the first column in matrix .

The method of calculating the first derivative of the desired angular velocity is

The Euler angle is obtained by the following steps. First, the satellite current Euler angle is converted [18] to coordinate transformation matrix (representing the rotation from to ). The current desired Euler angle is converted to coordinate transformation matrix . The coordinate transformation matrix from to is defined as , and it can be obtained [21] according to and . The coordinate transformation matrix from to is obtained by . Finally, can be converted to.

The current angular velocity error vector represents the difference between the angular velocity and the desired angular velocity .

5. Sample Creation

5.1. Structure of the Sample Creation Algorithm

A sample set should be created before the neural networks are trained. The algorithm for creating a sample in the sample set is shown in Figure 3.

To start the algorithm, the interior-point optimizer firstly provides a three-axis zero torque command sequence , which is the matrix to be optimized in the loop. is the number of torque commands used in the simulation of the “Attitude Dynamics” module.

The inputs of the “Attitude Dynamics” module are as well as the initial values, and the outputs of that module are the Euler angle sequence as well as the angular velocity sequence . is the number of steps in the simulation.

The sample input is also used for calculating the desired sequence and . The more approaches and approaches , the better is. Finally, is obtained when the exit condition is met.

5.2. Definitions of Sample Input and Output

Each sample is divided into a sample input vector and a sample output vector , and their definitions are

The number of samples should cover the input range of the neural network. The input vector of each sample is created randomly, with the 12 elements selected within the designated input ranges shown in Table 2. The ranges are selected according to the requirement of staring imaging tasks. If the input is out of the designated range, will be generated by the PID controller instead of the neural network. This function is tested in the simulation part of this paper. The variables in Table 2 are defined as , , , and .

5.3. Calculating the Desired Sequence

From the desired angular velocity and its first derivative at the initial time, the desired angular velocity sequence can be generated by where is the simulation time step and is the column in matrix .

The desired Euler angle sequence is calculated via quaternions. The desired angular velocity sequence can be integrated in time to obtain the quaternion sequence (denoting the rotation from the inertial frame to the desired satellite body frame ). The column in matrix is . It is set that the inertial frame coincides with the desired attitude frame at the beginning of the simulation in sample creation. Thus, the quaternion at the initial time is . The following equations are used to calculate the quaternion sequence . where the function is defined as

The quaternion sequence can be converted to the desired Euler angle sequence .

5.4. Calculating the Initial Values of Simulation

The desired satellite body frame coincides with the inertial frame at the initial time. Thus, the initial Euler angle of the satellite is where denotes the rotation from the inertial frame to the satellite body frame .

The initial satellite angular velocity in the satellite body frame is

5.5. Attitude Dynamics

It has been defined that is the number of commands in , and that is the number of steps in the simulation. Assuming that is the time interval between two commands, the relationship between ,,, and the simulation time step is where is the ratio of to and it is a positive integer.

The standard fourth-order Runge-Kutta method is used to calculate the kinematics and dynamics of the satellite. The Euler angle sequence and the angular velocity sequence in the whole process are recorded, and they are obtained after the simulation.

5.6. Calculating the Objective Function

The objective function (defined in Eq. (8)) is used to evaluate the performance of . The smaller the objective function is, the better the control performance is.

5.7. Interior-Point Optimization

At the beginning of the sample creation for a sample input , the interior-point optimizer creates a three-axis zero torque command sequence . The objective function is obtained after the simulation where is tested. Then, the optimizer makes the judgment according to . If the exit condition is satisfied, the loop exits. If it is not satisfied, a new is generated and the optimization continues.

The terminal value of is generated when the loop exits. The sample output is obtained by where is the first column of matrix .

6. Learning the Optimal Control Strategy

In this section, the neural network, the normalization module, and the antinormalization module in the attitude controller are established, which are the three central parts shown in Figure 2. The three modules can be modeled as function . The algorithms of normalization and antinormalization can be found in ref. [22].

6.1. Division and Normalization of the Sample Set

The whole sample set is divided into two parts, which are the training sample set and the test sample set. 75% of the samples are randomly selected as training samples, and the others are test samples.

The training sample set consists of two matrices. One is the sample input matrix and the other is the corresponding sample output matrix . is the number of samples in the training sample set. Each row of the two matrices has been normalized to [-1,1] before the training sample set is applied to training neural networks.

The test sample set also consists of two matrices. One is the sample input matrix , and the other is the corresponding sample output matrix . is the number of samples in the test sample set. Each row of the two matrices has been normalized to [-1,1] before the test sample set is applied to testing the neural networks.

6.2. Training the Neural Networks

The structure of the neural network is inputlayer-feedforwardlayer-feedforwardlayer. The stochastic gradient descent (SGD) algorithm is applied to the training process. In order to reduce overfitting, dropout (the fraction of dropout is set to 0.1) and weight decay (the regularization parameter is set to 0.01) are also applied.

Different network structure parameters and training parameters are tested to achieve the best performance. The values of parameters are selected in a linear method and are shown in Table 3. All parameter combinations are used in training, and 3876neural networks () are obtained.

6.3. Evaluating the Neural Networks

In the process of evaluating a neural network, the output matrix is obtained by where is the column in matrix and is the column in matrix . Define that is the element in row and column of matrix , and that is the element in row , and column of matrix . MSE (Mean Squared Error) is defined as

Finally, the neural network with the smallest MSE (0.00011686) is obtained. Its parameters used in training are shown in Table 4, and its training process is shown in Figure 4.

7. Results and Discussion

7.1. Simulation Condition Design

The satellite orbit parameters at the initial time are shown in Table 5. The initial time is 2017-05-31-04-40-00 (in yyyy-MM-dd-HH-mm-ss format).

As is shown in Figure 5, 100 ground targets are randomly selected in the circle with the center at 120°E 30°N and a radius of 100 km. After all the ground targets are tested, the average performance is obtained by where and is the performance defined in Eq. (8) of a single test.

The average pointing accuracy is obtained by

The average pointing stability is obtained by

Other simulation parameters are defined in Table 6.

7.2. Searching the Optimal Coefficients for the PID Controller

Different combinations of coefficients in the PID controller are tested to achieve its best performance as the baseline for comparison. The coefficients of each axis are optimized, respectively, using the same combinations () defined in Table 7. The Euler angles and angular velocities used in the PID control law are expressed in degrees.

All the ground targets defined in Figure 5are tested for each coefficient combination, and the average performance is calculated. Then, the optimal PID coefficient combination is obtained according to Eq. (25) ( and , in order to balance the pointing accuracy and the pointing stability).

7.3. Performance Comparison of Steady-State Control

The comparison of steady-state control performance between the PID controller and the neural network controller consists of two parts. The first part is testing the condition in which the satellite stares at the central ground target (at 120°E 30°N). Curves of the Euler angle error and angular velocity error are provided. The second part includes testing all the ground targets shown in Figure 5and calculating the average performance .

7.4. Testing the Central Ground Target

The performances are tested for the neural network controller and the PID controller with the optimal coefficient combination when the satellite stares at the central ground target. The neural network controller performs better than the PID controller according to the results shown in Table 8 and the error curves.

Error curves of the PID controller are shown in Figure 6 and Figure 7.

Error curves under the control of neural network are shown in Figures 8 and 9.

7.5. Testing all the Ground Targets

All the ground targets defined in Figure 5are tested for the neural network controller and the PID controller with the best coefficient combination. The neural network controller performs better according to the results shown in Table 9.

Figure 10 is provided to visualize the performances and excludes the influence of and . The -axis and -axis denote the pointing accuracy and the pointing stability , respectively. The blue points represent the performances of the PID controller on all the ground targets, and the red points represent the performances of the neural network controller.

7.6. Performance Comparison of the Dynamic Response
7.6.1. Step Response

In order to test the step response, two simulations are executed for each controller and each ground target. One is the normal simulation, and the other is the simulation in which an extra angle measurement error (0.005° to three axes) is added at 30s and later. According to the difference between the two Euler angle sequences, the overshoot, settling time (the error band is ), and steady-state error are obtained.

The average values of overshoot, settling time, and steady-state error for all the ground targets are shown in Table 10 (PID controller) and Table 11 (neural network controller). The Euler angle difference of the two simulations (the satellite stares at 120°E 30°N) are provided in Figure 11 (PID controller) and Figure 12 (neural network controller). The neural network controller shows an advantage over the PID controller on all the three indices (in column “Average of Three Axes”) according to the results.

7.7. Frequency Response

In order to test the frequency response, external periodical torque is applied in the simulation in which the satellite stares at the central ground target. The amplitude of the external periodical torque is 0.03 Nm, and the response of three axes is tested, respectively. Magnitude-frequency characteristic curves of the two controllers are shown in Figure 13 (PID controller) and Figure 14 (neural network controller). The amplitude of the neural network controller is higher than that of the PID controller when the frequency is lower than 2 Hz, and the amplitudes are similar when the frequency is higher.

7.8. Impulse Response

A big disturbance is applied to the system under the control of the neural network in order to test the stability of the system (the satellite stares at 120°E 30°N). The disturbance starts at 30s and lasts for 0.5 s, and its amplitudes for the three axes are [1.0, 1.0, 1.0] Nm. Figure 15 shows that though the Euler angle error has exceed the input range (defined in Table 2) of the neural network, and the PID control law in the proposed attitude controller (in Figure 2) manages to converge and gives the control back to the neural network.

7.9. Testing the Real-Time Control Ability

It takes an average of 6.1 seconds to create a sample using CPU i7-8700. As a result, applying the loop of optimization (defined in Figure 3) to the satellite attitude controller does not meet the real-time control requirement (e.g., the control period is usually set to 0.1 s). The calculation time of the whole neural network controller using ARM (STM32F417) is 3.3 ms (3.1 ms in the process of neural network forward propagation, 0.2 ms in other modules), which meets the requirement of real-time control.

8. Conclusions

In this paper, the optimal control and machine learning method are applied unitedly in the attitude control of staring imaging. Compared with the PID controller with the best coefficients, the neural network controller shows advantages in the steady-state performance (including pointing accuracy and pointing stability), which is the most important in staring imaging tasks. As for the dynamic response, the neural network controller performs better in the step response and worse in the frequency response than the PID controller. The proposed controller also shows the ability to converge when a big disturbance is applied.

The calculation time of the proposed attitude controller on ARM (STM32F417) is 3.3 ms, which meets the requirement of real-time control.

Data Availability

The training sample data and trained networks used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partially supported by the Key Laboratory of Spacecraft Design Optimization and Dynamic Simulation Technologies, Ministry of Education. This research was funded by the National Key Research and Development Program of China (No. 2016YFB0501102) and the Key Laboratory of Spacecraft Design Optimization and Dynamic Simulation Technologies, Ministry of Education.