Abstract

A method to solve nonlinear optimal control problems is proposed in this work. The method implements an approximating sequence of time-varying linear quadratic regulators that converge to the solution of the original, nonlinear problem. Each subproblem is solved by manipulating the state transition matrix of the state-costate dynamics. Hard, soft, and mixed boundary conditions are handled. The presented method is a modified version of an algorithm known as “approximating sequence of Riccati equations.” Sample problems in astrodynamics are treated to show the effectiveness of the method, whose limitations are also discussed.

1. Introduction

Optimal control problems are solved with indirect or direct methods. Indirect methods stem from the calculus of variations [1, 2]; direct methods use a nonlinear programming optimization [3, 4]. Both methods require the solution of a complex set of equations (Euler-Lagrange differential equations or Karush-Kuhn-Tucker algebraic equations) for which iterative numerical methods are used. These iterative procedures implement some form of Newton’s method to find the zeros of a nonlinear function. They are initiated by providing an initial guess solution. Guessing an appropriate initial solution is not trivial and requires a deep knowledge of the problem at hand. In indirect methods, the initial value of the Lagrange multiplier has to be provided, whose lack of physical meaning makes it difficult to formulate a good guess. In direct methods, the initial trajectory and control have to be guessed at discrete points over the whole time interval.

This paper presents an approximate method to solve nonlinear optimal control problems. This is a modification of the method known as “approximating sequence of Riccati equations” (ASRE) [5, 6]. It transforms the nonlinear dynamics and objective function into a pseudolinear and quadratic-like structure, respectively, by using state- and control-dependent functions. At each iteration, these functions are evaluated by using the solutions at the previous iteration, and therefore, a series of time-varying linear quadratic regulators is treated. This sequence is solved with a state transition matrix approach, where three different final conditions are handled: final state fully specified, final state not specified, and final state not completely specified. These define hard, soft, and mixed constrained problems, respectively.

The main feature of the presented method is that it does not require guessing any initial solution or Lagrange multiplier. In fact, iterations start by evaluating the state- and control-dependent functions using the initial condition and zero control, respectively. The way the dynamics and objective function are factorized recalls the state-dependent Riccati equations (SDRE) method [79]. These two methods possess some similarities, although the way they solve the optimal control problem is different. As the method is approximated, suboptimal solutions are derived. These could be used as first guess solutions for either indirect or direct methods.

2. The Nonlinear Optimal Control Problem

The optimal control problem requires that, given a set of first-order differential equations the control functions must be determined within initial, final time , , such that the performance index is minimized while satisfying two-point conditions

The problem consists in finding a solution that represents a stationary point of the augmented performance index where is the vector of costate and is the multiplier of the boundary condition. The necessary conditions for optimality, also referred to as Euler-Lagrange equations, are where , the Hamiltonian, is The differential-algebraic system (5) must be solved together with the final boundary conditions (3) and the transversality conditions which define a differential-algebraic parametric two-point boundary value problem whose solution supplies and the functions , , , .

3. The Approximating Sequence of Riccati Equations

Let the controlled dynamics (1) be rewritten in the form and let the objective function (2) be rearranged as where the operators , , , , and have appropriate dimensions. The nonlinear dynamics (8) and the performance index (9) define an optimal control problem. The initial state, , is assumed to be given, while the final condition ( in (3)) can assume three different forms (see Section 4). The problem is formulated as an approximating sequence of Riccati equations. This method reduces problem (8)-(9) to a series of time-varying linear quadratic regulators that are defined by evaluating the state- and control-dependent matrices using the solution of the previous iteration (the first iteration considers the initial condition and zero control).

The initial step consists in solving problem 0, which is defined as follows: Problem 0 is a standard time-varying linear quadratic regulator (TVLQR), as the arguments of , , , , and are all given except for the time. This problem is solved to yield and , , where the superscript denotes the problem that the solution refers to.

At a generic, subsequent iteration, problem has to be solved. This is defined as follows: Problem is again a TVLQR; note that and are the solutions of problem achieved at previous iteration. Solving problem yields and , .

Iterations continue until a certain convergence criterion is satisfied. In the present implementation of the algorithm, the convergence is reached when where is a prescribed tolerance. That is, iterations terminate when the difference between each component of the state, evaluated for all times, changes by less than between two consecutive iterations.

4. Solution of the Time-Varying Linear Quadratic Regulator by the State Transition Matrix

With the approach sketched in Section 3, a fully nonlinear optimal control problem is reduced to a sequence of time-varying linear quadratic regulators. These can be solved a number of times to achieve an approximate solution of the original, nonlinear problem. This is done by exploiting the structure of the problem as well as its state transition matrix. This scheme differs from that implemented in [5, 6], and, in part, is described in [1].

Suppose that the following dynamics are given: together with the quadratic objective function where , , and are positive semidefinite and positive definite time-varying matrices with appropriate dimensions, respectively. The Hamiltonian of this problem is and the optimality conditions (5) read From (18), it is possible to get which can be substituted into (16)-(17) to yield In a compact form, (20) can be arranged as Since (21) is a system of linear differential equations, the exact solution can be written as where the functions , , , and are the components of the state transition matrix, which can be found by integrating the following dynamics: with the initial conditions

If both and were given, it would be possible to compute and through (22)-(23), and therefore the optimal control function with (19). The initial condition is assumed to be given, whereas the computation of depends on the final condition, which, in the present algorithm, can be defined in three different ways.

4.1. Hard Constrained Problem

In a hard constrained problem (HCP), the value of the final state is fully specified, , and therefore, (14) does not account for . The value of can be found by writing (22) at final time and by solving for ; that is,

4.2. Soft Constrained Problem

In a soft constrained problem (SCP), the final state is not specified, and thus in (14) is an positive definite matrix. The transversality condition (7) sets a relation between the state and costate at final time which can be used to find . This is done by writing (22)-(23) at final time and using (28) Equations (29) represent a linear algebraic system of equations in the unknowns . The system can be solved by substitution to yield

4.3. Mixed Constrained Problem

In a mixed constrained problem (MCP), some components of the final state are specified and some are not. Without any loss of generality, let the state be decomposed as , where are the known components at final time, , and are remaining elements. The costate is decomposed accordingly as . With this formalism, in (14) is , and it is pre- and postmultiplied by . The transversality condition (7) is .

The MCP is solved by partitioning the state transition matrix in a suitable form such that, at final time, (22)-(23) read where the dependence of the state transition matrix components on , is omitted for brevity. From the first row of (31), it is possible to get which can be substituted in the second row of (31) to yield Equations (33)-(34), together with the transversality condition , can be substituted in the second row of (32) to compute the component of the initial costate where Once is know, the remaining part of the initial costate, , is computed through (33), and therefore, the full initial costate is obtained as a function of the initial condition, given final condition, initial and final time; that is,  .

5. Numerical Examples

Two simple problems with nonlinear dynamics are considered to apply the developed algorithm. These correspond to the controlled relative spacecraft motion and to the controlled two-body dynamics for low-thrust transfers.

5.1. Low-Thrust Rendezvous

This problem is taken from the literature where a solution is available, for comparison’s sake [10, 11]. Consider the planar, relative motion of two particles in a central gravity field expressed in a rotating frame with normalized units: the length unit is equal to the orbital radius, the time unit is such that the orbital period is , and the gravitational parameter is equal to 1. In these dynamics, the state, , represents the radial, tangential displacements () and the radial, tangential velocity deviations (), respectively. The control, , is made up by the radial and tangential accelerations, respectively.

The equations of motion are with . The initial condition is . Two different problems are solved to test the algorithm in both hard and soft constrained conditions.

Hard Constrained Rendezvous. The HCP consists in minimizing with the final, given condition and , .

Soft Constrained Rendezvous. The SCP considers the following objective function: with , and ( is free).

The differential equations (37) are factorized into the form of (8) as with . Thus, the problem is put into the pseudo-LQR form (8)-(9) by defining and as in (40) and by setting and .

The two problems have been solved with the developed method. Table 1 reports the details of the HCP and SCP, whose solutions are shown in Figures 1 and 2, respectively. In Table 1, is the objective function at the final iteration, “Iter” is the number of iterations, and the “CPU time” is the computational time (this refers to an Intel Core 2 Duo 2 GHz with 4 GB RAM running Mac OS X 10.6). The termination tolerance in (12) is . The optimal solutions found replicate those already known in the literature [10, 11], indicating the effectiveness of the developed method.

5.2. Low-Thrust Orbital Transfer

In this problem, the controlled, planar Keplerian motion of a spacecraft in polar coordinates is studied. The dynamics are written in scaled coordinates, where the length unit corresponds to the radius of the initial orbit, the time unit is such that its period is , and the gravitational parameter is 1. The state, , is made up by the radial distance from the attractor (), the phase angle (), the radial velocity (), and the transversal velocity (), whereas the control, , corresponds to the radial and transversal accelerations, respectively [12, 13]. The equations of motion are and the objective function is with and . The initial state corresponds to the conditions at the initial orbit; that is, . Two different HCPs are solved, which correspond to the final states and , respectively. This setup mimics an Earth-Mars low-thrust transfer. The dynamics (41) and the objective function (42) are put in the form (8)-(9) by defining , , and

The two HCPs have been solved with the developed method. The solutions’ details are reported in Table 2, whose columns have the same meaning as in Table 1. It can be seen that more iterations and an increased computational burden are required to solve this problem. The solution with is reported in Figure 3.

6. Conclusion

In this paper, an approximated method to solve nonlinear optimal control problems has been presented, with applications to sample cases in astrodynamics. With this method, the nonlinear dynamics and objective function are factorized in a pseudolinear and quadratic-like forms, which are similar to those used in the state-dependent Riccati equation approach. Once in this form, a number of time-varying linear quadratic regulator problems are solved. A state transition matrix approach is used to deal with the time-varying linear quadratic regulators. The results show the effectiveness of the method, which can be used to either have suboptimal solutions or to provide initial solutions to more accurate optimizers.