Abstract
This paper studies a discrete-time stochastic LQ problem over an infinite time horizon with state-and control-dependent noises, whereas the weighting matrices in the cost function are allowed to be indefinite. We mainly use semidefinite programming (SDP) and its duality to treat corresponding problems. Several relations among stability, SDP complementary duality, the existence of the solution to stochastic algebraic Riccati equation (SARE), and the optimality of LQ problem are established. We can test mean square stabilizability and solve SARE via SDP by LMIs method.
1. Introduction
Stochastic linear quadratic (LQ) control problem was first studied by Wonham [1] and has become a popular research field of modern control theory, which has been extensively studied by many researchers; see, for example, [2โ12]. We should point out that, in the most early literature about stochastic LQ issue, it is always assumed that the control weighting matrix is positive definite and the state weight matrix is positive semi-definite. A breakthrough belongs to [9], where a surprising fact was found that for a stochastic LQ modeled by a stochastic Itรด-type differential system, even if the cost-weighting matrices and are indefinite, the original LQ optimization may still be well-posed. This finding reveals the essential difference between deterministic and stochastic systems. After that, follow-up research was carried out and a lot of important results were obtained. In [10โ12], continuous-time stochastic LQ control problem with indefinite weighting matrices was studied. The authors in [10] provided necessary and sufficient conditions for the solvability of corresponding generalized differential Riccati equation (GDRE). The authors introduced LMIs whose feasibility is shown to be equivalent to the solvability of SARE and developed a computational approach to the SARE by SDP in [11]. Furthermore, stochastic indefinite LQ problems with jumps in infinite time horizon and finite time horizon were, respectively, studied in [13, 14]. Discrete-time case was also studied in [15โ17]. Among these, a central issue is solving corresponding SARE. A traditional method is to consider the so-called associated Hamiltonian matrix. However, this method does not work on when is indefinite.
In this paper, we use SDP approach introduced in [11, 18] to discuss discrete-time indefinite stochastic LQ control problem over an infinite time horizon. Several equivalent relations between the stabilization/optimality of the LQ problem and the duality of SDP are established. We show that the stabilization is equivalent to the feasibility of the dual SDP. Furthermore, we prove that the maximal solution to SARE associated with the LQ problem can be obtained by solving the corresponding SDP. What we have obtained extend the results of [11] from continuous-time case to discrete-time case and the results of [15] from finite time horizon to infinite time horizon.
The organization of this paper is as follows. In Section 2, we formulate the discrete-time indefinite stochastic LQ problem in an infinite time horizon and present some preliminaries including some definitions, lemmas, and SDP. Section 3 is devoted to the relations between stabilization and dual SDP. In Section 4, we develop a computational approach to the SARE via SDP and characterize the optimal LQ control by the maximal solution to the SARE. Some numerical examples are presented in Section 5.
Notations 1. : -dimensional Euclidean space. : the set of all matrices. : the set of all symmetric matrices. : the transpose of matrix . : is positive semidefinite (positive definite). : the identity matrix. : the spectrum set of the operator . : the set of all real numbers. : the set of all complex numbers. : the open left-hand side complex plane. : the trace of a square matrix . : the adjoint mapping of .
2. Preliminaries
2.1. Problem Statement
Consider the following discrete-time stochastic system: where , are the system state and control input, respectively. is the initial state, and is the noise. and are constant matrices. is a sequence of real random variables defined on a filtered probability space with , which is a wide sense stationary, second-order process with and , where is the Kronecker function. belongs to , the space of all -valued, -adapted measurable processes satisfying We assume that the initial state is independent of the noise .
We first give the following definitions.
Definition 2.1 (see [17]). The following system is called asymptotically mean square stable if, for any initial state , the corresponding state satisfies .
Definition 2.2 (see [17]). System (2.1) is called stabilizable in the mean square sense if there exists a feedback control such that, for any initial state , the closed-loop system
is asymptotically mean square stable; that is, the corresponding state of (2.4) satisfies , where is a constant matrix.
For system (2.1), we define the admissible control set
The cost function associated with system (2.1) is
where and are symmetric matrices with appropriate dimensions and may be indefinite. The LQ optimal control problem is to minimize the cost functional over . We define the optimal value function as
Since the weighting matrices and may be indefinite, the LQ problem is called an indefinite LQ control problem.
Definition 2.3. The LQ problem is called well-posed if
If there exists an admissible control such that , the LQ problem is called attainable and is called an optimal control.
Stochastic algebraic Riccati equation (SARE) is a primary tool in solving LQ control problems. Associated with the above LQ problem, there is a discrete SARE:
Definition 2.4. A symmetric matrix is called a maximal solution to (2.9) if is a solution to (2.9) and for any symmetric solution to (2.9).
Throughout this paper, we assume that system (2.1) is mean square stabilizable.
2.2. Some Definitions and Lemmas
The following definitions and lemmas will be used frequently in this paper.
Definition 2.5. For any matrix , there exists a unique matrix , called the Moore-Penrose inverse, satisfying
Definition 2.6. Suppose that is a finite-dimensional vector space and is a space of block diagonal symmetric matrices with given dimensions. : is a linear mapping and . Then the inequality is called a linear matrix inequality (LMI). An LMI is called feasible if there exists at least one satisfying the above inequality and is called a feasible point.
Lemma 2.7 (Schurโs lemma). Let matrices , and be given with appropriate dimensions. Then the following conditions are equivalent:(1),(2),(3).
Lemma 2.8 (extended Schurโs lemma). Let matrices , and be given with appropriate dimensions. Then the following conditions are equivalent:(1), , and ,(2),(3).
Lemma 2.9 (see [11]). For a symmetric matrix , one has(1),(2) if and only if ,(3).
2.3. Semidefinite Programming
Definition 2.10 (see [19]). Suppose that is a finite-dimensional vector space with an inner product and is a space of block diagonal symmetric matrices with an inner product . The following optimization problem is called a semidefinite programming (SDP). From convex duality, the dual problem associated with the SDP is defined as In the context of duality, we refer to the SDP (2.12) as the primal problem associated with (2.13).
Remark 2.11. Definition 2.10 is more general than Definitionโโ6 in [11].
Let denote the optimal value of SDP (2.12); that is,
and let denote the optimal value of the dual SDP (2.13); that is,
Let and denote the primal and dual optimal sets; that is,
About SDP, we have the following proposition (see [20, Theorem 3.1]).
Proposition 2.12. if either of the following conditions holds.(1)The primal problem (2.12) is strictly feasible; that is, there is an with .(2)The dual problem (2.13) is strictly feasible; that is, there is a with and .
If both conditions hold, the optimal sets and are nonempty. In this case, a feasible point is optimal if and only if there is a feasible point satisfying the complementary slackness condition:
3. Mean Square Stabilization
The stabilization assumption of system (2.1) is basic for the study on the stochastic LQ problem for infinite horizon case. So, we will cite some equivalent conditions in verifying the stabilizability.
Lemma 3.1 (see [16, 21]). System (2.1) is mean square stabilizable if and only if one of the following conditions holds.(1)There are a matrix and a symmetric matrix such that Moreover, the stabilizing feedback control is given by .(2)There are a matrix and a symmetric matrix such that Moreover, the stabilizing feedback control is given by .(3)For any matrix , there is a matrix such that the following matrix equation has a unique positive definite solution . Moreover, the stabilizing feedback control is given by .(4)For any matrix , there is a matrix such that the following matrix equation has a unique positive definite solution . Moreover, the stabilizing feedback control is given by .(5)There exist matrices and such that the following LMI holds. Moreover, the stabilizing feedback control is given by .
Below, we will construct the relation between the stabilization and the dual SDP. First, we assume that the interior of the set is nonempty; that is, there is a such that and .
Consider the following SDP problem:
By the definition of SDP, we can get the dual problem of (3.6).
Theorem 3.2. The dual problem of (3.6) can be formulated as
Proof. The objective of the primal problem can be rewritten as maximizing . Define the dual variable as where . The LMI constraint in the primal problem can be represented as that is, According to the definition of adjoint mapping, we have , that is, . It follows . By Definition 2.10, the objective of the dual problem is to maximize . On the other hand, we will state that the constraints of the dual problem (2.13) are equivalent to the constraints of (3.7). Obviously, is equivalent to the equality constraint of (3.7). Furthermore, notice that the matrix variable does not work on in the above formulation and therefore can be treated as zero matrix. So, the condition is equivalent to This ends the proof.
Remark 3.3. This proof is simpler than the proof in [11] because we use a more general dual definition.
The following theorem reveals that the stabilizability of discrete stochastic system can be also regarded as a dual concept of optimality. This result is a discrete edition of Theoremโโ6 in [11].
Theorem 3.4. The system (2.1) is mean square stabilizable if and only if the dual problem (3.7) is strictly feasible.
Proof. First, we prove the necessary condition. Assume that system (2.1) is mean square stabilizable by the feedback . By Lemma 3.1, there is a unique satisfying
where is a fixed matrix. Set , then . The above equality can be written as
Let , and . Obviously, and satisfy
We have for sufficiently small . By Lemma 2.7, is equivalent to . We conclude that the dual problem (3.7) is strictly feasible.
Next, we prove the sufficient condition. Assume that the dual problem is strictly feasible; that is, , , . It implies that there are , and such that
It follows that
Let . The above inequality is equivalent to
By Lemma 3.1, system (2.1) is mean square stabilizable.
4. Solutions to SARE and SDP
The following theorem will state the existence of the solution of the SARE (2.9) via SDP (3.6).
Theorem 4.1. The optimal set of (3.6) is nonempty, and any optimal solution must satisfy the SARE (2.9).
Proof. Since system (2.1) is mean square stabilizable, by Theorem 3.4, (3.7) is strictly feasible. Equation (3.6) is strictly feasible because is a interior point of . By Proposition 2.12, (3.6) is nonempty and satisfies ; that is, From the above equality, we have the following equalities: Moreover, because of and . Then by (4.4), . Substituting it into (4.2) yields . Remember that , , , satisfy the equality constraint in (3.7). Multiplying both sides by , we have Considering , it follows from Lemma 2.8 that and . By Lemma 2.9, and . So we have Then it follows that . It yields due to .
The following theorem shows that any optimal solution of the primal SDP results in a stabilizing control for LQ problem.
Theorem 4.2. Let be an optimal solution to the SDP (3.6). Then the feedback control is mean square stabilizing for system (2.1).
Proof. Optimal dual variables , , , satisfy (4.2)โ(4.6). . Now we show . Let , . The constraints in (3.7) imply Similar to the proof of Theorem 4.1, we have . We conclude that from . Again by the equality constraint in (3.7), we have By Lemma 3.1, the above inequality is equivalent to the mean square stabilizability of system (2.1) with . This ends the proof.
Theorem 4.3. There is a unique optimal solution to (3.6), which is the maximal solution to SARE (2.9).
Proof. The proof is similar to Theoremโโ9 in [11] and is omitted.
Theorem 4.4. Assume that is nonempty, then SARE (2.9) has a maximal solution , which is the unique optimal solution to SDP:
Proof. The first assertion is Theoremโโ4 in [16]. As to the second part, we proceed as follows. By Lemma 2.7, satisfies the constraints in (4.11). is an optimal solution to (4.11) due to the maximality. Next we prove the uniqueness. Assume that is another optimal solution to (4.11). Then we have . According to Definition 2.4, . This yields . Hence, the proof of the theorem is completed.
Remark 4.5. Here we drop the assumption that the interior of is nonempty.
Remark 4.6. Theorem 4.4 presents that the maximal solution to SARE (2.9) can be obtained by solving SDP (4.11). The result provides us a computational approach to the SARE. Furthermore, as shown in [16], the relationship between the LQ value function and the maximal solution to SARE (2.9) can be established; that is, assuming that is nonempty, then the value function and the optimal control can be expressed as .
The above results represent SARE (2.9) may exist a solution even if is indefinite (even negative definite). To describe the allowable negative degree, we give the following definition to solvability margin.
Definition 4.7 (see [11]). The solvability margin is defined as the largest nonnegative scalar such that (2.9) has a solution for any .
By Theorem 4.4, the following conclusion is obvious.
Theorem 4.8. The solvability margin can be obtained by solving the following SDP:
5. Numerical Examples
Consider the system (2.1) with , , , as follows:
5.1. Mean Square Stabilizability
In order to test the mean square stabilizability of system (2.1), we only need to check weather or not LMI (3.5) is feasible by Theorem 3.2. Making use of LMI feasp solver [22], we find matrices and satisfying (3.5): and the stabilizing control ; that is,
5.2. Solutions of SARE
Let in SARE (2.9). Below, we solve SARE (2.9) via the SDP (4.11) (LMI mincx solver [22]).
Case 1. is positive definite. Choose , and we obtain the maximal solution to (2.9):
Case 2. is indefinite. Choose , and we obtain the maximal solution to (2.9):
Case 3. is negative definite. First we can get the solvability margin by solving SDP (4.12) (LMI gevp solver [22]). Hence (2.9) has a maximal solution when . Choose , and we obtain the maximal solution to (2.9):
6. Conclusion
In this paper, we use the SDP approach to the study of discrete-time indefinite stochastic LQ control. It was shown that the mean square stabilization of system (2.1) is equivalent to the strict feasibility of the SDP (3.7). In addition, the relation between the optimal solution of (3.6) and the maximal solution of SARE (2.9) has been established. What we have obtained can be viewed as a discrete-time version of [11]. Of course, there are many open problems to be solved. For example, is a basic assumption in this paper. A natural question is whether or not we can weaken it to . This problem merits further study.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (61174078), Specialized Research Fund for the Doctoral Program of Higher Education (20103718110006), and Key Project of Natural Science Foundation of Shandong Province (ZR2009GZ001).