Abstract

The model predictive control (MPC) subject to control and state constraint is studied. Given a terminal cost, a terminal region is obtained through iterative estimation by using support vector machine (SVM). It is proved that the obtained terminal region is the largest terminal region when the terminal cost is given. The relationships between terminal cost and terminal region and between terminal cost and total cost are discussed, respectively. Based on these relationships, a simple method to get a suitable terminal cost is proposed and it can be adjusted according to our need. Finally, some experiment results are presented.

1. Introduction

Model predictive control also known as receding horizon control has become quite popular recently. The key advantage is its ability to handle control and state constraints. It was pointed out in [1] that what MPC solves is the standard optimal control (SOC) problem except that it uses the finite-horizon optimization (some paper used quasi-infinite horizon optimization, such as [2, 3]) to replace the infinite-horizon optimization of SOC and the control is computed online.

Generally, to say that an MPC is good or bad, it is contrasted with SOC from two aspects: the domain of attraction and the total cost (the wasted performance index from initial time to infinity). If an MPC has a larger domain of attraction and for any initial state point the total cost is fewer than those of another MPC, it is considered as a better MPC. In MPC, there are three factors playing important roles in its performances on the two aspects as just mentioned: the prediction horizon, the terminal region, and the terminal cost. As known to all, lengthening the prediction horizon, the domain of attraction will be enlarged and the total cost will decrease, but the computation burden of online optimization will increase. Recently, many attentions have concentrated on the latter two factors: how to get a large terminal region? and how to get a suitable terminal cost? Here, only some typical papers are listed. Chen and Allgöwer [2] presented a terminal cost by using SOC method based on the linearized model of system and took an ellipsoidal set in which the state can be drived to the equilibrium point by linear feedback control as the terminal region. Cannon et al. [4] used a polytopic set to replace the ellipsoidal set. De Doná et al. [5] took the stabilizable region of using saturated control as the terminal region. Ong et al. [6] got a very large stabilizable region and a terminal cost of using linear feedback control via support vector machine. Limon et al. [7] proved that, for MPC without terminal state constraint in its on-line optimization, the terminal region will be enlarged by weighting the terminal cost. Most of these papers have a common shortage; the terminal region is computed under a precondition; some explicit controller was given in advance, like linear feedback controller and saturated controller. So, the computed terminal region is somewhat conservative, in other words, it is not the largest one.

In this paper, a novel method is proposed to get a terminal state region. Given a terminal cost, a set sequence is obtained by using one-step set contraction iteratively. It is proved that, when the iteration time goes to infinity, this set sequence will converge to the maximal terminal region. In this sequence, each set is estimated by using support vector machine (SVM, see [7, 8] for details). Next, the relationships between terminal cost and terminal region and between terminal cost and total cost are discussed, respectively. Then, a simple method to get a suitable terminal cost according to our need is given. Finally, some experiment results are presented.

2. The Relationship between SOC and MPC

As mentioned, MPC is an approximation to SOC, and SOC is the counterpoint to evaluate MPC. Here, the study on MPC begins with the comprehension of SOC. Consider the discrete-time system as follows: where are the state and the input of the system at sampling time respectively. is the successor state and the mapping with is known. The system is subject to constraints on both state and control action, and they are given by where is a closed set and a compact set, both of them containing the origin. The control objective is usually to steer the state to the origin.

The optimization problem of SOC at the initial state can be stated as follows: where , is the stage cost and its form is chosen as in which are positive definites.

It is well known that the stability is guaranteed if has feasible solution. Here, assume that the solution of is known if it has the following. is defined as the solution, , , the optimal control trajectory and , the corresponding state trajectory. As is well known, is the total cost of using to drive to .

But solving is not an easy job, especially when is nonlinear. To avoid this problem, the infinite-horizon optimization of SOC can be approximated by the finite-horizon optimization of MPC (As mentioned, quasi-infinite horizon optimization was used in [2, 3]. For convenience, we consider it to belong to the frame of finite-horizon optimization).

Similarly, the optimization problem of MPC at the initial state can be stated as where is prediction horizon, is terminal region, and is terminal cost satisfying and (the mapping satisfying is continuous and strictly increasing, where ).

There exist many optimization algorithms to compute the solution of . Let be the solution, =,= be the optimal control trajectory and corresponding predicted state trajectory of , respectively.

At sampling time , is inputted to the system. At the next sampling time , is outputted, and the control input can be computed by solving the optimization problem . By repeating this procedure, two trajectories can be obtained: ==. Here, for convenience, they are called as the receding horizon control trajectory and the receding horizon state trajectory of MPC with , respectively.

The introduction of and in (2.3) is to guarantee the closed loop stability of system.

Lemma 2.1. Define . For any if and satisfy two conditions as follows:
   being a Lyapunov function, more strictly, for any , there exists
   being an invariant set. In other words, one has Here, is the control in (C1).

It is guaranteed that, will be led to by using is called the domain of attraction.

The proof can be found in [1].

Obviously, the optimal choice of is . The total cost with this choice is , namely it is the least one. But, as mentioned, can not be obtained generally, so other ways should be found to get . Certainly, the closer it approaches to , the better it is. When is given, can be found to satisfy the conditions (C1) and (C2). There are many satisfying the conditions and different will be obtained by different methods. But, to make the domain of attraction the largest, it is wished that the largest can be obtained. Here, define as the largest terminal region when is given, and in the next section, a novel method to get will be proposed.

3. Maximizing the Terminal Region for MPC

Until now, there exist many methods to construct . As mentioned, these methods have a common basic idea: some controller like linear feedback controller or saturation controller is given in advance, then a stabilizable domain of using this controller is computed and works as the terminal region of MPC. It is obvious that this kind of construction is somewhat conservative and the computed by using this method does not approximate to to the largest extent.

In this paper, a novel method is proposed in which is constructed directly from conditions (C1) and (C2).

3.1. Approximating the Largest Terminal Region Asymptotically

Define as where is the solution of the following optimization problem:

Obviously, for an , it cannot be decided whether belongs to from (3.1) and (3.2) when is unknown. The difficulty is that the state constraint in the optimization (3.2) uses the itself. To avoid it, the method of asymptotic approximation is adopted. Firstly, an initial set which can be obtained by the following discriminant is given: where   is the solution of

Then, using instead of in the state constraint in optimization of (3.4), can be obtained. One by one, will be obtained. The whole procedure can be pictured as in Figure 1.

In Figure 1, is defined as where is the solution of

Similarly, can be defined as and is the solution of

Obviously, as Figure 1 shows, there exists . As increases, will converge to a set denoted by . But, whether is the terminal region we want? Theorem 3.1 provides the answer to this question.

Theorem 3.1. For defined in (3.7), when goes to infinity, will converge to , namely, as , one has .

This theorem is proved by contradiction.

Proof. (A) Assume that there exists which satisfies and , as then for any , we have and . Obviously this is contradicted with being the largest one satisfying (C1) and (C2).
(B) Similarly, assume that there exists which satisfies and , as then there exists with satisfying and , where denotes the empty set. Choose any , it is obviouse that, satisfies and . On the other hand, we know that , so meets the conditions in the definition of and we have . This is contradicted with .

Remark 3.2. Generally, in the computation of , it is impossible to keep computation until . So, when the iteration goes to , if is equal to in principle, can be taken as the terminal region we want.

Remark 3.3. The terminal region computed through the method in Remark 3.2 is not itself, but its enclosing set. Then, the corresponding domain of attraction may include some points which should not be in the real domain of attraction. To avoid this problem, the outspread skill from smaller region to larger one can be used to turn the contraction skill from larger region to smaller one in this paper. The concrete algorithm is not presented here, just the general idea is stated; giving a known subset of , denoted by in advance and using to serve as the state constraint in (3.4), then a larger region will be computed. By the same procedure as presented in Section 3.1, a terminal region which is a subset of will be gotten.

To obtain is not an easy job. The only tool is statistical learning method. Here, the SVM is used.

3.2. Support Vector Machine

SVM (see [8, 9]) is the youngest part of statistical learning theory. It is an effective approach for pattern recognition. In SVM approach, the main aim of an SVM classifier is obtaining a function, which determines the decision boundary or hyperplane. This hyperplane optimally separates two classes of input data points.

Take the example of separating into and . For each , an additional variable is introduced. Similarly, for each , is introduced. Define , . SVM is used to find a separating hyperplane , between and . Then, we obtain an estimated set of , .   can be obtained by solving the following problem: where denotes the kernel function.

In this paper, the following Gaussian kernel is used: with being the positive Gaussian kernel width.

There are many software packages of SVM available on internet. They can be downloaded and used directly. By using SVM, the support vectors are extracted from and their relevant weights are exported. Denote as the number of support vectors and as the support vectors set, the optimal hyperplane is described as follows: where is a support vector, and satifying is the relevant weight.

3.3. Estimating the Largest Terminal Region

In SVM classifier, the training data is inputted and the hyperplane will be outputted. To us, the training of data is the only job.

Take the separation of from as an example. Firstly, choose arbitrary points , ( is the number of training points) then decide the value of corresponding to by using the following procedure

If , 

else

endif.

When all the for all the are gotten, they can be packed to constitute the training data. Then, by inputting the training data into SVM classifier, an optimal hyperplane and an estimated set of , will be obtained.

When is known, the training data for separating from can be known by the similar procedure. By inputting them into SVM classifier, a hyperplane and an estimated set of , will be gotten.

Repeatedly, a hyperplane series , will be obtained. When if it is satisfied that, for , , there exists it is deemed that is equal to in principle and is taken as the final estimation of . Where is the support vectors set at , is the number of support vectors and is a tunable threshold. The smaller it is, the higher the precision of approximating to is.

Remark 3.4. Here, we used the information that, in SVM classifier, the hyperplanes are decided just on the support vectors.

4. Choosing an Appropriate Terminal Cost

In the previous chapter, a method to maximize the terminal region was proposed, but the method has a premise: the terminal cost is given in advance. In this chapter, how to get a terminal cost will be shown. Before this, some properties of terminal cost will be analyzed.

4.1. Weighting Terminal Cost, the Domain of Attraction Will Be Enlarged

From conditions (C1) and (C2), it is known that the terminal region is based on the choice of the terminal cost. We want to know what the relationship between them is and this relationship will give us what messages when we choose terminal cost. Theorem 4.1 will give us the answer.

Denote as the terminal regions of and with a weighted terminal cost , , respectively. Limon et al. [7] proved that the terminal region will be enlarged by weighting the terminal cost for MPC without terminal constraint in its on-line optimization. Here, we will show that, this property will also hold water in our case.

Theorem 4.1. Consider and satisfying conditions (C1) and (C2). When a weighted terminal cost , is used, the corresponding terminal region is larger than , namely, .

Proof. For any , the conditions of (C1) and (C2) are equivalent to the fact that there exists a control trajectory to make the following inequalities stand up: where .
It is obvious that, when the terminal cost is , these inequalities also stand up using the same control trajectory So, we can see that , namely, .

Remark 4.2. When the prediction horizon is given, the larger the terminal region is, the larger the domain of attraction is. So Theorem 4.1 shows that, giving the prediction horizon, by weighting terminal cost, the domain of attraction will be enlarged.

4.2. Weighting Terminal Cost, the Total Cost Will Increase

It is known from the last section that, when the terminal cost is weighted, the terminal region is enlarged. So, can we weight the terminal cost arbitrarily? The answer is no. This section will tell us the reason the total cost will be increased by weighting the terminal cost.

Let be the solution of , , be the optimal control trajectory and corresponding state trajectory, respectively, and let ,   be the receding horizon control trajectory and receding horizon state trajectory of using MPC with , respectively. Define    as the total costs of using MPC with and with , respectively.

For convenience, consider an assumption.

Assumption. For any , where is the domain of attraction of MPC with , the terminal state by solving belongs to , that is to say .

Remark 4.3. Assumption 1 means, for any , that the solution of with as its terminal region is equal to that with as its terminal region. A few points in may not satisfy this assumption, for convenience, their influence is neglected. Under this assumption, it is obviouse that and the following lemma holds water.

Lemma 4.4. For any , there exists

Proof. From the view of optimality, can be expressed as
Considering Assumption 1 and by optimality, there exists
Similarly, can be expressed as
And by optimality, there exists
Obviously, the result of subtracting from the right hand of (4.7) is bigger than the result of subtracting the right hand of (4.5) from , in other words,
Finally, the following result can be obtained:

Define as a kind of functions satisfying the following: for any , and , there exists the following: if , the inequality stands up.

To continue discussion, another assumption is needed.

Assumption. All of the positive cost functions used in this paper like , , , and and the results of the addition or subtraction between them like , belong to .

Based on Assumption 2 and Lemma 4.4, it is known that, for any and , there exists

Then, by using (4.10) and Assumptions 1 and 2, another lemma which is a key for our study on this issue can be gotten.

Lemma 4.5. Under Assumptions 1 and 2, for any and any positive cost function satisfying , there exists

Proof. Here, means and means . From Assumption 2, it is known that , so there exists
Here, we used the fact that and .

Then, the reason why we cannot weight the terminal cost arbitrarily can be presented.

Theorem 4.6. Under Assumptions 1 and 2, for any , the following inequality stands up:

Proof. It is obvious that meets the condition in Lemma 4.5 because of . Choose in Lemma 4.5. There exists
Similarly, for any , the following result can be obtained:
So, choose
By using ,   to replace respectively, there exists
From Assumption 1, it is known that, for , there exists . Replacing with in (4.14), the following inequality can be gotten:
So, there exists
Repeating this procedure, there exists
Let , the final result can be obtained as follows:

Theorem 4.6 shows that, weighting the terminal cost, the total cost will be increased. So, when choosing a terminal cost, people should not only take into account the need of enlarging the terminal region.

4.3. Getting an Appropriate Terminal Cost

It was pointed out from Theorems 4.1 and 4.6 that the terminal cost is a double-edged sword. On its choice, two factors, the terminal region and the total cost, must be considered. With different emphasis, different terminal cost should be chosen.

Here, a simple method to get a terminal cost is presented, whose basic idea is getting an initial terminal cost in advance then adjusting it according to our need.

As mentioned, a good terminal cost should approximate to as close as possible. People can only achieve it in a small neighborhood around the origin by using SOC method, see [2] for continuous-time system.

Consider the linearization of the system (2.1) at the origin with and .

Here, assume that (4.22) is stabilizable, then a terminal cost which serves as an initial candidate can be found through the following procedure.

Step 1. Solving the Riccati equation to get a preparatory ,

Step 2. Getting a locally stabilizing linear state feedback gain ,

Step 3. Computing by solving the following Riccati equation: where , , and is an adjustable parameter satisfying .

Then, can serve as an initial terminal cost. According to our need and the properties of terminal cost, the initial one can be adjusted to get an appropriate one, , . For example, if a larger terminal region is wanted and the total cost is not cared, can be set to be a larger number; otherwise, if a lower total cost is demanded and the domain of attraction already covers the operating region of system, a small one can be used.

5. Simulation Experiment

The model is an approximate discrete-time realization from a continuous-time system used in [2] as follows: where , , and the state constraint and control constraint are , , respectively.

The stage cost is chosen as with and . By using SOC method, the locally linear feedback gain is adopted as    and is obtained. Then, choose   and get the terminal cost as with .

To estimate each , 4000 training points are generated. Set , when , there exists where , is the support vectors set at and is the number of support vectors. Then, it is deemed that is equal to  in principle and can be taken as the final estimation of . Figure 2 shows the approximation process of . The blue line is the hyperplane at , the black dot line is that at and the red lines between them are those at . Let the prediction horizon be . Figure 3 shows the closed-loop trajectories of some points chosen from the domain of attraction arbitrarily.

When the terminal cost is enlarged to , a new terminal region larger than the old one can be obtained. Figure 4 shows it. The red line is the new hyperplane and the black dot line is the old one.

For convenience, let (A) denote the MPC using as its terminal cost and (B) the MPC using . For some points chosen from of (A) arbitrarily, Figure 5 shows their closed-loop trajectories of using (A) and (B), respectively, where red lines denote the results of using (A), and blue dash-dotted lines denote the results of using (B). Table 1 shows the comparison of the total costs. Obviously, for the same point, the total cost of using (A) is smaller than that of using (B).

6. Conclusion

This paper discussed the relationships between terminal cost and terminal region and between terminal cost and total cost, respectively. It showed that, by enlarging the terminal cost, terminal region will be enlarged, but the total cost will be increased too. A simple method to get a suitable terminal cost was proposed, it can be adjusted according to our need. For example, to get a larger terminal region, it can be weighted; to reduce the total cost, it can be unweighted. When a terminal cost was given, a novel method was proposed to receive a maximal terminal region by using SVM. With the same prediction horizon, its corresponding domain of attraction is the largest one.