Abstract
Although solving the robust control problem with offline manner has been studied, it is not easy to solve it using the online method, especially for uncertain systems. In this paper, a novel approach based on an online datadriven learning is suggested to address the robust control problem for uncertain systems. To this end, the robust control problem of uncertain systems is first transformed into an optimal problem of the nominal systems via selecting an appropriate value function that denotes the uncertainties, regulation, and control. Then, a datadriven learning framework is constructed, where Kronecker’s products and vectorization operations are used to reformulate the derived algebraic Riccati equation (ARE). To obtain the solution of this ARE, an adaptive learning law is designed; this helps to retain the convergence of the estimated solutions. The closedloop system stability and convergence have been proved. Finally, simulations are given to illustrate the effectiveness of the method.
1. Introduction
Existing achievements of control techniques are mostly acquired under the assumption that there are no dynamical uncertainties in the controlled plants. Nevertheless, in practical control systems, there are many external disturbances and/or model uncertainties, so the system lifetimes are always affected by those uncertainties. The factors of uncertainties must be taken into consideration in the design of the controller such that the closedloop systems must have good responses even in the presence of such uncertain dynamics. We say a controller is robust if it works even though the practical system deviates from its nominal model. Therefore, it creates the problem of robust control design, which has been widely studied during the past decades [1, 2]. The latest research [1, 3] shows that the robust control problem can be addressed via using the optimal control approach for the nominal system. Nevertheless, the online solution for the derived optimal control problem is not handled in [1].
Considering optimal control problems, recently, many approaches have been presented [4, 5]. A linear system optimal control problem is described to address the associated linear quadratic regulator (LQR) problem, where the optimal control law can be obtained. The theory of dynamic programming (DP) has been proposed to study the optimal control problem in the past years [6]; however, there is an obvious disadvantage for DP, i.e., with the increase in the dimensions of system state and control input, there is an alarming increase in the amount of computation and storage, which is called “curse of dimensionality.” To overcome this problem, the neural network (NN) is used to approximate the optimal control problem [7], which leads to recent research work on adaptive/approximate dynamic programming (ADP); the tricky optimal problem can be tackled via ADP method; thus, we can get the online solution of the optimal cost function [8]. Recently, robust control design based on adaptive critic idea has gradually become one of the research hotspots in the field of ADP. Many methods have been proposed one after another, which are collectively referred to as the robust adaptive critic control. A basic approach is to transform the problem to establish a close relationship between robustness and optimality [9]. In these literatures, the closedloop system generally satisfies the uniformly ultimately bounded (UUB). These results fully show that the ADP method is suitable for the robust control design of complex systems in uncertain environment. Since many previous ADP results are not focus on the robust performance of the controller, the emergence of robust adaptive critic control greatly expands the application scope of ADP methods. Then, considering the commonness in dealing with system uncertainties, the selflearning optimization method combined with ADP and sliding mode control technology provides a new research direction for robust adaptive critic control [10]. In addition, the robust ADP method is another important achievement in this field. It is worth mentioning that the application of robust ADP methods in power systems has attracted special attention [11], leading to a higher application value in industrial systems.
Based on the above facts, we develop a robust control design for uncertain systems via using an online datadriven learning method. For this purpose, the robust control problem of uncertain systems is first transformed into an optimal control problem of the nominal systems with an appropriate cost function. Then, a datadriven technique is developed, where Kronecker’s products and vectorization operations are used to reformulate the derived ARE. To solve this ARE, a novel adaptive law is designed, where the online solution of ARE can be approximated. Simulations are given to indicate the validity of the developed method.
The major contributions of this paper include the following: (1)To address the robust control problem, we transform the robust control problem of uncertain systems into an optimal control problem of the nominal system. It provides an approach to address the robust control problem(2)Kronecker’s products and vectorization operations are used to reformulate the derived ARE, which can help to rewrite the original ARE into a linear parametric form. It gives a new pathway to online solve the ARE(3)A newly developed adaptation algorithm driven by the parameter estimation errors is used to online learn the unknown parameters. The convergence of the estimated unknown parameters to the true values can be guaranteed
This paper is organized as follows: In Section 2, we introduce the robust control problem and transform the robust control problem into an optimal control problem. In Section 3, we design an ADPbased datadriven learning method to online solve the derived ARE, where Kronecker’s products and vectorization operations are used. Section 4 gives some simulation results to illustrate the effectiveness of the proposed method. Some conclusions are stated in Section 5.
2. Preliminaries and Problem Formulation
A continuoustime (CT) uncertain system can be written as where and are the system state and the control action, respectively. is the system matrix and is the input matrix. denotes the uncertain parameter involved in the system, and denotes the bounded nonlinearities. The purpose of this paper is designing a controller to make the system (1) asymptotically stable under uncertainties .
In this paper, we study the case, i.e., the matching condition is satisfied; in other words, the uncertainty is in the range of ; thus, the uncertainty is in matrix which can be rewritten as for uncertain , where is the nominal value of . Let denote the upper bound of ; then, for all , we have . In this paper, we will resolve following problem, i.e., realize the online solution for robust control with uncertain system (1). Then, the above robust control problem can be rewritten as
To obtain the robust control solution, the classical method is linear matrix inequality (LMI) [12] in an offline; online resolving the robust control problem is not easy. To overcome this problem, the authors in [1, 9] reported that the robust control problem of uncertain systems can be transformed into an optimal control problem of nominal systems, which provides a new pathway to address the robust control problem. Hence, consider the nominal plant of the system (1).
The aim is to find a control action to minimize the following continuous cost function: where and are the weight matrices.
It should also be noted that the upper bound of the uncertainties is involved in the cost function (4) to address their effects. The following Lemma summarizes the equivalence between the robust control of the system (1) or (2) and the optimal control of the system (3) with cost function (4).
Lemma 1 (see [9]). If the solution to the optimal control problem of the nominal system (3) with cost function (4) exists, then it is a solution to the robust control problem for system.
Lemma 1 exploits the relationship between the robust control and optimal control and thus provides a new way to address the robust control.
To address the optimal control problem of (3), an Algebraic Riccati equation (ARE) can be derived via the cost function (4) as where is the solution of (5), , and . Then, based on the optimal principle, its optimal action can be given as
3. Online Solution to Robust Control via DataDriven Learning
This section will propose a datadriven learning method to resolve the robust control, the schematic of the proposed control method as given Figure 1.
To this end, the system states are multiplied on both sides of ARE (5); we have
We apply two operations ( and ) on (7) yielding
Since the is involved in (8), then the dimension of (8) is very high. To overcome this issue, a dimensionality reduction operation on (7) is given then we can apply two operations (and ) on (9) yielding
Hence, above equation (10) can be rewritten as a compact form
where , , and .
3.1. Online Solution of Robust Control
From (11), we have that only variable is unknown due to involving the unknown matrices and ; thus, the next operation is design an online learning method to update the unknown variable . Consequently, the unknown matrices and can be online estimated based on the estimate of . To this end, we define two auxiliary variables, i.e., and as with being the learning parameter. Then, its solution can be calculated as
To realize the online estimation for based on the estimation error , an auxiliary vector is defined as
After taking (11) into (13), we have ; thus, we can rewrite (14) as with being the estimation error. Then, we can design the adaptive learning law as with being the learning gain.
For adaptive law (16), auxiliary vector of (14) obtained based on and using (15) contains the information on the parameter estimation error . Thus, can be used to drive parameter estimation. Consequently, parameter estimation can be updated along with the estimation error extracted by using the measurable system states . Thus, this adaptive algorithm clearly differs to the gradient descent algorithms used in other ADP literatures.
Since the fact is true, then we can obtain the following lemma as follows.
Lemma 2 (see [13, 14]. Assume that the variable provided in (12) meets persistently excited, then the matrix given in (12) can be considered as positive definite, which means that for any positive constant .
Lemma 2 shows the positivity of the variable , then we can summarize the convergence of proposed adaptive learning law (16) as follows.
Theorem 3. Consider (11) with adaptive learning law (16), when variable provided in (11) satisfies PE condition, then the estimation error is convergence to the origin.
Proof. A Lyapunov function can be chosen as , then we can calculate its as with . Hence, we have the estimated error . This completes the proof.☐
The stepbystep implementation of proposed learning algorithm is given as follows.

Remark 4. For the above designed adaptive learning law (16), which is derived by the estimation error. To this end, the control input and system states are used; this is clearly different to the existing results [15]. In particular, two operations and are applied to the derived ARE; this helps to realize the online learning. Consequently, faster convergence can be retained compared to the previous gradient methodbased adaptive laws designed.
Remark 5. It is a fact that some ADP methods are applied to address the robust control problem successfully. However, most existing ADP techniques focus on infinity control problem. For proposed robust control problem in this paper, we know the uncertain parameter are involved in system matrix such that , so we can consider the system contains unmolded dynamics. To obtain the uncertain term bound, we should do some operations such that , which will be used in the cost function (4). Assume that the system dynamics are completely unknown, the uncertain bound may not be used in cost function as expected. Hence, the system matrix must be known in this paper; future work will try to solve the outputfeedback robust control under completely unknown dynamics.
3.2. Stability Analysis
Before the stability analysis of the closedloop system, we first define the practical optimal control as with being the estimated .
Taking (18) into (3), we have the closedloop system dynamics
To complete the stability analysis, we use the following assumptions as follows.
Assumption 6. The dynamic matrices and for , the estimated matrix for .
In fact, the above assumptions are not stringent in practical systems and have been widely used in many results [13, 14, 16].
Now, some results can be included as follows.
Theorem 7. Consider the system (3) with adaptive learning law (16), if the variable is PE, then the parameter estimation error converges to zero, and the derived control is convergence to its optimal control, i.e., .
Proof. Consider a Lyapunov function as
where is the optimal cost function provided in (4) and and are the positive constants.
From (17), we have as
Then, the can be derived from systems (3) and (19) as
Thus, based on (21) and (22), we have as
Then, the parameters and can be chosen fulfilling following conditions
Therefore, we can rewrite (23) as
where , , and are represented as
Thus, we have for via Lyapunov theorem, then the estimation error converges to zero, i.e., . Consequently, we can obtain the error between and as
This implies the practical optimal control convergence to 0 is true. This completes the proof.☐
4. Simulation
4.1. Example 1: SecondOrder System
We consider a CT secondorder system as where denotes the uncertainties in system and is the state variable. The purpose of the paper is to design a control making the system (28) stable. In this paper, we define , then based on the stated in Section 2, we can rewrite the system (28) as then we can extract the uncertain term as . Thus, the upper bound can be calculated as
To complete the simulation, we set the initial system states as , the weights matrices are , and learning gains are and . To show the effectiveness of the proposed algorithm, the offline solution of ARE is given as
Figure 2 gives the estimation of the matrix with online adaptive learning law (16); based on the ideal solution in (31), we have that the estimated solution is convergence to its optimal solution . This is also found in Figure 3, where the normal error, i.e., , is provided. The good convergence will contribute to the rapid convergence of the system states, which can be found in Figure 4, the system states are bounded and smooth. Since the estimated fast convergence to , then the system response is quite fast; this also can be found in Figure 4. The corresponding control input is given in Figure 5, which is bounded.
4.2. Example 2: Power System Application
This section will provide a power system to test the proposed learning algorithm; thus, we choose as system states, where is the incremental change of the frequency deviation, defines the generator output, and denotes the governor value position. Therefore, the statespace expression of this power system can be given as then we can give some parameters of the proposed power system as follows.
is the time of the governor, denotes the time of the turbine model, is the time of the generator model, indicates the feedback regulation constant, is the gain constant of the turbine model, and shows the gain constant of the generator model.
In order to complete this simulation, one assumes that this system is disturbed by an uncertain term as example 1. The initial system states are set as , , and ; the learning parameters are given as and . Similar to example 1, the offline solution of ARE can be given as
Figure 6 shows the convergence of estimated matrix ; based on the offline solution given in (33), we have that the estimated solution can converge to its optimal solution ; this in turn affects the system state response (as shown in Figure 7). Figure 7 gives the system state response, which is smooth and bounded. The system control input is given in Figure 8.
5. Conclusion
In this paper, an online datadriven ADP method is proposed to solve the robust control problem for continuoustime systems with uncertainties. The robust control problem can be transformed into the optimal control problem. A new online ADP scheme is then introduced to obtain the solution of ARE via using the vectorization operator and Kronecker product. Finally, the closedloop system stability and the convergence of the robust control solution are all analyzed. Simulation results are presented to validate the effectiveness of the proposed algorithm. It is worth noting that the research results are satisfied to the matched uncertainty condition. In our future work, we will extend the proposed idea to address the robust tracking control problem, which allows to carry out practical experimental validations based on existing testrigs in our lab.
Data Availability
Data were curated by the authors and are available upon request.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the Shandong Provincial Natural Science Foundation (grant no. ZR2019BEE066) and Applied Basic Research Project of Qingdao (grant no. 196268cg).