Introducing Dynamic Programming and Persistently Exciting into Data-Driven Model Predictive Control

Jianwang, Hong; Ramirez-Mendoza, Ricardo A.; Morales-Menendez, Ruben

doi:https://doi.org/10.1155/2021/9915994

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Applied Mathematics for Engineering Problems in Biomechanics and Robotics 2021

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 9915994 | https://doi.org/10.1155/2021/9915994

Introducing Dynamic Programming and Persistently Exciting into Data-Driven Model Predictive Control

Hong Jianwang,¹Ricardo A. Ramirez-Mendoza,¹and Ruben Morales-Menendez¹

Academic Editor: Shouzhen Zeng

Received02 Apr 2021

Revised17 Apr 2021

Accepted17 May 2021

Published26 May 2021

Abstract

In this paper, one new data-driven model predictive control scheme is proposed to adjust the varying coupling conditions between different parts of the system; it means that each group of linked subsystems is grouped as data-driven scheme, and this group is independently controlled through a decentralized model predictive control scheme. After combing coalitional scheme and model predictive control, coalitional model predictive control is used to design each controller, respectively. As the dynamic programming is only used in optimization theory, to extend its advantage in control theory, the idea of dynamic programming is applied to analyze the minimum principle and stability for the data-driven model predictive control. Further, the goal of this short note is to bridge the dynamic programming with model predictive control. Through adding the inequality constraint to the constructed model predictive control, one persistently exciting data-driven model predictive control is obtained. The inequality constraint corresponds to the condition of persistent excitation, coming from the theory of system identification. According to the numerical optimization theory, the necessary optimality condition is applied to acquire the optimal control input. Finally, one simulation example is used to prove the efficiency of our proposed theory.

1. Introduction

The main mission of advanced control theory is to design a detailed controller in open loop or closed loop structure so that this designed controller can drive the output of a plant to track an expected set point or to satisfy a given target. Two categories exist for controller design, i.e., model-based approach and data-driven approach. Considering the first model- based approach, a mathematical model of the considered plant is required for the next controller design. It tells that no mathematical model means no controller. Constructing the corresponding mathematical model for the unknown plant is very necessary for this first type of model-based approach, and it is also the most difficult step as it needs some knowledge of other subjects, such as probability theory and linear and nonlinear system theory. This modeling process corresponds to model identification or system identification, which is adopted to obtain the mathematical model exploiting measured data from experiment on the considered open loop or closed loop system. The whole steps for system identification include four main steps, i.e., model structure selection, optimal input design, parameter estimation, and model validation. These above four steps are implemented iteratively until getting one satisfying model, so system identification is the first step or premise for the next controller design, i.e., the idea of identification for control.

Usually, while trying to apply different system identification strategies to produce a mathematical model, this mathematical model maybe high order and high property of nonlinearity, and then it leads to controllers with high order and high nonlinearity. Thus, due to the controllers with high nonlinearity, one extra controller reduction procedure is added in the practical application because the complex controllers are difficult or costly to design and implement. Generally, the obtained controller, designed by model-based approach, depends on the identified model for the unknown plant. It means the above four identification processes are repeated again and again, while guaranteeing that the identified model can be used to replace the original plant perfectly.

To alleviate the dependence on the identified model for the controller, notion of data-driven approach is widely studied in recent years. The attracting property of data-driven approach is that the controller is designed directly based on measured data. As data-driven approach is still in its infancy, different names are called in the references to describe it, such as data driven, data based, and model free. To the best of our knowledge, the principles between data-driven approach and system identification are similar to each other, as the measured data are applied to get the mathematical model for the unknown plant in the framework of system identification but to get the approximated controller for the case of data-driven approach. The idea of direct data-driven control was first proposed in machine learning; then, it attracted many researchers in the advanced control field recently. Now, this data-driven theory is widely applied in control field, for example, direct data-driven control, data-driven estimation, data-driven detection, data-driven optimization. The common property among them is that the measured data are used to achieve our main goals; then, it means some useful information is extracted from these measured data. On the other hand, data-driven approach needs lots of measured data, i.e., the number of measured data is sufficiently large. This requirement is feasible in our information period, and data-driven approach was born to overcome the limitation of model-based approach, so data-driven approach is studied very popularly from theory and practice application.

Due to the application of data-driven approach widely in control field and the similar point between data-driven approach and system identification, we call their combination as identification for control, i.e., system identification for direct data-driven control. More specifically, we describe a concise introduction or contribution on system identification for direct data-driven control, which belongs to data-driven approach. In case of the unknown but bounded noise, one bounded error identification is proposed to identify the unknown systems with time varying parameters. Then, one feasible parameter set is constructed to include the unknown parameter with a given probability level. In [1], the feasible parameter set is replaced by one confidence interval, as this confidence interval can accurately describe the actual probability that the future predictor will fall into the constructed confidence interval. The problem about how to construct this confidence interval is solved by a linear approximation/programming approach [2], which can identify the unknown parameter only for the linear regression model. According to the obtained feasible parameter set or confidence interval, the midpoint or center can be deemed as the final parameter estimation, and further a unified framework for solving the center of the confidence interval is modified to satisfy the robustness. This robustness corresponds to other external noises, such as outlier and unmeasured disturbance [3]. The abovementioned identification strategy, used to construct one set or interval for unknown parameter, is called as set membership identification, dealing with the unknown but bounded noise. There are two kinds of descriptions on external noise, one is probabilistic description and the other is deterministic description, corresponding to the unknown but bounded noise here [4]. For the probabilistic description on external noise, the noise is always assumed to be one white noise, and its probabilistic density function (PDF) is known in advance. On the contrary for deterministic description on external noise, the only information about noise is bound, so this deterministic description can relax the strict assumption on probabilistic description. In reality or practice, bounded noise is more common than white noise. Within the deterministic description on external noise, set membership identification is adjusted to design controllers with two degrees of freedom [5], and it corresponds to direct data-driven control or set membership control. Set membership control is applied to design feedback control in a closed loop system with the nonlinear system in [6], where the considered system is identified by set membership identification, and the obtained system parameter will be benefitted for the prediction output. After substituting the obtained system parameter into the prediction output to construct one cost function, reference [7] takes the derivative of the above cost function with respect to control input to achieve one optimal input. Set membership identification can be not only applied in MC but also in stochastic adaptive control [8], where a learning theory-kernel is introduced to achieve the approximation for the nonlinear function or system. Based on the bounded noise, many parameters are also included in known intervals in prior, and then robust optimal control with adjustable uncertainty sets are studied in [9], where robust optimization is introduced to consider uncertain noise and uncertain parameter simultaneously. To solve the expectation operation with dependence on the uncertainty, sample size of random convex programs is considered to replace the expectation by finite sum [10]. Generally, many practical problems in systems and control, such as controller synthesis and state estimation, are often formulated as optimization problems [11]. In many cases, the cost function incorporates variables that are used to model uncertainty, in addition to optimization variables, and reference [12] employs uncertainty described as probabilistic variables. Reference [13] studies data-driven output feedback controllers for the nonlinear system and applies event triggered mode to analyze the robust stability. Data-driven estimation is used to achieve hybrid system identification [14], whose nonlinearity is described by one kernel function. During these recent years, the first author studies this direct data-driven control too, for example, the closed relation between system identification and direct data-driven control [15] and data-driven model predictive control [16]. Based on above descriptions on direct data-driven control and our existed research about system identification, model predictive control, direct data-driven control, and convex optimization theory, our mission in this paper is to combine our previous results and apply them in practical engineering. Robust analysis is considered in more detail for data-driven model predictive control [15], where the idea of convex optimization and semidefinite relaxation scheme is proposed to achieve this goal.

In this paper, we continue to do a deep research on the data-driven model predictive control scheme and apply the idea of dynamic programming to complete the study on the data-driven model predictive control scheme. Dynamic programming deals with situations where decisions are made in stages. The outcome of each decision may not be fully predictable but can be anticipated to some extent before the next decision is made. More specifically, data-driven model predictive control is considered in this short note, and the detailed description of the proposed algorithm is provided. As data-driven model predictive control formulates the problem of designing controller into one constrain optimization, the idea of dynamic programming can be applied to analyze the minimum principle of that constrain optimization, which corresponds to one general case. Moreover, after defining one concept of stability for data-driven model predictive control, stability condition is also derived through the idea of dynamic programming. To the best of our knowledge, dynamic programming is only studied in optimization theory, not in control research yet. In our published paper [17], dynamic programming is firstly introduced to balance the desire for lower present cost with the undesirability of high future cost in multi UAVs formation anomaly detection. So, this is the first analysis regarding dynamic programming into data-driven model predictive control. The main contributions of out short note are those minimum principle and stability analysis are presented for data-driven model predictive control through the idea of dynamic programming. During the above theoretical analysis about dynamic programming for our constructed data-driven MPC strategy, no other constraint is considered for that numerical optimization problem, i.e., only one simple cost function exists in the whole analysis process. So, for the sake of completeness, one inequality constraint is considered or added in the original numerical optimization problem. However, this added inequality constraint is not related with the common observed input or output variable, and it responds to the condition of persistent excitation. This property of persistent excitation is crucial in many adaptive schemes, where parameter convergence is guaranteed, as parameter convergence is one of the primary objectives of the adaptive system. After defining our persistent excitation and combining this constrain condition with the original cost function, one quadratic programming problem with inequality constraint is needed to solve. Through our own simple but tedious calculations, this constrain optimization problem can be solved, while satisfying the necessary optimality conditions.

2. Problem Formulation

The whole system is divided into its components, i.e., pairs composed by a gate and its downstream research, forming a set of subsystems. The dynamics of each subsystem are described as the following linear model:where and are the state and input, respectively, of subsystem , are the state transaction and the input-to-state matrices of compatible dimension, and denotes the influence of other subsystems on subsystem .

The overall system can be described as follows:where are the grouped states and inputs and are the state transaction and the input-to-state global matrices. All the control systems may communicate through a data network, whose topology is described by means of the undirected graph . From the study by Wang et al. [17], when agents in the same communication component will benefit from cooperation, i.e., sharing information to aggregate their control links, then such components are named as data driven. This idea of data driven is carried out in different ways, and then the information flow and one broadened control feedback are constructed.

3. Data-Driven Model Predictive Control

The partition from a given network topology is defined as follows:where is the cardinality of the set, the set of indices is defined as . Data-driven idea are disjoint sets, such that

Then, the state cost corresponding to is , and local controllers aggregate into a data driven with their efforts to achieve a better performance. At time instant , a control sequence for all subsystems is derived by the following constrain optimization of the model predictive control problem:

Problem (5) is solved independently for each agent , and the overall control problem can be formulated as follows:where is defined as follows:

Problem (6) corresponds to one centralized model predictive control with sums, and also it constitutes a dynamic optimization with mixed integer variables, which is solved by the latter dynamic programming strategy.

4. Minimum Principle with the Idea of Dynamic Programming

As our goal is to study the data-driven model predictive control in this short note, we concern on that constrain optimization problem (5) and apply the idea of dynamic programming to analyze its minimum principle.

As we want to find a control sequenceand a corresponding state sequencefor constrain optimization problem (5), then we minimize thatsubject to the following linear model:and the control constraints as follows:

Firstly, we develop an expression for the gradient as follows:after using the chain rule, and then we obtain thatwhere all gradients are evaluated along the control trajectory and the corresponding state trajectory . Similarly, for all ,and substituting that linear model into equation (15), it holds thatwhich can be rewritten in the following form:for an appropriate vector , andwhere is the Hamiltonian function defined by

From equation (18), we see that the vectors are generated backwards by the following adjoint equation:with terminal condition

If the constraint sets are convex, the optimality condition is applied:for all feasible solution , and then this condition can be decomposed into the following conditions:

Generally, based on above derivations, this minimum principle from the idea of dynamic programming is applied to analyze the constraint optimization problem for the data-driven model predictive control. Then, we obtain the following Theorem 1.

Theorem 1. Suppose that is an optimal control input and is the corresponding state trajectory for our considered data-driven model predictive control. Sets are convex, and then for all , we havewhere vectors are obtained from the adjoint equation.with terminal condition

Further, Theorem 1 can be reformulated as one flowchart in Figure 1, where all the detailed processes are covered, and it is similar to the commonly used closed loop system structure.

5. Stability Analysis with the Idea of Dynamic Programming

Here, in order to apply the idea of dynamic programming to analyze the stability for that data=driven model predictive control (5), one explicit form about the cost per stage is needed. Without loss of generality, the cost per stage is quadratic.where and are two positive definite symmetric matrices, and also we impose state and control constraints.

For all initial state , the state of the closed loop systemsatisfied the state and control constraints, and is one stationary controller.

Definition 1. The stability concept is defined as that the total cost over an infinite number of state is finite, i.e.,Let be the state and control sequence generated by data-driven model predictive control, so thatDenote as the optimal cost of the stage problem solved by data-driven model predictive control, and be the optimal cost starting at of a corresponding stage problem. It means the optimal value of the quadratic cost.Since that deriving the state to 0 can not decrease the optimal cost, we have for all From the definitions of and , we have for all ,and by using equation (33), we have thatand summing these equations for arbitrary values , it holds thatSince , it holds thatTaking the limit on above inequality as , we have thatThis completes the proof of that stability concept (30), whose total cost over an infinite number of state is finite.

6. Persistently Exciting Data-Driven Model Predictive Control

Observing equation (21) again, no other constraint is considered there, except only one simple cost function with quadratic form. In case of equality or inequality constraints for equation (21), one quadratic programming problem is obtained with one inequality constraint, named as the condition of persistent excitation.

Firstly, here the definition of persistent excitation is given as follows.

Definition 2. (persistent excitation). A bounded input signal is persistent excitation with a level of excitation if there exist constants such thatThis condition can be interpreted as a condition on the energy of input signal in all direction, and this persistent excitation can guarantee the parameter estimation be exponentially fast in system identification and adaptive control.
Combining equations (21) and (23) and inequality (39), one persistently exciting data-driven MPC is formulated as the following quadratic programming problem with inequality constraint:The goal of the following theoretical analysis is to solve the above constrain optimization problem and give one explicit form for the optimal control input , while satisfying the condition of persistent excitation.
To simplify that cost function in equation (40), we havewhere is the identity matrix with dimension .where vector and are defined as follows:To simplify notation, set , thenSimilarly, we getwhereThrough these two equations (28) and (29), that cost function in (25) can be rewritten as follows:For the right part for that persistent excitation (39), we rewrite it as follows:It means thatSimilarly, we haveCombining inequality constraints (49) and (50), we obtain thatSetBased on the above constructed matrices and vectors, we get Theorem 2.

Theorem 2. Then, the constructed quadratic programming problem with inequality constraint (49) can be reformulated as follows:In order to obtain the optimal input signal , we need to identify the optimization variable , as is included in m, i.e., .
Observing quadratic programming problem with inequality constraint (53) again, Lagrangian multiplier is introduced corresponding to the inequality constraint, and the following Lagrangian function is constructed:Taking the partial derivative with respect to the optimization variable and setting the derivative equal to zero, it holds thatand substituting the optimal variable into Lagrangian function again, we haveSo, the dual problem isApplying the necessary optimality condition to get , it means that , where . After substituting it into equation (58) again, we havewhere

Then, we choose the last elements as the optimal input variable .

Generally, from above two sections, we can see that after data-driven model predictive control is constructed, dynamic programming strategy is applied for simple cost function without inequality constraint, and persistently exciting data-driven model predictive control holds for case of inequality constraint.

7. Simulation Example

Now, we propose one example to illustrate the nature of the above results. Then, the data-driven model predictive control strategy is applied in flutter suppression for one small helicopter, hovering in the laboratory in Figure 2.

Due to the limitation of the experimental physical devices, some constraints exist on the control input and observed variables. Helicopters have complex dynamics and special flight conditions, and their dynamics change accordingly with the flight altitude and flight conditions. The helicopter model corresponds to one state space system with three inputs and three outputs. More specifically, the three inputs include spray angle , acceleration , and rise angle . Furthermore, three outputs are relative height , forward speed , and roll angle . The corresponding state space system is given as follows:where observed noise is one white noise with zero mean and its covariance . Collect 1000 sampled data in one experiment and choose the sampled period as 0.2 second. One initial control is chosen as . Reference signal is also one white noise with zero mean, and its covariance is . Two matrices are , and constraints are considered as follows:

During the test, sensors and actuating devices are needed to measure the relevant signals. At this time, the selection of the device, the layout of the location, and the selection of excitation points need to be considered. For dynamic systems, Controllability and observability require that the excitation point and sensor arrangement are not at the node of the studied flutter mode. When the excitation point or all sensors are at the node of a certain order, the mode and its frequency multiplication mode cannot be identified. From the perspective of mechanics, we see that when the excitation point is at the node, the excitation energy cannot be input to the modal, and the modal motion cannot be excited; on the other hand, the response measured by the sensor is a subset of the system response. The first-order modal node does not contain the modal component in the measurement subset.

For the optimization problem (53) of data-driven model predictive controller, the dynamic programming scheme is used to solve the optimal predictive controller. In order to verify the performance of the closed loop system, the initial state of the system is selected as . The specific process using a joystick control of helicopter attitude is as follows: Firstly, the helicopter will bow down to about 3 degree and move action for 3 seconds until down to 0 degree, before it starts to fly rod. Secondly, the helicopter will loose rod for 5 second and then it will pull back to 3 degree after it returns to zero degree. The simulation response curve of the closed loop system under above operation is shown in Figure 3.

(a)

(b)

Figure 4 shows that the attitude control of the model machine in the experiment is relatively stable under our considered data-driven model predictive controller, and the attitude control accuracy is about 1, and the heading control accuracy is about 3, which better suppresses the effect of flutter.

8. Conclusion

In this paper, data-driven model predictive control is considered to adjust the varying coupling conditions between different parts of the system. The idea of dynamic programming is proposed to analyze the minimum principle and stability condition for the data-driven model predictive control, respectively. Through adding the inequality constraint to the constructed model predictive control, one persistently exciting data-driven model predictive control is obtained. The inequality constraint corresponds to the condition of persistent excitation. As this is the first theoretical analysis, regarding dynamic programming, persistent excitation, and model predictive control so that how to combine game theory and dynamic programming for persistently exciting data-driven model predictive control is our next ongoing work.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was partially supported by Jiangxi Provinical National Science Foundation (no. 20202BAL202009). The authors are grateful to Professor Eduardo F Camacho for his warm invitation in his control lab at the University of Seville, Seville, Spain.

References

T. Alamo, R. Tempo, and E. F. Camacho, “Randomized strategies for probabilistic solutions of uncertain feasibility and optimization problems,” Institute of Electrical and Electronics Engineers Transactions on Automatic Control, vol. 54, no. 11, pp. 2545–2559, 2009.
View at: Publisher Site | Google Scholar
D. P. Bertsekas, “Affine monotonic and risk-sensitive models in dynamic programming,” Institute of Electrical and Electronics Engineers Transactions on Automatic Control, vol. 64, no. 8, pp. 3117–3128, 2019.
View at: Publisher Site | Google Scholar
D. P. Bertsekas and V. Goyal, “On the power and limitations of affine policies in two stage adaptive optimization,” Mathematical Programming, vol. 134, no. 2, pp. 491–531, 2012.
View at: Google Scholar
L. Blackmore, M. Ono, and B. C. Williams, “Chance-constrained optimal path planning with obstacles,” Institute of Electrical and Electronics Engineers Transactions on Robotics, vol. 27, no. 6, pp. 1080–1094, 2011.
View at: Publisher Site | Google Scholar
G. C. Calafiore, “Random convex programs,” SIAM Journal on Optimization, vol. 20, no. 6, pp. 3427–3464, 2010.
View at: Publisher Site | Google Scholar
M. C. Campi and S. Garatti, “Wait and judge scenario optimization,” Automatica, vol. 50, no. 12, pp. 3019–3029, 2016.
View at: Google Scholar
M. C. Campi, S. Garatti, and M. Prandini, “The scenario approach for systems and control design,” Annual Reviews in Control, vol. 33, no. 2, pp. 149–157, 2009.
View at: Publisher Site | Google Scholar
D. Callawy and I. Hiskens, “Achieving controllability of electric loads,” Proceedings of the Institute of Electrical and Electronics Engineers, vol. 99, no. 1, pp. 184–199, 2011.
View at: Google Scholar
X. Zhang, M. Kamgarpour, A. Georghiou, P. Goulart, and J. Lygeros, “Robust optimal control with adjustable uncertainty sets,” Automatica, vol. 75, no. 1, pp. 249–259, 2017.
View at: Publisher Site | Google Scholar
X. Zhang, S. Grammatico, and G. Schildbach, “On the sample size of random convex programs with structured dependences on the uncertainty,” Automatica, vol. 60, no. 10, pp. 182–188, 2015.
View at: Publisher Site | Google Scholar
S. Garatti and M. C. Campi, “Modulating robustness in control design: principles and algorithm,” Institute of Electrical and Electronics Engineers Control Systems Magazine, vol. 33, no. 2, pp. 36–51, 2013.
View at: Google Scholar
M. Farina and L. Giulioni, “Stochastic linear model predictive control with chance constraints- a review,” Journal of Process Control, vol. 44, no. 2, pp. 53–67, 2016.
View at: Publisher Site | Google Scholar
M. Abdelrahim, R. Postoyan, and J. Daafouz, “Robust event triggered output feedback controllers for nonlinear systems,” Automatica, vol. 75, no. 1, pp. 96–108, 2017.
View at: Publisher Site | Google Scholar
P. Gianluigi, “A new kernel based approach to hybrid system identification,” Automatica, vol. 70, no. 2, pp. 21–31, 2016.
View at: Google Scholar
J. Wang and R. A. Ramirez-Mendoza, “Finite sampld properties of virtual reference feedback tuning with two degrees of feeedom controllers,” ISA Transactions, vol. 99, no. 6, pp. 37–49, 2020.
View at: Google Scholar
J. Wang, Y. Wang, and R. A. Ramirez-Mendoza, “Iteative identification and model predictive control:theory and application,” International Journal of Innovative Computing, Information and Control, vol. 17, no. 1, pp. 93–109, 2021.
View at: Google Scholar
J. Wang, R. A. Ramirez-Mendoza, and X. Tang, “Robust analysis for data driven model predictive control,” Systems Science Control Engineering, vol. 9, no. 1, pp. 393–404, 2021.
View at: Google Scholar

Copyright

Copyright © 2021 Hong Jianwang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

376

Downloads

550

Citations