An Improved Approach for Robust MPC Tuning Based on Machine Learning
A robust tuning method based on an artificial neural network for model predictive control (MPC) of industrial systems with parametric uncertainties is put forward in this work. Firstly, an efficient approach to characterize the mapping relationship between the controller parameters and the robust performance indices is established. As there are normally multiple conflicted robust performance indices to be considered in MPC tuning, the neural network is further used to fuse the indices to produce a simple label representing the acceptable level of the robust performance. Finally, an automated algorithm is proposed to tune the MPC parameters for the considered uncertain system to achieve the desired robust performance. In addition, the regulation of the pH value of the sewage treatment system is used to verify the effectiveness of the robust tuning algorithm which is described in this paper.
Model predictive control (MPC) has been widely used in industrial communities due to its robustness, ability to tackle safety constraints, and inaccurate model [1, 2]. As we all know, PID control is normally applied at the system’s base layer, while the MPC controllers are usually employed at the supervisory layer . In MPC applications, the prediction horizon, control horizon, and weighting matrices in the cost function will significantly affect the closed-loop performance of the controlled system, and thus, the selection of the aforementioned parameters becomes one of the most important tasks for MPC design . As control systems become more and more complex, factors such as input and output coupling, external interference, and time delay make it even more difficult to achieve an effective MPC tuning.
The MPC tuning methods in existing industrial applications are mainly based on engineering experience or numerical methods, which greatly increases the blindness of the controller design and, at the same time, consumes a lot of computation time . Besides, since the model is only an approximation of the real process, it is inevitable to suffer a certain level of uncertainty; robust MPC tuning becomes a necessity . In , the authors proposed a robust tuning method of MPC, which is on the basis of the min-max optimization. Such an approach can handle the model uncertainty problem explicitly and, meanwhile, could preclude MPC controllers from choosing large prediction and control horizons so that the online calculation time is reduced. In , the authors reduced the number of effective tuning parameters by modifying the controller structure and redesigned the MPC cost function properly. In , two robust tuning strategies are put forward for SISO uncertain paper-making processes which incorporated the total variation specification to user-friendly performance indices. In , the authors further proposed a rapid tuning strategy based on the closed-loop system structure for MPC parameters for MIMO paper-making system with first-order-plus-dead-time subsystems and uncertain model parameters. In , by adopting the sequential procedure, the authors developed a tuning method that took the reachable trajectories of each operating point of the controlled system as the reference to pursue an improved robust performance. Then, they applied the method to the electrical system to verify its feasibility. In , the authors proposed a tuning method using the worst-case control scenario, which is characterized by the Morari resiliency index and the condition number, and a nonlinear multiobjective performance criterion. The resulting constrained nonlinear optimization problem is solved with PSO. In , based on the measured data in various operating conditions, a novel approach of the real-time compensation of the asymmetric behavior was investigated, which leads to an improved control performance.
With the rapid development of AI technology lately, researchers have made some attempts in the application of machine learning techniques in the process of MPC tuning. In , the authors put forward a framework using machine learning to approximate the tuning experience of human experts along with a gradient-free optimization algorithm to tune the MPC parameters. In , the authors proposed an online tuning method for the MPC parameters using particle swarm optimization (PSO) and online sequential extreme learning machine (OS-ELM). In , the authors used the artificial neural network for the tuning of FS-MPC for power electronic converters and verified the effectiveness of a practical FS-MPC regulated voltage source converter (VSC) for uninterruptible power supply (UPS) system. In , a dynamic system based on the projection neural network (PNN), which is well known for the parallel computational capability, is established to optimize the objective function of MPC; thereby, the computational efficiency is improved significantly which makes the proposed control design more practical.
Note that although there already exist MPC parameter tuning methods using neural network and PSO, none of them focus on the parameter tuning of an uncertain system based on robust time-domain performance indices. As model-plant mismatch is unavoidable in industrial applications, it is of great importance to develop a tuning method to handle this uncertainty. Besides, compared with the frequency-domain indices, their time-domain counterparts are more familiar to the site engineers, and thus, it is also necessary to incorporate the time-domain indices in the MPC tuning.
Given such a new problem, this paper uses the machine learning technique to develop an MPC tuning method to deal with both parametric uncertainty and robust time-domain performance indices. The contribution can be summarized as follows:(i)A novel approach to characterize the relationship between the MPC parameters and the robust time-domain performance indices, e.g., worst-case overshoot, is established based on RBF neural network. According to the definition of parametric uncertainty and robust performance indices, the parameters of the neural network and the corresponding data acquisition method are also specifically designed.(ii)As there normally exist conflicts between different time-domain robust indices, it is difficult to specify a suitable target for MPC tuning, and thus, BP neural network is employed to fuse the indices to produce a scalar label representing the acceptable level of the robust performance, such that the MPC tuning problem can be efficiently solved via the PSO algorithm.
The paper is organized as follows. Section 2 describes the structure of the model predictive controller and expresses the tuning problem. Section 3 provides a tuning algorithm for MPC parameters via machine learning. Then, in Section 4, a real system is utilized to verify the effectiveness of the raised algorithm. Finally, the concluding remarks are given in Section 5.
2. Preliminary and Problem Formulation
2.1. Nominal Model and Model Uncertainty
In this paper, we consider multiple-input multiple-output systems which can be expressed by the following discrete-time transfer function model:in which demonstrates the transfer function between the output and the input.
Take the common FOPDT model structure for each subsystem following the industrial experience [9,10], which can be described as follows:where , , and denote the process gain, time constant, and time delay. Since is hard to be known accurately, a nominal model is defined to approach it, which can be expressed as
The model parameters , , and are identified through the input or output data of the real process and are used to predict the state of the MPC controller. However, it is inevitable that is different from , and to consider such a model mismatch, the parametric uncertainty is used, which refers to the difference in the model parameters:where . Note that compared with other types of uncertainty specifications, parametric uncertainty is easier for site engineers to specify based on their knowledge of the controlled system and thus is employed in this work. Based on the considered parametric uncertainty, a set of possible models can be denoted as follows:
Furthermore, the state-space model can be obtained aswhere is the input variable, is the output variable, is the state-space vector, and is the sampling instant.
2.2. MPC Formulation
In industrial applications, the MPC cost function with constraints usually can be designed in the following way:where and are controller parameters to be tuned, is the reference signal of , and and are prediction and control horizons.
Therefore, the predicted output value of the model can be expressed via the following matrix expression:in which indicates the matrix transpose, and and is the current state vector of the system. Then, the MPC can be represented in the following quadratic programming problem based on which the control signals can be obtained:
2.3. Tuning Problem
In this work, we need to tune and to keep the system robustly stable against the parametric uncertainty and each output tracks its target with a fast and stable response. But there are many contradictions in achieving the target. For example, a small overshoot often causes a large settling time, while a small settling time can be associated with a large overshoot. To make a balance in the process, we need to find the most appropriate set of parameters.
We choose overshoot and settling time as the main performance measures for the MPC tuning because they are simple and well suited for end users to evaluate the control effect. In the future application, other control performance indices can be directly added according to the specific needs following a similar procedure.
3. Robust MPC Tuning Based on Machine Learning
3.1. Controller Tuning Framework Based on Machine Learning
In this paper, an approach for adjusting the aforementioned MPC design parameters for system with parametric uncertainties is developed based on the machine learning technique, the overall framework of which is shown in Figure 1.
3.2. Robust Performance Calculation Based on Machine Learning
Many actual industrial control applications obtain the desired closed-loop performance through a tuning process. That is essentially an optimization problem, which, however, is normally solved by a human (see Figure 2). In this work, we are going to propose a method to achieve the desired controller parameters in a similar way as the human expert. As the MPC cost function captures a cost-benefit relationship between multiple, competing objectives, the performance of the system being considered often cannot be analyzed explicitly. Therefore, a common approach is to approximate it by exploiting the innate human ability to recognize different patterns. More specifically, the human expert acceptance level (denoted as ) of a given closed-loop performance (denoted as ) is considered as the tuning objective. Then, the tuning problem can be expressed aswhere presents the MPC tuning parameter and indicates a set of admissible controller parameters. is the robust performance of the uncertain system given a selected .
The explanation of is given in Figure 3. For the two curves with the same reference output, the blue curve has the performance we hope to obtain in the process of controller parameter tuning rather than the red curve. So its value is less than that of the red curve.
Note that as there is no explicit relationship between and , machine learning technique is adopted in this work to characterize the robust performance and the corresponding acceptance level. Then, the tuning problem is approximated bywhere is a feature extractor function which transforms outputs of the system (i.e., time-domain signals) into a vector consisting of relevant performance indices, and is an approximation to . Then, the tuning framework can be expressed as the diagram shown in Figure 4.
Now, there are three major steps in the tuning of MPC parameters: (i) extract the robust performance of the system with parameter , (ii) obtain the human expert’s acceptance level of the obtained performance, and (iii) optimize the MPC tuning parameter for the desired performance.
3.2.1. Robust Time-Domain Performance Calculation
Given the perturbed systems in , the time-domain robust performance indices are employed to characterize the robust performance as they are more intuitive to end users in the industry. More specifically, the worst-case overshoot and settling time are considered, the definition of which is defined as follows.
Definition 1. (worst-case overshoot). The worst-case overshoot of a set of responses with the same final value is the maximum value of all the responses minus the final value divided by the final value.
Definition 2. (worst-case settling time). The worst-case settling time of a set of responses with the same final value is the maximum time required for all the responses to arrive and stay within a predetermined final percentage range. The calculation of worst-case performance is shown in Algorithm 1.
The illustration of the abovementioned indices is shown in Figure 5.
Note that there may exist a certain relationship between the system model parameters and the robust time-domain performance indices, but such a relationship is implicit and cannot be expressed by a definite formula. Thus, we use an artificial neural network to establish the mapping relationship between the MPC controller parameters and the robust time-domain indices (i.e., in equation (11)) for the system with parametric uncertainty.
Due to the large dimensionality of the inputs and outputs, local approximation network radial basis function (RBF) is selected to ensure that the network has a fast learning convergence speed, and according to the sample size, a standard RBF network or generalized RBF network can be selected. Here, the standard RBF network is taken as an example without a morbid problem.
In this paper, each sample of the RBF neural network contains a set of controller parameters and and the corresponding robust time-domain performance indices of uncertain system. In practice, if the overshoot or settling time is too big, such control is considered meaningless. In order to ensure the representativeness of the samples, the grid method is used to sample the performance parameters within the acceptable range, and then, the corresponding controller parameters are derived as the training database of the RBF network. If each index takes numbers within its allowable range, there are training set samples in total. Finally, groups of controller parameters and their corresponding robust performance indices are randomly generated as test sets. If the training set samples cannot reach the required network accuracy, the segmentation of the above interval can be further refined.
Note that every group of worst-case performance indices needs ( is the number of uncertain parameters in the system model) curves to generate a reasonable result. Considering the requirement of the BPNN in this work, the groups of datasets are obtained when training RBF network is directly utilized, and therefore, it is necessary to have at least curves to generate a reasonable result. The reason to consider curves is that in order to characterize the worst-case performance, the polyhedron system representation  in robust control theory is employed, which indicates that the worst performance of the uncertain system mostly appears at the vertex system of the polyhedron system, and therefore, the largest and smallest possible values of each model parameter of the uncertain system need to be considered, resulting in curves for each group of robust indices. Note that, compared with the existing method to evaluate the worst-case time-domain performance (e.g., brutal search method), the aforementioned method is much simpler, since the required number of curves is significantly reduced, which helps to achieve the network in a more efficient way.
The input layer of the RBF network has two inputs, which represent the two parameters and of the controller, respectively. The number of neurons in the hidden layer is , which is the same as the number of samples in the training set; the number of neurons in the output layer is , which represents the worst-case dynamic time-domain performances of the model uncertain system. Gaussian radial basis function is used in the network.
The training of RBF neural network needs 4 steps: (1) preprocessing; (2) data standardization; (3) determination of the center vector and standardized constant of Gaussian function; and (4) obtaining the weight matrix of output layer by the recursive least square method. More specifically, the output function is given byin which is output variables, is output variables , is the weight from hidden layer node to output layer, and is the basis function of RBF network. The basis function of the RBF network in this paper is the Gaussian function. The specific expression is as follows:in which is the variance of the Gaussian function and is the center of the Gaussian function.
3.2.2. Performance Label Calculation
As there may exist a conflict between different robust performance indices, we employ a performance label to characterize the acceptance of a given pair of robust indices based on the experience of human experts. Note that although different expert’s is likely to be personalized to some extent, the codomain of each human cost function to achieve is a set of elements representing the quantities’ assessment of the relevant data. In this work, these data consist of the worst-case time-domain robust performance indices of the system, while the codomain elements are the nonnegative real numbers, referred to as performance labels.
For the purpose of demonstration, the performance label belongs to the interval , capturing the “acceptable” level of the closed-loop performance. More specifically, label 0 denotes the best possible performance while a label greater than 0 denotes worsen performance, and 1 means the least “acceptable” performance. Naturally, any label greater than 1 captures the “unacceptable” closed-loop performance. The label assignments corresponding to these worst-case robust performance indices are obtained by collecting expert experience.
Given the dataset above, approximating the human cost function is a typical supervised learning problem. Note that there may exist a difference between the label values from each expert because he or she may have different preferences and concerns about characterizing the performance. Therefore, the regression method is used to approximate the human cost function. In this work, BP neural network is used to establish the mapping relationship between the system performance indices and the label.
The training data of the BP network are obtained by investigation. First of all, we generate a number of output curves to illustrate the considered robust time-domain indices (e.g., Figure 5), and these outputs are given performance labels by experts.
Then, the required BP neural network can be established. The number of nodes in the input layer is , which is decided by the output dimension of the system. The output layer has just one node, indicating the value of the label. We chose the number of nodes in the hidden layer by an empirical formula , where the value range of is usually .
There are inputs and output training sample vectors are represented by and , respectively . The input vector is , the output vector of the network is , , and the target output vector is . Note is the weight of the component of the input vector mapped to the component of the output vector, which is randomly allocated in the first calculation. BP network modifies weight by the gradient steepest descent method through the feedback of output result, so that the sum of square error between output value and the target value is minimum, that is, (14). Repeat until the error is less than the set threshold.in which is the learning rate of the network. The activation function of the BP network is the sigmoid function, i.e.,
3.3. Tuning Process
As mentioned above, using the two trained neural networks, the performance label can be quickly obtained in the process of MPC parameter optimization. The specific algorithm for robust performance labeling is shown in Algorithm 2.
Interpretations: the position of each particle contains all the information of and matrices. The greater the particle swarm size is, the larger the search range will be, the easier the global optimal solution would be obtained, but the longer the corresponding running time will be required. is the maximum iteration number of particles, and is the precision of the optimization target. , , and are the local speed, global speed, and flight acceleration of particles, respectively. The faster the particles fly, the faster the optimization speed is, but it is also easier to miss the optimal location. The final is the optimal control parameters of the MPC given the considered model uncertainty; is the corresponding performance label.
4. Industrial Example
This section applies the developed new tuning algorithm to the actual application of the pH adjustment process of sewage treatment system shown by Figure 6 to illustrate the efficiency of the tuning algorithms. The pH neutralization process system of sewage is composed of inlet wastewater flow, buffer fluid flow, and acid neutralizer flow in the neutralization tank to obtain the height of wastewater flow at the outlet and the liquid level of the storage tank. Among them, the flow rate of acid neutralizer flow and the flow rate of sewage flow at the inlet are taken as control variables, and the pH value of sewage flow at the output port and the height of the liquid level in the storage tank are taken as the output quantities. The model predictive controller is used, and the proposed method is employed to tune the relevant parameters of the controller to achieve the purpose of adjusting the pH value of sewage.
The process can be characterized by a two-input two-output system:
Considering potential model-plant mismatch, the real model parameters are considered to be within the following ranges:
The nominal system considered is shown as follows:
Note that the model is identified via an advanced industrial control software package for the use of MPC.
The prediction and control horizons are set to , and the initial operating conditions are as follows:and the references in (8) are . The tuning parameters for the considered MPC in (8) are and , where
For illustration purpose, the weighting matrix is simplified as follows:
In our experiment, according to the requirements of the actual system for these indices, it is divided into performance intervals in the range of change, and a sample is selected in each interval, a total of samples as the training set. Then randomly select groups from the remaining samples as the test set. And parameters of supervised machine learning are set as follows:
The root mean square error (RMSE) which can reflect the prediction stability of networks is introduced to evaluate the prediction performance of the network.in which is the actual output and is the reference output of the network. In our experiment, the training accuracy of the RBF network and BP network is 0.001, and the error curves from the testing are shown in Figures 7–8. The RMSE of the RBF network and BP network is 0.8073 and 0.0951, respectively. Note that although the obtained error value of the RBF network is a bit high, such accuracy is acceptable for the robust tuning problem at hand because, given the high level of model uncertainty considered in this work, an average error of 0.1448 for worst-case overshoot and worst-case settling time would not affect the overall tuning performance from a robust control point of view. Furthermore, as the number of data samples of the industrial control system is normally limited, it is reasonable that the construction of the network is stopped once the accuracy meets the design requirement. As for the BP network, since the relationship between the robust indices and performance label is relatively simple, the accuracy becomes higher as shown in Figure 7.
Then, the effectiveness of the proposed robust time-domain performance calculation method is tested and the results are shown in Figure 9, in which the red curves indicate the worst-case performance obtained by the RBF network from Algorithm 1, and the blue curves indicate the outputs generated with system parameters randomly selected in the region shown in equation (13). More specifically, the robust time-domain performance from the brutal force search is , while that from the proposed network is .
Now, we test the proposed tuning method. Figure 10 shows the change of the performance label while Algorithm 2 is searching for the optimal MPC parameters. With the PSO iteration process going on, the label value decreases until it converges. The solution is 0.3426 and the corresponding tuning results are and , and the optimization time is 23 s; Figure 11 shows the optimal control parameters obtained by the brutal search. The global optimal solution is and , the corresponding label level value is 0.3450, and the optimization time is . Compared with the brute force search method, the tuning method based on machine learning requires only of the running time which can still obtain a similar tuning result, which verifies the effectiveness of the method described in the invention.
In order to further verify the effectiveness of the method proposed in this paper, we selected three groups of controller parameters according to tuning guides utilized in industry and compared them with those obtained by Algorithm 3, and the tuning parameters of each case are shown in Table 1. Assume that the actual parameters of the system are , , , , , and , which are different from the nominal system in equation (14).
The corresponding performance indices and corresponding of each group of controller parameters are shown in Table 2. It can be seen that the controller parameters corresponding to group (d), that is, the parameters obtained through the algorithm described in this paper, have the best control effect for the uncertain system.
As far as we know, this is the first work to use the machine learning technique to solve the MPC tuning problem considering parametric uncertainty and robust time-domain indices. Therefore, we can only compare with the most similar literature,  of this manuscript, which is a machine learning-based MPC tuning method developed for systems with no model uncertainty. According to , the MPC parameters are and when the controller is adjusted based on the nominal system without considering the model uncertainty. Assume that the actual model parameters of the system areand the nominal system is expressed by equation (19). Two groups of controllers are applied to the system, respectively, and the output image of the system is shown in Figure 13. The black curve represents the system output under the control of and , and its dynamic time-domain performance indices are overshoot and settling time of two outputs of the system, i.e., . The red curve represents the system output under the control of the controller parameters and which tuned based on the worst-case time-domain performance indices of uncertain system that discussed in this paper, and its dynamic time-domain performance indices are . It can be seen that the method proposed in this paper greatly enhances the robustness of parameter tuning of model predictive control for uncertain systems.
In this paper, an artificial neural network-based model predictive control (MPC) parameter tuning method for uncertain systems is proposed, which could solve the problem of low control accuracy caused by model mismatch and external disturbance effectively, so as to improve the robustness of the controlled system. To achieve such an objective, a novel method to compute the worst-case output performance in the time domain is developed through machine learning, and the potentially conflicted time-domain robust performance indices are transformed into a scalar performance label via a neural network formed based on expert’s experience. At the same time, the parallel search ability of PSO is adopted to research the optimal tuning parameters efficiently. Finally, the method is verified in the process of pH regulation in the sewage treatment system.
At the present stage, the potential shortcoming of the method is that the considered two neural networks can only be obtained offline, and for some industrial systems, it is more desirable that the offline training stage can be avoided and the learning can be realized in an online manner. Thus, the main future research direction is to employ the online learning method to develop an efficient robust tuning algorithm considering model uncertainty as well as robust time-domain performance.
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported by NSFC of China (Grant no. 61903291), Special Scientific Research Project of Shaanxi Provincial Department of Education (19JK0459), and Basic Research Plan in Shaanxi Province of China (2020JQ-683).
X. Yu-Geng, L. De-Wei, and L. Shu, “Model predictive control—status and challenges,” Acta Automatica Sinica, vol. 39, no. 3, pp. 222–236, 2013.View at: Google Scholar
K. Han, J. Zhao, and J. Qian, “A novel robust tuning strategy for model predictive control,” in Proceedings of the 2006 6th World Congress on Intelligent Control and Automation, vol. 2, pp. 6406–6410, IEEE, Dalian, China, June 2006.View at: Google Scholar
G. S. Sankar, R. C. Shekhar, C. Manzie, T. Sano, and H. Nakada, “Fast calibration of a robust model predictive controller for diesel engine airpath,” IEEE Transactions on Control Systems Technology, vol. 28, no. 4, pp. 1510–1519, 2019.View at: Google Scholar
G. N. Júnior, M. Martins, and R. Kalid, “A pso-based optimal tuning strategy for constrained multivariable predictive controllers with model uncertainty,” Isa Transactions, vol. 53, no. 2, pp. 560–567, 2014.View at: Google Scholar
H. Moumouh, N. Langlois, and M. Haddad, “A novel tuning approach for mpc parameters based on artificial neural network,” in Proceedings of the 2019 IEEE 15th International Conference on Control and Automation (ICCA), pp. 1638–1643, IEEE, Edinburgh, Scotland, July 2019.View at: Google Scholar
L. A. Cheng, A. Dw, A. Yz, and B. Xm, “Model predictive control for path following and roll stabilization of marine vessels based on neurodynamic optimization - sciencedirect,” Ocean Engineering, vol. 217, Article ID 107524, 2020.View at: Google Scholar