Research Article | Open Access
Quan Tu, Yingjiao Rong, Jing Chen, "Parameter Identification of ARX Models Based on Modified Momentum Gradient Descent Algorithm", Complexity, vol. 2020, Article ID 9537075, 11 pages, 2020. https://doi.org/10.1155/2020/9537075
Parameter Identification of ARX Models Based on Modified Momentum Gradient Descent Algorithm
The parameter estimation problem of the ARX model is studied in this paper. First, some traditional identification algorithms are briefly introduced, and then a new parameter estimation algorithm—the modified momentum gradient descent algorithm—is developed. Two gradient directions with their corresponding step sizes are derived in each iteration. Compared with the traditional parameter identification algorithms, the modified momentum gradient descent algorithm has a faster convergence rate. A simulation example shows that the proposed algorithm is effective.
There are many identification algorithms which can estimate the parameters of linear models and nonlinear models, such as the coupled identification algorithms [1, 2], the filtered identification algorithms [3, 4], and the hierarchical identification algorithms [5–7]. The autoregressive exogenous (ARX) model is based on the traditional autoregressive model, adding measurable external inputs at various times to generate output. Such a model is widely used in engineering practice. For example, Naveros used the ARX model to identify physical parameters of walls , Qin et al. applied the ARX model to control the magnetic levitation ball system , Haddouche et al. utilized the ARX model to control the gas condition tower . Since a robust controller often has the assumption that the structures of the systems should be given in prior [11–13], the system identification plays an important role in control engineering. Its basic idea is to use the identification algorithms to determine a mathematical model [14–17] and by which the behavior of the systems can be predicted.
The gradient descent algorithm is usually used for ARX model identification. It can effectively reduce the computational efforts but with slow convergence rates [18, 19]. The gradient descent algorithm includes two steps: the first is to determine a direction, which is called negative gradient, and the second is to determine a suitable step size for the direction. Besides, the least squares algorithm is another widely used method in system identification, which has a faster convergence rate [20–26]. However, the least squares algorithm has heavy computational efforts and needs to solve a derivative function. Therefore, it is inefficient for some models with complex nonlinear structures.
In order to determine the step size of the gradient descent algorithm, the root of a higher-order equation needs to be calculated, which is challenging/impossible. Fortunately, the stochastic gradient (SG) algorithms [27, 28] avoid the root calculation by updating the parameters in each sampling time with only one set input-output data. It can be widely used in engineering practice for its simple structure. However, only one set of data is used at each sampling instant; the convergence rate of the SG algorithm is slow. To improve the convergence rate, Ding et al. first proposed a multi-innovation stochastic gradient algorithm and a multi-innovation least squares algorithm for linear regression models [29, 30], which have quick convergence rates. The conjugate gradient descent method is another method which has quicker convergence rate when comparing with the gradient descent algorithm, but it is only available for offline identification [31–34]. Inspired by the conjugate gradient descent algorithm, the focus of this paper is to propose a modified momentum gradient descent algorithm, which has a quicker convergence rate and no root calculation.
The remainder of this paper is organized as follows. Section 2 introduces the ARX model and the traditional SG algorithm. The multi-innovation stochastic gradient algorithm is presented in Section 3. In Section 4, a modified momentum gradient descent algorithm is developed. A simulation example is given in Section 5. Finally, the conclusions and future directions are summarized in Section 6.
2. Stochastic Gradient Descent Algorithm
Consider the following ARX model:where is the output, is the input, is the noise, and and are polynomials:
Let be the true value, be the estimated one:and is the information vector:
Define the cost function as follows:
To obtain the minimum value of , let the iteration function be
In order to get the minimum value of , useand let . The steepest descent algorithm can be obtained:
Remark 1. When is close to the true value, the calculated step size would be imprecise, which will cause the error to fluctuate. Therefore, the steepest descent algorithm is inefficient.
The SG algorithm proposed in the following can deal with this problem:
Remark 2. The step size will be reduced with the increase in time. When is close to the true value, the smaller step size reduces the fluctuation dramatically.
3. Two-Innovation Stochastic Gradient (TI-SG) Descent Algorithm
Because of the slow convergence rate of the SG algorithm, Ding proposed a multi-innovation stochastic gradient (MI-SG) algorithm in . As a special case of the MI-SG algorithm, when two sets of input-output data are performed in each iteration, we term it as two-innovation stochastic gradient (TI-SG) algorithm.
For the ARX model, two sets input-output data are collected in each iteration as follows:
Establishing the following two functions and , we get
We can calculate the negative gradient directions and , respectively:
The cost function is established as follows:
Let the iteration function be
Update the parameters , then the cost function is
There are two ways to calculate the step size :(1)The two-innovation stochastic gradient descent algorithm has the same step size as that in the SG algorithm. Let the initial value of the step size be 0. The TI-SG algorithm can be designed as(2)The other method is to calculate the optimal step size, which is called modified two-innovation stochastic gradient (MT-SG) descent algorithm.
Let equal 0, then
The MT-SG algorithm can be designed as
Remark 3. The traditional two-innovation algorithm and the modified two-innovation algorithm use two gradients and assume that the two gradient directions have the same step size. Although the computational effort is reduced, it is not optimal. Because each gradient direction plays a different role in estimating the parameters, it is necessary to consider assigning different weights to each gradient.
Remark 4. Compared with the traditional two-innovation method, the modified two-innovation method calculates the optimal step size in each sampling instant. Therefore, the modified two-innovation algorithm has a faster convergence rate but with heavier computational efforts.
4. Modified Momentum Gradient Descent Algorithm (MMG)
Before introducing the modified momentum gradient descent algorithm, we first introduce the conjugate gradient descent algorithm.
Assume that we have collected input-output data. The collected information vectors and outputs are and , respectively,
Set up the cost function as follows:
To calculate the minimum value of , simply make :
Let and . When the order of is greater than , it is easy to know that is a symmetric positive definite matrix, and and .
Using the conjugate gradient descent method to solve higher-order matrix equations, let , where is the current negative gradient direction. Reconcile the previous iteration direction with the current negative gradient direction as the new iteration direction , which is . Making and conjugate about , that is , we have
Let the iteration function bewhere is the step size and is the iteration direction, then
Calculating the minimum value of and lettingyield
The conjugate gradient descent algorithm can be designed as
Remark 5. Here is the negative gradient direction of the current position and is the direction of the last iteration. The current iteration direction is obtained based on and . Compared with the traditional gradient descent method, this method has a faster convergence rate but with heavier computational efforts.
Inspired by the conjugate gradient descent method, the modified momentum gradient descent algorithm is proposed. Its basic idea is to use two gradient directions in each iteration/sampling instant and then to assign different step sizes for each direction.
When using the TI-SG algorithm method, a set of repeated data during the neighbouring two sampling instants will be involved, which causes the step size unsolvable. To overcome this difficulty, a new method is developed. For the ARX model, collect two sets of information vectors and two outputs in each iteration as , and :Establish two cost functions and as follows:Using to calculate the negative gradient directions yieldsLet the iteration function beThen, the cost functions areLet , and all be equal to 0, thenLet and , we haveThe MMG algorithm is listed as follows:The MMG algorithm constitutes the following steps:(Algorithm 1)
Remark 6. In each iteration, the MMG algorithm uses two directions and assigns the optimal step size for each direction. Therefore, it has a quicker convergence rate. However, some iterative algorithms [35–38] and recursive algorithms [39–42] can be extended to study the parameter identification of the ARX models in this paper.
Consider the following ARX model:
The input data is a random sequence with a uniform distribution on , and is Gaussian white noise with . The simulation data are shown in Figure 1.
The relative errors of each element in the parameter vector by using these four algorithms are shown in Figure 3 (, 50, and 100).
Select 100 new data based on the true model, and use the estimated models by the SG, TI-SG, TI-SG and MMG algorithms to generate the predicted outputs, respectively. The errors between the true outputs and the predicted outputs are shown in Figure 4.
Finally, a Monte Carlo experiment is performed by using the MMG algorithm (100 sets noises), and the results are shown in Figure 5.
The following conclusions can be obtained:(1)It can be seen from Figures 2 and 3 and Tables 2 and 3 that the MT-SG algorithm has a significantly faster speed than the original TI-SG algorithm(2)From Figures 2 and 3 and Tables 1–4, we can see that the MMG algorithm has the fastest convergence rates among the four algorithms(3)Figure 4 demonstrates that the estimated model by using the MMG algorithm is the most accurate one among these four estimated models(4)Figure 5 shows that the MMG algorithm is robust to the noises
This paper proposes an improved gradient descent algorithm for ARX models based on the conjugate gradient descent method. Since two gradient directions and the two corresponding step sizes are involved in each iteration, the proposed algorithm has a quicker convergence rate. The simulation example shows the effectiveness of the proposed algorithm. This algorithm can increase the convergence rate and does not require root calculation. Therefore, it can combine other identification techniques [43–46] to study the parameter estimation issues of linear and nonlinear stochastic systems with colored noises [47–50] and can be extended to other literatures [51–54], such as signal modeling, parameter identification information processing, and engineering application systems [55–57].
Although the MMG algorithm is hoped to be a powerful tool for parameter identification, its convergence property is an open and challenging problem.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported by the National Natural Science Foundation of China (No. 61973137), the Fundamental Research Funds for the Central Universities (No. JUSRP22016), and the Funds of the Science and Technology on Near-Surface Detection Laboratory (No. TCGZ2019A001).
- F. Ding, “Coupled-least-squares identification for multivariable systems,” IET Control Theory & Applications, vol. 7, no. 1, pp. 68–79, 2013.
- F. Ding, G. Liu, and X. P. Liu, “Partially coupled stochastic gradient identification methods for non-uniformly sampled systems,” IEEE Transactions on Automatic Control, vol. 55, no. 8, pp. 1976–1981, 2010.
- F. Ding, Y. Wang, and J. Ding, “Recursive least squares parameter identification algorithms for systems with colored noise using the filtering technique and the auxilary model,” Digital Signal Processing, vol. 37, pp. 100–108, 2015.
- Y. Wang and F. Ding, “Novel data filtering based parameter identification for multiple-input multiple-output systems using the auxiliary model,” Automatica, vol. 71, pp. 308–313, 2016.
- J. Ding, F. Ding, X. P. Liu, and G. Liu, “Hierarchical least squares identification for linear SISO systems with dual-rate sampled-data,” IEEE Transactions on Automatic Control, vol. 56, no. 11, pp. 2677–2683, 2011.
- F. Ding, “Hierarchical multi-innovation stochastic gradient algorithm for Hammerstein nonlinear system modeling,” Applied Mathematical Modelling, vol. 37, no. 4, pp. 1694–1704, 2013.
- F. Ding, System Identification-Iterative Search Principle and Identification Methods, Science Press, Beijing, China, 2018.
- I. Naveros, C. Ghiaus, D. P. Ruíz, and S. Castaño, “Physical parameters identification of walls using ARX models obtained by deduction,” Energy and Buildings, vol. 108, no. 12, pp. 317–329, 2015.
- Y. Qin, H. Peng, F. Zhou, X. Zeng, and J. Wu, “Nonlinear modeling and control approach to magnetic levitation ball system using functional weight RBF network-based state-dependent ARX model,” Journal of the Franklin Institute, vol. 352, no. 10, pp. 4309–4338, 2015.
- R. Haddouche, B. Chetate, and M. S. Boumedine, “Neural network ARX model for gas conditioning tower,” International Journal of Modelling and Simulation, vol. 39, no. 3, pp. 166–177, 2019.
- J. Na, Z. Yang, S. Kamal, L. Hu, W. Wang, and Y. Zhou, “Bio-inspired learning and adaptation for optimization and control of complex systems,” Complexity, vol. 2019, Article ID 9325364, 3 pages, 2019.
- J. Na, Y. P. Li, Y. B. Huang, G. Gao, and Q. Chen, “Output feedback control of uncertain hydraulic servo systems,” IEEE Transactions on Industrial Electronics, vol. 67, no. 1, pp. 490–500, 2019.
- J. Na, B. Jing, Y. Huang, G. Gao, and C. Zhang, “Unknown system dynamics estimator for motion control of nonlinear robotic systems,” IEEE Transactions on Industrial Electronics, vol. 67, no. 5, pp. 3850–3859, 2020.
- Q. M. Zhu, “A back propagation algorithm to estimate the parameters of non-linear dynamic rational models,” Applied Mathematical Modelling, vol. 27, no. 3, pp. 169–187, 2003.
- Q. Zhu, D. Yu, and D. Zhao, “An enhanced linear Kalman filter (EnLKF) algorithm for parameter estimation of nonlinear rational models,” International Journal of Systems Science, vol. 48, no. 3, pp. 451–461, 2017.
- D. Wang, L. Mao, and F. Ding, “Recasted models-based hierarchical extended stochastic gradient method for MIMO nonlinear systems,” IET Control Theory & Applications, vol. 11, no. 4, pp. 476–485, 2017.
- D. Wang, S. Zhang, M. Gan, and J. Qiu, “A novel EM identification method for Hammerstein systems with missing output data,” IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2500–2508, 2020.
- J. Zhang, Q. Zhu, and Y. Li, “Convergence time calculation for supertwisting algorithm and application for nonaffine nonlinear systems,” Complexity, vol. 2019, Article ID 6235190, 15 pages, 2019.
- J. Zhang, Q. Zhu, Y. Li, and X. L. Wu, “Homeomorphism mapping based neural networks for finite time constraint control of a class of nonaffine pure-feedback nonlinear systems,” Complexity, vol. 2019, Article ID 9053858, 11 pages, 2019.
- D. Wang, L. Li, Y. Ji, and Y. Yan, “Model recovery for Hammerstein systems using the auxiliary model based orthogonal matching pursuit method,” Applied Mathematical Modelling, vol. 54, pp. 537–550, 2018.
- F. Ding, F. Wang, L. Xu, T. Hayat, and A. Alsaedi, “Parameter estimation for pseudo-linear systems using the auxiliary model and the decomposition technique,” IET Control Theory & Applications, vol. 11, no. 3, pp. 390–400, 2017.
- F. Ding, F. Wang, L. Xu, and M. Wu, “Decomposition based least squares iterative identification algorithm for multivariate pseudo-linear ARMA systems using the data filtering,” Journal of the Franklin Institute, vol. 354, no. 3, pp. 1321–1339, 2017.
- Y. Ji, X. Jiang, and L. Wan, “Hierarchical least squares parameter estimation algorithm for two-input Hammerstein finite impulse response systems,” Journal of the Franklin Institute, vol. 357, no. 8, pp. 5019–5032, 2020.
- M. H. Li and X. M. Liu, “Maximum likelihood least squares based iterative estimation for a class of bilinear systems using the data filtering technique,” International Journal of Control Automation and Systems, vol. 18, no. 6, pp. 1581–1592, 2020.
- F. Ding, “Two-stage least squares based iterative estimation algorithm for CARARMA system modeling,” Applied Mathematical Modelling, vol. 37, no. 7, pp. 4798–4808, 2013.
- F. Ding, “Decomposition based fast least squares algorithm for output error systems,” Signal Processing, vol. 93, no. 5, pp. 1235–1242, 2013.
- A. Jentzen and P. V. Wurstemberger, “Lower error bounds for the stochastic gradient descent optimization algorithm: sharp convergence rates for slowly and fast decaying learning rates,” Journal of Complexity, vol. 57, Article ID 101438, 2019.
- D. Wang, Y. Yan, Y. Liu, and J. Ding, “Model recovery for Hammerstein systems using the hierarchical orthogonal matching pursuit method,” Journal of Computational and Applied Mathematics, vol. 345, pp. 135–145, 2019.
- F. Ding and T. Chen, “Performance analysis of multi-innovation gradient type identification methods,” Automatica, vol. 43, no. 1, pp. 1–14, 2007.
- F. Ding, X. P. Liu, and G. Liu, “Multi-innovation least squares identification for linear and pseudo-linear regression models,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 40, no. 3, pp. 767–778, 2010.
- O. Kyohei, F. Susumu, and S. Shota, “Impact of novel incorporation of CT-based segment mapping into a conjugated gradient algorithm on bone SPECT imaging: fundamental characteristics of a context-specific reconstruction method,” Asia Oceania Journal of Nuclear Medicine & Biology, vol. 7, no. 1, pp. 49–57, 2019.
- I. S. Armstrong, “Spatial dependence of activity concentration recovery for a conjugate gradient (Siemens xSPECT) algorithm using manufacturer-defined reconstruction presets,” Nuclear Medicine Communications, vol. 40, no. 3, pp. 287–293, 2019.
- A. R. Heravi and G. A. Hodtani, “A new correntropy-based conjugate gradient backpropagation algorithm for improving training in neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 12, pp. 6252–6263, 2019.
- J. Hu and C. Ma, “Conjugate gradient least squares algorithm for solving the generalized coupled Sylvester-conjugate matrix equations,” Applied Mathematics and Computation, vol. 334, pp. 174–191, 2018.
- F. Ding, Y. Liu, and B. Bao, “Gradient-based and least-squares-based iterative estimation algorithms for multi-input multi-output systems,” Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, vol. 226, no. 1, pp. 43–55, 2012.
- F. Ding, X. Liu, and J. Chu, “Gradient-based and least-squares-based iterative algorithms for Hammerstein systems using the hierarchical identification principle,” IET Control Theory & Applications, vol. 7, no. 2, pp. 176–184, 2013.
- L. Xu, L. Chen, and W. Xiong, “Parameter estimation and controller design for dynamic systems from the step responses based on the Newton iteration,” Nonlinear Dynamics, vol. 79, no. 3, pp. 2155–2163, 2015.
- L. Xu, “The damping iterative parameter identification method for dynamical systems based on the sine signal measurement,” Signal Processing, vol. 120, pp. 660–667, 2016.
- L. Xu, “The parameter estimation algorithms based on the dynamical response measurement data,” Advances in Mechanical Engineering, vol. 9, no. 11, 2017.
- L. Xu, W. Xiong, A. Alsaedi, and T. Hayat, “Hierarchical parameter estimation for the frequency response based on the dynamical window data,” International Journal of Control, Automation and Systems, vol. 16, no. 4, pp. 1756–1764, 2018.
- F. Ding, “Combined state and least squares parameter estimation algorithms for dynamic systems,” Applied Mathematical Modelling, vol. 38, no. 1, pp. 403–412, 2014.
- X. Zhang and F. Ding, “Hierarchical parameter and state estimation for bilinear systems,” International Journal of Systems Science, vol. 51, no. 2, pp. 275–290, 2020.
- J. Pan, X. Jiang, X. Wan, and W. Ding, “A filtering based multi-innovation extended stochastic gradient algorithm for multivariable control systems,” International Journal of Control, Automation and Systems, vol. 15, no. 3, pp. 1189–1197, 2017.
- H. Ma, J. Pan, F. Ding, L. Xu, and W. Ding, “Partially-coupled least squares based iterative parameter estimation for multi-variable output-error-like autoregressive moving average systems,” IET Control Theory & Applications, vol. 13, no. 18, pp. 3040–3051, 2019.
- M. Li, X. Liu, and F. Ding, “The filtering-based maximum likelihood iterative estimation algorithms for a special class of nonlinear systems with autoregressive moving average noise using the hierarchical identification principle,” International Journal of Adaptive Control and Signal Processing, vol. 33, no. 7, pp. 1189–1211, 2019.
- Y. Zhang, X. Li, G. Zhao, B. Lu, and C. C. Cavalcante, “Signal reconstruction of compressed sensing based on alternating direction method of multipliers,” Circuits, Systems, and Signal Processing, vol. 39, no. 1, pp. 307–323, 2020.
- J. Pan, W. Li, and H. Zhang, “Control algorithms of magnetic suspension systems based on the improved double exponential reaching law of sliding mode control,” International Journal of Control, Automation and Systems, vol. 16, no. 6, pp. 2878–2887, 2018.
- Y. Chang, G. Zhai, B. Fu, and L. Xiong, “Quadratic stabilization of switched uncertain linear systems: a convex combination approach,” IEEE/CAA Journal of Automatica Sinica, vol. 6, no. 5, pp. 1116–1126, 2019.
- L. Tang, G. J. Liu, M. Yang, F. Li, F. Ye, and C. Li, “Joint design and torque feedback experiment of rehabilitation robot,” Advances in Mechanical Engineering, vol. 12, no. 5, 2020.
- T. Wu, F. Ye, Y. Su, Y. Wang, and S. Riffat, “Coordinated control strategy of DC microgrid with hybrid energy storage system to smooth power output fluctuation,” International Journal of Low-Carbon Technologies, vol. 15, no. 1, pp. 46–54, 2020.
- Y. Zhang, M. Huang, T. Wu, and F. Ji, “Reconfigurable equilibrium circuit with additional power supply,” International Journal of Low-Carbon Technologies, vol. 15, no. 1, pp. 106–111, 2020.
- L. Wang, H. Liu, L. Dai, and Y. Liu, “Novel method for identifying fault location of mixed lines,” Energies, vol. 11, no. 6, p. 1529, 2018.
- H. Liu, Q. Zou, and Z. Zhang, “Energy disaggregation of appliances consumptions using ham approach,” IEEE Access, vol. 7, pp. 185977–185990, 2019.
- N. Zhao, Y.-C. Liang, and Y. Pei, “Dynamic contract incentive mechanism for cooperative wireless networks,” IEEE Transactions on Vehicular Technology, vol. 67, no. 11, pp. 10970–10982, 2018.
- X. Zhao, Z. Lin, B. Fu, L. He, and N. Fang, “Research on automatic generation control with wind power participation based on predictive optimal 2-degree-of-freedom PID strategy for multi-area interconnected power system,” Energies, vol. 11, no. 12, p. 3325, 2018.
- F. Ding, X. Zhang, and L. Xu, “The innovation algorithms for multivariable state-space models,” International Journal of Adaptive Control and Signal Processing, vol. 33, no. 11, pp. 1601–1618, 2019.
- F. Ding, L. Lv, J. Pan, X. Wan, and X.-B. Jin, “Two-stage gradient-based iterative estimation methods for controlled autoregressive systems using the measurement data,” International Journal of Control, Automation and Systems, vol. 18, no. 4, pp. 886–896, 2020.
Copyright © 2020 Quan Tu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.