Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2014, Article ID 572092, 9 pages
Research Article

Iterative Selection of Unknown Weights in Direct Weight Optimization Identification

1School of Mechanical and Electronic Engineering, Jingdezhen Ceramic Institute, Jingdezhen 333403, China
2Faculty of Information Systems and Technologies, University of Donja Gorica, 81101 Podgorica, Montenegro

Received 5 December 2013; Revised 1 April 2014; Accepted 2 April 2014; Published 8 May 2014

Academic Editor: Rongni Yang

Copyright © 2014 Xiao Xuan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


To the direct weight optimization identification of the nonlinear system, we add some linear terms about input sequences in the former linear affine function so as to approximate the nonlinear property. To choose the two classes of unknown weights in the more linear terms, this paper derives the detailed process on how to choose these unknown weights from theoretical analysis and engineering practice, respectively, and makes sure of their key roles between the unknown weights. From the theoretical analysis, the added unknown weights’ auxiliary role can be known in the whole process of approximating the nonlinear system. From the practical analysis, we learn how to transform one complex optimization problem to its corresponding common quadratic program problem. Then, the common quadratic program problem can be solved by the basic interior point method. Finally, the efficiency and possibility of the proposed strategies can be confirmed by the simulation results.

1. Introduction

The theory of system identification can be divided into linear system and nonlinear system identification. In the classical reference [1], the identification of linear system is discussed in the time domain. Then, the whole system identification field can be divided into four procedures and the accuracy analyses corresponding to various identification algorithms are explained in the probability framework. The time domain identification can be extended to the frequency domain in [2]. Now, the research on the nonlinear system identification point out that the nonlinear system can be approximately regarded as a linear term adding a distortion term in [3]. All the nonlinear characteristic factors of the nonlinear system can be contained in this distortion term. In [4], many special nonlinear systems are studied, for example, Wiener system, Hammerstein system, and so forth. So, various identification methods are proposed to solve these nonlinear system identification problems, such as minimum probability method, covariance instrumental variable method, and blind maximum likelihood method. The most practical method that is used to identify the nonlinear system is the basis function method. After prior selecting a group of basis functions, the nonlinear system is approximatively expanded under the prior basis functions. In order to attain the required accuracy, let the approximate error between the expansion and nonlinear system converge to zero by adjusting the unknown weights of each basis function. In [5], the process about how to construct the orthonormal basis functions using some prior poles of the denominator is given.

Based on the idea of adjusting the unknown weights to improve the approximate accuracy with basis function, a new nonlinear system identification method-direct weight optimization was proposed in [6]. The main core is that firstly we select an estimator that is linear in the observed output data of the nonlinear system and the adjusted weights are contained in this linear affine function expression. When disturbance noise exists, we get an optimization problem under the condition of the optimum approximate error. The optimum adjusted weights are derived in theory through the classical optimality KKT condition. In [7], the basic idea of the new direct weight optimization is applied to identify each weight that exists in the piecewise affine system. In [8], the effect of the perturb from the direct weight optimization is analyzed. It points out that when one parameter’s perturb range tends to infinity, the solution can be expressed as a piecewise linear solution path.

Based on the foundation idea of the references, we directly collect not only the observed output sequences but also the input sequences. Because the input sequences can be designed freely. So, the two sequences are all known as the prior information. From all above descriptions, we add the observed output and input sequences in the linear affine function simultaneously. Then, there exist two kinds of unknown weights about each observed input-output sequences. When compared with [3], many unknown weights corresponding to the all input sequences are added. These unknown weights can not only alleviate the dependence coming from the unknown weights of the only observed output sequences but also avoid negative effect from the perturbance. After adding some linear terms about the input sequences, the expected minimal mean square error is adopted as a criterion function to select those unknown weights. In the optimization problem of solving those unknown weight, the contribution of this paper is to deduce the selection strategy from the theory and engineering practice, respectively. We gain the unknown adjusted weights using optimality KKT sufficient and necessary condition and find that the second unknown weights that correspond to the observed output sequences are easy to get. Their concrete expressions of the second unknown weights do not depend on the first unknown weights corresponding to the input sequences. The whole selection process tells us that the second unknown weights undertake the key roles and the first unknown weights undertake the auxiliary roles. But this auxiliary effect coming from the first unknown weights may not be neglected.

This paper is organized as follows. In Section 2, we describe the problem discussed in this paper. In Section 3, we propose to add the input sequences to the linear affine function and derive an upper bound value of the objective function. In Section 4, we derive two kinds of unknown weights by using optimality KKT condition from [9]. In Section 5, the interior point algorithm is applied to solve a quadratic programming problem to get the unknown weights. The convergences of the two methods are analyzed, respectively, in Section 6. In Section 7, the numerical simulation results are given to validate the efficiency. Finally, the conclusions are drawn in Section 8.

2. Problem Description

Given the observed data from the nonlinear system, where is an unknown nonlinear system which need to be identified, is called the regression vector and is an independent zero mean stochastic white noise with variance . When the regression vector is chosen as the following form, the nonlinear system is called an exogenous input model: Suppose a linear affine function is used to approximate the nonlinear system as follows: In (3), a linear term comprised of terms of input sequences is added. Then, we identify more unknown weights additionally. As the approximation performance depends tightly on the unknown weights. The main goal of this paper is to determine a parameter vector which is consisted of unknown weights:

3. Direct Weight Optimization Identification

As the nonlinear system is approximated by the linear affine function , we want to find a linear affine function at an arbitrarily given point . The approximation accuracy depends on the weights and . A most commonly used criterion function would be the mean square error: Substituting (3) into (5), we obtain Substituting (1) into (6), the objection function is expanded to the following expression: To simplify the description, we introduce the notation . Then, after adding and subtracting the same two terms, the equality is not changed. Consider In (8), the square term is called the square bias term and the last term is the variance error term caused by the unmodeled factor. From (8), we see that the bias term will be arbitrarily large, unless we impose two constraint conditions of the unknown weights : Under (9), the objective function can be simplified to the following expression: Expanding the nonlinear system with Taylor series around gives Assume that the nonlinear function satisfies the following Lipschitz condition: where is a constant; letting us combine the above three formulas, we obtain an upper bound on the mean square error (10). Consider

The minimum mean square error expectations can be converted to the minimum upper bound value of the right side in (13). Hence, an optimization problem is getting

Because an additional term exists in (14), so the complexity of this paper increases.

4. Optimality KKT Sufficient and Necessary Condition

Notice that there exist some absolute operations in (14). Some slack variables , are introduced to eliminate the absolute operations as follows: Using these slack variables , in (14), the optimization problem can be formulated as

Now, the next problem is to solve the solutions of the optimization problem (16) Applying the optimality KKT sufficient and necessary condition to (16), the Lagrangian function is written as where and are the Lagrangian multipliers corresponding to the equality constraint and andare the Lagrangian multiplier vectors corresponding to the inequality constraint

From the optimality KKT condition, we find the equality relations for the optimal solution as follows: Through analyzing many subformulas in (20), we find many implicit optimal equalities: From the first subformula in (20), we see that . Further, if in the ninth subformula in (20), then we see that . The ninth subformula holds even when , so from the first subformula we derive that In the second subformula in (20), if , it implies that If the eighth subformula in (20) holds, we make and, from the first subformula, we see that .

When all the equalities hold, it means all unknown weights of the input sequences are equal to zeros. Synthesizing two cases and , we obtain that Substituting (24) into the each subformula in (20), every subformula in (20) can be simplified The equality relations represented by the fourth and fifth subformula in (25) are completely implied in the constructed Lagrangian function. Substituting the third subformula into the second subformula, we get When , from the seventh subformula in (25), we get .

If the seventh subformula holds, let . Substituting in the first subformula, we get Substituting the above equality into (26), the following equality holds: When considering , we get Formulating the above the equality relations, we get All the above give us how to solve the unknown weights . Substituting (28) into the third subformula in (25), we see that The following three equations are established: From (32), we can see that Then,

Generally when considered in the complex domain, it is easy to get that as represents the amplitude value of the input excite signal. When this amplitude is chosen to be constant , then (35) implies

In the linear algebra from [10], the commonly used selection method is to impose a constrained condition about the unknown weights in order to guarantee uniqueness To eliminate the absolute notation in (36), assume that the former weights are positive and the latter weights are negative. Thus, we get In the singular degradation linear equation (38), we get a group of unknown weight sequences through selecting free variables.

5. Solve the Unknown Weights Iteratively

To solve the unknown weights iteratively from the practice point, suppose in (16), and there exists three kinds of variables as the decision variables:,,.

For convenience, introduce a column vector whose dimension is . Consider Formulating inequalities constrained conditions in (16) to a matrix product form, where is an zero vector, and denoting the above equality’s left hand as matrix , is . The inequality constrain conditions can be simplified Similarly, the two equalities constrained conditions can be simplified to the matrix product form as follows: where 0 is a zero vector, and denoting the above equality’s left hand as matrix , is . The equality constraint conditions can be simplified It is obvious that the second term of the objective function can be rewritten as Furthermore, the computation in the bracket of the objective function can be rewritten as Squaring (45), we get Combining (41), (43), (44), and (46), a new optimization problem is to get as the new objective function (47) is a quadratic function about decision variable . Also, the inequality and equality constraints are linear functions about . Generally, (47) is a quadratic programming problem. The interior point method is applied to solve it.

Defining the Lagrangian function according to (47), Setting the partial derivative with respect to the fact that is zero, we get the equality Introducing a slack variable to eliminate the inequality constraint , we rewrite (49) Suppose that the matrix comprised by (50) is where The constrained minimum is solved by updating unknown vector iteratively. This minimum solution is the stationary point of the Lagrangian function. During the minimal process, a new iteration value is updated by adding a correct term to the current estimation. When applying the constrained Gauss-Newton method, the must satisfy the solution of the following equality: At time , the new iterate is defined as the vector where the step length of the search direction must satisfy the following inequality: The search direction is determined by (53). We may add a Levenberg-Marquardt parameter based on in order to avoid the singular phenomenon. It makes the left top corner matrix of the left matrix in (53) change to the matrix . So, it can guarantee that an inverse matrix exists and its inverse matrix is definite and bounded.

6. Algorithms Analysis

Now, we analyze the convergences of the two algorithms (20) and (54), respectively. From Sections 4 and 5, we see that the solution of (20) is derived from the optimality KKT sufficient and necessary condition and the solution of (54) is an iterative solution.

According to the optimality KKT necessary and sufficient condition which is similar to [11], the convergence of the algorithm used to identify the unknown weights is given.

Theorem 1. Assume that is a solution of the quadratic programming problem (47) which satisfies the optimality KKT necessary and sufficient condition (20). If Matrix is positive semidefinite for some Lagrangian multipliers and , then is a global solution of quadratic programming problem (47).

Proof. If is any other feasible point for (47), we have that for all . Hence, using the optimality KKT necessary and sufficient condition, we have that By elementary manipulation, we find that where the first inequality follows from (56) and the second inequality follows from positive semidefinite of . We have shown that for any feasible , so is a global solution.
Theorem 1 tells us that if a solution which satisfies all the equality (20) can be found, then it will be a global solution for the original quadratic problem.
When the interior point algorithm is applied to solve (47) iteratively, its convergence conclusion can be gotten.

Theorem 2. Suppose that quadratic function and linear function , are all continuous second differentiable functions in a neighborhood of a regular stationary point with associated multipliers ,.
Suppose also that the functions , used to set the value of satisfy , and are continuous at . Then, there exists a neighborhood of such that if the first iterates, the above interior point algorithm is well defined and generates a sequence iteratively by (54) converging superlinearly into .

Proof. Simplifying (53) to emphasize the iterative number, the linear system (53) can be written as If is in some neighborhood of the regular stationary point , with associated multipliers , satisfies .
Furthermore, is nonsingular and has a bounded inverse on that neighborhood. With the notation and with the objective and constraint functions that are all continuous second differentiable functions, we have Using and taking norms, we get where is a positive constant. Since is continuous at and the last estimate gives , it implies the superlinear convergence of to and

7. Simulation Example

As the nonlinear system can be approximated by a linear affine function using direct weight optimization method, we apply this idea to approximate the Stribeck nonlinear friction which appears in the flight simulation turntable system.

The Stribeck nonlinear friction model is described as where is the maximum static friction force, is coulomb friction force, is a viscous friction coefficient, and is the critical Stribeck speed. Let us regard in (56) as in (1) and apply the new linear affine function to approximate the Stribeck nonlinear friction model as follows: where is treated as the input signal. We minimize the performance function (10) to obtain the unknown parameter vector . The interior point algorithm is applied to solve it and the number of is selected by trying test method. When is increased to some fixed value, we survey whether the performance index function will not change much. If not, then this fixed value is the number of . Next, we make some simulations on the Stribeck nonlinear friction.

In Figure 1, we plot the relation curve between the friction force and the speed under sine position input signal. We compare the three curves of the true nonlinear friction with the proposed method, classical method. In Figure 1, the black curve represents the true nonlinear friction force, the green represents the linear affine curve proposed by our method, and the red curve represents the curve designed by [3]. From Figure 1, when the speed is low, the difference is very much obvious. But if the speed is increased, the black and green curve will coincide and the red curve starts to flutter away the black curve. It means that the relationship between the true nonlinear friction force and the linear affine friction force derived from our method will be equal. Then, if the speed is chosen sufficiently high, this paper’s linear affine friction force can be used to replace the true nonlinear friction force. To the classical method, it should spend more time to approximate the true nonlinear friction force.

Figure 1: The relations between the friction force and the speed under sine position input signal.

In Figure 2, we plot the relation curve between the friction force and the speed under slope position input signal in the flight simulation turntable. From Figure 2, we see that from the beginning, the linear affine function derived by our method can tightly approximate the nonlinear friction force and it has little swing. But to the classical method, the error is high even from the beginning and in the approximation process the curve has much more swings.

Figure 2: The relations between the friction force and the speed under slope position input signal.

We plot the crawl phenomenon under slope position input signal in Figure 3. From Figure 3, each output corresponding to the nonlinear friction model is full of many irregular curves. And each output corresponding to the linear affine function model is full of many piecewise lines. The embodiment of the approximation is to use these piecewise lines to approximate the irregular curve at different time periods. In every time period, the approximation error is defined as the derivation between the line and the corresponding curve. At the beginning, this deviation error is bigger. As the time goes, the lines are close to the curve and the approximate error is small.

Figure 3: The crawl phenomenon under slope position input signal.

8. Conclusion

This paper derives how to choose the unknown weights from the theory and engineering, respectively, in the improved direct weight optimization method. Because the input sequences should be designed to sufficiently excite the nonlinear system, further research on the optimal input signal design must be dealt with in future.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This work was supported by the Grants from the National Science Foundation of China (no. 31260273), the China-Montenegro Intergovernmental S&T Cooperation, and the JiangXI Provincial Foundation for Leaders of Disciplines in Science (20113BCB22008).


  1. L. Ljung, System Identification: Theory for the User, Prentice Hall, Upper Saddle River, NJ, USA, 1999.
  2. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004. View at Publisher · View at Google Scholar · View at MathSciNet
  3. J. Roll, A. Nazin, and L. Ljung, “Nonlinear system identification via direct weight optimization,” Automatica, vol. 41, no. 3, pp. 475–490, 2005. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  4. J. Roll, A. Bemporad, and L. Ljung, “Identification of piecewise affine systems via mixed-integer programming,” Automatica, vol. 40, no. 1, pp. 37–50, 2004. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  5. J. Roll, “Piecewise linear solution paths with application to direct weight optimization,” Automatica, vol. 44, no. 11, pp. 2745–2753, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  6. S. Paoletti, J. Roll, A. Garulli, and A. Vicino, “On the input-output representation of piecewise affine state space models,” IEEE Transactions on Automatic Control, vol. 55, no. 1, pp. 60–73, 2010. View at Publisher · View at Google Scholar · View at MathSciNet
  7. E.-W. Bai and Y. Liu, “Recursive direct weight optimization in nonlinear system identification: a minimal probability approach,” IEEE Transactions on Automatic Control, vol. 52, no. 7, pp. 1218–1231, 2007. View at Publisher · View at Google Scholar · View at MathSciNet
  8. R. Pintelon and J. Schoukens, System Identification: A Frequency Domain Approach, IEEE Press, New York, NY, USA, 2001.
  9. J. Nocedal and S. J. Wright, Numerical Optimization, Springer Series in Operations Research and Financial Engineering, Springer, New York, NY, USA, 2nd edition, 2006. View at MathSciNet
  10. M. N. Zeilinger, C. N. Jones, and M. Morari, “Real-time suboptimal model predictive control using a combination of explicit MPC and online optimization,” IEEE Transactions on Automatic Control, vol. 56, no. 7, pp. 1524–1534, 2011. View at Publisher · View at Google Scholar · View at MathSciNet
  11. A. Van Mulders, J. Schoukens, and L. Vanbeylen, “Identification of systems with localised nonlinearity: from state-space to block-structured models,” Automatica, vol. 49, no. 5, pp. 1392–1396, 2013. View at Publisher · View at Google Scholar · View at MathSciNet