Abstract
We propose a new type of neural adaptive control via dynamic neural networks. For a class of unknown nonlinear systems, a neural identifier-based feedback linearization controller is first used. Dead-zone and projection techniques are applied to assure the stability of neural identification. Then four types of compensator are addressed. The stability of closed-loop system is also proven.
1. Introduction
Feedback control of the nonlinear systems is a big challenge for engineer, especially when we have no complete model information. A reasonable solution is to identify the nonlinear, then a adaptive feedback controller can be designed based on the identifier. Neural network technique seems to be a very effective tool to identify complex nonlinear systems when we have no complete model information or, even, consider controlled plants as “black box”.
Neuroidentifier could be classified as static (feed forward) or as dynamic (recurrent) ones [1]. Most of publications in nonlinear system identification use static networks, for example multilayer perceptrons, which are implemented for the approximation of nonlinear function in the right-side hand of dynamic model equations [2]. The main drawback of these networks is that the weight updating utilize information on the local data structures (local optima) and the function approximation is sensitive to the training dates [3]. Dynamic neural networks can successfully overcome this disadvantage as well as present adequate behavior in presence of unmodeled dynamics because their structure incorporate feedback [4–6].
Neurocontrol seems to be a very useful tool for unknown systems, because it is model-free control, that is, this controller does not depend on the plant. Many kinds of neurocontrol were proposed in recent years, for example, supervised neuro control [7] is able to clone the human actions. The neural network inputs correspond to sensory information perceived by the human, and the outputs correspond to the human control actions. Direct inverse control [1] uses an inverse model of the plant cascaded with the plant, so the composed system results in an identity map between the desired response and the plant one, but the absence of feedback dismisses its robustness; internal model neurocontrol [8] that used forward and inverse model is within the feedback loop. Adaptive neurocontrol has two kinds of structure: indirect and direct adaptive control. Direct neuroadaptive may realize the neurocontrol by neural network directly [1]. The indirect method is the combination of the neural network identifier and adaptive control, the controller is derived from the on-line identification [5].
In this paper we extend our previous results in [9, 10]. In [9], the neurocontrol was derived by gradient principal, so the neural control is local optimal. No any restriction is needed, because the controller did not include the inverse of the weights. In [10], we assume the inverse of the weights exists, so the learning law was normal. The main contributions of this paper are (1) a special weights updating law is proposed to assure the existence of neurocontrol. (2) Four different robust compensators are proposed. By means of a Lyapunov-like analysis, we derive stability conditions for the neuroidentifier and the adaptive controller. We show that the neuroidentifier-based adaptive control is effective for a large classes of unknown nonlinear systems.
2. Neuroidentifier
The controlled nonlinear plant is given as where is unknown vector function. In order to realize indirect neural control, a parallel neural identifier is used as in [9, 10] (in [5] the series-parallel structure is used): where is the state of the neural network, are the weight matrices, is a stable matrix. The vector functions, is a diagonal matrix. Function is selected as ., for example may be linear saturation function, The elements of the weight matrices are selected as monotone increasing functions, a typical presentation is sigmoid function: where . In order to avoid , we select
Remark 1. The dynamic neural network (2) has been discussed by many authors, for example [4, 5, 9, 10]. It can be seen that Hopfield model is the special case of this networks with and and are the resistance and capacitance at the th node of the network, respectively.
Let us define identification error as Generally, dynamic neural network (2) cannot follow the nonlinear system (1) exactly. The nonlinear system may be written as where and are initial matrices of and and are prior known matrices, vector function can be regarded as modelling error and disturbances. Because and are chosen as sigmoid functions, clearly they satisfy the following Lipschitz property: where , , , , and are known positive constants matrices. The error dynamic is obtained from (2) and (7): where . As in [4, 5, 9, 10], we assume modeling error is bounded.(A1) the unmodeled dynamic satisfies is a known positive constants matrix.
If we define and the matrices and are selected to fulfill the following conditions:the pair () is controllable, the pair () is observable, local frequency condition [9] satisfies frequency condition: then the following assumption can be established. (A2) There exist a stable matrix and a strictly positive definite matrix such that the matrix Riccati equation: has a positive solution .
This condition is easily fulfilled if we select as stable diagonal matrix. Next Theorem states the learning procedure of neuroidentifier.
Theorem 2. Subject to assumptions A1 and A2 being satisfied, if the weights and are updated as where is the solution of Riccati equation (14), are projection functions which are defined as where the “condition” is or is a positive constant. is a dead-zone function then the weight matrices and identification error remain bounded, that is, for any the identification error fulfills the following tracking performance: where is the condition number of defined as .
Proof. Select a Lyapunov function as
where is positive definite matrix. According to (10), the derivative is
Since is scalar, using (9) and matrix inequality
where are any matrices, is any positive definite matrix, we obtain
In view of the matrix inequality (22) and (A1),
So we have
Since and , if we use (A2), we have
(I)if , using the updating law as (15) we can conclude that
(a)if or ,(b)if and is bounded. Integrating (27) from 0 up to yields
Because , we have
where is condition number of (II)If , the weights become constants, remains bounded. And
From (I) and (II), is bounded, (18) is realized. From (20) and , we know . Using (30) and (31), (19) is obtained. The theorem is proved.
Remark 3. The weight update law (15) uses two techniques. The dead-zone is applied to overcome the robust problem caused by unmodeled dynamic . In presence of disturbance or unmodeled dynamics, adaptive procedures may easily go unstable. The lack of robustness of parameters identification was demonstrated in [11] and became a hot issue in 1980s. Dead-zone method is one of simple and effective tool. The second technique is projection approach which may guarantee that the parameters remain within a constrained region and do not alter the properties of the adaptive law established without projection [12]. The projection approach proposed in this paper is explained in Figure 1. We hope to force inside the ball of center and radius . If , we use the normal gradient algorithm. When is on the ball, and the vector points either inside or along the ball, that is, , we also keep this algorithm. If , so , are directed toward the inside or the ball, that is, will never leave the ball. Since ,.
Remark 4. Figure 1 and (7) show that the initial conditions of the weights influence identification accuracy. In order to find good initial weights, we design an offline method. From above theorem, we know the weights will convergence to a zone. We use any initial weights, and , after , the identification error should become smaller, that is, and are better than and . We use following steps to find the initial weights.(1)Start from any initial value for .(2)Do identification until training time arrives .(3)If the , let as a new and , go to 2 to repeat the identification process.(4)If the , stop this offline identification, now are the final initial weights.
Remark 5. Since the updating rate is (), and can be selected as any positive matrix, the learning process of the dynamic neural network (15) is free of the solution of Riccati equation (14).
Remark 6. Let us notice that the upper bound (19) turns out to be ‘‘sharp’’, that is, in the case of not having any uncertainties (exactly matching case: ) we obtain and, hence, from which, for this special situation, the asymptotic stability property () follows. In general, only the asymptotic stability ‘‘in average’’ is guaranteed, because the dead-zone parameter can be never set zero.
3. Robust Adaptive Controller Based on Neuro Identifier
From (7) we know that the nonlinear system (1) may be modeled as
Equation (33) can be rewritten as where If updated law of and is (15), and are bounded. Using the assumption (A1), is bounded as .
The object of adaptive control is to force the nonlinear system (1) following a optimal trajectory which is assumed to be smooth enough. This trajectory is regarded as a solution of a nonlinear reference model: with a fixed initial condition. If the trajectory has points of discontinuity in some fixed moments, we can use any approximating trajectory which is smooth. In the case of regulation problem , , is constant. Let us define the sate trajectory error as From (34) and (36) we have Let us select the control action as linear form where is direct control part and is a compensation of unmodeled dynamic . As , and are available, we can select as Because in (5) is different from zero, and by the projection approach in Theorem 2. Substitute (39) and (40) into (38), we have So the error equation is Four robust algorithms may be applied to compensate .
(A) Exactly Compensation
From (7) and (2) we have
If is available, we can select as , that is,
So, the ODE which describes the state trajectory error is
Because is stable, is globally asymptotically stable.
(B) An Approximate Method
If is not available, an approximate method may be used as
where , is the differential approximation error. Let us select the compensator as
So , (44) become
Define Lyapunov-like function as
The time derivative of (49) is
can be estimated as
where is any positive define matrix. So (50) becomes
where is any positive define matrix. Because is stable, there exit and such that the matrix Riccati equation:
has positive solution . Defining the following seminorms:
where is the given weighting matrix, the state trajectory tracking can be formulated as the following optimization problem:
Note that
based on the dynamic neural network (2), the control law (47) can make the trajectory tracking error satisfies the following property:
A suitable selection of and can make the Riccati equation (53) has positive solution and make small enough if is small enough.
(C) Sliding Mode Compensation
If is not available, the sliding mode technique may be applied. Let us define Lyapunov-like function as
where is a solution of the Lyapunov equation:
Using (41) whose time derivative is
According to sliding mode technique, we may select as
where is positive constant,
Substitute (59) and (61) into (60)
If we select
where is define as (35), then . So,
(D) Local Optimal Control
If is not available and is not approximated as (B). In order to analyze the tracking error stability, we introduce the following Lyapunov function:
Using (41), whose time derivative is
can be estimated as
Substituting (68) in (67), adding and subtracting the term and with and , we formulate
Because is stable, there exit and such that the matrix Riccati equation:
So (69) is
where
We reformulate (71) as
Then, integrating each term from 0 to , dividing each term by , and taking the limit, for of these integrals’ supreme, we obtain
In the view of definitions of the seminorms (55), we have
It fixes a tolerance level for the trajectory-tracking error. So, the control goal now is to minimize and . To minimize , we should minimize . From (13), if select to make (70) have solution, we can choose the minimal as
To minimizing , we assume that, at the given (positive), and are already realized and do not depend on . We name the as the locally optimal control, because it is calculated based only on “local” information. The solution of this optimization problem is given by
It is typical quadratic programming problem. Without restriction is selected according to the linear squares optimal control law:
Remark 7. Approaches (A) and (C) are exactly compensations of , Approach (A) needs the information of . Because Approach (C) uses the sliding mode control that is inserted in the closed-loop system, chattering occurs in the control input which may excite unmodeled high-frequency dynamics. To eliminate chattering, the boundary layer compensator can be used, it offers a continuous approximation to the discontinuous sliding mode control law inside the boundary layer and guarantees the output tracking error within any neighborhood of the origin [13].
Finally, we give following design steps for the robust neurocontrollers proposed in this paper.(1)According to the dimension of the plant (1), design a neural networks identifier (2) which has the same dimension as the plant. In (2), can be selected a stable matrix. will influence the dynamic response of the neural network. The bigger eigenvalues of will make the neural network slower. The initial conditions for and are obtained as in Remark 4.(2)Do online identification. The learning algorithm is (15) with the dead zone in Theorem 2. We assume we know the upper bound of modeling error, we can give a value for . is chosen such that Riccati equation (14) has positive defined solution, can be selected as any positive defined matrix because is arbitrary positive defined matrix. The updating rate in the learning algorithm (15) is , and can be selected as any positive defined matrix, so the learning process is free of the solution of the Riccati equations (14). The larger is selected, the faster convergence the neuroidentifier has.(3)Use robust control (39) and one of compensation of (43), (47), (61), and (78).
4. Simulation
In this section, a two-link robot manipulator is used to illustrate the proposed approach. Its dynamics of can be expressed as follows [14]: where consists of the joint variables, denotes the links velocity, is the generalized forces, is the intertie matrix, is centripetal-Coriolis matrix, and is gravity vector, is the friction vector. represents the positive defined inertia matrix. If we define is joint position, is joint velocity of the link, , (79) can be rewritten as state space form [15]: where is control input, Equation (80) can also be rewritten as So the dynamic of the two-link robot (79) is in form of (1) with The values of the parameters are listed below: , , , , . Let define , and , the neural network for control is represented as We select where . We used Remark 4 to obtain a suitable and , start from random values, . After 2 loops, does not decrease, we let the and as the new and . For the update laws (15), we select , , . If we select the generalized forces as
Now we check the neurocontrol. We assume the robot is changed at , after that , , and the friction becomes disturbance as is a positive constant. We compare neurocontrol with a PD control as where is square wave. So .
The neurocontrol is (39) is selected to compensate the unmodeled dynamics. Sine is unknown method. (A) exactly compensation, cannot be used.
(B) . The link velocity is measurable, as in (43), The results are shown in Figures 2 and 3.
(C) is not available, the sliding mode technique may be applied. we select as (61). The results are shown in Figures 4 and 5.
(D) . We select , , , the solution of following Riccati equation: is . If without restriction , the linear squares optimal control law: The results of local optimal compensation are shown in Figures 6 and 7.
We may find that the neurocontrol is robust and effective when the robot is changed.
5. Conclusion
By means of Lyapunov analysis, we establish bounds for both the identifier and adaptive controller. The main contributions of our paper is that we give four different compensation methods and prove the stability of the neural controllers.