About this Journal Submit a Manuscript Table of Contents
Journal of Control Science and Engineering
Volume 2012 (2012), Article ID 867178, 9 pages
doi:10.1155/2012/867178
Research Article

Robust Adaptive Control via Neural Linearization and Compensation

Departamento de Control Automatico, CINVESTAV-IPN, Avenue.IPN 2508, 07360 Mexico City, DF, Mexico

Received 6 October 2011; Revised 4 January 2012; Accepted 5 January 2012

Academic Editor: Isaac Chairez

Copyright © 2012 Roberto Carmona Rodríguez and Wen Yu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We propose a new type of neural adaptive control via dynamic neural networks. For a class of unknown nonlinear systems, a neural identifier-based feedback linearization controller is first used. Dead-zone and projection techniques are applied to assure the stability of neural identification. Then four types of compensator are addressed. The stability of closed-loop system is also proven.

1. Introduction

Feedback control of the nonlinear systems is a big challenge for engineer, especially when we have no complete model information. A reasonable solution is to identify the nonlinear, then a adaptive feedback controller can be designed based on the identifier. Neural network technique seems to be a very effective tool to identify complex nonlinear systems when we have no complete model information or, even, consider controlled plants as “black box”.

Neuroidentifier could be classified as static (feed forward) or as dynamic (recurrent) ones [1]. Most of publications in nonlinear system identification use static networks, for example multilayer perceptrons, which are implemented for the approximation of nonlinear function in the right-side hand of dynamic model equations [2]. The main drawback of these networks is that the weight updating utilize information on the local data structures (local optima) and the function approximation is sensitive to the training dates [3]. Dynamic neural networks can successfully overcome this disadvantage as well as present adequate behavior in presence of unmodeled dynamics because their structure incorporate feedback [46].

Neurocontrol seems to be a very useful tool for unknown systems, because it is model-free control, that is, this controller does not depend on the plant. Many kinds of neurocontrol were proposed in recent years, for example, supervised neuro control [7] is able to clone the human actions. The neural network inputs correspond to sensory information perceived by the human, and the outputs correspond to the human control actions. Direct inverse control [1] uses an inverse model of the plant cascaded with the plant, so the composed system results in an identity map between the desired response and the plant one, but the absence of feedback dismisses its robustness; internal model neurocontrol [8] that used forward and inverse model is within the feedback loop. Adaptive neurocontrol has two kinds of structure: indirect and direct adaptive control. Direct neuroadaptive may realize the neurocontrol by neural network directly [1]. The indirect method is the combination of the neural network identifier and adaptive control, the controller is derived from the on-line identification [5].

In this paper we extend our previous results in [9, 10]. In [9], the neurocontrol was derived by gradient principal, so the neural control is local optimal. No any restriction is needed, because the controller did not include the inverse of the weights. In [10], we assume the inverse of the weights exists, so the learning law was normal. The main contributions of this paper are (1) a special weights updating law is proposed to assure the existence of neurocontrol. (2) Four different robust compensators are proposed. By means of a Lyapunov-like analysis, we derive stability conditions for the neuroidentifier and the adaptive controller. We show that the neuroidentifier-based adaptive control is effective for a large classes of unknown nonlinear systems.

2. Neuroidentifier

The controlled nonlinear plant is given as ̇ 𝑥 𝑡 𝑥 = 𝑓 𝑡 , 𝑢 𝑡 , 𝑡 , 𝑥 𝑡 𝑛 , 𝑢 𝑡 𝑛 , ( 1 ) where 𝑓 ( 𝑥 𝑡 ) is unknown vector function. In order to realize indirect neural control, a parallel neural identifier is used as in [9, 10] (in [5] the series-parallel structure is used): ̇ ̂ 𝑥 𝑡 = 𝐴 ̂ 𝑥 𝑡 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 , ( 2 ) where ̂ 𝑥 𝑡 𝑛 is the state of the neural network, 𝑊 1 , 𝑡 , 𝑊 2 , 𝑡 𝑛 × 𝑛 are the weight matrices, 𝐴 𝑛 × 𝑛 is a stable matrix. The vector functions 𝜎 ( ) 𝑛 ,   𝜙 ( ) 𝑛 × 𝑛 is a diagonal matrix. Function 𝛾 ( ) is selected as 𝛾 ( 𝑢 𝑡 ) 2 𝑢 ., for example 𝛾 ( ) may be linear saturation function, 𝛾 𝑢 𝑡 = 𝑢 𝑡 | | 𝑢 , i f 𝑡 | | < 𝑏 , | | 𝑢 𝑢 , i f 𝑡 | | 𝑏 . ( 3 ) The elements of the weight matrices are selected as monotone increasing functions, a typical presentation is sigmoid function: 𝜎 𝑖 ̂ 𝑥 𝑡 = 𝑎 𝑖 1 + 𝑒 𝑏 𝑖 ̂ 𝑥 𝑡 𝑐 𝑖 , ( 4 ) where 𝑎 𝑖 , 𝑏 𝑖 , 𝑐 𝑖 > 0 . In order to avoid 𝜙 ( ̂ 𝑥 𝑡 ) = 0 , we select 𝜙 𝑖 ̂ 𝑥 𝑡 = 𝑎 𝑖 1 + 𝑒 𝑏 𝑖 ̂ 𝑥 𝑡 + 𝑐 𝑖 . ( 5 )

Remark 1. The dynamic neural network (2) has been discussed by many authors, for example [4, 5, 9, 10]. It can be seen that Hopfield model is the special case of this networks with 𝐴 = d i a g { 𝑎 𝑖 } , 𝑎 𝑖 = 1 / 𝑅 𝑖 𝐶 𝑖 , 𝑅 𝑖 > 0 and 𝐶 𝑖 > 0 . 𝑅 𝑖 and 𝐶 𝑖 are the resistance and capacitance at the i th node of the network, respectively.

Let us define identification error as Δ 𝑡 = ̂ 𝑥 𝑡 𝑥 𝑡 . ( 6 ) Generally, dynamic neural network (2) cannot follow the nonlinear system (1) exactly. The nonlinear system may be written as ̇ 𝑥 𝑡 = 𝐴 𝑥 𝑡 + 𝑊 0 1 𝜎 𝑥 𝑡 + 𝑊 0 2 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 𝑓 𝑡 , ( 7 ) where 𝑊 0 1 and 𝑊 0 2 are initial matrices of 𝑊 1 , 𝑡 and 𝑊 2 , 𝑡 𝑊 0 1 Λ 1 1 𝑊 1 0 𝑇 𝑊 1 , 𝑊 0 2 Λ 2 1 𝑊 2 0 𝑇 𝑊 2 . ( 8 ) 𝑊 1 and 𝑊 2 are prior known matrices, vector function 𝑓 𝑡 can be regarded as modelling error and disturbances. Because 𝜎 ( ) and 𝜙 ( ) are chosen as sigmoid functions, clearly they satisfy the following Lipschitz property: 𝜎 𝑇 Λ 1 𝜎 Δ 𝑇 𝑡 𝐷 𝜎 Δ 𝑡 , 𝜙 𝑡 𝛾 ( 𝑢 𝑡 ) 𝑇 Λ 2 𝜙 𝑡 𝛾 𝑢 𝑡 𝑢 Δ 𝑇 𝑡 𝐷 𝜙 Δ 𝑡 , ( 9 ) where 𝜎 = 𝜎 ( ̂ 𝑥 𝑡 ) 𝜎 ( 𝑥 𝑡 ) , 𝜙 = 𝜙 ( ̂ 𝑥 𝑡 ) 𝜙 ( 𝑥 𝑡 ) ,   Λ 1 ,   Λ 2 ,   𝐷 𝜎 , and 𝐷 𝜙 are known positive constants matrices. The error dynamic is obtained from (2) and (7): ̇ Δ 𝑡 = 𝐴 Δ 𝑡 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑊 0 1 𝜎 + 𝑊 0 2 𝑢 𝜙 𝛾 𝑡 + 𝑓 𝑡 , ( 1 0 ) where 𝑊 1 , 𝑡 = 𝑊 1 , 𝑡 𝑊 0 1 , 𝑊 2 , 𝑡 = 𝑊 2 , 𝑡 𝑊 0 2 . As in [4, 5, 9, 10], we assume modeling error is bounded.(A1) the unmodeled dynamic 𝑓 satisfies 𝑓 𝑇 𝑡 Λ 𝑓 1 𝑓 𝑡 𝜂 . ( 1 1 ) Λ 𝑓 is a known positive constants matrix.

If we define 𝑅 = 𝑊 1 + 𝑊 2 + Λ 𝑓 , 𝑄 = 𝐷 𝜎 + 𝑢 𝐷 𝜙 + 𝑄 0 , ( 1 2 ) and the matrices 𝐴 and 𝑄 0 are selected to fulfill the following conditions: ( 1 ) the pair ( 𝐴 , 𝑅 1 / 2 ) is controllable, the pair ( 𝑄 1 / 2 , 𝐴 ) is observable, ( 2 ) local frequency condition [9] satisfies frequency condition: 𝐴 𝑇 𝑅 1 1 𝐴 𝑄 4 𝐴 𝑇 𝑅 1 𝑅 1 𝐴 𝑅 𝐴 𝑇 𝑅 1 𝑅 1 𝐴 𝑇 , ( 1 3 ) then the following assumption can be established. (A2) There exist a stable matrix 𝐴 and a strictly positive definite matrix 𝑄 0 such that the matrix Riccati equation: 𝐴 𝑇 𝑃 + 𝑃 𝐴 + 𝑃 𝑅 𝑃 + 𝑄 = 0 ( 1 4 ) has a positive solution 𝑃 = 𝑃 𝑇 > 0 .

This condition is easily fulfilled if we select 𝐴 as stable diagonal matrix. Next Theorem states the learning procedure of neuroidentifier.

Theorem 2. Subject to assumptions A1 and A2   being satisfied, if the weights 𝑊 1 , 𝑡 and 𝑊 2 , 𝑡 are updated as ̇ 𝑊 1 , 𝑡 = 𝑠 𝑡 𝐾 1 𝑃 Δ 𝑡 𝜎 𝑇 ̂ 𝑥 𝑡 , ̇ 𝑊 2 , 𝑡 = 𝑠 𝑡 P r 𝐾 2 𝑃 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 Δ 𝑇 𝑡 , ( 1 5 ) where 𝐾 1 , 𝐾 2 > 0 , 𝑃 is the solution of Riccati equation (14), P r 𝑖 [ 𝜔 ] ( 𝑖 = 1 , 2 ) are projection functions which are defined as 𝜔 = 𝐾 2 𝑃 𝜙 ( ̂ 𝑥 𝑡 ) 𝛾 ( 𝑢 𝑡 ) Δ 𝑇 𝑡 [ ] = 𝑊 P r 𝜔 𝜔 , c o n d i t i o n , 𝜔 + 2 , 𝑡 2 𝑊 𝑡 𝑟 𝑇 2 , 𝑡 𝐾 2 𝑃 𝑊 2 , 𝑡 𝜔 o t h e r w i s e , ( 1 6 ) where the “condition” is 𝑊 2 , 𝑡 < 𝑟 or 𝑊 [ 2 , 𝑡 𝑊 = 𝑟 a n d t r ( 𝜔 2 , 𝑡 ) 0 ] , 𝑟 < 𝑊 0 2 is a positive constant. 𝑠 𝑡 is a dead-zone function 𝑠 𝑡 = Δ 1 , i f 𝑡 2 > 𝜆 1 m i n 𝑄 0 𝜂 , 0 , o t h e r w i s e , ( 1 7 ) then the weight matrices and identification error remain bounded, that is, Δ 𝑡 𝐿 , 𝑊 1 , 𝑡 𝐿 , 𝑊 2 , 𝑡 𝐿 , ( 1 8 ) for any 𝑇 > 0 the identification error fulfills the following tracking performance: 1 𝑇 𝑇 0 Δ 𝑡 2 𝑄 0 𝑑 𝑡 𝜅 Δ 𝜂 + 𝑇 0 𝑃 Δ 0 𝑇 , ( 1 9 ) where 𝜅 is the condition number of 𝑄 0 defined as 𝜅 = 𝜆 m a x ( 𝑄 0 ) / 𝜆 m i n ( 𝑄 0 ) .

Proof. Select a Lyapunov function as 𝑉 𝑡 = Δ 𝑇 𝑡 𝑃 Δ 𝑡 𝑊 + t r 𝑇 1 , 𝑡 𝐾 1 1 𝑊 1 , 𝑡 𝑊 + t r 𝑇 2 , 𝑡 𝐾 2 1 𝑊 2 , 𝑡 , ( 2 0 ) where 𝑃 𝑛 × 𝑛 is positive definite matrix. According to (10), the derivative is ̇ 𝑉 𝑡 = Δ 𝑇 𝑡 𝑃 𝐴 + 𝐴 𝑇 𝑃 Δ 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑓 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑊 1 𝜎 + 𝑊 1 𝑢 𝜙 𝛾 𝑡 ̇ 𝑊 + 2 t r 𝑇 1 , 𝑡 𝐾 1 1 𝑊 1 , 𝑡 ̇ 𝑊 + 2 t r 𝑇 2 , 𝑡 𝐾 2 1 𝑊 2 , 𝑡 . ( 2 1 ) Since Δ 𝑇 𝑡 𝑃 𝑊 1 𝜎 𝑡 is scalar, using (9) and matrix inequality 𝑋 𝑇 𝑋 𝑌 + 𝑇 𝑌 𝑇 𝑋 𝑇 Λ 1 𝑋 + 𝑌 𝑇 Λ 𝑌 , ( 2 2 ) where 𝑋 , 𝑌 , Λ 𝑛 × 𝑘 are any matrices, Λ is any positive definite matrix, we obtain 2 Δ 𝑇 𝑡 𝑃 𝑊 1 𝜎 𝑡 Δ 𝑇 𝑡 𝑃 𝑊 1 Λ 1 1 𝑊 1 𝑇 𝑃 Δ 𝑡 + 𝜎 𝑇 𝑡 Λ 1 𝜎 𝑡 Δ 𝑇 𝑡 𝑃 𝑊 1 𝑃 + 𝐷 𝜎 Δ 𝑡 , 2 Δ 𝑇 𝑡 𝑃 𝑊 2 𝜙 𝑡 𝛾 𝑢 𝑡 Δ 𝑇 𝑡 𝑃 𝑊 2 𝑃 + 𝑢 𝐷 𝜙 Δ 𝑡 . ( 2 3 ) In view of the matrix inequality (22) and (A1), 2 Δ 𝑇 𝑡 𝑃 𝑓 𝑡 Δ 𝑇 𝑡 𝑃 Λ 𝑓 𝑃 Δ 𝑡 + 𝜂 . ( 2 4 ) So we have ̇ 𝑉 𝑡 Δ 𝑇 𝑡 𝑃 𝐴 + 𝐴 𝑇 𝑃 + 𝑃 𝑊 1 + 𝑊 2 + Λ 𝑓 𝑃 + 𝐷 𝜎 + 𝑢 𝐷 𝜙 + 𝑄 0 Δ 𝑡 ̇ 𝑊 + 2 t r 𝑇 1 , 𝑡 𝐾 1 1 𝑊 1 , 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝜂 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 ̇ 𝑊 + 2 t r 𝑇 2 , 𝑡 𝐾 2 1 𝑊 2 , 𝑡 + 2 Δ 𝑇 𝑡 𝑃 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 . ( 2 5 ) Since ̇ 𝑊 1 , 𝑡 = ̇ 𝑊 1 , 𝑡 and ̇ 𝑊 2 , 𝑡 = ̇ 𝑊 2 , 𝑡 , if we use (A2), we have ̇ 𝑉 𝑡 𝐾 2 t r 1 1 ̇ 𝑊 𝑇 1 , 𝑡 + 𝐾 1 𝑃 Δ 𝑡 𝜎 𝑇 ̂ 𝑥 𝑡 𝑊 1 , 𝑡 + 𝜂 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 𝐾 + 2 t r 2 1 ̇ 𝑊 2 , 𝑡 + 𝑃 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 Δ 𝑇 𝑡 𝑊 2 , 𝑡 . ( 2 6 ) (I)if Δ 𝑡 2 > 𝜆 1 m i n ( 𝑄 0 ) 𝜂 , using the updating law as (15) we can conclude that ̇ 𝑉 𝑡 2 t r P r 𝑃 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 Δ 𝑇 𝑡 + 𝑃 𝜙 ̂ 𝑥 𝑡 𝛾 𝑢 𝑡 Δ 𝑇 𝑡 𝑊 2 , 𝑡 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 + 𝜂 , ( 2 7 ) (a)if 𝑊 2 , 𝑡 < 𝑟 or 𝑊 [ 2 , 𝑡 𝑊 = 𝑟 a n d t r ( 𝜔 2 , 𝑡 ̇ 𝑉 ) 0 ] , 𝑡 𝜆 m i n ( 𝑄 0 ) Δ 𝑡 2 + 𝜂 < 0 ,(b)if 𝑊 2 , 𝑡 = 𝑟 and 𝑊 t r ( 𝜔 2 , 𝑡 ) > 0 ̇ 𝑉 𝑡 𝐾 2 t r 2 𝑃 𝑊 2 , 𝑡 2 𝑊 t r 𝑇 2 , 𝑡 𝐾 2 𝑃 𝑊 2 , 𝑡 𝜔 𝑊 2 , 𝑡 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 + 𝜂 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 + 𝜂 < 0 . ( 2 8 ) 𝑉 𝑡 is bounded. Integrating (27) from 0 up to 𝑇 yields 𝑉 𝑇 𝑉 0 𝑇 0 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 𝑑 𝑡 + 𝜂 𝑇 . ( 2 9 ) Because 𝜅 1 , we have 𝑇 0 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 𝑑 𝑡 𝑉 0 𝑉 𝑇 + 𝑇 0 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 𝑑 𝑡 𝑉 0 + 𝜂 𝑇 , 𝑉 0 + 𝜅 𝜂 𝑇 , ( 3 0 ) where 𝜅 is condition number of 𝑄 0 (II)If Δ 𝑡 2 𝜆 1 m i n ( 𝑄 0 ) 𝜂 , the weights become constants, 𝑉 𝑡 remains bounded. And 𝑇 0 Δ 𝑇 𝑡 𝑄 0 Δ 𝑡 𝑑 𝑡 𝑇 0 𝜆 m a x 𝑄 0 Δ 𝑡 2 𝜆 𝑑 𝑡 m a x 𝑄 0 𝜆 m i n 𝑄 0 𝜂 𝑇 𝑉 0 + 𝜅 𝜂 𝑇 . ( 3 1 )
From (I) and (II), 𝑉 𝑡 is bounded, (18) is realized. From (20) and 𝑊 1 , 𝑡 = 𝑊 1 , 𝑡 𝑊 0 1 , 𝑊 2 , 𝑡 = 𝑊 2 , 𝑡 𝑊 0 2 we know 𝑉 0 = Δ 𝑇 0 𝑃 Δ 0 . Using (30) and (31), (19) is obtained. The theorem is proved.

Remark 3. The weight update law (15) uses two techniques. The dead-zone 𝑠 𝑡 is applied to overcome the robust problem caused by unmodeled dynamic 𝑓 𝑡 . In presence of disturbance or unmodeled dynamics, adaptive procedures may easily go unstable. The lack of robustness of parameters identification was demonstrated in [11] and became a hot issue in 1980s. Dead-zone method is one of simple and effective tool. The second technique is projection approach which may guarantee that the parameters remain within a constrained region and do not alter the properties of the adaptive law established without projection [12]. The projection approach proposed in this paper is explained in Figure 1. We hope to force 𝑊 2 , 𝑡 inside the ball of center 𝑊 0 2 and radius 𝑟 . If 𝑊 2 , 𝑡 < 𝑟 , we use the normal gradient algorithm. When 𝑊 2 , 𝑡 𝑊 0 2 is on the ball, and the vector 𝑊 2 , 𝑡 points either inside or along the ball, that is, 𝑊 ( 𝑑 / 𝑑 𝑡 ) 2 , 𝑡 2 𝑊 = 2 t r ( 𝜔 2 , 𝑡 ) 0 , we also keep this algorithm. If 𝑊 t r ( 𝜔 2 , 𝑡 𝑊 ) > 0 , t r [ ( 𝜔 + ( 2 , 𝑡 2 𝑊 / t r ( 𝑇 2 , 𝑡 ( 𝐾 2 𝑊 𝑃 ) 2 , 𝑡 𝑊 ) ) 𝜔 ) 2 , 𝑡 ] < 0 , so 𝑊 ( 𝑑 / 𝑑 𝑡 ) 2 , 𝑡 2 < 0 , 𝑊 2 , 𝑡 are directed toward the inside or the ball, that is, 𝑊 2 , 𝑡 will never leave the ball. Since 𝑟 < 𝑊 0 2 , 𝑊 2 , 𝑡 0 .

867178.fig.001
Figure 1: Projection algorithm.

Remark 4. Figure 1 and (7) show that the initial conditions of the weights influence identification accuracy. In order to find good initial weights, we design an offline method. From above theorem, we know the weights will convergence to a zone. We use any initial weights, W 0 1 and W 0 2 , after 𝑇 0 , the identification error should become smaller, that is, 𝑊 1 , 𝑇 0 and 𝑊 2 , 𝑇 0 are better than 𝑊 0 1 and 𝑊 0 2 . We use following steps to find the initial weights.(1)Start from any initial value for 𝑊 0 1 = 𝑊 1 , 0 , 𝑊 0 2 = 𝑊 2 , 0 .(2)Do identification until training time arrives 𝑇 0 .(3)If the Δ ( 𝑇 0 ) < Δ ( 0 ) , let 𝑊 1 , 𝑇 0 , 𝑊 2 , 𝑇 0 as a new 𝑊 0 1 and 𝑊 0 2 , go to 2 to repeat the identification process.(4)If the Δ ( 𝑇 0 ) Δ ( 0 ) , stop this offline identification, now 𝑊 1 , 𝑇 0 , 𝑊 2 , 𝑇 0 are the final initial weights.

Remark 5. Since the updating rate is 𝐾 𝑖 𝑃 ( 𝑖 = 1 , 2 ), and 𝐾 𝑖 can be selected as any positive matrix, the learning process of the dynamic neural network (15) is free of the solution of Riccati equation (14).

Remark 6. Let us notice that the upper bound (19) turns out to be ‘‘sharp’’, that is, in the case of not having any uncertainties (exactly matching case: 𝑓 = 0 ) we obtain 𝜂 = 0 and, hence, l i m s u p 𝑇 1 𝑇 𝑇 0 Δ 𝑡 2 𝑄 0 𝑑 𝑡 = 0 ( 3 2 ) from which, for this special situation, the asymptotic stability property ( Δ 𝑡 𝑡 0 ) follows. In general, only the asymptotic stability ‘‘in average’’ is guaranteed, because the dead-zone parameter 𝜂 can be never set zero.

3. Robust Adaptive Controller Based on Neuro Identifier

From (7) we know that the nonlinear system (1) may be modeled as ̇ 𝑥 𝑡 = 𝐴 𝑥 𝑡 + 𝑊 1 𝜎 𝑥 𝑡 + 𝑊 2 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑓 = 𝐴 𝑥 𝑡 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑊 𝑓 + 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑊 1 , 𝑡 𝜎 𝑡 + 𝑊 1 𝑢 𝜙 𝛾 𝑡 . ( 3 3 )

Equation (33) can be rewritten as ̇ 𝑥 𝑡 = 𝐴 𝑥 𝑡 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑑 𝑡 , ( 3 4 ) where 𝑑 𝑡 = 𝑊 𝑓 + 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑊 1 , 𝑡 𝜎 𝑡 + 𝑊 1 𝑢 𝜙 𝛾 𝑡 . ( 3 5 ) If updated law of 𝑊 1 , 𝑡 and 𝑊 2 , 𝑡 is (15), 𝑊 1 , 𝑡 and 𝑊 2 , 𝑡 are bounded. Using the assumption (A1), 𝑑 𝑡 is bounded as 𝑑 = s u p 𝑡 𝑑 𝑡 .

The object of adaptive control is to force the nonlinear system (1) following a optimal trajectory 𝑥 𝑡 𝑟 which is assumed to be smooth enough. This trajectory is regarded as a solution of a nonlinear reference model: ̇ 𝑥 𝑡 𝑥 = 𝜑 𝑡 , 𝑡 , ( 3 6 ) with a fixed initial condition. If the trajectory has points of discontinuity in some fixed moments, we can use any approximating trajectory which is smooth. In the case of regulation problem 𝜑 ( 𝑥 𝑡 , 𝑡 ) = 0 , 𝑥 ( 0 ) = 𝑐 , 𝑐 is constant. Let us define the sate trajectory error as Δ 𝑡 = 𝑥 𝑡 𝑥 𝑡 . ( 3 7 ) From (34) and (36) we have ̇ Δ 𝑡 = 𝐴 𝑥 𝑡 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 𝑥 𝑡 𝛾 𝑢 𝑡 + 𝑑 𝑡 𝑥 𝜑 𝑡 , 𝑡 . ( 3 8 ) Let us select the control action 𝛾 ( 𝑢 𝑡 ) as linear form 𝛾 𝑢 𝑡 = 𝑈 1 , 𝑡 + 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 1 𝑈 2 , 𝑡 , ( 3 9 ) where 𝑈 1 , 𝑡 𝑛 is direct control part and 𝑈 2 , 𝑡 𝑛 is a compensation of unmodeled dynamic 𝑑 𝑡 . As 𝜑 ( 𝑥 𝑡 , 𝑡 ) , 𝑥 𝑡 , 𝑊 1 , 𝑡 𝜎 ( ̂ 𝑥 𝑡 ) and 𝑊 2 , 𝑡 𝜙 ( ̂ 𝑥 𝑡 ) are available, we can select 𝑈 1 , 𝑡 as 𝑈 1 , 𝑡 = 𝑊 2 , 𝑡 𝜙 ̂ 𝑥 𝑡 1 𝜑 𝑥 𝑡 , 𝑡 𝐴 𝑥 𝑡 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 . ( 4 0 ) Because 𝜙 ( ̂ 𝑥 𝑡 ) in (5) is different from zero, and 𝑊 2 , 𝑡 0 by the projection approach in Theorem 2. Substitute (39) and (40) into (38), we have So the error equation is ̇ Δ 𝑡 = 𝐴 Δ 𝑡 + 𝑈 2 , 𝑡 + 𝑑 𝑡 . ( 4 1 ) Four robust algorithms may be applied to compensate 𝑑 𝑡 .

(A) Exactly Compensation
From (7) and (2) we have 𝑑 𝑡 = ̇ 𝑥 𝑡 ̇ ̂ 𝑥 𝑡 𝑥 𝐴 𝑡 ̂ 𝑥 𝑡 . ( 4 2 ) If ̇ 𝑥 𝑡 is available, we can select 𝑈 2 , 𝑡 as 𝑈 𝑎 2 , 𝑡 = 𝑑 𝑡 , that is, 𝑈 𝑎 2 , 𝑡 𝑥 = 𝐴 𝑡 ̂ 𝑥 𝑡 ̇ 𝑥 𝑡 ̇ ̂ 𝑥 𝑡 . ( 4 3 ) So, the ODE which describes the state trajectory error is ̇ Δ 𝑡 = 𝐴 Δ 𝑡 . ( 4 4 ) Because 𝐴 is stable, Δ 𝑡 is globally asymptotically stable. l i m 𝑡 Δ 𝑡 = 0 . ( 4 5 )

(B) An Approximate Method
If ̇ 𝑥 𝑡 is not available, an approximate method may be used as ̇ 𝑥 𝑡 = 𝑥 𝑡 𝑥 𝑡 𝜏 𝜏 + 𝛿 𝑡 , ( 4 6 ) where 𝛿 𝑡 > 0 , is the differential approximation error. Let us select the compensator as 𝑈 𝑏 2 , 𝑡 𝑥 = 𝐴 𝑡 ̂ 𝑥 𝑡 𝑥 𝑡 𝑥 𝑡 𝜏 𝜏 ̇ ̂ 𝑥 𝑡 . ( 4 7 ) So 𝑈 𝑏 2 , 𝑡 = 𝑈 𝑎 2 , 𝑡 + 𝛿 𝑡 , (44) become ̇ Δ 𝑡 = 𝐴 Δ 𝑡 + 𝛿 𝑡 . ( 4 8 ) Define Lyapunov-like function as 𝑉 𝑡 = Δ 𝑡 𝑇 𝑃 2 Δ 𝑡 , 𝑃 2 = 𝑃 𝑇 2 > 0 . ( 4 9 ) The time derivative of (49) is ̇ 𝑉 𝑡 = Δ 𝑡 𝐴 𝑇 𝑃 2 + 𝑃 2 𝐴 Δ 𝑡 + 2 Δ 𝑡 𝑇 𝑃 2 𝛿 𝑡 , ( 5 0 ) 2 Δ 𝑇 𝑡 𝑃 2 𝛿 𝑡 can be estimated as 2 Δ 𝑡 𝑇 𝑃 2 𝛿 𝑡 Δ 𝑡 𝑇 𝑃 2 Λ 𝑃 2 Δ 𝑡 + 𝛿 𝑇 𝑡 Λ 1 𝛿 𝑡 ( 5 1 ) where Λ is any positive define matrix. So (50) becomes ̇ 𝑉 𝑡 Δ 𝑡 𝐴 𝑇 𝑃 2 + 𝑃 2 𝐴 + 𝑃 2 Λ 𝑃 2 + 𝑄 2 Δ 𝑡 + 𝛿 𝑇 𝑡 Λ 1 𝛿 𝑡 Δ 𝑡 𝑇 𝑄 2 Δ 𝑡 , ( 5 2 ) where 𝑄 is any positive define matrix. Because 𝐴 is stable, there exit Λ and 𝑄 2 such that the matrix Riccati equation: 𝐴 𝑇 𝑃 2 + 𝑃 2 𝐴 + 𝑃 2 Λ 𝑃 2 + 𝑄 2 = 0 ( 5 3 ) has positive solution 𝑃 2 = 𝑃 𝑇 2 > 0 . Defining the following seminorms: Δ 𝑡 2 𝑄 2 = l i m 𝑇 1 𝑇 𝑇 0 Δ 𝑡 𝑄 2 Δ 𝑡 𝑑 𝑡 , ( 5 4 ) where 𝑄 2 = 𝑄 2 > 0 is the given weighting matrix, the state trajectory tracking can be formulated as the following optimization problem: 𝐽 m i n = m i n 𝑢 𝑡 𝑥 𝐽 , 𝐽 = 𝑡 𝑥 𝑡 2 𝑄 2 . ( 5 5 ) Note that l i m 𝑇 1 𝑇 Δ 0 𝑇 𝑃 2 Δ 0 = 0 ( 5 6 ) based on the dynamic neural network (2), the control law (47) can make the trajectory tracking error satisfies the following property: Δ 𝑡 2 𝑄 2 𝛿 𝑡 2 Λ 1 . ( 5 7 ) A suitable selection of Λ and 𝑄 2 can make the Riccati equation (53) has positive solution and make Δ 𝑡 2 𝑄 2 small enough if 𝜏 is small enough.

(C) Sliding Mode Compensation
If ̇ 𝑥 𝑡 is not available, the sliding mode technique may be applied. Let us define Lyapunov-like function as 𝑉 𝑡 = Δ 𝑡 𝑇 𝑃 3 Δ 𝑡 , ( 5 8 ) where 𝑃 3 is a solution of the Lyapunov equation: 𝐴 𝑇 𝑃 3 + 𝑃 3 𝐴 = 𝐼 . ( 5 9 ) Using (41) whose time derivative is ̇ 𝑉 𝑡 = Δ 𝑡 𝐴 𝑇 𝑃 3 + 𝑃 3 𝐴 Δ 𝑡 + 2 Δ 𝑡 𝑇 𝑃 3 𝑈 2 , 𝑡 + 2 Δ 𝑡 𝑇 𝑃 3 𝑑 𝑡 . ( 6 0 ) According to sliding mode technique, we may select 𝑢 2 , 𝑡 as 𝑈 𝑐 2 , 𝑡 = 𝑘 𝑃 3 1 Δ s g n 𝑡 , 𝑘 > 0 , ( 6 1 ) where 𝑘 is positive constant, Δ s g n 𝑡 = 1 Δ 𝑡 > 0 0 Δ 𝑡 = 0 1 Δ 𝑡 Δ < 0 s g n 𝑡 = Δ s g n 1 , 𝑡 Δ , s g n 𝑛 , 𝑡 𝑇 𝑛 . ( 6 2 ) Substitute (59) and (61) into (60) ̇ 𝑉 𝑡 Δ = 𝑡 2 Δ 2 𝑘 𝑡 + 2 Δ 𝑡 𝑇 𝑃 𝑑 𝑡 Δ 𝑡 2 Δ 2 𝑘 𝑡 + 2 𝜆 m a x ( Δ 𝑃 ) 𝑡 𝑑 𝑡 Δ = 𝑡 2 Δ 2 𝑡 𝑘 𝜆 m a x 𝑑 ( 𝑃 ) 𝑡 . ( 6 3 ) If we select 𝑘 > 𝜆 m a x 𝑃 3 𝑑 , ( 6 4 ) where 𝑑 is define as (35), then ̇ 𝑉 𝑡 < 0 . So, l i m 𝑡 Δ 𝑡 = 0 . ( 6 5 )

(D) Local Optimal Control
If ̇ 𝑥 𝑡 is not available and ̇ 𝑥 𝑡 is not approximated as (B). In order to analyze the tracking error stability, we introduce the following Lyapunov function: 𝑉 𝑡 Δ 𝑡 = Δ 𝑡 𝑃 4 Δ 𝑡 , 𝑃 4 = 𝑃 𝑇 4 > 0 . ( 6 6 ) Using (41), whose time derivative is ̇ 𝑉 𝑡 = Δ 𝑡 𝐴 𝑇 𝑃 4 + 𝑃 4 𝐴 Δ 𝑡 + 2 Δ 𝑡 𝑇 𝑃 4 𝑈 2 , 𝑡 + 2 Δ 𝑡 𝑇 𝑃 4 𝑑 𝑡 , ( 6 7 ) 2 Δ 𝑡 𝑇 𝑃 4 𝑑 𝑡 can be estimated as 2 Δ 𝑡 𝑇 𝑃 4 𝑑 𝑡 Δ 𝑡 𝑃 4 Λ 4 1 𝑃 4 Δ 𝑡 + 𝑑 𝑇 𝑡 Λ 4 𝑑 𝑡 . ( 6 8 ) Substituting (68) in (67), adding and subtracting the term Δ 𝑡 𝑇 𝑄 4 Δ 𝑡 and 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 with 𝑄 4 = 𝑄 𝑇 4 > 0 and 𝑅 4 = 𝑅 𝑇 4 > 0 , we formulate ̇ 𝑉 𝑡 Δ 𝑡 𝐴 𝑇 𝑃 4 + 𝑃 4 𝐴 + 𝑃 4 Λ 4 𝑃 4 + 𝑄 4 Δ 𝑡 + 2 Δ 𝑡 𝑇 𝑃 4 𝑈 𝑑 2 , 𝑡 + 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 + 𝑑 𝑇 𝑡 Λ 4 1 𝑑 𝑡 Δ 𝑡 𝑄 Δ 𝑡 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 . ( 6 9 ) Because 𝐴 is stable, there exit Λ 4 and 𝑄 4 such that the matrix Riccati equation: 𝐴 𝑇 𝑃 4 + 𝑃 4 𝐴 + 𝑃 4 Λ 4 𝑃 4 + 𝑄 4 = 0 . ( 7 0 ) So (69) is ̇ 𝑉 𝑡 Δ 𝑡 2 𝑄 4 + 𝑈 𝑑 2 , 𝑡 2 𝑅 4 𝑈 + Ψ 𝑑 2 , 𝑡 + 𝑑 𝑇 𝑡 Λ 4 1 𝑑 𝑡 , ( 7 1 ) where Ψ 𝑈 𝑑 2 , 𝑡 = 2 Δ 𝑡 𝑇 𝑃 4 𝑈 𝑑 2 , 𝑡 + 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 . ( 7 2 ) We reformulate (71) as Δ 𝑡 2 𝑄 4 + 𝑈 𝑑 2 , 𝑡 2 𝑅 4 𝑈 Ψ 𝑑 2 , 𝑡 + 𝑑 𝑇 𝑡 Λ 4 1 𝑑 𝑡 ̇ 𝑉 𝑡 . ( 7 3 ) Then, integrating each term from 0 to 𝜏 , dividing each term by 𝜏 , and taking the limit, for 𝜏 of these integrals’ supreme, we obtain l i m 𝑇 1 𝑇 𝑇 0 Δ 𝑡 𝑇 𝑄 4 Δ 𝑡 𝑑 𝑡 + l i m 𝑇 1 𝑇 𝑇 0 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 𝑑 𝑡 l i m 𝑇 1 𝑇 𝑇 0 𝑑 𝑇 𝑡 Λ 4 1 𝑑 𝑡 𝑑 𝑡 + l i m 𝑇 1 𝑇 𝑇 0 Ψ 𝑈 𝑑 2 , 𝑡 + 𝑑 𝑡 l i m 𝑇 1 𝑇 𝑇 0 ̇ 𝑉 𝑡 𝑑 𝑡 . ( 7 4 ) In the view of definitions of the seminorms (55), we have Δ 𝑡 2 𝑄 4 + 𝑈 𝑑 2 , 𝑡 2 𝑅 4 𝑑 𝑡 2 Λ 4 1 + l i m 𝑇 1 𝑇 𝑇 0 Ψ 𝑈 𝑑 2 , 𝑡 𝑑 𝑡 . ( 7 5 ) It fixes a tolerance level for the trajectory-tracking error. So, the control goal now is to minimize Ψ ( 𝑈 𝑑 2 , 𝑡 ) and 𝑑 𝑡 2 Λ 4 1 . To minimize 𝑑 𝑡 2 Λ 4 1 , we should minimize Λ 4 1 . From (13), if select 𝑄 4 to make (70) have solution, we can choose the minimal Λ 4 1 as Λ 4 1 = 𝐴 𝑇 𝑄 4 𝐴 1 . ( 7 6 ) To minimizing Ψ ( 𝑈 𝑑 2 , 𝑡 ) , we assume that, at the given 𝑡 (positive), 𝑥 ( 𝑡 ) and ̂ 𝑥 ( 𝑡 ) are already realized and do not depend on 𝑈 𝑑 2 , 𝑡 . We name the 𝑈 𝑑 2 , 𝑡 ( 𝑡 ) as the locally optimal control, because it is calculated based only on “local” information. The solution of this optimization problem is given by 𝑢 m i n Ψ 𝑑 2 , 𝑡 = 2 Δ 𝑡 𝑇 𝑃 4 𝑢 𝑑 2 , 𝑡 + 𝑈 𝑑 𝑇 2 , 𝑡 𝑅 4 𝑈 𝑑 2 , 𝑡 . s u b j e c t : 𝐴 0 𝑈 1 , 𝑡 + 𝑈 𝑑 2 , 𝑡 𝐵 0 . ( 7 7 ) It is typical quadratic programming problem. Without restriction 𝑈 is selected according to the linear squares optimal control law: 𝑢 𝑑 2 , 𝑡 = 2 𝑅 4 1 𝑃 4 Δ 𝑡 . ( 7 8 )

Remark 7. Approaches (A) and (C) are exactly compensations of 𝑑 𝑡 , Approach (A) needs the information of ̇ 𝑥 𝑡 . Because Approach (C) uses the sliding mode control   𝑈 𝑐 2 , 𝑡 that is inserted in the closed-loop system, chattering occurs in the control input which may excite unmodeled high-frequency dynamics. To eliminate chattering, the boundary layer compensator can be used, it offers a continuous approximation to the discontinuous sliding mode control law inside the boundary layer and guarantees the output tracking error within any neighborhood of the origin [13].
Finally, we give following design steps for the robust neurocontrollers proposed in this paper.(1)According to the dimension of the plant (1), design a neural networks identifier (2) which has the same dimension as the plant. In (2), 𝐴 can be selected a stable matrix. 𝐴 will influence the dynamic response of the neural network. The bigger eigenvalues of 𝐴 will make the neural network slower. The initial conditions for 𝑊 1 , 𝑡 and 𝑊 2 , 𝑡 are obtained as in Remark 4.(2)Do online identification. The learning algorithm is (15) with the dead zone in Theorem 2. We assume we know the upper bound of modeling error, we can give a value for 𝜂 .   𝑄 0 is chosen such that Riccati equation (14) has positive defined solution, 𝑅 can be selected as any positive defined matrix because Λ 1 1 is arbitrary positive defined matrix. The updating rate in the learning algorithm (15) is 𝐾 1 𝑃 , and 𝐾 1 can be selected as any positive defined matrix, so the learning process is free of the solution 𝑃 of the Riccati equations (14). The larger 𝐾 1 𝑃 is selected, the faster convergence the neuroidentifier has.(3)Use robust control (39) and one of compensation of (43), (47), (61), and (78).

4. Simulation

In this section, a two-link robot manipulator is used to illustrate the proposed approach. Its dynamics of can be expressed as follows [14]: 𝑀 ( 𝜃 ) . . ̇ 𝜃 ̇ 𝜃 + 𝑉 𝜃 , 𝜃 + 𝐺 ( 𝜃 ) + 𝐹 𝑑 ̇ 𝜃 = 𝜏 , ( 7 9 ) where 𝜃 2 consists of the joint variables, ̇ 𝜃 2 denotes the links velocity, 𝜏 is the generalized forces, 𝑀 ( 𝜃 ) is the intertie matrix, ̇ 𝑉 ( 𝜃 , 𝜃 ) is centripetal-Coriolis matrix, and 𝐺 ( 𝜃 ) is gravity vector, 𝐹 𝑑 ( ̇ 𝜃 ) is the friction vector. 𝑀 ( 𝜃 ) represents the positive defined inertia matrix. If we define 𝑥 1 = 𝜃 = [ 𝜃 1 , 𝜃 2 ] is joint position, 𝑥 2 = ̇ 𝜃 is joint velocity of the link, 𝑥 𝑡 = [ 𝑥 1 , 𝑥 2 ] 𝑇 , (79) can be rewritten as state space form [15]: ̇ 𝑥 1 = 𝑥 2 , ̇ 𝑥 2 𝑥 = 𝐻 𝑡 , 𝑢 𝑡 , ( 8 0 ) where 𝑢 𝑡 = 𝜏 is control input, 𝐻 𝑥 𝑡 , 𝑢 𝑡 𝑥 = 𝑀 1 1 𝐶 𝑥 1 , 𝑥 2 ̇ 𝑥 1 𝑥 + 𝐺 1 + 𝐹 ̇ 𝑥 1 + 𝑢 𝑡 . ( 8 1 ) Equation (80) can also be rewritten as ̇ 𝑥 1 = 𝑡 0 𝐻 𝑥 𝜏 , 𝑢 𝜏 𝑥 𝑑 𝜏 + 𝐻 0 , 𝑢 0 . ( 8 2 ) So the dynamic of the two-link robot (79) is in form of (1) with 𝑓 𝑥 𝑡 , 𝑢 𝑡 = , 𝑡 𝑡 0 𝐻 𝑥 𝜏 , 𝑢 𝜏 𝑥 𝑑 𝜏 + 𝐻 0 , 𝑢 0 . ( 8 3 ) The values of the parameters are listed below: 𝑚 1 = 𝑚 2 = 1 . 5 3 k g ,   𝑙 1 = 𝑙 2 = 0 . 3 6 5 m ,   𝑟 1 = 𝑟 2 = 0 . 1 ,   𝑣 1 = 𝑣 2 = 0 . 4 ,   𝑘 1 = 𝑘 2 = 0 . 8 . Let define ̂ 𝜃 ̂ 𝑥 = [ 1 , ̂ 𝜃 2 ] 𝑇 , and 𝑢 = [ 𝜏 1 , 𝜏 2 ] 𝑇 , the neural network for control is represented as ̇ ̂ 𝑥 = 𝐴 ̂ 𝑥 + 𝑊 1 , 𝑡 𝜎 ̂ 𝑥 𝑡 + 𝑊 2 , 𝑡 𝜙 ( ̂ 𝑥 ) 𝑢 . ( 8 4 ) We select 𝐴 = 1 . 5 0 0 1 , 𝜙 ( ̂ 𝑥 𝑡 ) = d i a g ( 𝜙 1 ( ̂ 𝑥 1 ) , 𝜙 2 ( ̂ 𝑥 2 ) ) , 𝜎 ( ̂ 𝑥 𝑡 ) = [ 𝜎 2 ( ̂ 𝑥 2 ) , 𝜎 2 ( ̂ 𝑥 2 ) ] 𝑇 𝜎 𝑖 ̂ 𝑥 𝑖 = 2 1 + 𝑒 2 ̂ 𝑥 𝑖 1 2 , 𝜙 𝑖 ̂ 𝑥 𝑖 = 2 1 + 𝑒 2 ̂ 𝑥 𝑖 + 1 2 , ( 8 5 ) where 𝑖 = 1 , 2 . We used Remark 4  to obtain a suitable 𝑊 0 1 and 𝑊 0 2 , start from random values, 𝑇 0 = 1 0 0 . After 2 loops, Δ ( 𝑇 0 ) does not decrease, we let the 𝑊 1 , 3 0 0 and 𝑊 2 , 3 0 0 as the new 𝑊 0 1 = 0 . 5 1 3 . 8 2 . 3 1 . 5 1 and 𝑊 0 2 = 3 . 1 2 2 . 7 8 5 . 5 2 4 . 0 2 1 . For the update laws (15), we select 𝜂 = 0 . 1 ,   𝑟 = 5 ,   𝐾 1 𝑃 = 𝐾 1 𝑃 = 5 0 0 2 . If we select the generalized forces as 𝜏 1 = 7 s i n 𝑡 , 𝜏 2 = 0 . ( 8 6 )

Now we check the neurocontrol. We assume the robot is changed at 𝑡 = 4 8 0 , after that 𝑚 1 = 𝑚 2 = 3 . 5 k g ,   𝑙 1 = 𝑙 2 = 0 . 5 m , and the friction becomes disturbance as 𝐷 s i n ( ( 𝜋 / 3 ) 𝑡 ) , 𝐷 is a positive constant. We compare neurocontrol with a PD control as 𝜏 P D = 1 0 𝜃 𝜃 ̇ ̇ 𝜃 5 𝜃 , ( 8 7 ) where 𝜃 1 = 3 ; 𝜃 2 is square wave. So 𝜑 ( 𝜃 ̇ 𝜃 ) = = 0 .

The neurocontrol is  (39) 𝜏 n e u r o = 𝑊 2 , 𝑡 𝜙 ( ̂ 𝑥 ) + 𝜑 𝑥 𝑡 , 𝑡 𝐴 𝑥 𝑡 𝑊 1 , 𝑡 + 𝑊 𝜎 ( ̂ 𝑥 ) 2 , 𝑡 𝜙 ( ̂ 𝑥 ) + 𝑈 2 , 𝑡 . ( 8 8 ) 𝑈 2 , 𝑡 is selected to compensate the unmodeled dynamics. Sine 𝑓 is unknown method. (A) exactly compensation, cannot be used.

(B) 𝐷 = 1 . The link velocity ̇ 𝜃 is measurable, as in (43), 𝑈 2 , 𝑡 ̂ 𝜃 ̇ 𝜃 = 𝐴 𝜃 𝜃 ̇ ̂ . ( 8 9 ) The results are shown in Figures 2 and 3.

867178.fig.002
Figure 2: Tracking control of 𝜃 1 (method B).
867178.fig.003
Figure 3: Tracking control of 𝜃 2 (method B).

(C) ̇ 𝜃 𝐷 = 0 . 3 . is not available, the sliding mode technique may be applied. we select 𝑢 2 , 𝑡 as   (61). 𝑢 2 , 𝑡 = 1 0 × s g n 𝜃 𝜃 . ( 9 0 ) The results are shown in Figures 4 and 5.

867178.fig.004
Figure 4: Tracking control of 𝜃 1 (method C).
867178.fig.005
Figure 5: Tracking control of 𝜃 2 (method C).

(D) 𝐷 = 3 . We select 𝑄 = 1 / 2 ,   𝑅 = 1 / 2 0 ,   Λ = 4 . 5 , the solution of following Riccati equation: 𝐴 𝑇 𝑃 + 𝑃 𝐴 + 𝑃 Λ 𝑃 𝑡 ̇ + 𝑄 = 𝑃 ( 9 1 ) is 𝑃 = 0 . 3 3 0 0 0 . 3 3 . If without restriction 𝜏 , the linear squares optimal control law: 𝑢 2 , 𝑡 = 2 𝑅 1 𝑃 𝜃 𝜃 = 2 0 0 0 2 0 𝜃 𝜃 . ( 9 2 ) The results of local optimal compensation are shown in Figures 6 and 7.

867178.fig.006
Figure 6: Tracking control of 𝜃 1 (method D).
867178.fig.007
Figure 7: Tracking control of 𝜃 2 (method D).

We may find that the neurocontrol is robust and effective when the robot is changed.

5. Conclusion

By means of Lyapunov analysis, we establish bounds for both the identifier and adaptive controller. The main contributions of our paper is that we give four different compensation methods and prove the stability of the neural controllers.

References

  1. K. S. Narendra and K. Parthasarathy, “Identi cation and control for dynamic systems using neural networks,” IEEE Transactions on Neural Networks, vol. 1, pp. 4–27, 1990.
  2. S. Jagannathan and F. L. Lewis, “Identi cation of nonlinear dynamical systems using multilayered neural networks,” Automatica, vol. 32, no. 12, pp. 1707–1712, 1996.
  3. S. Haykin, Neural Networks-A comprehensive Foundation, Macmillan College, New York, NY, USA, 1994.
  4. E. B. Kosmatopoulos, M. M. Polycarpou, M. A. Christodoulou, and P. A. Ioannou, “High-order neural network structures for identification of dynamical systems,” IEEE Transactions on Neural Networks, vol. 6, no. 2, pp. 422–431, 1995. View at Publisher · View at Google Scholar · View at Scopus
  5. G. A. Rovithakis and M. A. Christodoulou, “Adaptive control of unknown plants using dynamical neural networks,” IEEE Transactions on Systems, Man and Cybernetics, vol. 24, no. 3, pp. 400–412, 1994. View at Publisher · View at Google Scholar · View at Scopus
  6. W. Yu and X. Li, “Some new results on system identi cation with dynamic neural networks,” IEEE Transactions on Neural Networks, vol. 12, no. 2, pp. 412–417, 2001.
  7. E. Grant and B. Zhang, “A neural net approach to supervised learning of pole placement,” in Proceedings of the IEEE Symposium on Intelligent Control, 1989.
  8. K. J. Hunt and D. Sbarbaro, “Neural networks for nonlinear internal model control,” IEE Proceedings D—Control Theory and Applications, vol. 138, no. 5, pp. 431–438, 1991. View at Scopus
  9. A. S. Poznyak, W. Yu, E. N. Sanchez, and J. P. Perez, “Nonlinear adaptive trajectory tracking using dynamic neural networks,” IEEE Transactions on Neural Networks, vol. 10, no. 6, pp. 1402–1411, 1999. View at Publisher · View at Google Scholar · View at Scopus
  10. W. Yu and A. S. Poznyak, “Indirect adaptive control via parallel dynamic neural networks,” IEE Proceedings Control Theory and Applications, vol. 146, no. 1, pp. 25–30, 1999.
  11. B. Egardt, Stability of Adaptive Controllers, vol. 20 of Lecture Notes in Control and Information Sciences, Springer, Berlin, Germany, 1979.
  12. P. A. Ioannou and J. Sun, Robust Adaptive Control, Prentice-Hall, Upper Saddle River, NJ, USA, 1996.
  13. M. J. Corless and G. Leitmann, “Countinuous state feedback guaranteeing uniform ultimate boundness for uncertain dynamic systems,” IEEE Transactions on Automatic Control, vol. 26, pp. 1139–1144, 1981.
  14. F. L. Lewis, A. Yeşildirek, and K. Liu, “Multilayer neural-net robot controller with guaranteed tracking performance,” IEEE Transactions on Neural Networks, vol. 7, no. 2, pp. 388–399, 1996. View at Scopus
  15. S. Nicosia and A. Tornambe, “High-gain observers in the state and parameter estimation of robots having elastic joins,” System & Control Letter, vol. 13, pp. 331–337, 1989.