Research Article | Open Access

Jiao-Jun Zhang, Hong-Sen Yan, "MTN Optimal Tracking Control of SISO Nonlinear Time-Varying Discrete-Time Systems without Mechanism Models", *Mathematical Problems in Engineering*, vol. 2018, Article ID 3219140, 19 pages, 2018. https://doi.org/10.1155/2018/3219140

# MTN Optimal Tracking Control of SISO Nonlinear Time-Varying Discrete-Time Systems without Mechanism Models

**Academic Editor:**Javier Moreno-Valenzuela

#### Abstract

Nonlinear time-varying systems without mechanism models are common in application. They cannot be controlled directly by the traditional control methods based on precise mathematical models. Intelligent control is unsuitable for real-time control due to its computation complexity. For that sake, a multidimensional Taylor network (MTN) based output tracking control scheme, which consists of two MTNs, one as an identifier and the other as a controller, is proposed for SISO nonlinear time-varying discrete-time systems with no mechanism models. A MTN identifier is constructed to build the offline model of the system, and a set of initial parameters for online learning of the identifier is obtained. Then, an ideal output signal is selected relative to the given reference signal. Based on the system identification model, Pontryagin minimum principle is introduced to obtain the numerical solution of the optimal control law for the system relative to the given ideal output signal, with the corresponding optimal output taken as the desired output signal. A MTN controller is generated automatically to fit the numerical solution of the optimal control law using the conjugate gradient (CG) method, and a set of initial parameters for online learning of the controller is obtained. An adaptive back propagation (BP) algorithm is developed to adjust the parameters of the identifier and controller in real time, and the convergence for the proposed learning algorithm is verified. Simulation results show that the proposed scheme is valid.

#### 1. Introduction

Nonlinear time-varying systems without mechanism models exist in practical engineering applications widely. However, it is difficult to obtain the precise mathematical model of a system due to the limitation of the modeling theory, the influences of its internal structure and parameter variations, and the external environment disturbances. In addition, the state variables are not easy to be determined, and it is inconvenient for state feedback control to be realized physically due to the practical and economic limitations of the measuring equipment in engineering practices. Output feedback control [1–6], which is of great theoretic and realistic significance, is to probe into the problem for nonlinear time-varying systems without mechanism models.

Nonlinear autoregressive moving average with exogenous inputs (NARMAX) model describes an input-output relationship for a nonlinear dynamic system, by which the system output can be represented as a nonlinear functional expansion of its lagged inputs and outputs [7–9]. NARMAX has attracted considerable interest both in its theory and in applications [10–13], especially in the field of black-box nonlinear modeling. It is also referred to as time-varying NARMAX model in [14]. Neural network (NN) has the ability to approximate any continuous function with an arbitrary degree of accuracy over a compact set [15], and various kinds of NN have been used in system identification and control [7, 8, 16–24]. However, to satisfy the approximation requirement of a high-order uncertain system, both the numbers of the hidden layer neurons and the corresponding weight parameters needing online updating are large, which leads to the fact that the learning time tends to be unacceptably long and the real-time control is hardly realizable in practice. In addition, NN can only represent local dynamic characteristics when the state goes without too much change. However, the actual state change can be quite notable or even divergent. Thus NN cannot represent dynamic characteristics in the general sense. In fact, it cannot approximate to the unstable system as the artificial neurons suffer the limitations from the sign function, sigmoid function, and radial basis function, regardless of sample size or weight parameters. Furthermore, NN cannot approximate any continuous function to an arbitrarily degree of accuracy if the sign function or the sigmoid function is removed. From the point of view of the frequency characteristics of the input signals, a signal can be viewed as the superposition of value-high signal with the lowest frequency and value-low signals with high frequencies. Therefore, the value-low signals with high frequencies tend to be limited or restrained by the sign function, sigmoid function, or radial basis function of the artificial neuron, as a result of which the output of NN may fail to track the rapid change of its input. In terms of the neural network function approximation theorem, a three-layer neural network can approximate any nonlinear real-valued continuous function defined on a closed bounded subset. However, it cannot ensure that the nonlinear real-valued continuous function can be well approximated outside the closed bounded subset. Therefore, the general nonlinear dynamic system in the entire state space (i.e., not the bounded subset) is difficult to be approximated with an arbitrary degree of accuracy. As the polynomial function tends to be of infinity, the multidimensional Taylor network (MTN, whose idea was proposed by Hong-Sen Yan in 2010 and realization was done by Bo Zhou) is good at approximating or representing the general nonlinear dynamic system. It is suitable to be used as the identified dynamic model of the controlled plant, as it can represent polynomial dynamic system accurately, being simple with only states and inputs, and can be easily analyzed and solved for optimal control in terms of the minimum principle. However, NN only approximates the polynomial dynamic system instead of representing it. It is too complicated to be analyzed or solved for optimal control [25]. In addition, exponential function is contained in NN, which leads to the computational complexity and poor real-time control performance by a single chip microcomputer (SCM) and embedded system. That makes us resort to MTN, whereby only addition and multiplication are needed, its computation complexity being nearly that of the Taylor expansion of a single neuro in NN.

MTN, first presented in [26], can reflect the dynamic characteristics of the system without knowing the order or other prior knowledge of the system. It approximates any nonlinear function with an arbitrary high accuracy, thus widely being applied in the study of time series prediction problems successfully [27–32]. The idea of MTN optimal control was proposed by Hong-Sen Yan in 2010 [33]. The optimal adjustment control of SISO nonlinear time-invariant systems has been achieved by introducing control input into MTN [34]. Asymptotic tracking and dynamic regulation of SISO nonlinear system based on discrete multidimensional Taylor network are considered in [35]. However, the system without mechanism model is not considered, nor are the time-varying characteristics of the nonlinear discrete-time system. MTN relies on the polynomial network for identification and control of the nonlinear system [10] in which the system considered is constant and its learning algorithm is based on the gradient descent algorithm with constant learning factor, which leads to its slow learning speed and convergence to local minima.

Due to the uncertainties of the external environment and time-varying characteristics of a controlled plant [36, 37], the identifier and controller parameters need constant updating online in the process of its control, and the adjustments affect not only the control process but the robustness of the controller. Therefore, designing a desirable real-time self-tuning rule for the weight parameters of the identifier and controller is highly wanted. Backpropagation (BP) algorithm [38] is the most widely used learning algorithm in training multilayer neural network. However, it has such drawbacks as slow convergence speed and local optimal point. To raise the convergence speed, the improvement of BP algorithm [39–42] has been focused on, with certain desirable results achieved. However, in the improved learning algorithm, the learning rate and momentum factor are taken as constants in the interval (−1,1) randomly. There have emerged some other evolutionary algorithms developed to adjust the weight parameters for NN, such as genetic algorithm (GA) [43], particle swarm optimization (PSO) algorithm [44], the hybrid of them [45, 46], and the fuzzy logic approach [47], and so on. It has been shown that the coefficients should not remain fixed but should be changed adaptively throughout the entire training process so as to produce better training results, and it leads to the emergence of various schemes for adjusting the learning rate and momentum factor of BP algorithms adaptively [48, 49].

In this paper, a MTN-based output tracking real-time control scheme, which consists of two MTNs, one as the identifier and the other as the controller, is proposed for SISO nonlinear time-varying discrete-time systems without mechanism models. A MTN identifier is developed for offline modeling of the system, and a set of initial parameters for online learning of the identifier is obtained. An ideal output signal is then selected for the given reference signal. Pontryagin minimum principle is employed to obtain the numerical solution of the optimal control law of the system relative to the ideal output signal, with the corresponding optimal output called the desired output signal. A MTN controller is generated automatically to fit the numerical solution of the optimal control law via the CG method, and a set of initial parameters for online learning of the controller is obtained. Based on the above, a novel adaptive BP algorithm for adjusting both the learning rate and the momentum factor in real time is designed to further enhance the learning speed of the identifier and controller. Finally, the convergence of the novel adaptive BP algorithm is analyzed. Simulation results show that the proposed scheme is valid.

This paper is arranged as follows: in Section 2, problem description; in Section 3, identifier design; in Section 4, automatic generation of the controller; in Section 5, selection of initial value of the online controller parameters; in Section 6, controller parameters self-tuning; The algorithm steps for MTN optimal control scheme are summarized in Section 7; simulation study is mentioned in Section 8; conclusion is in Section 9.

#### 2. Problem Statement

Consider the following unknown SISO nonlinear time-varying discrete-time system described by the input-output difference equation:where is an unknown nonlinear scale function, and are the output and input of the system, and and are the corresponding maximum delays.

The goals of the present study are as follows: (a) to design an offline identifier to build the system model based on the input-output data pairs ; (b) to design such real-time controller that allows the output of the system (1) to track the given reference signal as closely as possible.

The block diagram of control system (1) is shown in Figure 1. For clarity, the multidimensional Taylor network identifier and controller are abbreviated as MTNI and MTNC, as shown in Figure 1.

#### 3. System Identification

It is known from [26] that MTN provides a good nonlinear function approximation approach, and in the system (1) can be approximated with arbitrary precision by MTN, using an appropriate learning algorithm. Let be the mapping relationship, and we obtain the MTN model (MTNI) of the system (1) as follows:where is the output of MTNI, is the weight coefficient vector of MTNI, and and are positive constants, , .

For convenience and without loss of generality, set and we get

Setting the weight coefficient vector of MTNI as allows us to rewrite the identification model (2) aswhere represents the total number of product items of the -ary function expanded into the approximate polynomial with powers, is the weight coefficient of the -th product item in the formula, denotes the power of the variable in the -th product item, and , where .

The diagram of MTNI is shown in Figure 2.

To calculate and , the product items in (4) are rearranged as illustrated in Figure 3, i.e., storing the product items of the expansion according to their powers, respectively. We use the symbol to denote the -th rectangle in which the product items with -th power are stored and store the product items with -th power which are got by adding 1 on the power of the -th element from the -th rectangle to the -th rectangle with -th power into , and so on, until storing the product items with -th power which are obtained by adding 1 on the power of the -th element in into , where and .

The calculation of and goes as follows.

Let represent the number of product items in , and we getwhere

Suppose that, in (4), from the 2-th item, the -th product item corresponds to the -th product item in of Figure 3. For clarity, the power of the element is termed as , and represent the number of product items with the -th power from the -th to the -th rectangle. From Figure 3, it is known that

The initial values are set as follows:where and .

Based on the above, Figure 4 gives the diagram of system identification.

##### 3.1. Offline System Identification

The identification error can be defined aswhere .

The corresponding mean square error is

Substituting (4) and (9) into (10) yields

Assume

Equation (11) can be rewritten into

Calculate the partial derivative of with respect to the weight coefficient vector , i.e.,

Setting , and gives

Letting , the vector form of , and can be rewritten as

To obtain a precise model of system (1), the weight coefficient vector should be updated over and over again by observation of the input-output data pairs . A number of classical weight update laws have been proposed in the literatures, such as least squares algorithms, various gradient-type algorithms [50, 51], least-mean-square (LMS) algorithm [52], etc. The gradient method is commonly adopted for parameter adjustment; that is, can be updated once in the negative gradient direction after each offline learning. Let represent the value of after the th training, and we obtain , where . However, as the gradient path to the minimum point is zigzag, the search direction remains vertical to the last. Fortunately, the problem can be solved effectively by employing the CG method, whereby the weight can be updated as follows:where , , , and the initial value is .

The block diagram of offline learning of MTNI (4) is shown in Figure 5.

##### 3.2. Real-Time Learning for the Weights of MTNI

For an unknown system, system identification is the mathematical modeling process by observation of the input-output data pairs. Nonlinear time-varying system identification based on MTNI is to take the connection weight coefficients of MTNI as time-varying parameters to be estimated and trained online by suitable learning algorithm, with the same outputs of the plant and the model for the same set of inputs. The weight coefficients need to be adjusted online for desirable real-time identification effect.

Set the performance function for MTNI aswhere , represents the identification error at time , and .

To obtain a better identification effect for the unknown nonlinear time-varying system, the weight coefficients of MTNI should be adjusted adaptively throughout the entire training process. A novel adaptive BP algorithm for adjusting both the learning rate and momentum factor adaptively [53–55] is proposed, and the weight parameters are updated, i.e.,wherewhere is the partial derivative of with respect to ; , , are constants, and , , ; is the angle between the current gradient and previous update , given by .

Theorem 1. *For any given set of weight coefficient vector , if is generated by the learning rules from (19) to (23), there exists .*

*Proof. *For the first case, i.e., , we haveAs a matter of fact, there exist and when and , as a result of which holds.

For the second case, i.e., , we get , and , thus,As revealed by the above two cases, holds.

That completes the proof of Theorem 1.

#### 4. Controller

The control objective is to find a control input that enables the system output to track in real time the given reference signal as closely as possible in real time. In this section, we consider the controller MTNC generated automatically as follows:where is the output of MTNC, i.e., the input of the system (1), is the tracking error at time , and are the maximum delays of the output and input of MTNC, and and are positive constants, , .

For convenience, without loss of generality, denote , and we have

Known from [26], there exists a group of weight coefficient vectors , and thus, the input can be rewritten aswhere represents the total number of product items for the -ary function expanded into the approximate polynomial with powers, denotes the weight coefficient of the -th product item, is the power of the variable in the -th product item, and , where .

The diagram of MTNC is shown in Figure 6.

To calculate and , the product items in (29) are rearranged as illustrated in Figure 7, i.e., storing the product items of the expansion according to their powers, respectively. We use the symbol to denote the -th rectangle in which the product items with -th power are stored and store the product items with -th power which are got by adding 1 on the power of the -th element from the -th rectangle to the -th rectangle with -th power into , and so on, until storing the product items with -th power which are obtained by adding 1 on the power of the -th element in into , where and .

The calculation of and goes as follows.

Let represent the number of product items in , and we getwhere

Suppose that, in (29), from the 2-th item, the -th product item is according to the -th product item in of Figure 7. For clarity, the power of the element is termed as , and represents the number of product items with the -th power from the -th to the -th rectangle. From Figure 7, it is known that

The initial values are set as follows:where , and .

#### 5. Initial Weight Values of MTNC

The convergence speed is influenced by the selection of the initial weight values of the controller. However, random choice of network parameters is the most common practice in network training. To enhance the convergence speed and avoid falling into local minimum, two steps are introduced here for selection of the initial weight values. The first is to transform the offline model (4) of the system into an extended state space description form through variable substitution, select a group of ideal output signal relative to the given reference signal , and employ Pontryagin’s minimum principle to obtain the numerical solution of the optimal control law of the system relative to the ideal output signal , with the corresponding optimal output called desired output signal . In the second step, a set of parameter values is given randomly in the interval ), and the CG method is applied for MTNC offline training to approximate the optimal control law . A set of weight values are then obtained as the initial values for online training MTNC, where . The specific steps go as follows.

##### 5.1. Optimal Control Law

Based on the identification model (4) obtained offline for the system (1), can be substituted by . For convenience, setthen, we obtain the extended state space description form with the following variable substitution:

Consider the following optimal control problem [56]:where satisfies such the constraint conditions as (35) and (36).

Introduce the Hamiltonian equation:where and satisfy the following conditions:

If the control vector is constrained, Hamiltonian function takes the extreme value on the optimal control sequence by the minimum principle; i.e., take extreme value on the extreme values of the optimal trajectory and the optimal control law , that is,where is a bounded closed set.

If the control vector is not constrained, Hamiltonian function takes extreme value from the whole control space , the extreme condition beingand .

The given series of control sequence can be improved by repeated iteration in the direction that makes the gradient of Hamiltonian function decrease, until the necessary condition (42) is satisfied. Then we obtain the numerical solution of the optimal control law , where . For convenience, let , and the calculation steps are as follows.

*Algorithm 2. ****Step **1*. Set any given series of control sequence , where is the number of iterations, and the initial value is set as , and .*Step **2*. Solve the state variable sequentially by formula (35) based on and the initial condition , where .*Step **3*. Calculate , which is the gradient of with respect to in the control sequence , and set , where .*Step **4*. Calculate . If , stop, or else, revise the control vector: , i.e., , where is a given value, is a fixed step size, and .*Step **5*. Let , and return to Step 2.

##### 5.2. Initial Weight Values of MTNC

MTNC is generated automatically to approximate the numerical solution of the optimal control law for the offline model (4) of the system (1) relative to the ideal output signal . The weight coefficients of MTNC are obtainable by offline learning, and the block diagram of offline learning for MTNC (29) is shown in Figure 8.

The initial weight values of MTNC are secured offline by the CG method as follows.

Define the appropriate error aswhere is the output of the controller MTNC at time .

The corresponding mean square error is

Substituting (43) and (29) into (44) yields

Setand formula (45) can be rewritten as

Calculate the partial derivative of with respect to the weight coefficient vector :

Let , , and , and we have

Setting enables us to get the vector form of , and as follows:

For the given numerical solution of the optimal control, the weight coefficient vector can be updated once in the negative gradient direction after each learning. Let represent the value of after the th iterative training; then we have , where . However, as the gradient path to the minimum point is zigzag, the search direction remains vertical to the last. Fortunately, the problem can be solved effectively by employing the CG method, whereby the weight can be updated as follows:where , , , and the initial value is defined as .

#### 6. Real-Time Learning for Weights of MTNC

Due to the real-time modeling error for the unknown nonlinear time-varying system and the uncertainties existing in practical applications, the controller with fixed weight coefficients cannot ensure the lasting robust performance of the system. Therefore, it is required that the controller be capable of adjusting automatically for real-time control. Similar to the real-time learning algorithm for MTNI, a novel adaptive BP algorithm is proposed here for MTNC real-time training [53–55].

The performance function is defined aswhere , represents the practical tracking error at time .

The weight coefficients of MTNC are updated according to (53)-(57):where is the partial derivative of with respect to ; , , are constants, and , , ; is the angle between the current gradient and the previous update , given by .

As the system considered is unknown, the output of the actual system can be replaced with the output of the identification model (4), that is,where , , , , , , and have the same meanings as in (4), and .

Thus,

In (61), the first part on the right side of the equation can be calculated using the real-time identification model (4), and the second term can be done as follows:where is mentioned before.

Theorem 3. *For any given set of weight vector of MTNC used to approximate the optimal control law offline, by taking the weight vector as the initial values for online learning, we have if is updated with the learning rules from (53) to (57).*

*Proof. *For the first case, i.e., if holds, thenAs a matter of fact, and when and , thus holds.

For the second case, i.e., if , then , , and , then