Abstract

This paper proposes a novel adaptive dynamic programming (ADP) approach to address the optimal consensus control problem for discrete-time multiagent systems (MASs). Compared with the traditional optimal control algorithms for MASs, the proposed algorithm is designed on the basis of the event-triggered scheme which can save the communication and computation resources. First, the consensus tracking problem is transferred into the input-state stable (ISS) problem. Based on this, the event-triggered condition for each agent is designed and the event-triggered ADP is presented. Second, neural networks are introduced to simplify the application of the proposed algorithm. Third, the stability analysis of the MASs under the event-triggered conditions is provided and the estimate errors of the neural networks’ weights are also proved to be ultimately uniformly bounded. Finally, the simulation results demonstrate the effectiveness of the event-triggered ADP consensus control method.

1. Introduction

Because of the wide applications in the control field [16], the consensus control of MASs gained more and more attentions. In recent years, quite a few methods have been reported to solve the consensus control problem of MASs, such as adaptive control [7, 8] and sliding mode control [9, 10]. It is worth mentioning that the previous methods focus on the stability of the MASs. However, the optimal characteristic is also worth considering in the consensus control problem. Optimal consensus control problem aims to find the optimal control policies which guarantee the stability of MASs and minimize the energy cost. As one of the core methods to achieve the optimal control policies, ADP approaches address the issue abovementioned by approximating the solutions of Hamilton–Jacobi–Bellman (HJB) equation [1113].

Till now, ADP approaches have been applied in the optimal consensus control of MASs [1420]. In [14], an optimal coordination control algorithm has been designed to address the consensus problem of the multiagent differential games through fuzzy ADP. The optimal output heterogeneous MASs was considered in [15]. Based on this work, Gao et al. [16] considered the dynamic uncertainties factor in the cooperative output regulation problems. Zhang et al. [17, 18] considered the optimal consensus tracking control for discrete-time/continuous-time MASs. In order to address the optimal consensus problem for unknown MASs with input delay, the authors proposed a data-driven disturbed adaptive controller based on ADP technique in [19]. In [20], the problem of data-based optimal consensus control was studied for MASs with multiple time delays. All the above results are based on the assumption that the communication and computing resources are big enough to transmit system data and update the control policy in every time step. However, it is difficult to be satisfied in practice.

Event-triggered control (ETC) is a well-recognized technology to address the above issue [2124]. Different from the time-triggered control, whether the systems sample the signals or not only depends on the event-triggered condition. If it is satisfied at some time instants, then the data will be transmitted and the control policy will be updated. Therefore, compared with the control algorithms based on time-triggered scheme, the event-trigger control algorithms can efficiently save the computation resources [25]. In the past years, ETC is introduced to solve the optimal control problem under the limited computing resources [2629]. In [26], an ETC method based on ADP is developed for continuous-time MASs. The authors considered the unknown internal states factor in the event-triggered optimal control for continuous-time MASs in [27]. The multiplayer zero-sum differential games are considered in [28] and an optimal consensus tracking control based on event-triggered is designed to solve this problem. In [29], an event-triggered optimal control algorithm is designed for unmatched uncertain nonlinear continuous-time systems. In [30], to save the limited network resources, an event-triggered mechanism was introduced to address the consensus problem of linear discrete-time MASs. The authors considered the event-triggered consensus problem of discrete-time multiagent networks in [31]. It is worthy to say, all the results in [2629] studied the event-triggered optimal control for continuous-time MASs, but there were few works [30, 31] which consider the discrete-time MASs.

Motivated by the above discussions, an event-triggered ADP control algorithm is designed to address the optimal consensus tracking problem for discrete-time MASs. The major contributions of this paper are emphasized as follows:(1)Comparing with the existing event-triggered ADP consensus control methods [2729], we design the adaptive ET condition for every agent in the MASs. Then, the agent samples the data and communicates with the neighbors only when its event-triggered condition is satisfied. That means the agents in the MASs may not communicate with their neighbors or update their control policies at the same time instant, and then, the communicate resources are saved.(2)In this paper, we give the stability analysis for the MASs under the event-triggered condition. It shows all agents in the discrete-time MASs will achieve consensus under the ET condition. And, we also prove the weight estimate errors for the critic neural networks (NNs) and actor NNs are uniformly ultimately bounded during the learning process.

The rest of this paper is organized as follows. In Section 2, the discrete-time MASs are considered and the consensus problem is formed. The event-triggered conditions for each agent in the system are introduced and the stability analysis is given in Section 3. Then, NN-based event-triggered ADP algorithm is introduced in Section 4, and the simulation results of this algorithm are given in Section 5. Finally, the conclusions are shown in Section 6.

2. Problem Formation

Consider the discrete-time MASs:where denote the state and the coordination control of agent , respectively. are the constant matrices.

The leader’s dynamics function is defined aswhere denotes the state of the leader.

The local neighbor consensus tracking error is defined aswhere denotes the adjacency elements, if agent can communicate with agent , otherwise, , and denotes the pinning gain, , if agent can communicate with the leader, otherwise, . We assume that there is at least one agent who can get the information from the leader.

Under the event-triggered scheme, the discrete-time MASs transmit the systems’ data only when the event is triggered. Here, we define that the event is triggered at the discrete-time instants’ sequence , for . At the event-triggered instant of agent , the consensus errors of agent denote as .

The event-triggered error is defined aswhich means the difference between the consensus tracking errors at the event-triggered instant and the current local neighbor consensus tracking errors.

Then, the consensus problem of the discrete-time MASs is to find the distributed feedback control law, , which becomes a continuous signal through a zero-order hold (ZOH) device when .

Then, the local cost function is defined aswhere(i): the utility function, for agent ,(ii): the control of the neighbors of agent .(iii), , and : positive symmetric weighting matrices.(iv): the discount factor, .

According to Bellman’s principle, the optimal local cost function can be defined aswhich is also called discrete-time HJB equations.

The optimal disturbed control law is defined as

3. Stability Analysis

Assumption 1 (see [32]). There exist positive constants , , , and , a function : , and class functions and , such thatIf (10) and (11) are satisfied, function is called an ISS-Lyapunov function for the discrete-time MAS.
Let us consider a situation that , which means that the ET condition is satisfied at the sampling instant . In this situation, it is obvious that . Then, we haveSubstituting (1) and (2) into (3), we haveThen, we can haveSubstituting (9) into (14), we haveTherefore,Then, we can rewrite the ET condition asfor every .
To better illustrate the control process, a flowchart has been displayed in Figure 1. The transmitted data and control policies are updated at instant, and the event-triggered error is reset to zero. Once the event-triggered condition is satisfied, the current instant becomes the next triggering instant , and the system data are transmitted. Otherwise, keep the transmitted data and control policies unchanged.
Then, we will prove the discrete-time MAS is stable under our event-triggered conditions.

Theorem 1. If a discrete-time MASs which is under assumption 1 and satisfies the function,for every , where , then the system is asymptotically stable.

Proof. According to (9) and (11), we obtainThen, applying (18) into (19), we haveSolving (20), we can obtainWe define a function asAccording to (18), we havefor every .
From (22), we obtainApplying (9) into (24), we haveSince (23) and (25) hold, the stability of the discrete-time MAS is proved.

Remark 1. We give the event-triggered condition for each agent in the discrete-time MASs. Moreover, the stability of the systems is also proved in this paper.

4. Event-Trggered Controller Design

In this section, considering the good fitting characteristics of the neural networks (NN) [33, 34], the actor-critic neural network structure is introduced to approximate the local cost function and the distributed feedback control law . The actor-critic NNs are defined aswhere denotes the input data, denotes the activation functions, and and denote the weight matrices of the NNs.

4.1. Formulation of the Critic Networks

The critic NN approximates the local cost function in this paper as follows:where denotes the input vector of the critic NN which is constituted by , , and , denotes the activation function of the critic NN, and and are the weight matrices for the critic NN.

We define the difference between the current cost value and the estimate value as the error function of the critic NN as follows:

Then, the loss function for the critic NN is given as

Our objective is to minimize the loss function during the critic NN training.

The weights for the critic NN are updated according to the gradient-based rule, which is given as follows:where denotes the learning rate.

4.2. Formulation of the Actor Networks

The actor NN approximates the disturbed control law , which can be formulated aswhere is the input vector of the actor NN, is the activation function for the actor NN, and and are the weight matrices for the actor NN.

We define the difference between the current local cost value and the target cost value as the error function, which is given as

In this paper, the target cost value is defined as 0.

Then, the loss function for the actor NN is given as

Our objective is to minimize the loss function during the actor NN training.

The weights for the actor NN is updated according to the gradient-based rule, which is given as follows:where , , and is the learning rate for the actor NN.

The procedure of the NN-based event-triggered optimal consensus control algorithm for discrete-time MASs is shown in Algorithm 1.

Initialization:
 Give the computation precision and the initial state for agent ;
 Give the initial state for the leader;
 Select the learning rate and ;
 Give the positive matrices , , and ;
 Initialize the event-triggered error condition ;
 Select the positive constant ;
Iteration:
 Let the iteration index ;
repeat:
 Calculate the tracking error and the event-triggered error ;
  IF:
   Event-triggered error ;
   Event-triggered index ;
   Compute the control law ;
   Compute the local cost function ;
   Compute the next state of agent and the next state of the leader agent;
   Calculate the next tracking error ;
   Compute the control law ;
   Compute the local cost function ;
   Update the weights matrix of the critic NN;
   Update the weights matrix of the actor NN;
  ELSE:
   The control law ;
  Compute the control law ;
  Compute the next state of agent and the next sate of the leader agent
  according to the model NN;
   ;
Until;
End

Theorem 2. Consider a discrete-time MAS. The weights of critic NN and actor NN are updated following (30) and (34), respectively, under condition (17). The state , the critic NN weight estimation error, , and the action weight estimation error, , in the close loop system are UUB.

Proof. Case 1: the ET condition is satisfied at iteration index . The Lyapunov function for agent can be defined as follows:where , , and . The difference between and can be given asThe difference between and can be given asAccording to the update function for the weight matrix of critic NN (30), we havewhere .Substituting (38) into 37 we haveThe difference between and can be given asAccording to the update function for the weight matrix of critic NN (34), we haveSubstituting (41) into (40), we haveCombining (36), (39), and (42), the difference between and is given aswhere .If one of the conditions or holds, the difference is . This means the states of the system and the error of the weight matrices for critic NN and actor NN are UUB.Case 2: if the ET condition is not satisfied at iteration instant , consider the Lyapunov function (35) in case 1.The difference between and can be given asThe weight matrices for the critic NN and actor NN are not updated when the ET condition is not satisfied, so the differences are and .Combining , , and , the difference between and is given asIf the condition holds, the difference is . This means when the ET condition is not satisfied at the time index , the states of the system and the error of the weights matrices for the critic NN and actor NN are UUB.

5. Simulation Analysis

To test the effectiveness of the proposed algorithm, we apply the proposed algorithm in a numerical example. Consider a discrete-time leader-follower MAS consisting of 4 agents with a network topology, as shown in Figure 2. In the topology, agent 0 denotes the leader and the followers are labeled as agent 1 to agent 4. The adjacency elements , , and are set to 1. The other adjacency elements are set to 0. In this numerical example, only agent 1 can communicate with the leader, which means and . The weight matrices of the utility function are selected as , and .

The dynamics matrix for the leader are set to . The dynamics matrices for the followers are set to , , , , and .

The parameters for the critic NN and the actor NN are set to , and , , and . , , , and are the activation functions of the critic NNs. The activation functions of the actor NNs are set to . , , , , and are chosen as the initial states for the leader and the follower agents in the system. We set .

The tracking path for every agent in the discrete-time MAS is shown in Figure 3. From Figure 3, we can observe that all the agents in the system can reach the same state as the leader, and then, they achieve synchronization. The driving errors for the agents in the system are shown in Figure 4. All the agents’ driving error all are not updated at every instant , that is to say, all the agents are driven when the ET condition is satisfied. Figure 5 shows the comparisons of event-triggered errors and thresholds for every agent in the system. In Figure 5, we can observe that the event-triggered errors are always smaller than the thresholds during the tracking process, and we only sample the data when the event-triggered errors are bigger or equal to the thresholds, so we sample the less data and save computing resources using our algorithm. Figure 6 shows the comparisons of the required number of transmitting data under the time-triggered and event-triggered ADP algorithm for every agent in the system. We can observe the required number of the event-triggered algorithm is much less than the required number of the time-triggered algorithm.

6. Conclusion

An event-triggered optimal consensus tracking control algorithm based on the ADP structure is proposed in this paper. To save the communication and computation resources, we introduce the event-triggered scheme to the optimal consensus tracking control algorithm. The neural networks technology is introduced to simplify the application of the proposed algorithm. It is proved the discrete-time MASs are stable with the proposed algorithm and the estimate errors of the weights for NNs are UUB. The simulation results illustrate the efficiency of the proposed method.

Data Availability

All data included in this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (61803285 and 62001332) and National Defense Pre-Research Foundation of China (H04W201018).