Abstract
This paper deals with the problem of identifying and filtering a class of continuous-time nonlinear dynamic games (nonlinear differential games) subject to additive and undesired deterministic perturbations. Moreover, the mathematical model of this class is completely unknown with the exception of the control actions of each player, and even though the deterministic noises are known, their power (or their effect) is not. Therefore, two differential neural networks are designed in order to obtain a feedback (perfect state) information pattern for the mentioned class of games. In this way, the stability conditions for two state identification errors and for a filtering error are established, the upper bounds of these errors are obtained, and two new learning laws for each neural network are suggested. Finally, an illustrating example shows the applicability of this approach.
1. Introduction
1.1. Preliminaries and Motivation
Nowadays, an investigation field that has been developed widely is the design of controllers for a certain group of systems involved in a conflicting interaction, that is to say, when the objective of each implicated system differs and when the known information about this interaction may be distinct for every system (e.g., [1]).
More formally, this special class of system groups can be represented by means of the dynamic noncooperative game theory, where the decision making process (or interaction) is called a game and each involved system in it is called a player [2].
From the viewpoint of control theory, a dynamic game is controlled by obtaining its equilibrium solution, and in order to do so, a (mathematical) model of this dynamic game is needed. So, several publications about dynamic games (and particularly about continuous-time dynamic games) are based on the complete knowledge of the model that describes its dynamics (see, e.g., [3, 4]). Nevertheless, having a model (or even a partial model) of a continuous-time dynamic game is not always possible.
On the other hand, the equilibrium solution of a dynamic game is also based on the information structure that every player has or, in other words, on the available information that each player can use in the control strategy. For example, one can obtain an open-loop Nash equilibrium solution by using the maximum principle technique or a feedback Nash equilibrium solution by utilizing the dynamic programing method (see [2]).
According to the above, the aim of an identification process in terms of a dynamic game should be the modeling of such game and the obtaining of its information structure, that is, the guarantee that the control strategy of every player gives an equilibrium solution for a game despite its dynamic uncertainties. In this way, the works of [5, 6] obtain feedback control strategies for differential games modeled through norm-bounded uncertainties; the studies of [7–9] achieve equilibrium solutions for several classes of differential games under a multimodel approach; the analyses of [10, 11] present adaptive algorithms for determining equilibrium solutions without the complete knowledge of the game dynamics, and the work of [12] proposes the obtaining of a suboptimal equilibrium solution where a differential game is approximated by a fuzzy model.
However, an identification process could be deficient if there exist undesired perturbations in the game dynamics, that is to say, if the obtained information structure of the game is corrupted by (deterministic) noises. Thereby, some deterministic filtering works have been published in order to solve this issue. For example, [13] describes the concept of adaptive noise canceling for the estimation of signals corrupted by additive perturbations; [14] introduces a finite horizon robust filtering method that provides a guaranteed bound for the estimation error in the presence of both parameter uncertainty and a known input signal; [15] presents the filtering of the states of a time-varying uncertain linear system with deterministic disturbances of limited power; and [16] derives an optimal filtering formula for linear time-varying discrete systems with unknown inputs.
Therefore, according to these preliminary works, the motivation of this paper is to solve the problem of identifying and filtering a class of nonlinear differential games with additive deterministic perturbations, where the mathematical model of this class and the effect (or power) of the noises are completely unknown.
1.2. Main Contribution
Since the introduction of continuous-time recurrent neural networks (see [17]), the self-named differential neural networks have proved to be an excellent tool in the identification, state estimation, and control of several systems and of the appointed continuous-time dynamic games.
For example, in [18], differential neural networks are used for the identification of dynamical systems; references [19–21] design differential neural network observers for adaptive state estimation; the works of [22, 23] propose neural network controllers for several applications; and [24] shows a compendium of differential neural networks for identification, state estimation, and control of nonlinear systems. Also, [25] treats the state estimation problem for affine nonlinear differential games using a differential neural network observer, and in [26], a nearly optimal Nash equilibrium for classes of deterministic and stochastic nonlinear differential games is obtained using differential neural networks.
Moreover, speaking of recurrent neural networks and deterministic filtering, the work of [27] develops a recurrent neural network for robust optimal filter design in a approach, and in [28], algorithms are presented in order to obtain adaptive filtering in nonlinear dynamic systems approximated by neural networks.
Nevertheless, the idea of using differential neural networks for identification and filtering of a class of continuous-time nonlinear dynamic games is a new approach that, as far as the authors know, has not been treated before.
Hence, the main contribution of this paper is the proof that it is possible to identify and to filter the states of a certain class of nonlinear differential games through the designing of two differential neural networks. Moreover, this filtered identification process generates a feedback (perfect state) information pattern for the mentioned class of games.
More specifically, although the structure of this class is known, its mathematical model is not; that is to say, the only available information of the nonlinear differential game is the control actions of each player. So, by using only this available information, a first differential neural network will be designed for the identification of the nonlinear dynamic game with the undesired perturbations, and, similarly, the second differential neural network will identify the effect of these additive noises in the dynamics of the nonlinear differential game, which means that the perturbations are known but their power is not.
According to the above, one of these two differential neural networks will do the complete identification process of the class of nonlinear differential games, and, noticeably, the filtering (or the canceling) of the undesired perturbations will be held by subtracting the state estimates of the two differential neural networks.
Finally, it is important to emphasize that these two differential neural networks have the structure of multilayer perceptrons (see [23, 24, 29]), and, by means of Lyapunov’s second method of stability, the learning laws for their synaptic weights are derived.
2. Class of Nonlinear Differential Games
Consider the following continuous-time nonlinear dynamic game given by where ; the index denotes the number of players; is the state vector of the game; denotes the admissible control action vector of each player; the mappings and are unknown nonlinear functions; denotes a known deterministic perturbation vector; and is an unknown constant matrix of adequate dimensions.
Similarly, consider now the following set of cost functions (or performance indexes) associated with each player and given by where is well-defined for the player.
Moreover, the information structure of each player (denoted by ) has a standard feedback (perfect state) pattern; that is to say, and a permissible control strategy (or control policy) of the player is defined by the set of functions satisfying Nevertheless, the class of nonlinear differential games (1)–(4) is not completely described if the following assumptions are not fulfilled.
Assumption 1. The admissible control actions are measurable and bounded for all time ; that is, where are known constants.
Assumption 2. In order to guarantee the global existence and uniqueness of the solution of (1), the unknown nonlinear functions and satisfy the Lipschitz condition; that is, there exist , constants such that the equations are fulfilled for all . Moreover, and satisfy the following linear growth condition: where are known constants.
Assumption 3. Under the permissible control strategies , the (feedback) class of nonlinear differential games given by (1)–(4) is quadratically stable; that is, there exists a Lyapunov (maybe unknown) function such that the inequalities are satisfied for any known , constants.
Assumption 4. The deterministic perturbation is measurable and bounded for all time ; that is, where is a known constant.
3. Problem Statement
Let the class of continuous-time nonlinear dynamic games (1)–(4) be such that Assumptions 1 to 4 are fulfilled. Then, if one makes the following change of variables: it is clear that the expected or uncorrupted vector state of the class of nonlinear differential games (1)–(4) can be defined as
Thereby, and in view of the fact that , , and are unknown, the tackled problem in this paper is to obtain a feedback (perfect state) information pattern , given that satisfies the following filtering (or noise canceling) equation: and where are the state estimates of differential equations (10) and (11), respectively.
4. Differential Neural Networks Design
In order to solve the problem described above, consider a first differential neural network given by the following equation: where ; denotes the number of players; is the vector state of the neural network; the matrices , , , and are synaptic weights of the neural network; and and are activation functions of the neural network.
Remark 5. For simplicity (and from now on) and .
According to [29], the differential neural network (14) is classified as multilayer perceptrons and its structure was initially taken from [23, 24]. Also, this differential neural network only uses sigmoid activation functions; that is, and have a diagonal structure with elements where and are known constants that manipulate the geometry of the sigmoid function.
Thus, in view of the fact that and are sigmoid activation functions, they are bounded and they satisfy the following equations: where , , , , , , , , , , and are known constants of adequate dimensions, and
Nevertheless, there are some design conditions that this first differential neural network needs to satisfy.
Assumption 6. According to Assumption 2, the approximation or residual error of (14) corresponding to the unidentified dynamics of (10) and that is given by where , , , and are initial synaptic weights (when ), is bounded and satisfies the following inequality: where is a known constant and , .
Remark 7. The constants and in (22) are not known a priori because of the fact that they depend on the performance of the differential neural network (14); that is to say, the residual error (21) will depend on the number of neurons used in (14) and on its parameters, and therefore and will depend on this too (see [24]).
Assumption 8. There exist values of , , , , , , , , , , and , such that they provide a solution to the algebraic Riccati equation as follows: where where such that is Hurwitz and is known.
On the other hand, consider now a second differential neural network given by the following equation: where ; is the vector state; the matrices and are synaptic weights; and and are sigmoid activation functions.
Then, similar to and , the activation functions and satisfy the following inequalities: where , , , , and are known constants of adequate dimensions, and
Thereby, it is easy to confirm that if , if , if , and if , where , then the differential neural networks (14) and (27) coincide. Hence, two new assumptions (corresponding to Assumptions 6 and 8) must be satisfied for this second differential neural network.
Assumption 9. According (again) to Assumption 2, the approximation or residual error of (27) corresponding to the unidentified dynamics of (11) and that is given by where and are initial synaptic weights (when ), is bounded and satisfies the inequality where is a known constant and , .
Remark 10. Similar to Remark 7, the constants and in (31) are not known a priori because they depend on the performance of the differential neural network (27).
Assumption 11. If there exist values of , , , , and such that and values of and such that then they provide a solution to the algebraic Riccati equation (23), where
Remark 12. Notice that the equalities (32) and (33) were chosen in order to guarantee the same solution of the algebraic Riccati equation (23) in Assumptions 8 and 11. Also, notice that the first term of the right-hand side of (26) will always be greater than the first term of the right-hand side of (36); that is, .
5. Main Result on Identification and Filtering
According to the above, the main result on identification and filtering for the class of nonlinear differential games (1)–(4) deals with both the development of an adaptive learning law for the synaptic weights of the differential neural networks (14) and (27) and the inference of a maximum value of identification error for the dynamics (10) and (11).
Moreover, the establishment of a maximum value of filtering error between the uncorrupted states and the identified ones is obtained; namely, an error is defined by where is given by noise canceling equation (13) and is given by the expected or uncorrupted vector state (12).
More formally, the main obtained result is described in the following three theorems.
Theorem 13. Let the class of continuous-time nonlinear dynamic games (1)–(4) be such that Assumptions 1 to 4 are fulfilled. Also, let the differential neural network (14) be such that Assumptions 6 and 8 are satisfied. If the synaptic weights of (14) are adjusted with the following learning law: where denotes the identification error, , , and , , , and are known symmetric and positive-definite constant matrices, then it is possible to obtain the next maximum value of identification error in average sense as follows:
Proof. Taking into account the residual error (21), the differential equation (10) can be expressed as Then, by substituting (14) and (41) into the derivative of (39) with respect to and by adding and subtracting the terms , , , and , it is easy to confirm that where and . Now, let the Lyapunov (energetic) candidate function be such that the inequalities (8) are fulfilled (see Assumption 3), and let be the derivative of (43) with respect to . Then, by substituting (42) into the second term of the right-hand side of (44) and by adding and subtracting the term , one may get Next, by analyzing the first five terms of the right-hand side of (45) with the following inequality: which is valid for any pair of matrices and for any constant matrix , where and are positive integers (see [24]), the following is obtained.(i)Using (46) and (17) in the first term of the right-hand side of (45), then (ii)Substituting (18) into the second term of the right-hand side of (45) and using (46) and (19), then (iii)Using (46) and (17) in the third term of the right-hand side of (45), then (iv)Substituting (18) into the fourth term of the right-hand side of (45) and using (46) and (19), then (v)Using (46) and (22) in the fifth term of the right-hand side of (45), then So, by substituting (47)–(51) into the right-hand side of (45) and by adding and subtracting the term , inequality (44) can be expressed as where the algebraic Riccati equation in the first term of the right-hand side of (52) is described in (23) and in (24)–(26), and Thereby, by equating (53) to zero, that is and by, respectively, solving for , , , and , the learning law given by (38) is obtained. Now, by choosing and by solving the algebraic Riccati equation (23), one may get Thus, by integrating both sides of (56) on the time interval , the following is obtained: and by dividing (57) by , it is easy to verify that Finally, by calculating the upper limit as , the maximum value of identification error in average sense is the one described in (40). This means that the identification error is bounded between zero and (40), and, therefore, this means that is stable in the sense of Lyapunov.
Theorem 14. Let the class of continuous-time nonlinear dynamic games (1)–(4) be such that Assumptions 1 to 4 are fulfilled. Also, let the differential neural network (27) be such that Assumptions 9 and 11 are satisfied. If the synaptic weights of (27) are adjusted with the following learning law: where denotes the identification error and and are known symmetric and positive-definite constant matrices, then it is possible to obtain the next maximum value of identification error in average sense as follows:
Proof. Taking into account the residual error (30), the differential equation (11) can be expressed as Then, by substituting (27) and (62) into the derivative of (60) with respect to , and by adding and subtracting the terms and , it is easy to confirm that where and . Now, let the Lyapunov (energetic) candidate function be such that the inequalities (8) are fulfilled (see Assumption 3), and let be the derivative of (64) with respect to . Then, by substituting (63) into the second term of the right-hand side of (65) and by adding and subtracting the term , one may get Next, by analyzing the first three terms of the right-hand side of (66) with the inequality (46), the following is obtained.(i)Using (46) and (28) in the first term of the right-hand side of (66), then (ii)Using (46) and (28) in the second term of the right-hand side of (66), then (iii)Using (46) and (31) in the third term of the right-hand side of (66), then So, by substituting (67)–(69) into the right-hand side of (66) and by adding and subtracting the term , inequality (65) can be expressed as where the algebraic Riccati equation in the first term of the right-hand side of (70) is described in (23) and in (34)–(36), and Thereby, by equating (71) and (72) to zero () and by, respectively, solving for and , the learning law given by (59) is obtained. Now, by choosing and by solving the algebraic Riccati equation (23), one may get Thus, by integrating both sides of (74) on the time interval , the following is obtained: and by dividing (75) by , it is easy to verify that Finally, by calculating the upper limit as , the maximum value of identification error in average sense is the one described in (61).
Theorem 15. Let the class of continuous-time nonlinear dynamic games (1)–(4) be such that Assumptions 1 to 4 are fulfilled. Also, let the differential neural networks (14) and (27) be such that the Assumptions 6 and 8 and Assumptions 9 and 11 are satisfied. If the synaptic weights of (14) and (27) are, respectively, adjusted with the learning laws (38) and (59), then it is possible to obtain the following maximum value of filtering error in average sense: where is given by (37).
Proof. Consider the following Lyapunov (energetic) candidate function:
where is the solution of the algebraic Riccati equation (23) with , , and defined by (24)–(26) or (34)–(36). Then, by calculating the derivative of (79) with respect to and by considering the equations (37), (13), and (12), it can be verified that
and, taking into account (42), (63), and the proofs of Theorems 13 and 14, one may get
Thus, by integrating both sides of (81) on the time interval , the following is obtained:
and by dividing (82) by , it is easy to confirm that
By calculating the upper limit as , equation (83) can be expressed as
and, finally, it is clear that
(i)if , the maximum value of filtering error in average sense is the one described in (77);(ii)if , the maximum value of filtering error in average sense is the one described in (78) given that the left-hand side of the inequality (84) cannot be negative.
Hence, the theorem is proved.
6. Illustrating Example
Example 16. Consider a 2-player nonlinear differential game given by subject to the change of variables (10) and (11) (), where the control actions of each player are The undesired deterministic perturbations are and denotes the Heaviside step function or unit step signal. Then, under Assumptions 1–4, 6, 8, 9 and 11, it is possible to obtain a filtered feedback (perfect state) information pattern .
Remark 17. As told before, the functions and as well as the matrix are unknown, but in order to make an example of (85) and its simulation, the values of these “unknown” parameters will be shown in the Appendix of this paper.
Thereby, consider now a first differential neural network given by
where
and the activation functions are
By proposing the values described in Assumption 8 as
and by choosing and , the solution of the algebraic Riccati equation (23) results in
Then, by applying the learning law (38) described in Theorem 13, the value of the identification error in average sense (40) on a time period of 25 seconds is
Remark 18. Even though is the value of on the time interval , it is not the global minimum value that can take since does not tend to infinity. So, in order to find this global minimum value, it would be required to simulate an arbitrarily large time.
On the other hand, let a second differential neural network given by be such that the equalities (32) and (33) are fulfilled; that is, and the activation functions are By choosing and , the solution of the algebraic Riccati equation (23) results in the matrix (92), and by applying the learning law (59) described in Theorem 14, the value of the identification error in average sense (61) on a time period of 25 seconds is
Remark 19. As in Remark 18, in order to find the global minimum value of , it would be required to simulate an arbitrarily large time.
Finally, taking into account the identification errors (93) and (97), the value of the filtering error in average sense (77)-(78) on a time period of 25 seconds is The simulation of this example was made using the MATLAB and Simulink platforms and its results are shown in Figures 1–4.

(a)

(b)

(a)

(b)

(a)

(b)

(a)

(b)
As is seen in Figure 1, the differential neural network (88) can perform the identification of the continuous-time nonlinear dynamic game (85), where and (or, similarly, and ) denote the state variables with undesired perturbations and and indicate their state estimates.
On the other hand, according to Figure 2, the differential neural network (94) identifies the dynamics of the additive deterministic noises or, in other words, the dynamics of . Thus, and represent the state variables of the above differential equation and and are their state estimates.
In this way, Figure 3 shows the performance of the filtering process (13) in the nonlinear differential game (85); that is to say, it shows the comparison between the expected or uncorrupted state variables and and the filtered state estimates and .
Finally, Figure 4 also exhibits the described filtering process but comparing the real state variables and with filtered state estimates and .
7. Results Analysis and Discussion
Although the differential neural networks (14) and (27) can perform the identification of the differential equations (10) and (11), it is important to remember that their performance depends on the number of neurons used and on the proposition of all the constant values (or free parameters) that were described in Assumptions 8 and 11. Therefore, the values of and (at a fixed time) can change according to this fact.
In other words, due to the fact that the differential neural networks are an approximation of a dynamic system (or game), there always will be a residual error that will depend on this approximation.
Thus, in the particular case shown in Example 16, the design of (88) was made using only eight neurons, four for the layer of perceptrons without any relation to the players and two for the layer of perceptrons of each player. Similarly, for the design of (94) (which is subject to the equalities (32)-(33)), six neurons were used.
On the other hand, it is important to mention that (14) and (27) will operate properly only if Assumptions 1–4, 6, 8, 9 and 11 are satisfied; that is to say, there is no guarantee that these differential neural networks perform good identification and filtering processes if, for example, the class of nonlinear differential games (1)–(4) has a stochastic nature.
Finally, although this paper presents a new approach for identification and filtering of nonlinear dynamic games, it should be emphasized that there exist other techniques that might solve the problem treated here, for example, the cited publications in Section 1.
8. Conclusions and Future Work
According to the results of this paper, the differential neural networks (14) and (27) solve the problem of identifying and filtering the class of nonlinear differential games (1)–(4).
However, there is no guarantee that (14) and (27) identify and filter the states of (1)–(4) if Assumptions 1–4, 6, 8, 9 and 11 are not met.
Thereby, the proposed learning laws (38) and (59) obtain the maximum values of identification error in average sense (40) and (61), that is to say, the maximum value of filtering error (77)-(78).
Nevertheless, these errors depend on both the number of neurons used in the differential neural networks (14) and (27) and the proposed values applied in their design conditions.
On the other hand, according to the simulation results of the illustrating example, the effectiveness of (14) and (27) is shown and the applicability of Theorems 13, 14, and 15 is verified.
Finally, speaking of future work in this research field, one can analyze and discuss the use of (14) and (27) for the obtaining of equilibrium solutions in the class of nonlinear differential games (1)–(4), that is to say, the use of differential neural networks for controlling nonlinear differential games.
Appendix
The values of the “unknown” nonlinear functions , , and and the “unknown” constant matrix , used in (85) of Example 16, are shown as follows:
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.