Dynamic Programming and Hamilton–Jacobi–Bellman Equations on Time Scales

Zhu, Yingjun; Jia, Guangyan

doi:https://doi.org/10.1155/2020/7683082

Complexity

On this page

Abstract Introduction Preliminaries Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 7683082 | https://doi.org/10.1155/2020/7683082

Dynamic Programming and Hamilton–Jacobi–Bellman Equations on Time Scales

Yingjun Zhu¹and Guangyan Jia¹

Academic Editor: Guang Li

Received07 Jun 2020

Revised16 Oct 2020

Accepted24 Oct 2020

Published19 Nov 2020

Abstract

Bellman optimality principle for the stochastic dynamic system on time scales is derived, which includes the continuous time and discrete time as special cases. At the same time, the Hamilton–Jacobi–Bellman (HJB) equation on time scales is obtained. Finally, an example is employed to illustrate our main results.

1. Introduction

The stochastic control problem is to find an optimal control such that a cost functional associated with a stochastic system reaches the minimum value. The method of dynamic programming is a powerful approach to solving the stochastic optimal control problems. The dynamic programming is a well-established subject [1–4] to deal with continuous and discrete optimal control problems, respectively, and it has great practical applications in various fields [5, 6]. It is generally assumed that the time is continuous or discrete in the dynamic systems. However, this cannot be guaranteed. In reality, the time scale could be neither continuous nor discrete. There are many processes which are the mixture of continuous time and discrete time, nonuniform discrete time, or the union of disjoint time interval, such as the production and storage process in economics, the investment process in finance, and the population for the seasonal insects. When the time is more complex, it makes the control problem more difficult. How to deal with this problem?

Time scales were first introduced by Hilger [7] in 1988 in order to unite differential and difference equations into a general framework. This allows us to deal with the continuous and discrete analyses from the common point of view. Recently, time scale theory is extensively studied in many works [8–13]. It is well known that the optimal control problems on time scales are an important field for both theory and applications. Since the calculus of variations on time scales was studied by Bohner [14], the results on optimal control problems in the time scale setting and their applications have been rapidly growing. The existence of optimal control for the dynamic systems on time scales was discussed [15–17]. Subsequently, Pontryagin maximum principle on time scales was studied in several works [18, 19], which specifies the necessary conditions for optimality.

The dynamic programming for dynamic systems on time scales is not a simple task to unite the continuous time and discrete time cases because the time scales contain more complex time cases. Seiffertt et al. [20] studied the approximate dynamic programming for the dynamic system in the isolated time scale setting. In addition, Bellman dynamic programming on general time scales for the deterministic optimal control problems was considered in [21, 22]. However, limited work [23, 24] has been done on the linear quadratic stochastic optimal control problem in the time scales setting. That is to say, the general setting of stochastic optimal control problems on time scales is completely open.

Motivated by all these significant works, the purpose of this paper is to study the method of dynamic programming for the stochastic optimal control problems on time scales. As we know, the stochastic dynamic programming principle is different from the deterministic systems, which reflects the stochastic nature of the optimal problem. So, the method in the deterministic case on time scales cannot be applied to the stochastic case directly. In order to overcome this difficulty, we first give a new form of the chain rule on time scales. Based on this idea, we obtain ’s formula for the stochastic process on time scales. Second, we consider a family of optimal control problems with different initial times and states to establish the Bellman optimality principle in the time scale setting. Third, using ’s formula and the Bellman optimality principle obtained in the time scales, we also get the associated Hamilton–Jacobi–Bellman (HJB for short) equation on time scales which is a nonlinear second-order partial differential equation involving expectation. If the HJB equation is solvable, then we can get an optimal feedback control. Our work will enrich the dynamic programming problem by providing a more general time framework and make dynamic programming theory to be a powerful tool in tackling the optimal control problem on complexity time.

The organization of this paper is as follows. In Section 2, we show some preliminaries about time scale theory. Section 3 focuses on the Bellman optimality principle and the HJB equations on time scales. By introducing a new symbol, we present ’s formula in a new form. On this basis, we get the main results. Finally, an illustrative example is given to show the effectiveness of the proposed results.

2. Preliminaries

A time scale is a nonempty closed subset in real number set , and we denote . In this paper, we always suppose is bounded. The forward jump operator and backward jump operator are, respectively, defined bysupplemented by and , where denotes the empty set. If and , the point is called right-dense, while if , the point is called right-scattered. Similarly, if and , the point is called left-dense, while if , the point is called left-scattered. Moreover, a point is called isolated if it is both left-scattered and right-scattered. For a function , we denote to represent the compositions of the functions and . Similarly, we denote . The definition of the graininess function is as follows:

We now present some basic concepts and properties about time scales (see [10, 11]).

Definition 1 (see [10]). Let be a function on . is called a right-dense continuous function if is continuous at every right-dense point and has finite left-sided limits at every left-dense point. Similarly, is called a left-dense continuous function if is continuous at every left-dense point and has finite right-sided limits at every right-dense point. If is right-dense continuous and also is left-dense continuous, then is called a continuous function.
Define the set as

Definition 2 (see [11]). Let : be a function and . If for all , there exists a neighborhood of such thatwe call the -derivative of at .
We denote by the space of -valued continuously -differentiable functions on and denote by the family of all functions defined on such that they are continuously -differentiable in and twice continuously differentiable in .
Furthermore, we give the derivation rule of the compound function.

Lemma 1 (see [25]). Let be -differentiable and be continuously differentiable. Then is -differentiable, and

In this paper, we adopt the stochastic integral defined by Bohner et al. [26]. Let be a complete probability space with increasing and continuous filtration . We define that is the set of all -adapted, -valued measurable processes such that .

A Brownian motion indexed by time scales was defined by Grow and Sanyal [13]. Although the Brownian motion on time scales is very similar to that on continuous time, there are also some differences between them. For example, the quadratic variation of a Brownian motion on time scales (see [27]) is an increasing process yet, but it is not deterministic. In fact, , where is the Lebesgue measure.

Now, we give the definition of the stochastic -integral and its properties.

Definition 3 (see [26]). The random process is stochastic -integrable on if the corresponding is integrable. Define the integral value of aswhereand the Brownian motion on the right side of (6) is indexed by continuous time.
We also have the following properties.
Let and . Thenwhere the integral with respect to the quadratic variation of Brownian motion is defined by Stieltjes integral as .
Let be an -dimensional stochastic process defined byand we have the following ’s formula.

Lemma 2 (see [28]). Let and satisfy (9), then the following relation holds:

3. Problem Statement and Main Results

Let be a given filtered probability space satisfying the usual condition. Consider the stochastic control system

The control belongs towhere is a convex subset of . And the functions and satisfy the Lipschitz condition and linear growth condition in . Obviously, equation (11) admits a unique solution (see Bohner et al. [26]). The cost functional associated with (11) iswhere the maps and are continuous.

The optimal control problem is to find such that

is called the stochastic optimal control of the problem, and the corresponding is called an optimal state process.

Now, we consider a family of optimal control problems with different initial times and states. Let , consider the state equationalong with the cost functional

For any , minimize (16) subject to (15) over . The value function of the optimal control problem is defined as

We first introduce a symbol which is useful in the sequel. Let be -differentiable and be continuously differentiable. For any , define as follows:

Remark 1. Note that depends not only on the functions and but also on the time scales . If is a right-dense point on time scales , then . On the contrary, if is a right-scattered point and , we have .
With the help of the new symbol, we have the following lemma.

Lemma 3. Let be continuously differentiable and and be -differentiable, then

Proof. If or , it is easy to verify that (19) is true. We only give proofs under conditions and . If is right-dense, one hasWhen is right-scattered, thenThis completes the proof.

Remark 2. Similarly, another form can be expressed as

Remark 3. In particular, let be -differentiable and be continuously differentiable. Then . It is easy to see that this equality is equivalent to (5).

Remark 4. It is not hard for us to get the following result of multidimensions:where is continuously differentiable and and are -differentiable.
Next, we show ’s formula in a new form on time scales.

Proposition 1. Let satisfy (9) and , we havewhere is an indicative function and is the set of all right-dense points.

Proof. Because of Lemma 2, it is enough to show thatBy some manipulation, namely,it is straightforward to show that the above equation is true.
Now, we state the Bellman optimality principle on time scales.

Theorem 1 (optimality principle). Let . Then for any , we have

Proof. For any , there exists a control such thatwhere is the sigma field generated by .
On the contrary, by the definition of value function (17), we obtainThus, taking infimum over , one hasIt follows thatCombining with (28) and (31), we get the result.
Furthermore, we give the HJB equation on time scales which is similar to continuous and discrete cases.

Theorem 2. Let the value function , then the value function satisfies the following HJB equation:whereand we use the notation

Proof. According to the definition of the value function, is satisfied. Fix , and let be the state process corresponding to the control . By the optimality principle and ’s formula, for any and , we haveIf is right-dense, then let , while if is right-scattered, then let . This leads toIt follows thatConversely, for any , and , we can find a control such thatBy the same argument as that used in the above proof, we getTherefore, (32) follows. The proof is completed.

Remark 5. It is not surprising that HJB equation (32) on time scales is very similar to the classical HJB equation in continuous and discrete time (see [29, 30]). An intriguing feature of the HJB equation on time scales is that the expectation is involved. When or , we are able to reduce equation (32) into the classical ones.

Remark 6. Suppose , then HJB equation (32) becomesIn such a case, it is equivalent to the result in [22].

Remark 7. In particular, in Remark 6, if we further let , HJB equation (32) degenerates intowhich is just the one given by Seiffertt et al. [20].
From the above, we end up with the following verification theorem.

Theorem 3 (verification theorem). Let be the solution of HJB equation (32), and there exists a function such thatThen , and is an optimal control.

Proof. Let , we haveThis yieldsIn addition, for any admissible pair , we haveNamely,By arbitrary for , it follows thatHence, by (44) and (47), one hasFinally, inequality (44) together with (48) proves the optimality of .

4. Example

The dynamic programming on time scales contains not only continuous and discrete cases but also other more general cases. In order to illustrate our result, we give an example. Consider the quantum time scale . The state equation on time scales is as follows:

To find the sequence of optimal control policy , such that

In this example, we have . By Theorem 2, the value function satisfies the following equation:

We can see that the graininess function affected the solution of HJB equation (51).

Besides, by applying Theorem 3, we can find the optimal control through the above equation. Furthermore, it implies that the optimal strategy also depends on the structure of time scales.

5. Conclusions

In this paper, we developed the dynamic programming principle for the stochastic system on time scales, for which we presented ’s formula on time scales in a new form by introducing a new symbol. Similar to the classical cases, we constructed the HJB equation on time scales and proved the verification theorem. The results of this paper are more general. The continuous and discrete analogues of the dynamic programming are special cases of this paper. An example has been given to demonstrate the effectiveness of our results.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Key R&D Program of China (Grant no. 2018YFA0703900) and the Major Project of National Social Science Foundation of China (Grant no. 19ZDA091).

References

R. Bellman, “Dynamic programming,” Science, vol. 153, no. 3731, pp. 34–37, 1966.
View at: Publisher Site | Google Scholar
D. Blackwell, “Discrete dynamic programming,” The Annals of Mathematical Statistics, vol. 33, no. 2, pp. 719–726, 1962.
View at: Publisher Site | Google Scholar
E. Bandini, A. Cosso, M. Fuhrman, and H. Pham, “Randomized filtering and bellman equation in wasserstein space for partial observation control problem,” Stochastic Processes and Their Applications, vol. 129, no. 2, pp. 674–711, 2019.
View at: Publisher Site | Google Scholar
H. Pham and X. Wei, “Dynamic programming for optimal control of stochastic McKean—Vlasov dynamics,” Siam Journal on Control and Optimization, vol. 55, no. 2, pp. 1069–1101, 2017.
View at: Publisher Site | Google Scholar
C. Mu, Y. Zhang, H. Jia, and H. He, “Energy-storage-based intelligent frequency control of microgrid with stochastic model uncertainties,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1748–1758, 2020.
View at: Publisher Site | Google Scholar
C. Mu and Y. Zhang, “Learning-based robust tracking control of quadrotor with time-varying and coupling uncertainties,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 1, pp. 259–273, 2020.
View at: Publisher Site | Google Scholar
S. Hilger, “Analysis on measure chains—a unified approach to continuous and discrete calculus,” Results in Mathematics, vol. 18, no. 1-2, pp. 18–56, 1990.
View at: Publisher Site | Google Scholar
G. S. Guseinov, “Integration on time scales,” Journal of Mathematical Analysis and Applications, vol. 285, no. 1, pp. 107–127, 2003.
View at: Publisher Site | Google Scholar
M. Bohner and G. S. Guseinov, “Partial differentiation on time scales,” Dynamic Systems and Applications, vol. 13, no. 3-4, pp. 351–379, 2004.
View at: Google Scholar
M. Bohner and A. Peterson, Dynamic Equations on Time Scales: An Introduction with Applications, Birkhäuser Boston, Boston, MA, USA, 2001.
M. Bohner and A. Peterson, Advances in Dynamic Equations on Time Scales, Birkhäuser Boston, Boston, MA, USA, 2002.
N. H. Du and N. T. Dieu, “The first attempt on the stochastic calculus on time scale,” Stochastic Analysis and Applications, vol. 29, no. 6, pp. 1057–1080, 2011.
View at: Publisher Site | Google Scholar
D. Grow and S. Sanyal, “Brownian motion indexed by a time scale,” Stochastic Analysis and Applications, vol. 29, no. 3, pp. 457–472, 2011.
View at: Publisher Site | Google Scholar
M. Bohner, “Calculus of variations on time scales,” Dynamic Systems and Applications, vol. 13, no. 3-4, pp. 339–349, 2004.
View at: Google Scholar
Z. Zhan and W. Wei, “On existence of optimal control governed by a class of the first-order linear dynamic systems on time scales,” Applied Mathematics and Computation, vol. 215, no. 6, pp. 2070–2081, 2009.
View at: Publisher Site | Google Scholar
Y. Gong and X. Xiang, “A class of optimal control problems of systems governed by the first order linear dynamic equations on time scales,” Journal of Industrial and Management Optimization, vol. 5, no. 1, pp. 1–10, 2009.
View at: Publisher Site | Google Scholar
Y. Peng, X. Xiang, and Y. Jiang, “Nonlinear dynamic systems and optimal control problems on time scales,” ESAIM: Control, Optimisation and Calculus of Variations, vol. 17, no. 3, pp. 654–681, 2011.
View at: Publisher Site | Google Scholar
Z. Zhan, S. Chen, and W. Wei, “A unified theory of maximum principle for continuous and discrete time optimal control problems,” Mathematical Control and Related Fields, vol. 2, no. 2, pp. 195–215, 2012.
View at: Publisher Site | Google Scholar
M. Bohner, K. Kenzhebaev, O. Lavrova, and O. Stanzhytskyi, “Pontryagin’s maximum principle for dynamic systems on time scales,” Journal of Difference Equations and Applications, vol. 23, no. 7, pp. 1161–1189, 2017.
View at: Publisher Site | Google Scholar
J. Seiffertt, S. Sanyal, and D. C. Wunsch, “Hamilton-Jacobi-Bellman equations and approximate dynamic programming on time scales,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 4, pp. 918–923, 2008.
View at: Publisher Site | Google Scholar
Z. Zhan, W. Wei, and H. Xu, “Hamilton–Jacobi–Bellman equations on time scales,” Mathematical and Computer Modelling, vol. 49, no. 9-10, pp. 2019–2028, 2009.
View at: Publisher Site | Google Scholar
R. Š. Hilscher and V. Zeidan, “Hamilton–Jacobi theory over time scales and applications to linear-quadratic problems,” Nonlinear Analysis: Theory, Methods and Applications, vol. 75, no. 2, pp. 932–950, 2012.
View at: Publisher Site | Google Scholar
Y. Zhu and G. Jia, “Stochastic linear quadratic control problem on time scales,” submitted.
View at: Google Scholar
Y. Zhu and G. Jia, “Linear feedback of mean-field stochastic linear quadratic optimal control problems on time scales,” Mathematical Problems in Engineering, vol. 2020, Article ID 8051918, 11 pages, 2020.
View at: Publisher Site | Google Scholar
C. Pötzsche, “Chain rule and invariance principle on measure chains,” Journal of Computational and Applied Mathematics, vol. 141, no. 1-2, pp. 249–254, 2002.
View at: Publisher Site | Google Scholar
M. Bohner, O. M. Stanzhytskyi, and A. O. Bratochkina, “Stochastic dynamic equations on general time scales,” Electronic Journal of Differential Equations, vol. 2013, no. 57, pp. 1–15, 2013.
View at: Google Scholar
D. Grow and S. Sanyal, “The quadratic variation of Brownian motion on a time scale,” Statistics and Probability Letters, vol. 82, no. 9, pp. 1677–1680, 2012.
View at: Publisher Site | Google Scholar
W. Hu, “Itô’s formula, the stochastic exponential, and change of measure on general time scales,” Abstract and Applied Analysis, vol. 2017, Article ID 9140138, 13 pages, 2017.
View at: Publisher Site | Google Scholar
M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Springer Science & Business Media, Berlin, Germany, 2008.
L. Grüne, “Error estimation and adaptive discretization for the discrete stochastic Hamilton–Jacobi–Bellman equation,” Numerische Mathematik, vol. 99, no. 1, pp. 85–112, 2004.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Yingjun Zhu and Guangyan Jia. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2820

Downloads

1167

Citations