Research Article | Open Access
Yingjun Zhu, Guangyan Jia, "Dynamic Programming and Hamilton–Jacobi–Bellman Equations on Time Scales", Complexity, vol. 2020, Article ID 7683082, 11 pages, 2020. https://doi.org/10.1155/2020/7683082
Dynamic Programming and Hamilton–Jacobi–Bellman Equations on Time Scales
Bellman optimality principle for the stochastic dynamic system on time scales is derived, which includes the continuous time and discrete time as special cases. At the same time, the Hamilton–Jacobi–Bellman (HJB) equation on time scales is obtained. Finally, an example is employed to illustrate our main results.
The stochastic control problem is to find an optimal control such that a cost functional associated with a stochastic system reaches the minimum value. The method of dynamic programming is a powerful approach to solving the stochastic optimal control problems. The dynamic programming is a well-established subject [1–4] to deal with continuous and discrete optimal control problems, respectively, and it has great practical applications in various fields [5, 6]. It is generally assumed that the time is continuous or discrete in the dynamic systems. However, this cannot be guaranteed. In reality, the time scale could be neither continuous nor discrete. There are many processes which are the mixture of continuous time and discrete time, nonuniform discrete time, or the union of disjoint time interval, such as the production and storage process in economics, the investment process in finance, and the population for the seasonal insects. When the time is more complex, it makes the control problem more difficult. How to deal with this problem?
Time scales were first introduced by Hilger  in 1988 in order to unite differential and difference equations into a general framework. This allows us to deal with the continuous and discrete analyses from the common point of view. Recently, time scale theory is extensively studied in many works [8–13]. It is well known that the optimal control problems on time scales are an important field for both theory and applications. Since the calculus of variations on time scales was studied by Bohner , the results on optimal control problems in the time scale setting and their applications have been rapidly growing. The existence of optimal control for the dynamic systems on time scales was discussed [15–17]. Subsequently, Pontryagin maximum principle on time scales was studied in several works [18, 19], which specifies the necessary conditions for optimality.
The dynamic programming for dynamic systems on time scales is not a simple task to unite the continuous time and discrete time cases because the time scales contain more complex time cases. Seiffertt et al.  studied the approximate dynamic programming for the dynamic system in the isolated time scale setting. In addition, Bellman dynamic programming on general time scales for the deterministic optimal control problems was considered in [21, 22]. However, limited work [23, 24] has been done on the linear quadratic stochastic optimal control problem in the time scales setting. That is to say, the general setting of stochastic optimal control problems on time scales is completely open.
Motivated by all these significant works, the purpose of this paper is to study the method of dynamic programming for the stochastic optimal control problems on time scales. As we know, the stochastic dynamic programming principle is different from the deterministic systems, which reflects the stochastic nature of the optimal problem. So, the method in the deterministic case on time scales cannot be applied to the stochastic case directly. In order to overcome this difficulty, we first give a new form of the chain rule on time scales. Based on this idea, we obtain ’s formula for the stochastic process on time scales. Second, we consider a family of optimal control problems with different initial times and states to establish the Bellman optimality principle in the time scale setting. Third, using ’s formula and the Bellman optimality principle obtained in the time scales, we also get the associated Hamilton–Jacobi–Bellman (HJB for short) equation on time scales which is a nonlinear second-order partial differential equation involving expectation. If the HJB equation is solvable, then we can get an optimal feedback control. Our work will enrich the dynamic programming problem by providing a more general time framework and make dynamic programming theory to be a powerful tool in tackling the optimal control problem on complexity time.
The organization of this paper is as follows. In Section 2, we show some preliminaries about time scale theory. Section 3 focuses on the Bellman optimality principle and the HJB equations on time scales. By introducing a new symbol, we present ’s formula in a new form. On this basis, we get the main results. Finally, an illustrative example is given to show the effectiveness of the proposed results.
A time scale is a nonempty closed subset in real number set , and we denote . In this paper, we always suppose is bounded. The forward jump operator and backward jump operator are, respectively, defined bysupplemented by and , where denotes the empty set. If and , the point is called right-dense, while if , the point is called right-scattered. Similarly, if and , the point is called left-dense, while if , the point is called left-scattered. Moreover, a point is called isolated if it is both left-scattered and right-scattered. For a function , we denote to represent the compositions of the functions and . Similarly, we denote . The definition of the graininess function is as follows:
Definition 1 (see ). Let be a function on . is called a right-dense continuous function if is continuous at every right-dense point and has finite left-sided limits at every left-dense point. Similarly, is called a left-dense continuous function if is continuous at every left-dense point and has finite right-sided limits at every right-dense point. If is right-dense continuous and also is left-dense continuous, then is called a continuous function.
Define the set as
Definition 2 (see ). Let : be a function and . If for all , there exists a neighborhood of such thatwe call the -derivative of at .
We denote by the space of -valued continuously -differentiable functions on and denote by the family of all functions defined on such that they are continuously -differentiable in and twice continuously differentiable in .
Furthermore, we give the derivation rule of the compound function.
Lemma 1 (see ). Let be -differentiable and be continuously differentiable. Then is -differentiable, and
In this paper, we adopt the stochastic integral defined by Bohner et al. . Let be a complete probability space with increasing and continuous filtration . We define that is the set of all -adapted, -valued measurable processes such that .
A Brownian motion indexed by time scales was defined by Grow and Sanyal . Although the Brownian motion on time scales is very similar to that on continuous time, there are also some differences between them. For example, the quadratic variation of a Brownian motion on time scales (see ) is an increasing process yet, but it is not deterministic. In fact, , where is the Lebesgue measure.
Now, we give the definition of the stochastic -integral and its properties.
Definition 3 (see ). The random process is stochastic -integrable on if the corresponding is integrable. Define the integral value of aswhereand the Brownian motion on the right side of (6) is indexed by continuous time.
We also have the following properties.
Let and . Thenwhere the integral with respect to the quadratic variation of Brownian motion is defined by Stieltjes integral as .
Let be an -dimensional stochastic process defined byand we have the following ’s formula.
3. Problem Statement and Main Results
Let be a given filtered probability space satisfying the usual condition. Consider the stochastic control system
The control belongs towhere is a convex subset of . And the functions and satisfy the Lipschitz condition and linear growth condition in . Obviously, equation (11) admits a unique solution (see Bohner et al. ). The cost functional associated with (11) iswhere the maps and are continuous.
The optimal control problem is to find such that
is called the stochastic optimal control of the problem, and the corresponding is called an optimal state process.
Now, we consider a family of optimal control problems with different initial times and states. Let , consider the state equationalong with the cost functional
We first introduce a symbol which is useful in the sequel. Let be -differentiable and be continuously differentiable. For any , define as follows:
Remark 1. Note that depends not only on the functions and but also on the time scales . If is a right-dense point on time scales , then . On the contrary, if is a right-scattered point and , we have .
With the help of the new symbol, we have the following lemma.
Lemma 3. Let be continuously differentiable and and be -differentiable, then
Proof. If or , it is easy to verify that (19) is true. We only give proofs under conditions and . If is right-dense, one hasWhen is right-scattered, thenThis completes the proof.
Remark 2. Similarly, another form can be expressed as
Remark 3. In particular, let be -differentiable and be continuously differentiable. Then . It is easy to see that this equality is equivalent to (5).
Remark 4. It is not hard for us to get the following result of multidimensions:where is continuously differentiable and and are -differentiable.
Next, we show ’s formula in a new form on time scales.
Proposition 1. Let satisfy (9) and , we havewhere is an indicative function and is the set of all right-dense points.
Proof. Because of Lemma 2, it is enough to show thatBy some manipulation, namely,it is straightforward to show that the above equation is true.
Now, we state the Bellman optimality principle on time scales.
Theorem 1 (optimality principle). Let . Then for any , we have
Proof. For any , there exists a control such thatwhere is the sigma field generated by .
On the contrary, by the definition of value function (17), we obtainThus, taking infimum over , one hasIt follows thatCombining with (28) and (31), we get the result.
Furthermore, we give the HJB equation on time scales which is similar to continuous and discrete cases.
Theorem 2. Let the value function , then the value function satisfies the following HJB equation:whereand we use the notation
Proof. According to the definition of the value function, is satisfied. Fix , and let be the state process corresponding to the control . By the optimality principle and ’s formula, for any and , we haveIf is right-dense, then let , while if is right-scattered, then let . This leads toIt follows thatConversely, for any , and , we can find a control such thatBy the same argument as that used in the above proof, we getTherefore, (32) follows. The proof is completed.
Remark 5. It is not surprising that HJB equation (32) on time scales is very similar to the classical HJB equation in continuous and discrete time (see [29, 30]). An intriguing feature of the HJB equation on time scales is that the expectation is involved. When or , we are able to reduce equation (32) into the classical ones.
Remark 7. In particular, in Remark 6, if we further let , HJB equation (32) degenerates intowhich is just the one given by Seiffertt et al. .
From the above, we end up with the following verification theorem.
Theorem 3 (verification theorem). Let be the solution of HJB equation (32), and there exists a function such thatThen , and is an optimal control.
Proof. Let , we haveThis yieldsIn addition, for any admissible pair , we haveNamely,By arbitrary for , it follows thatHence, by (44) and (47), one hasFinally, inequality (44) together with (48) proves the optimality of .
The dynamic programming on time scales contains not only continuous and discrete cases but also other more general cases. In order to illustrate our result, we give an example. Consider the quantum time scale . The state equation on time scales is as follows:
To find the sequence of optimal control policy , such that
In this example, we have . By Theorem 2, the value function satisfies the following equation:
We can see that the graininess function affected the solution of HJB equation (51).
Besides, by applying Theorem 3, we can find the optimal control through the above equation. Furthermore, it implies that the optimal strategy also depends on the structure of time scales.
In this paper, we developed the dynamic programming principle for the stochastic system on time scales, for which we presented ’s formula on time scales in a new form by introducing a new symbol. Similar to the classical cases, we constructed the HJB equation on time scales and proved the verification theorem. The results of this paper are more general. The continuous and discrete analogues of the dynamic programming are special cases of this paper. An example has been given to demonstrate the effectiveness of our results.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This work was supported by the National Key R&D Program of China (Grant no. 2018YFA0703900) and the Major Project of National Social Science Foundation of China (Grant no. 19ZDA091).
- R. Bellman, “Dynamic programming,” Science, vol. 153, no. 3731, pp. 34–37, 1966.
- D. Blackwell, “Discrete dynamic programming,” The Annals of Mathematical Statistics, vol. 33, no. 2, pp. 719–726, 1962.
- E. Bandini, A. Cosso, M. Fuhrman, and H. Pham, “Randomized filtering and bellman equation in wasserstein space for partial observation control problem,” Stochastic Processes and Their Applications, vol. 129, no. 2, pp. 674–711, 2019.
- H. Pham and X. Wei, “Dynamic programming for optimal control of stochastic McKean—Vlasov dynamics,” Siam Journal on Control and Optimization, vol. 55, no. 2, pp. 1069–1101, 2017.
- C. Mu, Y. Zhang, H. Jia, and H. He, “Energy-storage-based intelligent frequency control of microgrid with stochastic model uncertainties,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1748–1758, 2020.
- C. Mu and Y. Zhang, “Learning-based robust tracking control of quadrotor with time-varying and coupling uncertainties,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 1, pp. 259–273, 2020.
- S. Hilger, “Analysis on measure chains—a unified approach to continuous and discrete calculus,” Results in Mathematics, vol. 18, no. 1-2, pp. 18–56, 1990.
- G. S. Guseinov, “Integration on time scales,” Journal of Mathematical Analysis and Applications, vol. 285, no. 1, pp. 107–127, 2003.
- M. Bohner and G. S. Guseinov, “Partial differentiation on time scales,” Dynamic Systems and Applications, vol. 13, no. 3-4, pp. 351–379, 2004.
- M. Bohner and A. Peterson, Dynamic Equations on Time Scales: An Introduction with Applications, Birkhäuser Boston, Boston, MA, USA, 2001.
- M. Bohner and A. Peterson, Advances in Dynamic Equations on Time Scales, Birkhäuser Boston, Boston, MA, USA, 2002.
- N. H. Du and N. T. Dieu, “The first attempt on the stochastic calculus on time scale,” Stochastic Analysis and Applications, vol. 29, no. 6, pp. 1057–1080, 2011.
- D. Grow and S. Sanyal, “Brownian motion indexed by a time scale,” Stochastic Analysis and Applications, vol. 29, no. 3, pp. 457–472, 2011.
- M. Bohner, “Calculus of variations on time scales,” Dynamic Systems and Applications, vol. 13, no. 3-4, pp. 339–349, 2004.
- Z. Zhan and W. Wei, “On existence of optimal control governed by a class of the first-order linear dynamic systems on time scales,” Applied Mathematics and Computation, vol. 215, no. 6, pp. 2070–2081, 2009.
- Y. Gong and X. Xiang, “A class of optimal control problems of systems governed by the first order linear dynamic equations on time scales,” Journal of Industrial and Management Optimization, vol. 5, no. 1, pp. 1–10, 2009.
- Y. Peng, X. Xiang, and Y. Jiang, “Nonlinear dynamic systems and optimal control problems on time scales,” ESAIM: Control, Optimisation and Calculus of Variations, vol. 17, no. 3, pp. 654–681, 2011.
- Z. Zhan, S. Chen, and W. Wei, “A unified theory of maximum principle for continuous and discrete time optimal control problems,” Mathematical Control and Related Fields, vol. 2, no. 2, pp. 195–215, 2012.
- M. Bohner, K. Kenzhebaev, O. Lavrova, and O. Stanzhytskyi, “Pontryagin’s maximum principle for dynamic systems on time scales,” Journal of Difference Equations and Applications, vol. 23, no. 7, pp. 1161–1189, 2017.
- J. Seiffertt, S. Sanyal, and D. C. Wunsch, “Hamilton-Jacobi-Bellman equations and approximate dynamic programming on time scales,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 4, pp. 918–923, 2008.
- Z. Zhan, W. Wei, and H. Xu, “Hamilton–Jacobi–Bellman equations on time scales,” Mathematical and Computer Modelling, vol. 49, no. 9-10, pp. 2019–2028, 2009.
- R. Š. Hilscher and V. Zeidan, “Hamilton–Jacobi theory over time scales and applications to linear-quadratic problems,” Nonlinear Analysis: Theory, Methods and Applications, vol. 75, no. 2, pp. 932–950, 2012.
- Y. Zhu and G. Jia, “Stochastic linear quadratic control problem on time scales,” submitted.
- Y. Zhu and G. Jia, “Linear feedback of mean-field stochastic linear quadratic optimal control problems on time scales,” Mathematical Problems in Engineering, vol. 2020, Article ID 8051918, 11 pages, 2020.
- C. Pötzsche, “Chain rule and invariance principle on measure chains,” Journal of Computational and Applied Mathematics, vol. 141, no. 1-2, pp. 249–254, 2002.
- M. Bohner, O. M. Stanzhytskyi, and A. O. Bratochkina, “Stochastic dynamic equations on general time scales,” Electronic Journal of Differential Equations, vol. 2013, no. 57, pp. 1–15, 2013.
- D. Grow and S. Sanyal, “The quadratic variation of Brownian motion on a time scale,” Statistics and Probability Letters, vol. 82, no. 9, pp. 1677–1680, 2012.
- W. Hu, “Itô’s formula, the stochastic exponential, and change of measure on general time scales,” Abstract and Applied Analysis, vol. 2017, Article ID 9140138, 13 pages, 2017.
- M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Springer Science & Business Media, Berlin, Germany, 2008.
- L. Grüne, “Error estimation and adaptive discretization for the discrete stochastic Hamilton–Jacobi–Bellman equation,” Numerische Mathematik, vol. 99, no. 1, pp. 85–112, 2004.
Copyright © 2020 Yingjun Zhu and Guangyan Jia. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.