Some Results on Bellman Equations of Optimal Production Control in a Stochastic Manufacturing System
The paper studies the production inventory problem of minimizing the expected discounted present value of production cost control in a manufacturing system with degenerate stochastic demand. We establish the existence of a unique solution of the Hamilton-Jacobi-Bellman (HJB) equations associated with this problem. The optimal control is given by a solution to the corresponding HJB equation.
Many manufacturing enterprisers use a production inventory system to manage fluctuations in consumer demand for the product. Such a system consists of a manufacturing plant and a finished goods warehouse to store those products which are manufactured but not immediately sold. The advantages of having products in inventory are as follows: first, they are immediately available to meet demand; second, by using the warehouse to store excess production during low demand periods to be available for sale during high demand periods. This usually permits the use of a smaller manufacturing plant than would otherwise be necessary, and also reduces the difficulties of managing the system.
We are concerned with the optimization problem to minimize the expected discounted cost control of production planning in a manufacturing systems with degenerate stochastic demand: subject to the dynamics of the state equation which says that the inventory at time is increased by the production rate and decreased by the demand rate can be written according to and the demand equation with the production rate is described by the Brownian motion in the class of admissible controls of production processes with nonnegative constant defined on a complete probability space endowed with the natural filtration generated by carrying a one-dimensional standard Brownian motion , is the inventory level for production rate at time (state variable), is the demand rate at time , is the production rate at time (control variable), is the constant nonnegative discount rate, is the nonzero constant, is nonzero constant diffusion coefficient, is the initial value of inventory level, and is the initial value of demand rate.
This optimization control problem of production planning in manufacturing systems has been studied by many authors like Fleming et al. , Sethi and Zhang , Sprzeuzkouski , Hwang et al. , Hartl and Sethi , and Feichtinger and Hartl . The Bellman equation associated with production inventory control problem is quite different from them and it is treated by Bensoussan et al.  for the one-dimensional manufacturing systems with the unbounded control region. Generally speaking, the similar type of linear control problems has been investigated for the stochastic deferential systems with invariant measures like Bensoussan , and Borkar . The works of Bensoussan and Frehse  Da Prato and Ichikawa  on the Bellman equation of ergodic control without convex and polynomial growth hypothesis and the linear quadratic case are done for the linear ergodic control problem. This type of optimization problem has been studied also by Morimoto and Kawaguchi  for renewable resources as well as Baten and Sobhan Miah  for one-sector neoclassical growth model with the CES function. The optimality can be shown by an extension of the results given in Fujita and Morimoto , and for another setting of optimal control in manufacturing systems they are available in Morimoto and Okada  and Sethi et al. . These papers treat the cases with bounded control regions. On the contrary, our control region is unbounded as in (1.4).
The purpose of the paper is to give an optimal production cost control by an existence unique solution associated with the two-dimensional HJB equation. We apply the technique of dynamic programming principle  for the Riccati-based solution of the reduced (one-dimensional) HJB equation corresponding to production inventory control problem. This paper is organized as follows. In Section 2 by the principle of optimality Bellman , we have obtained the HJB equation and then the two-dimensional HJB equation has been reduced to one-dimensional second-order differential equation. We have derived the dynamics of inventory-demand ratio that evolves according to stochastic neoclassical differential equation through Itô’s lemma. We have finally found the Riccati-based solution of production inventory control problem that is satisfied by the value function of this optimization problem. In Section 3 we have established the properties of the value function and have shown the existence of an unique solution associated with the reduced (one-dimensional) Hamilton-Jacobi-Bellman (HJB) equation. Finally in Section 4 we present an application to production control of optimization problem (1.1) subject to (1.2) and (1.3).
2. Riccati-Based Solution of Hamilton-Jacobi Bellman Equation
2.1. The Hamilton-Jacobi-Bellman Equation
Suppose is a function whose value is the minimum value of the objective function of the production inventory control problem for the manufacturing system given that we start it at time in state , and . That is, where the value function is finite valued and twice continuously differentiable on . We initially assume that exists for all , , and in the relevant ranges.
Since (1.2) and (1.3) is a scalar equation, the subscript here means only time . Thus, and will not cause any confusion and, at the same time, will eliminate the need of writing many parentheses. Thus, is a scalar.
To solve the problem defined by (1.1), (1.2), and (1.3), let , known as the value function, be the expected value of the objective function (1.1) from to infinity, when an optimal policy is followed from to infinity, given Then by the principle of optimality , We assume that is a continuously differentiable of its arguments. By Taylor’s expansion, we have From (1.2), we can formally write The exact meaning of these expressions comes from the theory of stochastic calculus; Arnold [18, chapter 5] and Karatzas and Shreve . For our purposes, it is sufficient to know the multiplication rules of the stochastic calculus: Substitute (2.3) into (2.2) and use (2.4), (2.5) (2.6), (2.7), and (2.8) to obtain Note that we have suppressed the arguments of the functions involved in (2.3).
Canceling the term on both sides of (2.9), dividing the remainder by , and letting , we obtain the dynamic programming partial differential equation or Hamilton-Jacobi-Bellman equation where is the Legendre transform of , that is, and , , , are partial derivatives of with respect to and .
2.2. A Reduction to 1-Dimensional Case
In this subsection, the general (two-dimensional) HJB equation has been reduced to a one-dimensional second-order differential equation. From the two-dimensional state space form (one state for inventory level and the other state for demand rate), it has been reduced to one-dimensional form for () the ratio of inventory to demand.
The main feature of the HJB equation (2.13) is the vanishing of the coefficient of for in partial differential equation terminology, then the equation is degenerate elliptic. Generally speaking, the difficulty stems from the degeneracy in the second-order term of the HJB equation (2.13).
2.3. Value Function
Let us consider the minimum value of the payoff function is a function of this initial point. The value function can be defined as a function whose value is the minimum value of the objective function of the production inventory control problem (1.1) for the manufacturing system, that is, The value function is a solution to the reduced (one-dimensional) HJB equation (2.13) and the solution of this HJB equation is used to test controller for optimality or perhaps to construct a feedback controller. Again the HJB equation (2.13) arises in the production control problem (1.1), (1.2), (1.3) with constraint
2.4. Stochastic Neoclassical Differential Equation for Dynamics of Inventory-Demand Ratio
As in the certainty optimal production control model, the dynamics of the state equation of inventory level (1.2) can be reduced to a one-dimensional process by working in intensive (per capita) variables. Define To determine the stochastic differential for the inventory-demand ratio, we apply Itô's lemma as follows: From Itô's lemma, From (1.3), we have that Substituting the above expressions into (2.18), we have that the dynamics of to be the inventory-demand ratio at time which evolves according to the stochastic neoclassical differential equation for demand
2.5. Riccati-Based Solution
This subsection deals with the Riccati-based solution of the reduced one-dimensional HJB equation (2.13) corresponding to the production inventory control problem (2.14) subject to (2.19) using the dynamic programming principle .
To find the Riccati-based solution of HJB equation (2.13), we refer to Da Prato  and Da Prato and Ichikawa  for the degenerate linear control problems related to Riccati equation in case of convex function like
By taking the derivative of (2.13) with respect to and setting it to zero, we can minimize the expression inside the bracket of (2.13) (i.e., with respect to . This procedure yields Substituting (2.20) into (2.13) yields the equation known as the HJB equation. This is a partial differential equation which has a solution form Then Substituting (2.22) and (2.23) into (2.21) yields Since (2.24) must hold for any value of , we must have called a Riccati equation from which we obtain So, (2.22) is a solution form of (2.21).
3. Bellman Equations for Discounted Cost Control
3.1. Existence and Uniqueness
To solve the Bellman equation (2.13) let us consider this HJB equation associated with the discounted production control problem in the following form: where We make the following assumptions: satisfies the polynomial growth condition such that for some
In order to ensure the integrability of , we assume that This condition (3.5) is needed for the integrability of or . Under (3.5), we have Lemmas 3.1, 3.2, and Theorem 3.5, which ensures the finiteness of and hence the finiteness of an existence unique solution of HJB equation of .
First we have established the properties of the value function of the optimal control problem.
Lemma 3.1. Under (3.5) and for each there exists such that
Proof. We have given its proof here to need the same kind of calculations in the future. By Itô’s formula we have Now by (2.15), (3.5) and taking expectation on the both sides, we obtain where Obviously, is bounded above. Thus we can deduce (3.6).
Lemma 3.2. Under (3.5) and for each there exists such that
Proof. By an application of Itô’s formula to we have
Now by (2.15), (3.5) and taking expectation on the both sides we obtain
We choose such that for all Then Thus we get (3.9) with independent of sufficiently small
Proof. For any , there exist such that where We set for Clearly, Hence, by convexity Letting we get which completes the convexity of the value function
Proof. By (3.5), we choose such that
and then such that
Then (3.20) is immediate.
Applying Itô’s formula to we have Now by (3.20) and taking expectation on the both sides, we obtain from which we deduced (3.21).
The convexity of the value function follows from the same line as (Proposition 3.3). Let be the unique solution of Then by (3.4) and (3.21), which implies (3.22) and satisfies (3.4). Hence this completes the proof.
Proof. Since is Lipschitz continuous, this follows from Bensoussan  in case of Assumption (3.4) except convexity. For the general case, we take a nondecreasing sequence convergent to with It is well known (Bensoussan ) that, for every (3.1) has a unique solution for of the form
in the class of continuous functions vanishing at infinitely, where is a solution of (2.19).
To prove (3.29), we recall (3.30). Hence by (3.4) and Lemma 3.2 we have This implies that satisfies (3.29).
To estimate on for we remember the Taylor expansions of : where is any -neighborhood of Then wa can obtain the Landau-Kolmogorov inequality: Choosing and by (3.1), (3.2), and (3.34) we have from which Now by (3.34) and (3.4), we have Thus, taking the finite covering of we deduce and hence By the Ascoli-Arzelà theorem, we have taking a subsequence if necessary. Passing to the limit, we can obtain (3.1) and (3.30).
Following the inequality (3.29), we have Hence by Itô’s formula to for convex function [19, page 219], we have By virtue of (3.1) and taking expectation on the both sides we have Now by (3.41), we obtain which is similar to (3.30) and here the infimum is attained by the feedback law with . The proof is complete.
4. An Application to Production Control
In this section we will study the production control problem to minimize the cost (2.14) over the class of all progressively measurable processes such that and for the response to
Let us consider the stochastic differential equation where that is,
We need to establish the following lemmas.
Lemma 4.1. Under (3.5) and for each there exists such that
Proof. Since, we have by Itô’s formula
Now by the assumptions of comparison theorem we have , , then we have and , where . Thus we can see by the comparison theorem of Ikeda and Watanabe . Since the explosion time , we have . Hence .
By the monotonicity of , we have . Then by Itô’s formula, Hence Set , we obtain where By Gronwall Lemma, we have Therefore, from which we have So, the uniqueness of (4.2) holds. Thus we conclude that (4.2) admits a unique strong solution , Ikeda and Watanabe [22, Chapter 4, Theorem ], with
By (3.5) and Itô’s formula, where
By (4.3) it is easily seen that if for sufficiently large Clearly Also By the same line as the proof of Lemma 3.1, we see that the right-hand side is bounded from above. This completes the proof.
Proof. We first note by (4.3) that and the minimum is attained by We apply Itô’s formula for convex functions [19, page 219] to obtain
Taking expectation on the both sides,
By virtue of (2.13),
Choose such that By (3.4) and Lemma 3.1, we have
which implies that
Hence satisfies (4.1). Then we get from which By (3.4), we have hence
Clearly for every Again following the same construction of (3.45) and by the HJB equation (2.13), we have By (4.1) we have Thus we deduce The proof is complete.
Lemma 4.3. Under (3.5), there exists a unique solution of where
Since is continuous in and , there exists a nonexplosive solution of (4.21).
Now we will show a.s. Suppose Then L’Hospital’s rule gives Letting in (3.1), we have and hence This contradicts the assumption. Thus we get which implies that In case we have at , Therefore
To prove uniqueness, let be two solutions of (4.21). Then satisfies We have Note that the function is increasing. Hence By Gronwall's lemma, we have So, the uniqueness of (4.21) holds. The proof is complete.