Optimization Theory, Methods, and Applications in Engineering 2014View this Special Issue
Research Article | Open Access
A Risk-Averse Inventory Model with Markovian Purchasing Costs
We study a few dynamic risk-averse inventory models using additive utility functions. We add Markovian behavior of purchasing costs in our models. Such Markovian purchasing costs can reflect a market situation in a global supply chain such as fluctuations at exchange rates or the existence of product spot markets. We provide our problem formulations with finite and infinite MDP (Markovian Decision Process) problems. For finite time models, we first prove (joint) concavity of the model for each state and obtain a (modified) base-stock optimal policy. Then, we conduct comparative static analysis for model parameters and derive monotone properties to the optimal solutions. For infinite time models, we show the existence of stationary base-stock optimal policies and the inheritance of the monotone properties proven at our finite time models.
Various inventory problems have been studied as dynamic programming models in supply chain literature. In these models, there exists a product to be sold over a (finite or infinite) time horizon. On the one hand, when demand exceeds supply for the product, the shortage amount is backordered and fulfilled by the supply in the next period with a backordering cost. On the other hand, when supply exceeds demands for the product, the excessive inventory is carried for the potential demand in the next period with a holding cost. The firm’s objective is to determine the optimal ordering quantity so as to optimize expected total profit or costs.
The literature focuses mainly on risk-neutral performance measures, when the firm maximizes the expected total profit or minimizes the expected total costs. It implies that inventory managers are risk-neutral. Particularly, risk neutrality provides the best decision only on average, so it is consistent with rational decision making. However, we cannot assume all the inventory managers are risk-neutral. In supply chain literature, Schweitzer and Cachon  provide an experimental evidence suggesting that inventory managers may be risk-averse for the products with high profits.
In recent years, risk-averse inventory models have received an increasing attention in literature. Most work up to date focuses on single-period (newsvendor) models. For the single-period risk-averse models, Chen et al.  provide an excellent review and a summary of the literature in this direction. Choi et al.  also review the recent literature after Chen et al. . On the other hand, for the multiperiod risk-averse models, Bouakiz and Sobel  initiated their analysis with exponential utility functions and characterized the optimal ordering policies. After then, Chen et al.  and Chen and Sun  studied dynamic inventory models in conjunction with hedging opportunities at financial markets. From this stream of research, typical key contributions to literature are the characterization of the optimal base-stock policies in dynamic inventory models.
This paper follows after Chen et al.  and Chen and Sun  in this stream where the authors consider a few risk-averse models with finite time horizon at the former and with infinite time horizon at the latter. Chen et al.  consider several joint models of inventory and pricing with and without financial hedging opportunities. They establish a consumption model with an income flow in a multiperiod model. As risk measures, they use additive exponential utility functions to analyze the models and derive state-dependent (modified) base-stock optimal policies. The work of Chen and Sun  is a natural extension to that of Chen et al.  with infinite time horizon. They also use additive exponential utility functions, but without financial hedging opportunities. As a result, they obtain state-dependent (modified) base-stock optimal policies with infinite time horizon.
In our work, we add Markovian (discrete-state) behavior of purchasing costs, which distinguishes this work from Chen et al.  and Chen and Sun  where the models parameters are fixed or at most uniquely determined deterministically by their historical trajectory. That is, they do not consider the probabilistic characteristics in their model parameters. Our Markovian purchasing costs reflect typical market situations in global supply chains such as fluctuations at exchange rates or the existence of product spot markets. By exploiting such cost changes, inventory managers can get benefits from the fluctuations. Thus, a (random) fluctuation in purchasing cost affects the optimal ordering quantity significantly, so it has been frequently studied in the literature of risk-neutral inventory models (e.g., Gavirneni  and Yang and Xia ).
Gavirneni  considers a risk-neutral multiperiod inventory model. By analyzing the model with both finite and infinite time horizons, he obtains a base-stock optimal policy and monotone property of the impact of fluctuations in purchasing costs on the optimal ordering quantity under some conditions. Yang and Xia  study a continuous-review risk-neutral inventory system with a continuous MDP (Markov Decision Process) formulation. Then, they identify some conditions where the base-stock order-up-to level is monotone by the (random) fluctuations in purchasing costs. However, both of these two works only study the corresponding risk-neutral models. After then, with Markovian purchasing costs in risk-averse models, our key contributions are to conduct a comparative static analysis with finite and infinite time horizon and obtain monotone properties to the optimal solution, which have not been studied in literature.
The remainder of this paper is organized as follows. In Section 3, we establish the models with MDP formulations using general additive utility functions. Then, in Section 3.1, we prove the propositions of the concavity of the model and state-dependent optimal base-stock policy. It implies that these propositions can be preserved with risk aversion as well as risk neutrality. In addition, for our comparative static analysis, we prove the impacts of backordering and inventory holding costs to the optimal order-up-to level. Then, with the special case of exponential utility functions at Section 3.2, we also prove the impacts of (random) price changes and cost fluctuations to the optimal solution. We then extend the analytical results to the case of infinite time horizon models at Section 4. For numerical analysis, computational results are presented to confirm the analytical results in Section 5. Finally, we provide some concluding remarks in Section 6.
2. Problem Formulation
We consider a risk-averse firm to make a sequential decision from time , where is a length of time horizon. In each time , it faces a nonnegative and real-bounded random demand , where demands in different time periods are independent. It also has (linear) time-invariant resale price , inventory holding costs , and backordering costs per unit per period.
Let us denote to be the initial on-hand inventory at time before placing an order. Similarly, is the accumulated inventory at time after receiving an order. Lead time is given zero. So, the amount to be ordered is fulfilled instantaneously.
Let us also define fluctuations in purchasing costs. We denote as the total number of possible values of the purchasing costs in each time , where, without loss of generality, , for . This purchasing cost in each time undergoes Markovian behavior with a transition matrix , where is the probability that the purchasing cost is in the next period given that it is in this period. Then, the current profit function at time is defined when backordering is allowed given the target on-hand inventory , initial inventory , and purchasing costs with state . ConsiderIn addition to profit functions, we assume that inventory managers can borrow or lend money with a risk-free interest rate from financial markets. That is, we consider both consumption and profit income in our model, and the current profit and a (nonnegative) consumption level ≥0 at time change the current wealth level as follows: Equivalently, , where is a discount rate. Then, our objective function is a function of consumptions in each time such as If is a linear function such as , the model becomes a risk-neutral model.
The definition of additive utility model is given as To be consistent with risk aversion, each is nondecreasing and concave. As a special case of (general) additive utility function, exponential utility function has the form , Here, can be translated into risk tolerance factor. Thus, lower means more risk-averse.
The original model is
For a risk-neutral model, let be profits-to-go function of the risk-neutral model up to the end of time horizon, T, when backordering is allowed given that the initial on-hand inventory is and wealth level with the state at time . Considerwhere with a boundary condition Due to additivity of expected value operator, wealth and consumption levels can be separated from the model as they do not affect the optimal ordering quantity. It implies that our risk-neutral case is an income-flow model without consumption through financial markets, which is equivalent to the model in Gavirneni .
3. Analytical Results
3.1. Additive Utility Functions
In this subsection, we focus on additive utility function to analyze a dynamic consumption model. First, we define the value function which means utility-of-profits-to-go function of additive utility up to the end of time horizon, , when backordering is allowed given that the initial on-hand inventory is with the purchasing costs state and wealth level at time . Considerwherewith a boundary condition By an equivalent formulation and the modified income at time , Then a new problem is where with a boundary condition
Proposition 1 (existence of a wealth dependent base-stock optimal policy). is jointly concave in and for each In addition, a wealth dependent base-stock policy is optimal.
Proof. Our proof idea is induction. For at , it is obvious to be jointly concave in and , as is nondecreasing and concave. Next, we assume that is jointly concave in and , . Finally, we prove that a wealth dependent base-stock policy is optimal and is jointly concave in and , .
First, is concave in . Thus, is jointly concave in and , . It implies that is jointly concave in and , .
Then, we prove that a wealth dependent base-stock policy is optimal. Let be an optimal solution for the problem Because is concave in for given and , it is optimal to order-up-to if and not to order otherwise. That is, a wealth dependent base-stock policy is optimal.
Finally, after a proper modification of Theorem A.4 (convexity preservation under minimization) in Porteus , is jointly concave in and , .
Now we conduct a comparative static analysis of model parameters. In fact, for single-period models, the comparative static analysis was done in Eeckhoudt et al. . Then, we extend the analysis to multiperiod inventory models with general utility functions at Section 3.1 and exponential utility functions at Section 3.2. The dynamic characteristics in our multiperiod models make the analysis nontrivial and even much more challenging.
Proposition 2 (the impacts of backordering costs and inventory holding costs to the optimal base-stock level). is a nondecreasing (or nonincreasing) function of (or ) which means is supermodular (or submodular) in (or ). That is, higher backordering costs (or inventory holding costs) mean higher (lower) order-up-to level at each time .
Proof. Our proof idea is to use supermodularity and has two steps. First, we find the commonality between our model and the (single-period) model in Eeckhoudt et al. . Specifically, we show that our boundary case at time is the same as in the case of Eeckhoudt et al. . Finally, we show that supermodularity preserves through time periods recursively as maximization preserves it.
First, with respect to backordering costs, our profit function definition subtracts the shortage amount multiplied by unit backordering cost , instead of being multiplied by emergency ordering costs at Eeckhoudt et al. . Thus, our parameter has the same monotone property at time as in Eeckhoudt et al.  as our objective function does not have an iterative term. Therefore, similar to Table 1 of Eeckhoudt et al. , is supermodular in .
For , when is supermodular in , a convex combination of is also supermodular in . Then, is supermodular in , so is because is nondecreasing and concave. Lastly, maximization preserves supermodularity by Theorem 8.2 of Porteus  after a proper modification.
For inventory holding costs , we can replace it with and this new parameter preserves supermodularity similarly done above. Thus, is submodular in .
3.2. Additive Exponential Utility Functions
Now we use exponential utility function for further analysis in Section 3.2. To analyze it with a risk tolerance parameter , denote the “certainty equivalent” operator with respect to a random variable to be We also consider the “effective risk tolerance” per period, defined as . It implies that .
Then, at time with an additive exponential utility function, where Thus, at time , the value function is equivalent to the corresponding single-period (newsvendor) problem, which is .
Now let us consider the next case at time with . Assume that
For any given (), the first-order optimal condition with respect to is Thus, After taking logarithm and calculating the equation, the maximizer of is as follows: Then, the optimal consumption level can be calculated as follows: where Then, by plugging (22) into (21), it is where Therefore, the optimal order-up-to level is independent of wealth level with exponential utility function which simplifies the model.
Proposition 3 (the impact of resale price to the optimal base-stock level). is a nonincreasing function of which means is submodular in . That is, higher resale price means lower order-up-to level at each time .
Proof. We will prove this proposition similar to Proposition 2.
First, our profit function definition has the same revenue part as in Eeckhoudt et al. . Thus, our parameter has the same monotone property at time as in Eeckhoudt et al. . Therefore, similar to Table 1 of Eeckhoudt et al. , is submodular in , equivalently supermodular in with .
Next, in order to discuss preservation of supermodularity for in , let me work it stepwise. For , when is supermodular in , is submodular in . Then, this submodularity is invariant to a convex combination of . Next, submodularity changes to supermodularity by . Finally, is supermodular in as is also supermodular. Lastly, certainty equivalent operator and maximization preserve supermodularity.
Proposition 4 (the impact of fluctuations to the optimal base-stock level). When for each (the cost in a period is independent of the cost in the previous period), the base-stock solution is order-preserving with respect to the costs such as , for all .
Proof. Our proof idea is to use the concept of supermodularity. First, let me denote as follows: For supermodularity of with respect to , it is equivalent to prove that In this proposition, it is sufficient to prove that the state space and the set of state and action spaces are lattice and thatdue to the concavity of proven at Proposition 1. At time , we need to prove where . The first inequality holds true as and . Finally, the state space and action space are lattices trivially to satisfy the definition of lattice.
For time , what we need to prove is that is supermodular in because certainty equivalent operator (refer to Table 1 of Topkis ) and maximization, which was discussed at the case of previously, preserve supermodularity. As is supermodular in , a sufficient condition to the supermodularity of in is that is also supermodular in . Then, when for all , is not a function of , and thus it guarantees that is supermodular in .
4. Extension to Infinite Time Horizon Model
In this section, we consider the infinite time horizon problem as a special and limiting case of the finite time horizon problem when . For infinite time horizon model, we focus on a subset of additive exponential utility functions at Section 3.2, denoted as . For this subset of , we specifically consider two conditions. The first is uniform boundedness of where our utility, , has a finite value for all action and state spaces. That is, . So is Then, the second is for all and for all time to discuss similar analytical results at finite time MDP models, such as where with . Under this second condition, cost parameters are stationary.
Now it is time to consider the infinite horizon model with stationary parameters. Then, we study the model with an expected discounted profit criteria.
Proposition 5 (existence of a stationary and wealth independent base-stock optimal policy). A stationary and wealth -independent base-stock policy is optimal with infinite time horizon model.
Proof. There are two methods to prove stationarity of infinite time horizon model. The first one is to prove that our optimal value operator is a contraction mapping. Then, by Banach’s fixed point theorem, there exists a unique optimal solution to satisfy stationarity. In this paper, we use an alternative method shown in Puterman . Then, by Theorem 6.11.10 of Puterman , what we need to show is uniform boundedness of which we focus on in this section as all other conditions are trivial. Then, only one difference between finite and infinite time horizon models is that maximization is replaced by supremum as we consider continuous state and action spaces. Thus, for each state, there exists Moreover, this is the unique solution of where
Corollary 6 (inheritance of the monotone optimal policy for backordering costs and inventory holding costs with infinite time horizon problem). The base-stock solution is order-preserving (or order-reversing) with respect to the backordering costs (or inventory holding costs) such that is a nondecreasing (or nonincreasing) function of (or ).
Corollary 7 (inheritance of the monotone policy for purchasing costs and resale price with infinite time horizon problem). The base-stock solution is order-preserving (or order-reversing) with respect to the purchasing costs (or resale price) such as and is a nonincreasing function of .
Proof. The proof is the same as in Corollary 6.
5. Computational Study
In this section, we provide our numerical results to confirm the analytical results in Section 3. We consider additive exponential utility with , for all For model parameters, we highlight a planning horizon with . Then, as a base case, we set up a resale price, , backordering costs, , and inventory holding costs, , with a discount rate, . In addition, we also define Markovian fluctuation of purchasing costs, with . Our transition matrix is with for all . That is, a purchasing cost in a period is independent of the cost in the previous period. Finally, demands in each time are iid and have a support with . Then, it has a truncated and discretized normal distribution bounded by and where the expected value and variance of the original (unbounded) distribution are given as and .
Figure 1 shows how risk tolerance factor affects the optimal solution. We select the factor to be and compare the optimal order-up-to level with the risk-neutral solutions. When increases, the optimal solution becomes higher and eventually converges to the risk-neutral solution in the limit.
Figures 2–4 present the numerical results for comparative static analysis with backordering costs, inventory holding costs, and resale price, respectively. In Figure 2, we select our backordering costs to be and all other parameters are the same as in our base case. Then, as backordering costs increase, the optimal solutions also increase for each time and For Figures 3 and 4, we study the impacts of inventory holding costs and resale price to the optimal solutions. Similarly, we take the same values as in our base case except for Figure 3 and for Figure 4, respectively. Then, in all our cases, our analytical results are confirmed to show monotone impacts of these model parameters such that the optimal solutions decrease when inventory holding costs or resale price increases.
This paper reconsiders risk-averse inventory models in supply chain literature. Different from the previous works in literature, we use the two key conditions simultaneously, which are multiperiod models and fluctuations in purchasing costs. Although most of the results are seemingly consistent with those in literature, they are analytically challenging and need to be proved rigorously with independent investigation. In fact, most of the multiperiod inventory models tend to focus on characterizing the base-stock optimal ordering policies in general regardless of risk preferences. This paper could fulfill the knowledge gap in literature to conduct a comparative static analysis as a further research in this research stream.
For possible limitations, the impact of risk tolerance factor has not been discussed analytically but only numerically in this paper. Actually Figure 1 in Section 5 may imply the existence of the monotone impact on risk tolerance factor, if possible, in multiperiod inventory models. In the literature of risk-averse inventory models, such monotone impact on risk tolerance factor has been studied in various single-period models (e.g., Eeckhoudt et al. ). Thus, it is an interesting conjecture and would be left as a further possible line in this research stream, which has not been proved yet with any multiperiod risk-averse inventory models in literature, up to our best knowledge.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
The first author, Sungyong Choi, was supported (in part) by the Yonsei University Future-Leading Research Initiative of 2014. The corresponding author, Kyungbae Park, was supported by Sangji University Research Fund, 2014. The authors are very grateful to the anonymous referees for their comments and suggestions.
- M. E. Schweitzer and G. P. Cachon, “Decision bias in the newsvendor problem with a known demand distribution: experimental evidence,” Management Science, vol. 46, no. 3, pp. 404–420, 2000.
- X. Chen, M. Sim, D. Simchi-Levi, and P. Sun, “Risk aversion in inventory management,” Operations Research, vol. 55, no. 5, pp. 828–842, 2007.
- S. Choi, A. Ruszczyński, and Y. Zhao, “A multiproduct risk-averse newsvendor with law-invariant coherent measures of risk,” Operations Research, vol. 59, no. 2, pp. 346–364, 2011.
- M. Bouakiz and M. J. Sobel, “Inventory control with an exponential utility criterion,” Operations Research, vol. 40, no. 3, pp. 603–608, 1992.
- X. Chen and P. Sun, “Optimal structural policies for ambiguity and risk averse inventory and pricing models,” SIAM Journal on Control and Optimization, vol. 50, no. 1, pp. 133–146, 2012.
- S. Gavirneni, “Periodic review inventory control with fluctuating purchasing costs,” Operations Research Letters, vol. 32, no. 4, pp. 374–379, 2004.
- J. Yang and Y. Xia, “Acquisition management under fluctuating raw material prices,” Production and Operations Management, vol. 18, no. 2, pp. 212–225, 2009.
- E. L. Porteus, Foundations of Stochastic Inventory Theory, Stanford University Press, Stanford, Calif, USA, 2002.
- L. Eeckhoudt, C. Gollier, and H. Schlesinger, “The risk-averse (and prudent) newsboy,” Management Science, vol. 41, no. 5, pp. 786–794, 1995.
- D. M. Topkis, “Minimizing a submodular function on a lattice,” Operations Research, vol. 26, no. 2, pp. 305–321, 1978.
- M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley-Interscience, New York, NY, USA, 2nd edition, 1994.
- D. P. Heyman and M. J. Sobel, Stochastic Models in Operations Research: Volume II, Stochastic Optimization, Dover Publications, Mineola, NY, USA, 2004.
Copyright © 2015 Sungyong Choi and Kyungbae Park. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.