Mathematical Problems in Engineering

Research Article

Dynamic Request Routing for Online Video-on-Demand Service: A Markov Decision Process Approach

Interval value iteration () algorithm.

Input: a BMDP , a value function , and a place for holding the policy in current iteration
Output: and
(1) Create , ; and hold order sequence of states in , i.e.,
(2) Create , ; and hold the transition probabilities for the order-maximizing MDP with respect
to order and , respectively.}
(3) Create , ; and is the order-maximizing index for order sequences and , respectively.}
(4) Create ; is the index into an ordering .}
(5) ;
(6) ;
(7) for all do
(8) for all do
(9) ;
(10) ;
{find order-maximizing index for transition probability in state under action according to (18).}
(11) for to do
(12) Update and according to (19);
(13) end for
(14) end for
(15) ;(*)
(16) if and then
(17) ;
(18) ;
(19) else
(20) ;(**)
(21) ;
(22) end if
(23) end if