Research Article
Decomposition Methods for Solving Finite-Horizon Large MDPs
Algorithm 1
Ameliorated backward induction.
(1) | ABI (In MDP: Out | (2) | | (3) | Take | (4) | Repeat | (5) | | (6) | For each Do | (7) | //The Deterministic Decision Rule | (8) | | (9) | For each Do | (10) | is an optimal policy and is the optimal value vector. | 11 | Return |
|