Decomposition Methods for Solving Finite-Horizon Large MDPs

<table class="table-group" id="tab1"><tr><td><table class="table"><tr><td class="thead-hr" colspan="4"><hr/></td></tr><tr class="thead"><td class="align_left"> </td><td class="align_center"><i>MDP</i>’s horizon</td><td class="align_center">Number of possible states</td><td class="align_center">Number of SCC<i>s</i></td></tr><tr><td class="thead-hr" colspan="4"><hr/></td></tr><tr><td class="align_left">Race-1</td><td class="align_center">31</td><td class="align_center">132750</td><td class="align_center">5120</td></tr><tr><td class="align_left">Race-2</td><td class="align_center">18</td><td class="align_center">106200</td><td class="align_center">5103</td></tr><tr><td class="align_left">Race-3</td><td class="align_center">48</td><td class="align_center">123300</td><td class="align_center">3554</td></tr><tr class="table-tr"><td colspan="4"><hr class="tbody-hr"/></td></tr></table></td></tr><tr class="table-fn"><td><div><i>Note.</i> The value iteration (<i>VI</i>) algorithm under the infinite-horizon discounted <i>MDP</i> is used in [<a href="/journals/jmath/2022/8404716/#B30" target="_blank">30</a>] in order to solve racetrack problems. presents the comparison between <i>VI</i>, <i>BI</i>, and <i>HBI</i> algorithms. As it can be seen, the <i>BI</i> algorithm outperforms the <i>VI</i> algorithm, but the proposed <i>HBI</i> algorithm is more efficient than the <i>BI</i> algorithm.<br/></div></td></tr></table>

<div>Characteristics of the three racetracks.</div>

Journal of Mathematics

tab1

Table 1

Table 1: Decomposition Methods for Solving Finite-Horizon Large MDPs