Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

<div>Global model planning (<svg height="14.8173pt" id="M218" style="vertical-align:-5.52897pt" version="1.1" viewbox="-0.0498162 -9.28833 157.889 14.8173" width="157.889pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M475 507C475 612 440 712 326 712C139 712 23 420 23 215C23 96 58 -12 180 -12C369 -12 475 293 475 507ZM391 522C391 486 387 448 379 394H126C155 538 222 677 310 677C386 677 391 571 391 522ZM373 346C344 193 283 22 189 22C126 22 106 114 106 196C106 243 111 293 118 346H373Z" id="g113-230"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="498" vert-adv-y="498"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,6.475,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,11.618,0)"><path d="M558 587C558 666 497 712 432 712C379 712 330 691 284 650C212 586 178 508 148 348L71 -65C49 -185 35 -229 23 -235L27 -261C49 -259 101 -251 124 -224C131 -216 138 -178 159 -24C171 66 197 200 227 356C264 550 295 668 393 668C443 668 479 632 479 575C479 516 446 458 383 418C361 404 344 398 318 397L296 350C395 338 460 281 460 192C460 110 411 40 300 40C258 40 215 55 192 69L181 51C191 18 222 -16 266 -16C308 -16 351 1 397 26C471 67 545 142 545 221C545 315 486 365 401 395C469 437 558 498 558 587Z" id="g113-224"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="581" vert-adv-y="581"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,19.172,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,24.316,0)"><path d="M471 401C471 442 455 448 440 448C421 448 389 439 366 424C314 390 233 331 165 242H163L187 345C193 372 197 393 197 409C197 435 189 448 174 448C146 448 80 408 23 349L40 327C64 351 97 373 107 373C115 373 118 366 111 337L29 -4L36 -12C56 -4 83 3 110 9C123 69 136 126 150 172C227 283 336 381 377 381C392 381 396 370 379 298L326 74C290 -77 273 -172 273 -197C273 -219 316 -261 346 -261L353 -246C348 -227 352 -174 374 -66L457 310C467 352 471 381 471 401Z" id="g113-229"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="494" vert-adv-y="494"></glyph.data></g><g transform="matrix(.0091,0,0,-0.0091,30.413,3.132)"><path d="M389 0V32C297 38 291 46 291 118V635C234 613 175 595 109 583V556L161 554C203 552 207 547 207 497V118C207 46 201 38 110 32V0H389Z" id="g50-50"></path><glyph.data ascent="3443" descent="-2856" horiz-adv-x="487" vert-adv-y="487"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,35.359,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,40.502,0)"><path d="M113 -12C146 -12 170 11 170 46C170 78 146 103 114 103S58 78 58 46C58 11 82 -12 113 -12Z" id="g113-47"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,45.645,0)"><path d="M113 -12C146 -12 170 11 170 46C170 78 146 103 114 103S58 78 58 46C58 11 82 -12 113 -12Z" id="g113-47"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,50.788,0)"><path d="M113 -12C146 -12 170 11 170 46C170 78 146 103 114 103S58 78 58 46C58 11 82 -12 113 -12Z" id="g113-47"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,55.965,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,61.108,0)"><path d="M471 401C471 442 455 448 440 448C421 448 389 439 366 424C314 390 233 331 165 242H163L187 345C193 372 197 393 197 409C197 435 189 448 174 448C146 448 80 408 23 349L40 327C64 351 97 373 107 373C115 373 118 366 111 337L29 -4L36 -12C56 -4 83 3 110 9C123 69 136 126 150 172C227 283 336 381 377 381C392 381 396 370 379 298L326 74C290 -77 273 -172 273 -197C273 -219 316 -261 346 -261L353 -246C348 -227 352 -174 374 -66L457 310C467 352 471 381 471 401Z" id="g113-229"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="494" vert-adv-y="494"></glyph.data></g><g transform="matrix(.0091,0,0,-0.0091,67.205,3.132)"><path d="M485 418C485 434 469 451 441 451C389 451 312 387 255 335C221 304 194 278 159 242H157L259 678C264 698 264 710 255 710C242 710 183 681 96 673L90 643L128 642C165 641 171 640 162 602L24 -5L31 -12C51 -4 81 4 110 9L145 181C154 195 180 220 205 242C233 170 263 106 291 53C318 1 335 -12 360 -12C383 -12 425 0 483 82L463 105C437 79 411 57 400 57C388 57 378 68 355 110C329 156 288 244 268 298C299 333 355 376 405 376C426 376 440 373 448 368C453 365 465 368 468 373C477 386 485 405 485 418Z" id="g50-108"></path><glyph.data ascent="3443" descent="-2856" horiz-adv-x="509" vert-adv-y="509"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,72.352,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,77.496,0)"><path d="M384 393C384 425 346 448 294 448C244 448 200 431 150 399C75 351 23 271 23 176C23 90 52 45 132 19C150 13 166 9 195 2C245 -10 260 -27 260 -55C260 -92 234 -127 218 -145L237 -159C271 -137 333 -93 333 -18C333 43 283 62 228 74C199 80 186 84 172 89C116 109 101 150 101 210C101 318 164 400 241 400C276 400 301 390 327 356C334 347 338 343 346 345C367 349 384 374 384 393Z" id="g113-252"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="407" vert-adv-y="407"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,82.788,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,87.931,0)"><path d="M391 364C391 409 353 448 295 448C249 448 198 426 152 393C65 331 23 225 23 139C23 14 96 -12 146 -12C198 -12 280 9 367 101L351 124C300 78 242 48 194 48C129 48 109 107 109 162V191C208 213 391 266 391 364ZM313 350C313 305 268 261 113 223C132 334 187 381 217 398C227 404 244 405 261 405C290 405 313 385 313 350Z" id="g113-102"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="414" vert-adv-y="414"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,93.313,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,98.456,0)"><path d="M524 0V26C466 32 460 36 460 104V297C460 393 411 449 331 449C302 449 276 437 248 419C223 402 201 387 181 372V451C137 432 90 420 42 411V388C96 378 102 374 102 310V104C102 38 97 33 29 26V0H246V26C187 32 181 36 181 104V339C211 365 250 390 290 390C357 390 381 345 381 276V109C381 40 374 32 315 26V0H524Z" id="g121-108"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="547" vert-adv-y="547"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,105.319,0)"><path d="M518 50L491 51C452 54 444 60 444 110V444C429 441 405 437 377 434C348 430 314 427 287 426V403L323 397C356 392 365 386 365 333V99C331 66 293 51 257 51C212 51 169 75 169 164V299C169 366 169 413 172 444C156 441 128 437 101 433C75 430 50 427 29 426V403L57 397C82 391 90 386 90 333V137C90 29 147 -12 214 -12C241 -12 262 -4 291 13S342 48 365 65V-6L371 -12C390 -7 415 1 441 8C468 15 496 21 518 24V50Z" id="g121-115"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="531" vert-adv-y="531"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,112.222,0)"><path d="M797 0V26C739 32 732 36 732 103V296C732 394 682 449 605 449C576 449 550 437 529 423C504 407 475 389 446 366C425 418 382 449 334 449C303 449 279 437 253 421C222 403 201 385 180 371V452C135 432 85 419 41 411V388C99 379 102 374 102 310V103C102 38 93 32 27 26V0H238V26C189 32 180 38 180 103V338C210 363 250 390 289 390C351 390 377 348 377 275V103C377 37 368 32 306 26V0H520V26C465 32 456 38 456 101V296C456 314 455 326 453 338C491 369 529 390 565 390C628 390 653 345 653 274V107C653 36 642 32 583 26V0H797Z" id="g121-107"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="819" vert-adv-y="819"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,122.739,0)"><path d="M152 404V712C115 698 54 683 7 677V654C71 648 73 642 73 579V24C128 -2 179 -12 220 -12C353 -12 471 92 471 238C471 357 381 449 274 449C262 449 249 446 233 439L152 404ZM152 374C170 384 202 393 230 393C313 393 382 326 382 213C382 97 330 26 246 26C194 26 165 62 158 81C154 91 152 101 152 116V374Z" id="g121-96"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="508" vert-adv-y="508"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,129.46,0)"><path d="M380 106C343 72 306 56 265 56C195 56 116 112 115 248C235 252 361 262 377 265C396 269 400 277 400 297C400 374 333 449 250 449H249C198 449 144 421 103 376S37 269 37 201C37 88 109 -12 232 -12C263 -12 332 6 395 84L380 106ZM225 412C281 412 315 364 314 312C314 297 308 292 290 292C232 290 176 289 120 289C135 370 180 412 225 412Z" id="g121-99"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="425" vert-adv-y="425"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,134.985,0)"><path d="M181 342V451C133 431 89 419 40 411V388C98 381 102 377 102 311V104C102 38 95 32 33 26V0H263V26C186 32 181 38 181 104V287C203 343 235 372 261 372C277 372 289 366 304 352C310 346 318 345 330 350C349 359 362 379 362 399C362 422 338 449 304 449C256 449 213 393 183 342H181Z" id="g121-112"></path><glyph.data ascent="989" descent="-360" horiz-adv-x="371" vert-adv-y="371"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,139.808,0)"><path d="M95 130C70 130 46 113 46 88C46 72 54 64 59 64C93 55 121 33 121 -3C121 -41 93 -68 44 -88L55 -117C117 -98 186 -56 186 22C186 91 131 130 95 130Z" id="g113-45"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="228" vert-adv-y="228"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,144.951,0)"><path d="M600 480C600 590 528 650 384 650H143L137 622C222 614 225 607 210 531L130 127C113 41 106 36 23 28L17 0H294L300 28C204 36 195 42 212 127L243 284L314 263C327 263 339 263 352 264C465 271 600 337 600 480ZM508 481C508 351 402 304 329 304C289 304 265 311 250 317L295 559C302 594 310 606 323 611C335 616 350 619 367 619C455 619 508 573 508 481Z" id="g113-81"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="617" vert-adv-y="617"></glyph.data></g><g transform="matrix(.0091,0,0,-0.0091,151.607,3.132)"><path d="M554 433L549 437C539 437 513 441 503 443C481 447 456 451 436 451C354 451 268 415 216 368C150 308 99 205 99 106C99 24 136 -12 165 -12C196 -12 243 14 268 33C318 70 374 118 415 181H419L395 39C374 -84 332 -149 295 -175C269 -193 245 -197 212 -197C137 -197 93 -156 76 -102C71 -87 61 -87 51 -96C36 -110 24 -130 24 -147C24 -183 76 -257 171 -257C226 -257 285 -227 323 -200C387 -155 455 -76 480 94C499 225 538 391 554 433ZM460 386C457 361 438 287 426 254C408 218 373 176 332 133C294 93 246 61 217 61C198 61 186 80 186 122C186 167 206 250 231 294C262 349 288 375 317 387C334 394 358 400 381 400C413 400 439 393 460 386Z" id="g50-104"></path><glyph.data ascent="3443" descent="-2856" horiz-adv-x="578" vert-adv-y="578"></glyph.data></g></svg>).</div>

Computational Intelligence and Neuroscience

alg4

Algorithm 4

Algorithm 4: Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning