Dynamic Resource Allocation with Integrated Reinforcement Learning for a D2D-Enabled LTE-A Network with Access to Unlicensed Band

<div>The average number of RL iterations (slots) necessary for convergence of utilities in JRA with different values of <svg height="9.25202pt" id="M297" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 11.851 9.25202" width="11.851pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M457 332C457 400 429 448 397 448C384 448 371 441 365 435L366 428C376 413 388 372 388 315C388 152 306 45 220 45C166 45 134 85 158 180L197 332C204 361 207 385 207 402C207 434 196 448 174 448C125 448 66 406 23 342L43 319C84 368 110 383 121 383C130 383 129 374 122 343L76 156C71 137 69 120 69 105C69 25 125 -12 181 -12C314 -12 457 166 457 332Z" id="g113-242"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="480" vert-adv-y="480"></glyph.data></g><g transform="matrix(.0091,0,0,-0.0091,5.915,3.132)"><path d="M558 106L533 129C501 84 476 67 464 67C446 67 435 116 414 239C457 297 508 367 550 440L539 451L484 438C458 390 429 338 404 300H402C375 407 351 451 285 451C168 451 24 312 24 153C24 56 66 -12 132 -12C206 -12 284 64 345 156H347C367 24 386 -12 422 -12C457 -12 503 7 558 106ZM338 206C272 99 215 56 173 56C140 56 116 96 116 170C116 303 194 405 255 405C304 405 324 299 338 206Z" id="g50-223"></path><glyph.data ascent="3443" descent="-2856" horiz-adv-x="582" vert-adv-y="582"></glyph.data></g></svg> and fixed <svg height="11.439pt" id="M298" style="vertical-align:-2.15067pt" version="1.1" viewbox="-0.0498162 -9.28833 43.2343 11.439" width="43.2343pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M162 -163V703H101V-163H162Z" id="g113-9"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="263" vert-adv-y="263"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,3.419,0)"><path d="M625 186C574 85 512 32 429 32C293 32 194 153 194 337C194 485 274 621 419 621C496 621 573 586 606 468L643 475C636 537 629 583 620 638C593 644 519 665 438 665C199 665 42 523 42 317C42 158 152 -15 418 -15C486 -15 578 5 604 11C621 46 647 122 660 171L625 186Z" id="g13-65"></path><glyph.data ascent="1024" descent="-360" horiz-adv-x="676" vert-adv-y="676"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,12.207,0)"><path d="M162 -163V703H101V-163H162Z" id="g113-9"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="263" vert-adv-y="263"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,19.258,0)"><path d="M535 323V373H52V323H535ZM535 138V188H52V138H535Z" id="g117-34"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="587" vert-adv-y="587"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,30.521,0)"><path d="M384 0V27C293 34 287 42 287 114V635C232 613 172 594 109 583V559L157 557C201 555 205 550 205 499V114C205 42 199 34 109 27V0H384Z" id="g113-50"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="480" vert-adv-y="480"></glyph.data></g><g transform="matrix(.013,0,0,-0.013,36.761,0)"><path d="M241 635C89 635 35 457 35 312C35 153 89 -12 240 -12C390 -12 443 166 443 312C443 466 390 635 241 635ZM238 602C329 602 354 454 354 312C354 172 330 22 240 22C152 22 124 173 124 313S148 602 238 602Z" id="g113-49"></path><glyph.data ascent="3473" descent="-2876" horiz-adv-x="480" vert-adv-y="480"></glyph.data></g></svg>.</div>

Mobile Information Systems

fig3

Figure 3

Figure 3: Dynamic Resource Allocation with Integrated Reinforcement Learning for a D2D-Enabled LTE-A Network with Access to Unlicensed Band