Decentralized and Dynamic Band Selection in Uplink Enhanced Licensed-Assisted Access: Deep Reinforcement Learning Approach

<table class="table-group" id="tab2"><tr><td><table class="table"><tr><td class="thead-hr" colspan="2"><hr/></td></tr><tr class="thead"><td class="align_left">Parameter</td><td class="align_center">Value</td></tr><tr><td class="thead-hr" colspan="2"><hr/></td></tr><tr><td class="align_left">Discount factor <svg height="9.39034pt" id="M145" style="vertical-align:-3.42943pt" version="1.1" viewbox="-0.0498162 -5.96091 6.63704 9.39034" width="6.63704pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M478 372C478 418 458 448 431 448C409 448 389 431 389 410C389 404 391 400 394 395C398 388 406 371 406 348C406 253 308 122 251 51H249C254 122 249 257 231 336C212 421 189 448 159 448C126 448 75 412 23 327L48 306C83 354 103 371 115 371C125 371 134 360 144 334C185 224 192 64 183 -19C146 -100 116 -202 110 -244L125 -261C154 -259 208 -234 222 -220C222 -194 225 -84 235 -23C247 -3 273 36 308 79C379 165 478 288 478 372Z"></path></g></svg></td><td class="align_center">0.9</td></tr><tr><td class="align_left">Learning rate <svg height="12.7178pt" id="M146" style="vertical-align:-3.42947pt" version="1.1" viewbox="-0.0498162 -9.28833 7.68094 12.7178" width="7.68094pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M558 587C558 666 497 712 432 712C379 712 330 691 284 650C212 586 178 508 148 348L71 -65C49 -185 35 -229 23 -235L27 -261C49 -259 101 -251 124 -224C131 -216 138 -178 159 -24C171 66 197 200 227 356C264 550 295 668 393 668C443 668 479 632 479 575C479 516 446 458 383 418C361 404 344 398 318 397L296 350C395 338 460 281 460 192C460 110 411 40 300 40C258 40 215 55 192 69L181 51C191 18 222 -16 266 -16C308 -16 351 1 397 26C471 67 545 142 545 221C545 315 486 365 401 395C469 437 558 498 558 587Z"></path></g></svg></td><td class="align_center">0.01</td></tr><tr><td class="align_left">Exploration <svg height="6.1673pt" id="M147" style="vertical-align:-0.2063904pt" version="1.1" viewbox="-0.0498162 -5.96091 5.44961 6.1673" width="5.44961pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M387 375C387 402 357 448 257 448C172 448 82 404 82 326C82 289 108 255 156 241V239C85 223 23 181 23 116C23 39 89 -12 182 -12C265 -12 336 31 378 91L361 114C320 73 269 47 216 47C157 47 115 82 115 137C115 191 160 219 218 219C243 219 262 218 272 217L304 259L302 266C295 265 281 264 255 264C195 264 163 294 163 335C163 377 200 416 249 416C293 416 321 389 329 342C331 332 335 329 341 329C355 329 387 352 387 375Z"></path></g></svg> in <span class="nowrap"><svg height="6.1673pt" id="M148" style="vertical-align:-0.2063904pt" version="1.1" viewbox="-0.0498162 -5.96091 5.44961 6.1673" width="5.44961pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M387 375C387 402 357 448 257 448C172 448 82 404 82 326C82 289 108 255 156 241V239C85 223 23 181 23 116C23 39 89 -12 182 -12C265 -12 336 31 378 91L361 114C320 73 269 47 216 47C157 47 115 82 115 137C115 191 160 219 218 219C243 219 262 218 272 217L304 259L302 266C295 265 281 264 255 264C195 264 163 294 163 335C163 377 200 416 249 416C293 416 321 389 329 342C331 332 335 329 341 329C355 329 387 352 387 375Z"></path></g></svg>-</span>greedy policy</td><td class="align_center">0.05 to 0.01</td></tr><tr><td class="align_left">Target network update frequency <svg height="8.8423pt" id="M149" style="vertical-align:-0.2064009pt" version="1.1" viewbox="-0.0498162 -8.6359 8.8162 8.8423" width="8.8162pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M645 631C614 643 545 666 457 666C215 666 23 519 23 283C23 90 158 -16 337 -16C412 -16 489 2 522 10C543 39 590 127 606 167L580 181C519 89 459 18 348 18C201 18 122 136 122 287C122 464 244 632 435 632C544 632 602 595 608 472L639 475C643 526 645 581 645 631Z"></path></g></svg></td><td class="align_center">300</td></tr><tr><td class="align_left">Mini batch size</td><td class="align_center">32</td></tr><tr><td class="align_left">Replay buffer <svg height="9.49473pt" id="M150" style="vertical-align:-0.3238297pt" version="1.1" viewbox="-0.0498162 -9.1709 13.031 9.49473" width="13.031pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M889 613C845 597 837 593 797 566C722 646 604 699 467 699C386 699 292 682 215 633C147 590 93 522 93 432C93 299 180 230 319 230C416 230 558 314 558 498C558 510 558 514 557 534H507C507 346 408 288 318 288C230 288 184 347 184 424C184 492 218 545 269 584S384 644 473 644C576 644 678 605 739 525C654 456 610 391 557 306C475 176 431 117 388 85C333 113 281 142 202 142C165 142 135 137 111 126C80 111 60 90 60 61S76 11 109 -2C136 -13 171 -17 216 -17C279 -17 325 -9 374 7C421 -9 479 -21 543 -21C723 -21 898 127 898 324C898 392 876 458 837 518C862 543 875 553 906 566L889 613ZM461 49C569 102 649 196 702 300C725 345 752 401 781 444H782C793 411 800 376 800 336C800 135 674 34 539 34C512 34 481 40 461 47V49ZM310 51C281 42 250 38 215 38C172 38 151 45 151 65C151 82 182 87 195 87C231 87 269 79 310 52V51Z"></path></g></svg> size</td><td class="align_center">1000</td></tr><tr class="table-tr"><td colspan="2"><hr class="tbody-hr"/></td></tr></table></td></tr></table>

Wireless Communications and Mobile Computing

tab2

Table 2

Table 2: Decentralized and Dynamic Band Selection in Uplink Enhanced Licensed-Assisted Access: Deep Reinforcement Learning Approach