A UAV Pursuit-Evasion Strategy Based on DDPG and Imitation Learning

<table class="table-group" id="tab1"><tr><td><table class="table"><tr><td class="thead-hr" colspan="3"><hr/></td></tr><tr class="thead"><td class="align_left">Training hyperparameter</td><td class="align_center">Symbol</td><td class="align_center">Value</td></tr><tr><td class="thead-hr" colspan="3"><hr/></td></tr><tr><td class="align_left">Discounting factor</td><td class="align_center"><span style="width: 6.63704ptpx;"><svg height="9.39034pt" id="M148" style="vertical-align:-3.42943pt" version="1.1" viewbox="-0.0498162 -5.96091 6.63704 9.39034" width="6.63704pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M478 372C478 418 458 448 431 448C409 448 389 431 389 410C389 404 391 400 394 395C398 388 406 371 406 348C406 253 308 122 251 51H249C254 122 249 257 231 336C212 421 189 448 159 448C126 448 75 412 23 327L48 306C83 354 103 371 115 371C125 371 134 360 144 334C185 224 192 64 183 -19C146 -100 116 -202 110 -244L125 -261C154 -259 208 -234 222 -220C222 -194 225 -84 235 -23C247 -3 273 36 308 79C379 165 478 288 478 372Z"></path></g></svg></span></td><td class="align_center">0.9</td></tr><tr><td class="align_left">Inertial update rate</td><td class="align_center"><span style="width: 6.40217ptpx;"><svg height="6.1673pt" id="M149" style="vertical-align:-0.2063904pt" version="1.1" viewbox="-0.0498162 -5.96091 6.40217 6.1673" width="6.40217pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M471 456L444 459C426 433 414 430 388 430C324 430 270 434 216 434C103 434 51 374 23 338L43 317C96 366 146 380 221 375L154 109C149 86 147 68 147 52C147 4 168 -12 197 -12C240 -12 291 25 334 71L320 96C295 75 268 58 252 58C238 58 227 79 238 138C251 211 272 296 292 372C310 372 332 368 350 368C391 368 421 369 434 371C444 388 455 413 471 456Z"></path></g></svg></span></td><td class="align_center">0.01</td></tr><tr><td class="align_left">Memory size</td><td class="align_center"><span style="width: 12.9526ptpx;"><svg height="8.68572pt" id="M150" style="vertical-align:-0.0498209pt" version="1.1" viewbox="-0.0498162 -8.6359 12.9526 8.68572" width="12.9526pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M962 650H795L470 145L347 650H176L170 622C268 613 275 606 244 503L190 322C150 188 132 126 118 91C102 50 80 33 18 28L12 0H245L251 28C175 35 170 48 174 93C177 128 191 180 220 284L292 542H294C331 392 383 150 409 4H432L774 555H776L714 137C700 40 694 34 612 28L606 0H868L874 28C793 34 784 37 797 137L849 533C859 612 863 616 956 622L962 650Z"></path></g></svg></span></td><td class="align_center">30000</td></tr><tr><td class="align_left">Size of batch experience</td><td class="align_center">Batch size</td><td class="align_center">64</td></tr><tr><td class="align_left">Simulation time step</td><td class="align_center"><span style="width: 16.7759ptpx;"><svg height="9.01194pt" id="M151" style="vertical-align:-0.04981995pt" version="1.1" viewbox="-0.0498162 -8.96212 16.7759 9.01194" width="16.7759pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M600 0V24L339 665L311 657L43 24V0H600ZM497 50H107L303 539L497 50Z"></path></g><g transform="matrix(.013,0,0,-0.013,8.327,0)"><path d="M620 675H597C578 656 570 650 541 650H144C112 650 104 653 94 675H72C59 618 42 552 23 493L53 491C71 534 88 564 105 585C124 608 144 615 238 615H290L197 121C182 40 174 34 88 28L82 0H361L367 28C275 34 266 38 281 121L374 615H441C522 615 543 608 553 583C562 560 566 531 565 493L597 494C603 551 612 629 620 675Z"></path></g></svg></span></td><td class="align_center">0.1</td></tr><tr><td class="align_left">Learning rate of Critic network</td><td class="align_center"><span style="width: 14.3328ptpx;"><svg height="10.6404pt" id="M152" style="vertical-align:-4.67949pt" version="1.1" viewbox="-0.0498162 -5.96091 14.3328 10.6404" width="14.3328pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M545 106L524 126C493 85 467 65 455 65C438 65 427 113 405 238C448 295 498 362 543 439L533 448L478 435C453 386 423 331 398 295H395C370 404 347 448 282 448C169 448 23 309 23 153C23 54 65 -12 128 -12C203 -12 283 70 339 155H341C360 29 380 -12 411 -12C444 -12 491 11 545 106ZM333 204C265 95 210 54 169 54C137 54 113 96 113 171C113 302 191 405 252 405C301 405 318 306 333 204Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,7.059,3.132)"><path d="M750 355C750 552 603 667 428 667C199 667 24 497 24 286C24 118 146 -4 316 -16L336 -29C480 -122 544 -145 582 -154C606 -159 666 -170 714 -172L722 -146C652 -117 562 -75 472 -15L454 -3C623 38 750 176 750 355ZM646 353C646 210 556 50 414 25L385 41L304 27C193 50 128 152 128 282C128 474 254 626 418 626C571 626 646 519 646 353Z"></path></g></svg></span></td><td class="align_center">0.002</td></tr><tr><td class="align_left">Learning rate of Actor network</td><td class="align_center"><span style="width: 12.7252ptpx;"><svg height="9.25202pt" id="M153" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 12.7252 9.25202" width="12.7252pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M545 106L524 126C493 85 467 65 455 65C438 65 427 113 405 238C448 295 498 362 543 439L533 448L478 435C453 386 423 331 398 295H395C370 404 347 448 282 448C169 448 23 309 23 153C23 54 65 -12 128 -12C203 -12 283 70 339 155H341C360 29 380 -12 411 -12C444 -12 491 11 545 106ZM333 204C265 95 210 54 169 54C137 54 113 96 113 171C113 302 191 405 252 405C301 405 318 306 333 204Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,7.059,3.132)"><path d="M528 97L513 125C484 95 440 68 431 68C425 68 421 76 427 107C449 227 476 342 505 451H495L418 424L385 278C338 191 220 62 157 62C145 62 133 76 145 130L200 368C215 434 210 451 183 451C157 451 92 418 24 352L37 324C78 356 106 377 115 377C123 377 122 364 116 340L64 115C58 89 56 68 56 52C56 2 78 -12 103 -12C127 -12 159 -4 190 18C247 59 309 113 366 187H368L351 108C328 1 348 -12 368 -12C403 -12 472 33 528 97Z"></path></g></svg></span></td><td class="align_center">0.001</td></tr><tr><td class="align_left">Number of episodes</td><td class="align_center">MaxEpisode</td><td class="align_center">4000</td></tr><tr><td class="align_left">Number of steps in one episode</td><td class="align_center">MaxStep</td><td class="align_center">300</td></tr><tr class="table-tr"><td colspan="3"><hr class="tbody-hr"/></td></tr></table></td></tr></table>

<div>Table of training hyperparameter.</div>

International Journal of Aerospace Engineering

tab1

Table 1

Table 1: A UAV Pursuit-Evasion Strategy Based on DDPG and Imitation Learning