Minimizing the Cost of Spatiotemporal Searches Based on Reinforcement Learning with Probabilistic States

<div>Training of QDP. <svg height="9.25202pt" id="M90" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 8.59533 9.25202" width="8.59533pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M352 391C352 416 319 448 267 448C236 448 173 423 147 400C107 364 96 332 96 304C96 248 143 210 193 181C241 153 258 124 258 100C258 72 232 38 184 38C151 38 107 66 81 108C77 114 64 116 55 111C34 99 23 84 23 65C23 29 81 -12 134 -12C220 -12 325 61 325 141C325 184 297 215 234 256C194 282 161 309 161 346C161 380 188 401 217 401C255 401 279 380 301 353C308 344 313 341 325 347C341 355 352 371 352 391Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,4.81,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g></svg>denotes the current state, <svg height="9.25202pt" id="M91" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 9.78277 9.25202" width="9.78277pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M483 97L471 123C436 91 401 65 392 65C388 65 384 74 390 106C414 239 444 378 457 429L455 433C444 433 429 436 416 439C392 444 368 448 344 448C281 448 204 415 152 376C71 315 23 205 23 103C23 21 57 -12 85 -12C114 -12 149 6 185 34C231 70 285 119 329 183H331L309 81C292 0 308 -12 326 -12C350 -12 421 24 483 97ZM374 387C370 363 356 291 345 261C315 193 181 50 139 50C124 50 110 71 110 118C110 224 153 331 218 379C238 394 271 402 301 402C329 402 359 394 374 387Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,5.993,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g></svg> denotes the action performed at state <span class="nowrap"><svg height="9.25202pt" id="M92" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 8.59533 9.25202" width="8.59533pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M352 391C352 416 319 448 267 448C236 448 173 423 147 400C107 364 96 332 96 304C96 248 143 210 193 181C241 153 258 124 258 100C258 72 232 38 184 38C151 38 107 66 81 108C77 114 64 116 55 111C34 99 23 84 23 65C23 29 81 -12 134 -12C220 -12 325 61 325 141C325 184 297 215 234 256C194 282 161 309 161 346C161 380 188 401 217 401C255 401 279 380 301 353C308 344 313 341 325 347C341 355 352 371 352 391Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,4.81,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g></svg>,</span> <svg height="9.36162pt" id="M93" style="vertical-align:-3.40071pt" version="1.1" viewbox="-0.0498162 -5.96091 18.7159 9.36162" width="18.7159pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M393 379C402 394 400 411 393 422C384 437 365 448 348 448C301 448 237 372 186 285H182L193 335C210 408 205 448 178 448C150 448 80 402 29 344L45 321C80 355 114 373 122 373C128 373 130 365 124 330C106 228 76 98 50 -5L57 -12C82 -5 112 3 132 6L172 203C196 256 234 304 254 329C275 355 293 367 306 367C318 367 330 360 342 348C347 343 355 343 365 350S386 367 393 379Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,4.901,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,8.059,3.132)"><path d="M556 236V289H337V504H275V289H56V236H275V-4H337V236H556Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,13.619,3.132)"><path d="M389 0V32C297 38 291 46 291 118V635C234 613 175 595 109 583V556L161 554C203 552 207 547 207 497V118C207 46 201 38 110 32V0H389Z"></path></g></svg> denotes the reward (reward is the inverse of cost) of the action <svg height="9.25202pt" id="M94" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 9.78277 9.25202" width="9.78277pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M483 97L471 123C436 91 401 65 392 65C388 65 384 74 390 106C414 239 444 378 457 429L455 433C444 433 429 436 416 439C392 444 368 448 344 448C281 448 204 415 152 376C71 315 23 205 23 103C23 21 57 -12 85 -12C114 -12 149 6 185 34C231 70 285 119 329 183H331L309 81C292 0 308 -12 326 -12C350 -12 421 24 483 97ZM374 387C370 363 356 291 345 261C315 193 181 50 139 50C124 50 110 71 110 118C110 224 153 331 218 379C238 394 271 402 301 402C329 402 359 394 374 387Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,5.993,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g></svg> at state <span class="nowrap"><svg height="9.25202pt" id="M95" style="vertical-align:-3.29111pt" version="1.1" viewbox="-0.0498162 -5.96091 8.59533 9.25202" width="8.59533pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M352 391C352 416 319 448 267 448C236 448 173 423 147 400C107 364 96 332 96 304C96 248 143 210 193 181C241 153 258 124 258 100C258 72 232 38 184 38C151 38 107 66 81 108C77 114 64 116 55 111C34 99 23 84 23 65C23 29 81 -12 134 -12C220 -12 325 61 325 141C325 184 297 215 234 256C194 282 161 309 161 346C161 380 188 401 217 401C255 401 279 380 301 353C308 344 313 341 325 347C341 355 352 371 352 391Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,4.81,3.132)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g></svg>,</span> <svg height="15.8272pt" id="M96" style="vertical-align:-4.0531pt" version="1.1" viewbox="-0.0498162 -11.7741 30.264 15.8272" width="30.264pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M293 -169V-141C218 -131 209 -90 209 -44C209 19 222 85 222 152C222 207 198 255 139 269V273C199 286 222 334 222 388C222 454 209 523 209 588C209 632 218 671 293 681V709C234 709 148 695 148 577C147 513 155 438 155 372C155 337 152 291 64 285V256C152 250 155 204 155 169C156 105 148 31 148 -41C148 -157 234 -169 293 -169Z"></path></g><g transform="matrix(.013,0,0,-0.013,4.511,0)"><path d="M570 304C570 398 525 448 414 448C385 448 343 445 312 434L329 511L321 518C297 504 262 482 244 460L233 411C195 397 159 381 128 358L135 332C160 347 189 360 224 373L111 -147C97 -210 84 -218 17 -231L13 -257L254 -247L259 -218L233 -216C183 -212 177 -202 189 -142L218 -1C238 -10 266 -12 283 -12C351 3 429 48 483 105C543 168 570 242 570 304ZM482 289C482 161 380 33 304 33C278 33 248 51 233 69L303 396C326 400 352 403 369 403C428 403 482 380 482 289Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,12.221,-5.741)"><path d="M250 606C250 634 233 656 203 656C168 656 146 618 146 593C146 564 169 545 192 545C227 545 250 573 250 606ZM227 95L212 119C187 98 152 71 135 71C129 71 128 78 134 102L207 373C219 418 217 451 194 451C165 451 92 411 30 351L44 326C77 353 106 371 114 371C124 371 121 357 117 341L55 97C32 5 46 -12 70 -12C108 -12 191 51 227 95Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,11.895,3.784)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,15.053,3.784)"><path d="M556 236V289H337V504H275V289H56V236H275V-4H337V236H556Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,20.613,3.784)"><path d="M389 0V32C297 38 291 46 291 118V635C234 613 175 595 109 583V556L161 554C203 552 207 547 207 497V118C207 46 201 38 110 32V0H389Z"></path></g><g transform="matrix(.013,0,0,-0.013,25.592,0)"><path d="M283 255V284C195 290 192 337 192 375C192 436 200 508 200 580C200 696 116 709 54 709V681C127 671 139 634 139 588C138 524 125 452 125 387C125 333 149 286 209 272V268C148 253 125 206 125 151C126 87 139 16 139 -47C139 -92 127 -131 54 -141V-169C115 -169 200 -152 200 -41C200 32 192 104 192 164C192 202 195 249 283 255Z"></path></g></svg> denotes the probability distribution of next states, and <svg height="15.8272pt" id="M97" style="vertical-align:-4.0531pt" version="1.1" viewbox="-0.0498162 -11.7741 18.7159 15.8272" width="18.7159pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M393 379C402 394 400 411 393 422C384 437 365 448 348 448C301 448 237 372 186 285H182L193 335C210 408 205 448 178 448C150 448 80 402 29 344L45 321C80 355 114 373 122 373C128 373 130 365 124 330C106 228 76 98 50 -5L57 -12C82 -5 112 3 132 6L172 203C196 256 234 304 254 329C275 355 293 367 306 367C318 367 330 360 342 348C347 343 355 343 365 350S386 367 393 379Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,5.488,-5.741)"><path d="M250 606C250 634 233 656 203 656C168 656 146 618 146 593C146 564 169 545 192 545C227 545 250 573 250 606ZM227 95L212 119C187 98 152 71 135 71C129 71 128 78 134 102L207 373C219 418 217 451 194 451C165 451 92 411 30 351L44 326C77 353 106 371 114 371C124 371 121 357 117 341L55 97C32 5 46 -12 70 -12C108 -12 191 51 227 95Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,4.901,3.784)"><path d="M329 433H203L239 587L230 596L147 534L123 433H57L30 395L34 388H115L61 129C37 16 59 -12 85 -12C147 -12 222 58 260 98L241 125C212 95 160 62 144 62C132 62 127 71 138 126L192 386L305 394L329 433Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,8.059,3.784)"><path d="M556 236V289H337V504H275V289H56V236H275V-4H337V236H556Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,13.619,3.784)"><path d="M389 0V32C297 38 291 46 291 118V635C234 613 175 595 109 583V556L161 554C203 552 207 547 207 497V118C207 46 201 38 110 32V0H389Z"></path></g></svg> denotes the reward of the optimal action at each possible state.</div>

Wireless Communications and Mobile Computing

fig4

Figure 4

Figure 4: Minimizing the Cost of Spatiotemporal Searches Based on Reinforcement Learning with Probabilistic States