Review Article
DRL-Based Intelligent Resource Allocation for Diverse QoS in 5G and toward 6G Vehicular Networks: A Comprehensive Survey
Input: state space , action space , reward | Output:, | Initialization: | 1: Central server: | Initialize the global DRL model with random value at the beginning of decision period | 2: Local vehicles: | Initialize the local DRL models with values | Download from the central server and let | : Initialize replay memory | Iteration: For each decision period to do | 3: function FL | Local vehicles: | 4: whiledo | 5: for each vehicle in parallel do | 6: download from the controller | 7: let | 8: train the DRL agent locally with on the current service requests | 9: upload the trained weights to the central server; | 10: observe in | 10: end for | 11: central server: | 12: receive all weights updates; | 13: perform federated averaging; | 14: broadcast averaged weights | 15: end while | 16: end function |
|