Journal of Sensors

Research Article

Trading and Pricing Sensor Data in Competing Edge Servers with Double Auction Markets

I-PDQN algorithm.

Input: market pricing parameters and , number of buyers , number of sellers , trader bidding space
Output: the Nash equilibrium trading strategy of the trader
1 Initialization: For each Trader , Initialize the Exploration Parameter , Batch Size , Uniform Distribution , and Randomly Initialize the Network Weights and , and , and the Initial State Is
2 while The loss function of traders is not convergence do
3 For each trader , Calculate the continuous parameter corresponding to all discrete actions according to the current state;
4 Select action according to the following rules
5
6 When the bidding time of the current stage ends, each trader obtains its immediate return and the state of the next stage through the market rules;
7 For each trader , the tuple is stored in replay memory ;
8 Strategy training:
9 Each trader takes samples from replay memory and calculates according to equation:
10
11 Calculate the random gradients and according to equations (17) and (19), and update the weight according to equation:
12 and