Research Article
Deep Reinforcement Learning-Based Trading Strategy for Load Aggregators on Price-Responsive Demand
Table 1
Description of the variables of the DDPG.
| Name | Meaning | Meaning of this study |
| Agent | Intelligence to be controlled | Price load-responsive load aggregator | State, S | Status of the agent | Current electricity spot market price (, ) | Action, a | Actions that an agent can take | Purchase and sale of electricity (output sell/buy) | Reward, r | Timely return value of the environment used to evaluate the quality of an action on an agent | Revenue from load aggregators | Policy, P | Agent decides the strategy of the next action based on the current state | Buying and selling actions in the next cycle are determined based on the status of the load aggregator in the previous cycle | Value | Return value of an agent action’s long-term value, distinguished from the short-term return represented by reward | Total revenue over the period of operation | Environment | Environment of the agent | Fluctuations in electricity prices (input t real-time locational marginal price, input t real-time l demand) |
|
|