Research Article

Deep Reinforcement Learning-Based Trading Strategy for Load Aggregators on Price-Responsive Demand

Table 1

Description of the variables of the DDPG.

NameMeaningMeaning of this study

AgentIntelligence to be controlledPrice load-responsive load aggregator
State, SStatus of the agentCurrent electricity spot market price (, )
Action, aActions that an agent can takePurchase and sale of electricity (output sell/buy)
Reward, rTimely return value of the environment used to evaluate the quality of an action on an agentRevenue from load aggregators
Policy, PAgent decides the strategy of the next action based on the current stateBuying and selling actions in the next cycle are determined based on the status of the load aggregator in the previous cycle
ValueReturn value of an agent action’s long-term value, distinguished from the short-term return represented by rewardTotal revenue over the period of operation
EnvironmentEnvironment of the agentFluctuations in electricity prices (input t real-time locational marginal price, input t real-time l demand)