Initialize and for all ; |
Set parameters and decision time; |
Give the initial state ; |
Repeat |
(1) Choose an exploration action based on the mixed strategy set ; |
(2) Execute the exploration action to AGC units and run LFC system for the next sec; |
(3) Observe a new state via CPS1 and ACE; |
(4) Obtain a short-term reward using Eq. (19); |
(5) Update eligibility trace according to Eq. (2); |
(6) Update Q function using Eq. (3); |
(7) Select variable learning rate δ with Eq. (7); |
(8) Compute by Eq. (5) and Eq. (6); |
(9) Calculate and according to Eq. (8); |
(10) Update the mixed strategy according to Eq. (4); |
(11) Obtain the total power of the GSGi; |
(12) Determine the ramp rate according to Eq. (13); |
(13) Execute CC algorithm according to Eq. (14) and Eq. (15); |
(14) Calculate the uth unit power in GSGi; |
(15) If the power limit is not exceeded, then execute step 17; |
(16) Calculate and according to Eq. (17). And update using Eq. (9), Eq. (11) and Eq. (18); |
(17) Calculate the power error according to Eq. (16); |
(18) If is not satisfied, execute step 13; |
(19) Output the uth unit power ; |
(20) Set , and return to step 1. |
End |