Research Article

Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices

Figure 9

Maximum reward obtained for the Fault 1 operation mode of the IPR in one year of operation.