
, for all and, and observer 
the state . 
For every rule , determine according to the membership 
function . 
For every rule , select with an EEP (formula (7)). 
Calculate the inferred action (formula (8)). 
Calculate the corresponding (formula (9)). 
Perform the action , receive the reward , and observe 
the next state . 
For every rule , calculate . 
Calculate (formula (10)). 
Calculate (formula (11)). 
Update : (formula (12)). 
, go to (3). 
