Research Article

Composition of Web Services Using Markov Decision Processes and Dynamic Programming

Algorithm 4

Sarsa algorithm.
(1) initialize arbitrarily
(2) foreach training episode do
(3)  initialize
(4)  choose from using policy derived from
(5)  repeat for each step of episode
(6)   take action , observe ,
(7)   choose from using policy derived from
(8)   
(9)   ;
(10)  until   is terminal
(11) end