Research Article

Composition of Web Services Using Markov Decision Processes and Dynamic Programming

Algorithm 5

-learning algorithm.
(1) initialize arbitrarily
(2) foreach training episode do
(3)  initialize
(4)  repeat for each step of episode
(5)   choose from using policy derived from
(6)   take action , observe ,
(7)   
(8)   ;
(9)   until  is terminal
(10) end