Research Article

Ubiquitous Robotic Technology for Smart Manufacturing System

Algorithm 3

-learning for SMDP.
()    problem , initialize Value Function VF to zero
()    while error > threshold
()       = random state
()      while not satisfy
()        for each that is applicable to
()          apply to , observe the outcome state and cost
()        endfor
()        
()        error = max(error, ΔVF())
()     
()   end while
() end while
()