Research Article

Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach

Algorithm 2

algorithm.
(1)Input: , feature , , and learning rate
(2)
(3)compute teacher model {using equations (4)–(6)}
(4)enhance the demonstration {using equation (9)}
(5)while not converged
(6)compute , and {using equations (3) and (12)}
(7)compute {using equation (11)}
(8)
(9)end while
(10)Output: