Research Article
Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
(1) | Input: , feature , , and learning rate | (2) | | (3) | compute teacher model {using equations (4)–(6)} | (4) | enhance the demonstration {using equation (9)} | (5) | while not converged | (6) | compute , and {using equations (3) and (12)} | (7) | compute {using equation (11)} | (8) | | (9) | end while | (10) | Output: |
|