Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
Figure 9
(a) The performance of framework under different feedback error values used in the interactive phase when 100 state-action pairs of 60% optimal demonstrations are given (see Table 2). (b) The performance of our previous work () [15] with similar setting to (a).