Research Article

Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach

Figure 5

’s stage-two performance in face of abundant and different demonstration optimality levels in stage-one (points “A1”, …, “A8” and “B10” in Figure 3) and the number of evaluative feedbacks. The black curve has no initial demonstration (point “C” in Figure 3).