Research Article
Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
Figure 3
Performance of the standard method () used in the first stage of our framework. The plain curves are the mean of āā scores with respect to demonstration steps and nonoptimality degree. The blue, red, and black circles are different initialization settings for stage 2 of our framework.