Research Article

Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach

Figure 8

The performance of under different exploration policy types used in the interactive phase (stage-two) when 100 state-action pairs of 60% optimal demonstrations are given (see Table 2).