Research Article
Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
Table 3
Learning the driving styles from a human teacher in different conditions.
| Driving style | Demonstrations | Number of feedbacks | EV of | EV of teacher policy | Type | Time | Optimality degree | All | Negative |
| Style 1 | Abundant | 120 sec | 70% | 164 | 62 | 16.821 | 17.164 | Sparse | 20 sec | 100% | 302 | 84 | 16.906 | No demo | — | — | 409 | 239 | 16.684 |
| Style 2 | Abundant | 120 sec | 70% | 124 | 48 | 14.893 | 15.275 | Sparse | 20 sec | 100% | 218 | 59 | 14.969 | No demo | — | — | 306 | 182 | 14.877 |
|
|