Research Article
Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning
Table 3
The impact of the optical flow on the proposed system’s performance. The model trained with 1A + 2B and tested on 2A + 1B and all data.
| Method | mAP (%) | ALL data | Unseen data (2A + 1B) |
| 2Stream – WE-VLAD | 17.8 | 14.83 | 3Stream – WE-VLAD | 19.21 | 16.65 | 2Stream + WE-VLAD | 19.5 | 16.08 | 3Stream + WE-VLAD | 20.84 | 17.32 | 2Stream + WE + VLAD (rnn) | 20.86 | 16.96 | 3Stream + WE + VLAD (rnn) | 21.27 | 17.63 |
|
|