Research Article

Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning

Table 3

The impact of the optical flow on the proposed system’s performance. The model trained with 1A + 2B and tested on 2A + 1B and all data.

MethodmAP (%)
ALL dataUnseen data (2A + 1B)

2Stream – WE-VLAD17.814.83
3Stream – WE-VLAD19.2116.65
2Stream + WE-VLAD19.516.08
3Stream+WE-VLAD20.8417.32
2Stream + WE + VLAD (rnn)20.8616.96
3Stream+WE+VLAD (rnn)21.2717.63