Research Article

Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning

Table 4

The impact of the RNNs (LSTM/GRU) on the proposed system’s performance. Training and testing are performed on the same subset (averaged on four subsets).

MethodmAP (%)
LSTMGRU

2Stream + WE - VLAD20.2820.33
3Stream + WE - VLAD20.9420.94
2Stream + WE + VLAD (rnn)20.7420.78
3Stream + WE + VLAD (rnn)21.3121.35