Research Article
RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet
Table 7
Comparison of SV-GCN with other state-of-the-art methods.
| Network | CS (top 1) | CV (top 1) |
| DSSCA-SSLM [32] | 74.9% | | TCN [33] | 74.3% | 83.10% | GCA-LSTM [16] | 76.1% | 84.0% | Skelemotion [34] | 76.5% | 84.7% | Slowfastnet [12] | 80.25% | 93.74% | St-gcn [18] | 81.5% | 88.3% | LSTM-CNN [35] | 82.9% | 91.0% | Two-stream CNN [36] | 83.2% | 89.3% | DPRL+GCNN [19] | 83.5% | 89.8% | Cross-attention [21] | 84.2% | 89.3% | SV-GCN (ours) | 85.51% | 94.15% |
|
|