Research Article
Multimodal Semantics Extraction from User-Generated Videos
Table 3
Performance comparison for the event genre classification task using different feature-sets.
| | | Automatic event genre classification | Event | Ground truth event genre | Feature-set (audio) | Feature-set (sensors) | Feature-set (DSIFT) | Feature-set (global visual features) |
| Football match 1 | Sport | Live music | Sport | Sport | Sport | Football match 2 | Sport | Sport | Sport | Sport | Sport | Football match 3 | Sport | Live music | Sport | Sport | Sport | Ice-hockey match 1 | Sport | Live music | Sport | Live music | Sport | Ice-hockey match 2 | Sport | Live music | Sport | Live music | Live music | Concert 1 | Live music | Live music | Live music | Live music | Live music | Concert 2 | Live music | Live music | Live music | Live music | Sport | Concert 3 | Live music | Sport | Live music | Live music | Live music | Concert 4 | Live music | Live music | Sport | Live music | Live music |
| Total accuracy (%) | ā | 44.4 | 88.9 | 77.8 | 77.8 |
|
|