Research Article

Multimodal Semantics Extraction from User-Generated Videos

Table 3

Performance comparison for the event genre classification task using different feature-sets.

Automatic event genre classification
EventGround truth event genreFeature-set (audio)Feature-set (sensors)Feature-set (DSIFT)Feature-set (global visual features)

Football match 1SportLive musicSportSportSport
Football match 2SportSportSportSportSport
Football match 3SportLive musicSportSportSport
Ice-hockey match 1SportLive musicSportLive musicSport
Ice-hockey match 2SportLive musicSportLive musicLive music
Concert 1Live musicLive musicLive musicLive musicLive music
Concert 2Live musicLive musicLive musicLive musicSport
Concert 3Live musicSportLive musicLive musicLive music
Concert 4Live musicLive musicSportLive musicLive music

Total accuracy (%)ā€”44.488.977.877.8