Research Article

Multimodal Semantics Extraction from User-Generated Videos

Table 4

Performance comparison for the event genre classification task using different feature-sets.

Automatic event genre classification
EventGround truth event genreFeature-set (audio, sensors)Feature-set (DSIFT, sensors)Feature-set (global visual, sensors)Feature-set (audio, DSIFT, sensors)Feature-set (audio, global visual, sensors)—Proposed set

Football match 1SportSportSportSportSportSport
Football match 2SportSportSportSportSportSport
Football match 3SportSportSportSportSportSport
Ice-hockey match 1SportSportSportSportLive musicSport
Ice-hockey match 2SportSportSportSportLive musicSport
Concert 1Live musicSportSportSportLive musicLive music
Concert 2Live musicLive musicLive musicLive musicLive musicLive music
Concert 3Live musicLive musicLive musicLive musicLive musicLive music
Concert 4Live musicLive musicLive musicLive musicLive musicLive music

Total accuracy (%)88.988.988.977.8100