Research Article

A Symmetric Fusion Learning Model for Detecting Visual Relations and Scene Parsing

Table 4

Ablation studies on the key components of our model. We report the phrase and relationship detection performance in R@n scores (R@50 and R@100).

KMethodRelation detectionPhrase detection
R@50R@100R@50R@100

1Baseline (20% noisy labels)18.0022.4426.3731.42
Baseline +  +  (20% noisy labels)24.0928.1329.3434.51

70Baseline (20% noisy labels)21.5226.3828.2435.44
Baseline +  +  (20% noisy labels)25.0630.8330.4738.07