Research Article

A Symmetric Fusion Learning Model for Detecting Visual Relations and Scene Parsing

Table 3

Ablation studies on the key components of our method. We report the phrase and relationship detection performance in R@n scores (R@50 and R@100).

KMethodRelation detectionPhrase detection
R@50R@100R@50R@100

1Baseline25.3929.6731.5037.00
Baseline + 25.6229.9231.4537.21
Baseline +  + 26.0129.9032.0237.31

70Baseline27.3834.3333.5641.90
Baseline + 27.3834.8133.9142.46
Baseline +  + 28.6335.2134.8843.07