Research Article
A Symmetric Fusion Learning Model for Detecting Visual Relations and Scene Parsing
Table 2
Comparison with state-of-the-art on the VRD data set.
| Method | Relation detection | Phrase detection | k = 1 | k = 70 | k = 1 | k = 70 | R@50 | R@100 | R@50 | R@100 | R@50 | R@100 | R@50 | R@100 |
| VRD [3] | 17.03 | 16.17 | 24.90 | 20.04 | 14.70 | 13.86 | 21.51 | 17.35 | KL distillation [23] | 19.17 | 21.34 | 22.68 | 31.89 | 23.14 | 24.03 | 26.32 | 29.43 | Zoom-net [25] | 18.92 | 21.41 | 21.37 | 27.30 | 24.82 | 28.09 | 29.05 | 37.34 | CAI + SCA-M [25] | 19.54 | 22.39 | 22.34 | 28.52 | 25.21 | 28.89 | 29.64 | 38.39 | RelDN [37] | 18.92 | 22.96 | 21.52 | 26.38 | 26.37 | 31.42 | 28.24 | 35.44 | AVR [42] | 22.83 | 25.41 | 27.35 | 32.96 | 29.33 | 33.27 | 34.51 | 41.36 | GPS-Net [43] | 21.50 | 24.30 | — | — | 28.90 | 34.00 | — | — | SABRA [44] | 24.47 | 29.16 | 27.27 | 33.99 | 30.57 | 36.80 | 33.39 | 41.79 | Ours | | | | | | | | |
|
|