Research Article

A Symmetric Fusion Learning Model for Detecting Visual Relations and Scene Parsing

Table 1

Comparison with state-of-the-art on the VG data set.

MethodSGGenSGClsPredCls
R@20R@50R@100R@20R@50R@100R@20R@50R@100

IMP [29]14.620.724.531.734.635.452.759.361.3
Frequency [31]17.723.527.627.732.434.049.459.964.1
Frequency + overlap [31]20.126.230.129.332.332.953.660.662.2
MotifNet-LeftRight [31]21.427.230.332.935.836.558.565.267.1
Graph R-CNN [39]11.413.729.631.654.259.1
VCTREE-SL [40]21.727.731.135.037.938.659.866.267.9
RelDN [37]21.128.332.736.136.836.866.968.468.4
VCTREE + TranstextNet[41]28.131.738.339.366.968.7
Ours