Research Article

Context-Fused Guidance for Image Captioning Using Sequence-Level Training

Table 4

Performance comparison of the ablative models.

ModelCross-entropy trainingCIDEr optimization
MetricBLEU4CIDErSPICEBLEU4CIDErSPICE

CFGV36.1112.820.337.7123.921.0
CFGE36.1112.920.537.8124.621.1
CFGA36.3113.020.638.1124.621.4
CFG36.8114.020.838.3125.421.6