Research Article
Context-Fused Guidance for Image Captioning Using Sequence-Level Training
Table 2
Performance comparisons on MS COCO Karpathy test split under cross-entropy training.
| Cross-entropy loss | Metric | BLEU1 | BLEU2 | BLEU3 | BLEU4 | METEOR | ROUGE-L | CIDEr | SPICE |
| NIC [16] | — | — | — | 29.6 | — | 52.6 | 94.0 | — | SCST [11] | — | — | — | 30.0 | 25.9 | 53.4 | 99.4 | — | Up-down [4] | 77.2 | — | — | 36.2 | 27.0 | 56.4 | 113.5 | 20.3 | RFNet [17] | 76.4 | 60.4 | 46.6 | 35.8 | 27.4 | 56.8 | 112.5 | 20.5 | HAN [20] | 77.2 | 61.2 | 47.7 | 36.2 | 27.5 | 56.6 | 114.8 | 20.6 | RAtt-Soft [29] | 79.2 | 61.8 | 47.6 | 36.9 | 28.3 | 60.9 | 114.3 | 20.8 | CFG | 77.1 | 61.5 | 47.9 | 36.8 | 27.7 | 56.7 | 114.0 | 20.8 |
|
|
The best results (%) are highlighted in boldface. The symbol “—” indicates the results are not reported.
|