Research Article
Automatic Image Captioning Based on ResNet50 and LSTM with Soft Attention
Table 5
The performance comparison in the MS COCO 2014 dataset.
| Model | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | METEOR | CIDEr |
| Nearest neighbor [27] | 0.48 | 0.281 | 0.166 | 0.1 | 0.157 | 0.383 | Google NIC [28] | 0.666 | 0.461 | 0.329 | 0.246 | — | — | LRCN [24] | 0.628 | 0.442 | 0.304 | — | — | — | MS research [29] | — | — | — | 0.211 | 0.207 | — | Chen and Zitnick [23] | — | — | — | 0.19 | 0.204 | 0.141 | Log bilinear [25] | 0.708 | 0.489 | 0.344 | 0.243 | 0.2 | — | DVS [26] | 0.625 | 0.45 | 0.321 | 0.23 | 0.195 | 0.66 | AICRL-ResNet50 | 0.731 | 0.562 | 0.41 | 0.326 | 0.261 | 0.872 | AICRL-VGA16 | 0.702 | 0.536 | 0.398 | 0.295 | 0.236 | 0.857 |
|
|