Research Article
Bag of Visual Words Model with Deep Spatial Features for Geographical Scene Classification
Table 4
The classification accuracy of different fine-tuning model on 12-scene datasets.
| Number | Method | 12-scene (%) | Time (h) |
| 1 | Means + SVM | 59.80 | 0.5 | 2 | Sift + BoVW | 61.02 | 0.5 | 3 | Local–global feature BoVW [34] | 60.23 | 1.2 | 4 | Fine-tuning Cifar + BoVW | 51.12 | 12 | 5 | Fine-tuning Alexnet + BoVW | 67.01 1.22 | 28 | 6 | Fine-tuning GoogLeNet + BoVW | 68.21 0.61 | 36 | 7 | Our approach | 75.12 | 23 |
|
|