Review Article

Bag-of-Words Representation in Image Annotation: A Review

Table 1

Comparisons of interest point detection, visual words generation, and learning models.

WorkRegion/point detectionLocal descriptorClustering algorithmNo. of visual wordsWeighting schemeLearning model

2012

de Campos et al. [70]DoGSIFTLogistic regression
Elfiky et al. [97]Harris-LaplaceSIFT/HSV
color + SIFT
k-meansSVM
Fernando et al. [68]Harris-LaplacePCA-SIFT/SIFT/SURF1k-means2000SVM
Gavves et al. [77]SIFT/SURF200000
Kesorn and Poslad [80]DoGSIFTSLAC2 Binary/TF/ TF-IDFNaïve bayes/ SVM-linear/ SVM-RBF
Lee and Grauman [103]NCuts3Texton
histogram
k-means400SVM
Qin and Yung [64]Color SIFTk-meansSVM-linear/ SVM-poly/ SVM-RBF
Romberg et al. [102]SIFTk-meansmm-pLSA4
Shang and Xiao [99]SIFTk-means1000SVM
Stottinger et al. [104]Harris-LaplaceRGB Harris with Laplacian scale selectionk-means4000SVM
Tong et al. [100]Harris-LaplaceSIFTAKM5

2011

Hare et al. [73]DoG/MSERSIFTAKM1000–100000IDF
López-Sastre et al. [78]Hessian-LaplaceSIFTCPM and Adaptive Refinement3818SVM
Luo et al. [18]DoGSIFTk-means500TFSVM
Van Gemert [65]Harris and Hessian-LaplaceSIFTk-means2000
Yang et al. [37]SIFTk-means1000SVM
Zhang et al. [76]DoGSIFTHKM632357TF-IDF
Zhang et al. [38]DoGSIFTHKM32400TF-IDF

2010

Bae and Juang [79]Dense sampling171329
Chen et al. [62]Hessian-LaplaceSIFTGMM-BIC73500TF
Cheng and Wang [82]Mean-shift8HSV color histogram and co-occurrence matrixSVM
Ding et al. [105]DoGPCA-SIFTk-means2000SVM
Jégou et al. [22]Hessian-LaplaceSIFTk-means200000TF-IDF
Jiang et al. [17]DoGSIFTk-means500–10000Binary/TF/ TF-IDF/soft-weightingSVM
Li and Godil [87]DoGSIFTk-means500/700/800TFpLSA
Qin and Yung [106]PCA-SIFTAccelerated k-means32/128/2048/ 4096SVM
Tirilly et al. [107]Hessian-LaplaceSIFTHKM6556 to 117151
Uijlings et al. [33]PCA-SIFTk-means/ random forest4096SVM
Wu et al. [69]SIFTk-means2500–4500Naïve Bayes/ SVM

2009

Chen et al. [39]DoGSIFTk-means1000Spatial weighting
Lu and Ip [41]Dense samplingHSV color + Gabor txturek-means100/200SVM
Lu and Ip [42]Dense samplingHSV color + Gabor txturek-means100/200LLP9/GLP10/ SVM
S. Kim and D. Kim [40]Dense samplingSIFT/SURFk-means500/1500/3000TFpLSA/SVM
Uijlings et al. [43]Dense samplingSIFTk-means4096SVM
Xiang et al. [108]NCuts36 region features11MRFA12
Zhang et al. [94]SIFTHKM32357TF-IDF

2008

Bosch et al. [98]Harris-LaplaceColor SIFTk-means1500k-NN/SVM
Liu et al. [96]Harris-LaplaceSIFTk-means1000SVM-linear
Marszałek and Schmid [109]Harris-LaplaceSIFTk-means8000SVM
Rasiwasia and Vasconcelos [66]DCT13 coefficientsHierarchical Dirichlet models/SVM
Tirilly et al. [81]SIFTHKM6556/61687TF-IDFSVM
Van de Sande et al. [110]Harris-LaplaceColor SIFTk-means4000SVM
Zheng et al. [71]DoG + Hessian-LaplaceSIFT + Spin14k-means1010SVM

2007

Bosch et al. [24]Dense samplingHSV color + co-occurrence + edgek-means700pLSA
Chum et al. [52]Hessian-LaplaceSIFTk-meansTF-IDF
Gökalp and Aksoy [28]Dense samplingHSV colork-meansBayesian classifier
Hörster and Lienhart [21]DoG/dense samplingColor SIFTk-meansLDA
Jegou et al. [74]SIFTk-means30000
Li and Fei-Fei [111]Dense samplingSIFTk-means300TFLDA
Lienhart and Slaney [93]SIFTk-meansTFpLSA
Philbin et al. [45]Hessian-LaplaceSIFTAKM1 M
Quelhas et al. [13]DoGSIFTk-means1000SVM/pLSA
Wu et al. [46]Dense samplingTexture histogramUnigram/ bigram/trigram models
Junsong et al. [112]DoGPCA-SIFTk-means160/500

2006

Agarwal and Triggs [47]Dense samplingSIFTEM15LDA/SVM
Bosch et al. [29]Dense samplingColor SIFTk-means1500k-NN/pLSA
Lazebnik et al. [48]Dense samplingSIFTk-means200/400SVM
Marszałek and Schmid [49]Harris-LaplaceSIFTk-means1000TFSVM
Monay et al. [50]DoGSIFTk-means1000TFpLSA
Moosmann et al. [72]Dense sampling/DoGHSV color + wavelet/ SIFTExtremely randomized treesSVM
Perronnin et al. [113]DoGPCA-SIFT1024SVM-linear

1Speeded up robust features [114].
2Search ant and labor ant clustering algorithm [115].
3Normalized cuts [116].
4Multilayer modality pLSA.
5Approximate k-means.
6Hierarchical k-means.
7Gaussian mixture model with Bayesian information criterion.
8Mean shift region segmentation algorithm [117].
9Local label propagation on the k-NN graph.
10Global label propagation on the complete graph.
11Region color and standard deviation, region average orientation energy (12 filters), region size, location, convexity, first moment, and ratio of region area to boundary length squared [118].
12Multiple Markov random fields.
13Discrete cosine transform.
14A rotation-invariant two-dimensional histogram of intensities within an image region [71].
15Expectation maximization.