About this Journal Submit a Manuscript Table of Contents
ISRN Artificial Intelligence
Volume 2012 (2012), Article ID 376804, 19 pages
http://dx.doi.org/10.5402/2012/376804
Review Article

Bag-of-Words Representation in Image Annotation: A Review

Department of Information Management, National Central University, Jhongli 32001, Taiwan

Received 26 August 2012; Accepted 19 September 2012

Academic Editors: F. Camastra, J. A. Hernandez, P. Kokol, J. Wang, and S. Zhu

Copyright © 2012 Chih-Fong Tsai. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349–1380, 2000. View at Publisher · View at Google Scholar · View at Scopus
  2. M. L. Kherfi, D. Ziou, and A. Bernardi, “Image retrieval from the World Wide Web: issues, techniques, and systems,” ACM Computing Surveys, vol. 36, no. 1, pp. 35–67, 2004. View at Publisher · View at Google Scholar · View at Scopus
  3. R. Datta, D. Joshi, J. Li, and J. Z. Wang, “Image retrieval: ideas, influences, and trends of the new age,” ACM Computing Surveys, vol. 40, no. 2, article 5, 2008. View at Publisher · View at Google Scholar · View at Scopus
  4. Y. Choi and E. M. Rasmussen, “Users' relevance criteria in image retrieval in American history,” Information Processing and Management, vol. 38, no. 5, pp. 695–726, 2002. View at Publisher · View at Google Scholar · View at Scopus
  5. M. Markkula, M. Tico, B. Sepponen, K. Nirkkonen, and E. Sormunen, “A test collection for the evaluation of content-based image retrieval algorithms—a user and task-based approach,” Information Retrieval, vol. 4, no. 3-4, pp. 275–293, 2001. View at Publisher · View at Google Scholar · View at Scopus
  6. A. Goodrum and A. Spink, “Image searching on the Excite Web search engine,” Information Processing and Management, vol. 37, no. 2, pp. 295–311, 2001. View at Publisher · View at Google Scholar · View at Scopus
  7. C. F. Tsai and C. Hung, “Automatically annotating images with keywords: a review of image annotation systems,” Recent Patents on Computer Science, vol. 1, no. 1, pp. 55–68, 2008.
  8. A. Hanbury, “A survey of methods for image annotation,” Journal of Visual Languages and Computing, vol. 19, no. 5, pp. 617–627, 2008. View at Publisher · View at Google Scholar · View at Scopus
  9. D. Zhang, M. M. Islam, and G. Lu, “A review on automatic image annotation techniques,” Pattern Recognition, vol. 45, pp. 346–362, 2011. View at Publisher · View at Google Scholar · View at Scopus
  10. A. Pinz, “Object categorization,” Foundations and Trends in Computer Graphics and Vision, vol. 1, no. 4, pp. 255–353, 2006. View at Publisher · View at Google Scholar · View at Scopus
  11. C. F. Tsai, K. Mcgarry, and J. Tait, “CLAIRE: a modular support vector image indexing and classification system,” ACM Transactions on Information Systems, vol. 24, no. 3, pp. 353–379, 2006. View at Publisher · View at Google Scholar · View at Scopus
  12. W.-C. Lin, M. Oakes, J. Tait, and C.-F. Tsai, “Improving image annotation via useful representative feature selection,” Cognitive Processing, vol. 10, no. 3, pp. 233–242, 2009. View at Publisher · View at Google Scholar
  13. P. Quelhas, F. Monay, J. M. Odobez, D. Gatica-Perez, and T. Tuytelaars, “A thousand words in a scene,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1575–1589, 2007. View at Publisher · View at Google Scholar · View at Scopus
  14. J. Sivic and A. Zisserman, “Video google: a text retrieval approach to object matching in videos,” in Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03), pp. 1470–1477, October 2003. View at Scopus
  15. J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman, “Discovering objects and their location in images,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 370–377, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  16. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, “Learning object categories from Google's image search,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 1816–1823, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  17. Y. G. Jiang, J. Yang, C. W. Ngo, and A. G. Hauptmann, “Representations of keypoint-based semantic concept detection: a comprehensive study,” IEEE Transactions on Multimedia, vol. 12, no. 1, pp. 42–53, 2010. View at Publisher · View at Google Scholar · View at Scopus
  18. H. L. Luo, H. Wei, and L. L. Lai, “Creating efficient visual codebook ensembles for object categorization,” IEEE Transactions on Systems, Man, and Cybernetics Part A, vol. 41, no. 2, pp. 238–253, 2010. View at Publisher · View at Google Scholar · View at Scopus
  19. J. Fan, Y. Gao, and H. Luo, “Multi-level annotation of natural scenes using dominant image components and semantic concepts,” in Proceedings of the 12th ACM International Conference on Multimedia (MM '04), pp. 540–547, October 2004. View at Scopus
  20. G. Wang, Y. Zhang, and L. Fei-Fei, “Using dependent regions for object categorization in a generative framework,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 1597–1604, June 2006. View at Publisher · View at Google Scholar · View at Scopus
  21. E. Hörster and R. Lienhart, “Fusing local image descriptors for large-scale image retrieval,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  22. H. Jégou, M. Douze, and C. Schmid, “Improving bag-of-features for large scale image search,” International Journal of Computer Vision, vol. 87, no. 3, pp. 316–336, 2010. View at Publisher · View at Google Scholar · View at Scopus
  23. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004. View at Publisher · View at Google Scholar · View at Scopus
  24. A. Bosch, X. Muñoz, and R. Martí, “Which is the best way to organize/classify images by content?” Image and Vision Computing, vol. 25, no. 6, pp. 778–791, 2007. View at Publisher · View at Google Scholar · View at Scopus
  25. K. Mikolajczyk, B. Leibe, and B. Schiele, “Local features for object class recognition,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 1792–1799, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  26. T. Tuytelaars and K. Mikolajczyk, “Local invariant feature detectors: a survey,” Foundations and Trends in Computer Graphics and Vision, vol. 3, no. 3, pp. 177–280, 2007. View at Publisher · View at Google Scholar · View at Scopus
  27. K. Mikolajczyk, T. Tuytelaars, C. Schmid et al., “A comparison of affine region detectors,” International Journal of Computer Vision, vol. 65, no. 1-2, pp. 43–72, 2005. View at Publisher · View at Google Scholar · View at Scopus
  28. D. Gökalp and S. Aksoy, “Scene classification using bag-of-regions representations,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  29. A. Bosch, A. Zisserman, and X. Munoz, “Scene classification via pLSA,” in European Conference on Computer Vision, pp. 517–530, 2006.
  30. F. Jurie and B. Triggs, “Creating efficient codebooks for visual recognition,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 604–610, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  31. L. Fei-Fei and P. Perona, “A bayesian hierarchical model for learning natural scene categories,” in Proceedings of the 6th IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '05), pp. 524–531, June 2005. View at Scopus
  32. Y. Ke and R. Sukthankar, “PCA-SIFT: a more distinctive representation for local image descriptors,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. 506–513, July 2004. View at Scopus
  33. J. R. R. Uijlings, A. W. M. Smeulders, and R. J. H. Scha, “Real-time visual concept classification,” IEEE Transactions on Multimedia, vol. 12, no. 7, pp. 665–681, 2010. View at Publisher · View at Google Scholar · View at Scopus
  34. K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615–1630, 2005. View at Publisher · View at Google Scholar · View at Scopus
  35. J. Zhang, M. Marszałek, S. Lazebnik, and C. Schmid, “Local features and kernels for classification of texture and object categories: a comprehensive study,” International Journal of Computer Vision, vol. 73, no. 2, pp. 213–238, 2007. View at Publisher · View at Google Scholar · View at Scopus
  36. Z. Li, Z. Shi, X. Liu, Z. Li, and Z. Shi, “Fusing semantic aspects for image annotation and retrieval,” Journal of Visual Communication and Image Representation, vol. 21, no. 8, pp. 798–805, 2010. View at Publisher · View at Google Scholar · View at Scopus
  37. L. Yang, N. Zheng, and J. Yang, “A unified context assessing model for object categorization,” Computer Vision and Image Understanding, vol. 115, no. 3, pp. 310–322, 2011. View at Publisher · View at Google Scholar · View at Scopus
  38. S. Zhang, Q. Tian, G. Hua et al., “Modeling spatial and semantic cues for large-scale near-duplicated image retrieval,” Computer Vision and Image Understanding, vol. 115, no. 3, pp. 403–414, 2011. View at Publisher · View at Google Scholar · View at Scopus
  39. X. Chen, X. Hu, and X. Shen, “Spatial weighting for bag-or-visual-words and its application in content-based image retrieval,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 867–874, 2009.
  40. S. Kim and D. Kim, “Scene classification using pLSA with visterm spatial location,” in Proceedings of the 1st ACM International Workshop on Interactive Multimedia for Consumer Electronics (IMCE '09), pp. 57–66, October 2009. View at Publisher · View at Google Scholar · View at Scopus
  41. Z. Lu and H. H. S. Ip, “Image categorization with spatial mismatch kernels,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 397–404, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  42. Z. Lu and H. H. S. Ip, “Image categorization by learning with context and consistency,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 2719–2726, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  43. J. R. R. Uijlings, A. W. M. Smeulders, and R. J. H. Scha, “What is the spatial extent of an object?” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 770–777, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  44. L. Cao and L. Fei-Fei, “Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–8, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  45. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  46. L. Wu, M. Li, Z. Li, W. Y. Ma, and N. Yu, “Visual language modeling for image classification,” in Proceedings of the 9th ACM SIG Multimedia International Workshop on Multimedia Information Retrieval (MIR '07), pp. 115–124, September 2007. View at Publisher · View at Google Scholar · View at Scopus
  47. A. Agarwal and B. Triggs, “Hyperfeatures—multilevel local coding for visual recognition,” in Conference on Computer Vision, pp. 30–43, 2006.
  48. S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 2169–2178, June 2006. View at Publisher · View at Google Scholar · View at Scopus
  49. M. Marszałek and C. Schmid, “Spatial weighting for bag-of-features,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 2118–2125, June 2006. View at Publisher · View at Google Scholar · View at Scopus
  50. F. Monay, P. Quelhas, J. M. Odobez, and D. Gatica-Perez, “Integrating co-occurrence and spatial contexts on patch-based scene segmentation,” in Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 14–21, June 2006. View at Publisher · View at Google Scholar · View at Scopus
  51. K. E. A. Van De Sande, T. Gevers, and C. G. M. Snoek, “Empowering visual categorization with the GPU,” IEEE Transactions on Multimedia, vol. 13, no. 1, pp. 60–70, 2011. View at Publisher · View at Google Scholar · View at Scopus
  52. O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: automatic query expansion with a generative feature model for object retrieval,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–8, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  53. S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, “Indexing by latent semantic analysis,” Journal for the American Society for InFormation Science, vol. 41, no. 6, pp. 391–407, 1990.
  54. T. Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,” Machine Learning, vol. 42, no. 1-2, pp. 177–196, 2001. View at Publisher · View at Google Scholar
  55. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. 4-5, pp. 993–1022, 2003. View at Scopus
  56. T. Mitchell, Machine Learning, McGraw-Hill, New York, NY, USA, 1997.
  57. S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice Hall, Upper Saddle River, NJ, USA, 2nd edition, 1999.
  58. V. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, NY, USA, 1998.
  59. M. Summer and R. W. Picard, “Indoor-outdoor image classification,” IEEE International Workshop on Content-Based Access of Image and Video Databases, pp. 42–50, 1998.
  60. P. Quelhas, F. Monay, J. M. Odobez, D. Gatica-Perez, T. Tuytelaars, and L. Van Gool, “Modeling scenes with local descriptors and latent aspects,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 883–890, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  61. W. Zhong, K. Qifa, M. Isard, and S. Jian, “Bundling features for large scale partial-duplicate web image search,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 25–32, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  62. K. T. Chen, K. H. Lin, Y. H. Kuo, Y. L. Wu, and W. H. Hsu, “Boosting image object retrieval and indexing by automatically discovered pseudo-objects,” Journal of Visual Communication and Image Representation, vol. 21, no. 8, pp. 815–825, 2010. View at Publisher · View at Google Scholar · View at Scopus
  63. P. Gehler and S. Nowozin, “On feature combination for multiclass object classification,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV '09), pp. 221–228, 2009.
  64. J. Qin and N. H. Yung, “Feature fusion within local region using localized maximum-margin learning for scene categorization,” Pattern Recognition, vol. 45, pp. 1671–1683, 2012.
  65. J. C. Van Gemert, “Exploiting photographic style for category-level image classification by generalizing the spatial pyramid,” in Proceedings of the 1st ACM International Conference on Multimedia Retrieval (ICMR '11), pp. 1–8, April 2011. View at Publisher · View at Google Scholar · View at Scopus
  66. N. Rasiwasia and N. Vasconcelos, “Scene classification with low-dimensional semantic spaces and weak supervision,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  67. H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117–128, 2011. View at Publisher · View at Google Scholar · View at Scopus
  68. B. Fernando, E. Fromont, D. Muselet, and M. Sebban, “Supervised learning of Gaussian mixture models for visual vocabulary generation,” Pattern Recognition, vol. 45, pp. 897–907, 2011. View at Publisher · View at Google Scholar · View at Scopus
  69. L. Wu, S. C. H. Hoi, and N. Yu, “Semantics-preserving bag-of-words models and applications,” IEEE Transactions on Image Processing, vol. 19, no. 7, pp. 1908–1920, 2010. View at Publisher · View at Google Scholar · View at Scopus
  70. T. de Campos, G. Csurka, and F. Perronnin, “Images as sets of locally weighted features,” Computer Vision and Image Understanding, vol. 116, pp. 68–85, 2012.
  71. Y. T. Zheng, M. Zhao, S. Y. Neo, T. S. Chua, and Q. Tian, “Visual synset: towards a higher-level visual representation,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  72. F. Moosmann, B. Triggs, and F. Jurie, “Fast discriminative visual codebooks using randomized clustering forests,” in International Conference on Neural Information Processing Systems, pp. 985–992, 2006.
  73. J. S. Hare, S. Samangooei, and P. H. Lewis, “Efficient clustering and quantisation of SIFT features: exploiting characteristics of the SIFT descriptor and interest region detectors under image inversion,” in Proceedings of the 1st ACM International Conference on Multimedia Retrieval (ICMR '11), pp. 1–8, April 2011. View at Publisher · View at Google Scholar · View at Scopus
  74. H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  75. J. Winn, A. Criminisi, and T. Minka, “Object categorization by learned universal visual dictionary,” in Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV '05), pp. 1800–1807, October 2005. View at Publisher · View at Google Scholar · View at Scopus
  76. S. Zhang, Q. Tian, G. Hua, Q. Huang, and W. Guo, “Generating descriptive visual words and visual phrases for large-scale image applications,” IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2664–2677, 2011.
  77. E. Gavves, C. G. M. Snoek, and A. W. Smeulders, “Visual synonyms for landmark image retrieval,” Computer Vision and Image Understanding, vol. 116, pp. 238–249, 2012.
  78. R. J. López-Sastre, T. Tuytelaars, F. J. Acevedo-Rodríguez, and S. Maldonado-Bascón, “Towards a more discriminative and semantic visual vocabulary,” Computer Vision and Image Understanding, vol. 115, no. 3, pp. 415–425, 2011. View at Publisher · View at Google Scholar · View at Scopus
  79. S. H. Bae and B. H. Juang, “IPSILON: incremental parsing for semantic indexing of latent concepts,” IEEE Transactions on Image Processing, vol. 19, no. 7, pp. 1933–1947, 2010. View at Publisher · View at Google Scholar · View at Scopus
  80. K. Kesorn and S. Poslad, “An enhanced bag-of-visual words vector space model to represent visual content in athletics images,” IEEE Transactions on Multimedia, vol. 14, no. 1, pp. 211–222, 2012.
  81. P. Tirilly, V. Claveau, and P. Gros, “Language modeling for bag-of-visual words image categorization,” in Proceedings of the International Conference on Image and Video Retrieval (CIVR '08), pp. 249–258, July 2008. View at Publisher · View at Google Scholar · View at Scopus
  82. H. Cheng and R. Wang, “Semantic modeling of natural scenes based on contextual Bayesian networks,” Pattern Recognition, vol. 43, no. 12, pp. 4042–4054, 2010. View at Publisher · View at Google Scholar · View at Scopus
  83. D. Larlus, J. Verbeek, and F. Jurie, “Category level object segmentation by combining bag-of-words models with dirichlet processes and random fields,” International Journal of Computer Vision, vol. 88, no. 2, pp. 238–253, 2010. View at Publisher · View at Google Scholar · View at Scopus
  84. Y. Wang and G. Mori, “Human action recognition by semilatent topic models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1762–1774, 2009. View at Publisher · View at Google Scholar · View at Scopus
  85. B. Fasel, F. Monay, and D. Gatica-Perez, “Latent semantic analysis of facial action codes for automatic facial expression recognition,” in Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04), pp. 181–188, October 2004. View at Scopus
  86. J. Wang, Y. Li, Y. Zhang et al., “Bag-of-features based medical image retrieval via multiple assignment and visual words weighting,” IEEE Transactions on Medial Imaging, vol. 30, no. 11, pp. 1996–2011, 2011.
  87. X. Li and A. Godil, “Investigating the bag-of-words method for 3D shape retrieval,” EURASIP Journal on Advances in Signal Processing, vol. 2010, Article ID 108130, 2010. View at Publisher · View at Google Scholar · View at Scopus
  88. R. Toldo, U. Castellani, and A. Fusiello, “A bag of words approach for 3D object categorization,” in International Conference on Computer Vision/Computer Graphics Collaboration Techniques, pp. 116–127, 2009.
  89. P. Ye and D. Doermann, “No-reference image quality assessment using visual codebooks,” IEEE Transactions on Image Processing, vol. 21, no. 7, pp. 3129–3138, 2012.
  90. A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, “Describing objects by their attributes,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 1778–1785, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  91. E. B. Sudderth, A. Torralba, W. T. Freeman, and A. S. Willsky, “Describing visual scenes using transformed objects and parts,” International Journal of Computer Vision, vol. 77, no. 1–3, pp. 291–330, 2008. View at Publisher · View at Google Scholar · View at Scopus
  92. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Lost in quantization: improving particular object retrieval in large scale image databases,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  93. R. Lienhart and M. Slaney, “PLSA on large scale image databases,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), pp. IV1217–IV1220, April 2007. View at Publisher · View at Google Scholar · View at Scopus
  94. S. Zhang, Q. Tian, G. Hua, Q. Huang, and S. Li, “Descriptive visual words and visual phrases for image applications,” in Proceedings of the 17th ACM International Conference on Multimedia (MM '09), pp. 75–84, October 2009. View at Publisher · View at Google Scholar · View at Scopus
  95. A. Torralba and A. A. Efros, “Unbiased look at dataset bias,” in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR '11), pp. 1521–1528, 2011.
  96. D. Liu, G. Hua, P. Viola, and T. Chen, “Integrated feature selection and higher-order spatial feature extraction for object categorization,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  97. N. M. Elfiky, F. S. Khan, J. van de Weijer, and J. Gonzalez, “Discriminative compact pyramids for object and scene recognition,” Pattern Recognition, vol. 45, pp. 1627–1636, 2012.
  98. A. Bosch, A. Zisserman, and X. Muñoz, “Scene classification using a hybrid generative/discriminative approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 4, pp. 712–727, 2008. View at Publisher · View at Google Scholar · View at Scopus
  99. L. Shang and B. Xiao, “Discriminative features for image classification and retrieval,” Pattern Recognition Letters, vol. 33, pp. 744–751, 2012.
  100. W. Tong, F. Li, R. Jin, and A. Jain, “Large-scale near-duplicate image retrieval by kernel density estimation,” International Journal of Multimedia Information Retrieval, vol. 1, pp. 45–58, 2012.
  101. J. Shotton, M. Johnson, and R. Cipolla, “Semantic texton forests for image categorization and segmentation,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  102. S. Romberg, R. Lienhart, and E. Horster, “Multimodal image retrieval: fusing modalities with multilayer multimodal pLSA,” International Journal of Multimedia Information Retrieval, vol. 1, no. 1, pp. 31–44, 2012.
  103. Y. J. Lee and K. Grauman, “Object-graphs for context-aware visual category discovery,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 2, pp. 346–358, 2012.
  104. J. Stottinger, A. Hanbury, N. Sebe, and T. Gevers, “Sparse color interest points for image retrieval and object categorization,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2681–2692, 2012.
  105. G. Ding, J. Wang, and K. Qin, “A visual word weighting scheme based on emerging itemsets for video annotation,” Information Processing Letters, vol. 110, no. 16, pp. 692–696, 2010. View at Publisher · View at Google Scholar · View at Scopus
  106. J. Qin and N. H. C. Yung, “Scene categorization via contextual visual words,” Pattern Recognition, vol. 43, no. 5, pp. 1874–1888, 2010. View at Publisher · View at Google Scholar · View at Scopus
  107. P. Tirilly, V. Claveau, and P. Gros, “Distances and weighting schemes for bag of visual words image retrieval,” in Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval (MIR '10), pp. 323–332, March 2010. View at Publisher · View at Google Scholar · View at Scopus
  108. Y. Xiang, X. Zhou, T. S. Chua, and C. W. Ngo, “A revisit of generative model for automatic image annotation using markov random fields,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 1153–1160, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  109. M. Marszalek and C. Schmid, “Constructing category hierarchies for visual recognition,” in European Conference on Computer Vision, pp. 479–491, 2008.
  110. K. E. A. Van De Sande, T. Gevers, and C. G. M. Snoek, “A comparison of color features for visual concept classification,” in Proceedings of the International Conference on Image and Video Retrieval (CIVR '08), pp. 141–150, July 2008. View at Publisher · View at Google Scholar · View at Scopus
  111. L. J. Li and L. Fei-Fei, “What, where and who? Classifying events by scene and object recognition,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–8, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  112. Y. Junsong, W. Ying, and Y. Ming, “Discovery of collocation patterns: from visual words to visual phrases,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  113. F. Perronnin, C. Dance, G. Csurka, and M. Bressan, “Adapted vocabularies for generic visual categorization,” in European Conference on Computer Vision, pp. 464–475, 2006.
  114. H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346–359, 2008. View at Publisher · View at Google Scholar · View at Scopus
  115. H. Lee, G. Shim, Y. B. Kim, J. Park, and J. Kim, “A search ant and labor ant algorithm for clustering data,” in International Conference on Ant Colony Optimization and Swarm Intelligence, pp. 500–501, 2006.
  116. J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000. View at Publisher · View at Google Scholar · View at Scopus
  117. D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002. View at Publisher · View at Google Scholar · View at Scopus
  118. P. Duygulu, K. Barnard, J. F. G. de Freitas, and D. A. Forsyth, “Object recognition as machine translation: learning a lexicon for a fixed image vocabulary,” in European Conference on Computer Vision, pp. 97–112, 2002.
  119. M. Stark and B. Schiele, “How good are local features for classes of geometric objects,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–8, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  120. S. K. Divvala, D. Hoiem, J. H. Hays, A. A. Efros, and M. Hebert, “An empirical study of context in object detection,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 1271–1278, June 2009. View at Publisher · View at Google Scholar · View at Scopus
  121. D. Nister and H. Stewenius, “Scalable recognition with vocabulary tree,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 1470–1477, 2006.
  122. J. Vogel and B. Schiele, “Semantic modeling of natural scenes for content-based image retrieval,” International Journal of Computer Vision, vol. 72, no. 2, pp. 133–157, 2007. View at Publisher · View at Google Scholar · View at Scopus
  123. M. R. Boutell, J. Luo, and C. M. Brown, “Scene parsing using region-based generative models,” IEEE Transactions on Multimedia, vol. 9, no. 1, pp. 136–146, 2007. View at Publisher · View at Google Scholar · View at Scopus
  124. J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic image annotation and retrieval using cross-media relevance models,” in ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 119–126, 2003.
  125. V. Lavrenko, R. Manmatha, and J. Jeon, “A model for learning the semantics of pictures,” in International Conference on Neural Information Processing Systems, pp. 553–560, 2003.
  126. F. Monay and D. Gatica-Perez, “Modeling semantic aspects for cross-media image indexing,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 10, pp. 1802–1817, 2007. View at Publisher · View at Google Scholar · View at Scopus
  127. C. Siagian and L. Itti, “Gist: a mobile robotics application of context-based vision in outdoor environment,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR '05), pp. 1063–1069, 2005.
  128. C. Siagian and L. Itti, “Rapid biologically-inspired scene classification using features shared with visual attention,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 300–312, 2007. View at Publisher · View at Google Scholar · View at Scopus
  129. A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall, “Learning distance functions using equivalence relations,” in Proceedings of the 20th International Conference on Machine Learning, pp. 11–18, August 2003. View at Scopus
  130. J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon, “Information-theoretic metric learning,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 209–216, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  131. J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighborhood component analysis,” in International Conference on Neural Information Processing Systems, pp. 513–520, 2004.
  132. K. Weinberger, J. Blitzer, and L. Saul, “Distance metric learning for large margin nearest neighbor classification,” in International Conference on Neural Information Processing Systems, pp. 1473–1480, 2006.
  133. J. Yang, Y. G. Jiang, A. G. Hauptmann, and C. W. Ngo, “Evaluating bag-of-visual-words representations in scene classification,” in Proceedings of the 9th ACM SIG Multimedia International Workshop on Multimedia Information Retrieval (MIR '07), pp. 197–206, September 2007. View at Publisher · View at Google Scholar · View at Scopus
  134. S. L. Feng, R. Manmatha, and V. Lavrenko, “Multiple Bernoulli relevance models for image and video annotation,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), pp. 1002–1009, July 2004. View at Scopus
  135. S. Savarese, J. Winn, and A. Criminisi, “Discriminative object class models of appearance and shape by correlatons,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), pp. 2033–2040, June 2006. View at Publisher · View at Google Scholar · View at Scopus
  136. J. Liu and M. Shah, “Scene modeling using co-clustering,” in Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV '07), pp. 1–8, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  137. A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H. J. Zhang, “Image classification for content-based indexing,” IEEE Transactions on Image Processing, vol. 10, no. 1, pp. 117–130, 2001. View at Publisher · View at Google Scholar · View at Scopus
  138. J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, “Local features and kernels for classification of texture and object categories: an in-depth study,” Tech. Rep. RR-5737, INRIA Rhône-Alpes, 2005.
  139. A. Opelt, M. Fussenegger, A. Pinz, and P. Auer, “Weak hypotheses and boosting for generic object detection and recognition,” in European Conference on Computer Vision, pp. 71–84, 2004.
  140. J. Farquhar, S. Szedmak, H. Meng, and J. Shawe-Taylor, “Improving “bag-of-keypoints” image categorication,” Tech. Rep., University of Southampton, 2005.
  141. T. Deselaers, D. Keysets, and H. Ney, “Classification error rate for quantitative evaluation of content-based image retrieval systems,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR '04), pp. 505–508, August 2004. View at Scopus
  142. F. Li, W. Tong, R. Jin, A. K. Jain, and J. E. Lee, “An efficient key point quantization algorithm for large scale image retrieval,” in Proceedings of the 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM '09), pp. 89–96, October 2009. View at Publisher · View at Google Scholar · View at Scopus
  143. A. K. Bhogal, N. Singla, and M. Kaur, “Comparison of algorithms for segmentation of complex scene images,” International Journal of Advanced Engineering Sciences and Technologies, vol. 8, no. 2, pp. 306–310, 2011.
  144. H. Zhang, J. E. Fritts, and S. A. Goldman, “Image segmentation evaluation: a survey of unsupervised methods,” Computer Vision and Image Understanding, vol. 110, no. 2, pp. 260–280, 2008. View at Publisher · View at Google Scholar · View at Scopus
  145. A. Perina, M. Cristani, U. Castellani, V. Murino, and N. Jojic, “Free energy score spaces: using generative information in discriminative classifiers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 7, pp. 1249–1262, 2012.