Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2018, Article ID 7068349, 13 pages
https://doi.org/10.1155/2018/7068349
Review Article

Deep Learning for Computer Vision: A Brief Review

1Department of Informatics, Technological Educational Institute of Athens, 12210 Athens, Greece
2National Technical University of Athens, 15780 Athens, Greece

Correspondence should be addressed to Athanasios Voulodimos; rg.autn.liam@vsonaht

Received 17 June 2017; Accepted 27 November 2017; Published 1 February 2018

Academic Editor: Diego Andina

Copyright © 2018 Athanasios Voulodimos et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bulletin of Mathematical Biology, vol. 5, no. 4, pp. 115–133, 1943. View at Publisher · View at Google Scholar · View at Scopus
  2. Y. LeCun, B. Boser, J. Denker et al., “Handwritten digit recognition with a back-propagation network,” in Advances in Neural Information Processing Systems 2 (NIPS*89), D. Touretzky, Ed., Denver, CO, USA, 1990. View at Google Scholar
  3. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. View at Publisher · View at Google Scholar · View at Scopus
  4. G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  5. TensorFlow, Available online: https://www.tensorflow.org.
  6. B. Frederic, P. Lamblin, R. Pascanu et al., “Theano: new features and speed improvements,” in Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012, http://deeplearning.net/software/theano/. View at Google Scholar
  7. Mxnet, Available online: http://mxnet.io.
  8. W. Ouyang, X. Zeng, X. Wang et al., “DeepID-Net: Object Detection with Deformable Part Based Convolutional Neural Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 7, pp. 1320–1334, 2017. View at Publisher · View at Google Scholar
  9. A. Diba, V. Sharma, A. Pazandeh, H. Pirsiavash, and L. V. Gool, “Weakly Supervised Cascaded Convolutional Networks,” in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5131–5139, Honolulu, HI, July 2017. View at Publisher · View at Google Scholar
  10. N. Doulamis and A. Voulodimos, “FAST-MDL: Fast Adaptive Supervised Training of multi-layered deep learning models for consistent object tracking and classification,” in Proceedings of the 2016 IEEE International Conference on Imaging Systems and Techniques, IST 2016, pp. 318–323, October 2016. View at Publisher · View at Google Scholar · View at Scopus
  11. N. Doulamis, “Adaptable deep learning structures for object labeling/tracking under dynamic visual environments,” Multimedia Tools and Applications, pp. 1–39, 2017. View at Publisher · View at Google Scholar
  12. L. Lin, K. Wang, W. Zuo, M. Wang, J. Luo, and L. Zhang, “A deep structured model with radius-margin bound for 3D human activity recognition,” International Journal of Computer Vision, vol. 118, no. 2, pp. 256–273, 2016. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  13. S. Cao and R. Nevatia, “Exploring deep learning based solutions in fine grained activity recognition in the wild,” in Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 384–389, Cancun, December 2016. View at Publisher · View at Google Scholar
  14. A. Toshev and C. Szegedy, “DeepPose: Human pose estimation via deep neural networks,” in Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, pp. 1653–1660, USA, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  15. X. Chen and A. L. Yuille, “Articulated pose estimation by a graphical model with image dependent pairwise relations,” in Proceedings of the NIPS, 2014.
  16. H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the 15th IEEE International Conference on Computer Vision, ICCV 2015, pp. 1520–1528, Santiago, Chile, December 2015. View at Publisher · View at Google Scholar · View at Scopus
  17. J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15), pp. 3431–3440, IEEE, Boston, Mass, USA, June 2015. View at Publisher · View at Google Scholar
  18. D. H. Hubel and T. N. Wiesel, “Receptive fields, binocular interaction, and functional architecture in the cat's visual cortex,” The Journal of Physiology, vol. 160, pp. 106–154, 1962. View at Publisher · View at Google Scholar · View at Scopus
  19. K. Fukushima, “Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, no. 4, pp. 193–202, 1980. View at Publisher · View at Google Scholar · View at Scopus
  20. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2323, 1998. View at Publisher · View at Google Scholar · View at Scopus
  21. Y. LeCun, B. Boser, J. S. Denker et al., “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989. View at Publisher · View at Google Scholar
  22. M. Tygert, J. Bruna, S. Chintala, Y. LeCun, S. Piantino, and A. Szlam, “A mathematical motivation for complex-valued convolutional networks,” Neural Computation, vol. 28, no. 5, pp. 815–825, 2016. View at Publisher · View at Google Scholar · View at Scopus
  23. M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Is object localization for free? - Weakly-supervised learning with convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 685–694, June 2015. View at Publisher · View at Google Scholar · View at Scopus
  24. C. Szegedy, W. Liu, Y. Jia et al., “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15), pp. 1–9, Boston, Mass, USA, June 2015. View at Publisher · View at Google Scholar
  25. Y. L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proceedings of the ICML, 2010.
  26. D. Scherer, A. Müller, and S. Behnke, “Evaluation of pooling operations in convolutional architectures for object recognition,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 6354, no. 3, pp. 92–101, 2010. View at Publisher · View at Google Scholar · View at Scopus
  27. H. Wu and X. Gu, “Max-Pooling Dropout for Regularization of Convolutional Neural Networks,” in Neural Information Processing, vol. 9489 of Lecture Notes in Computer Science, pp. 46–54, Springer International Publishing, Cham, 2015. View at Publisher · View at Google Scholar
  28. K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” in Computer Vision – ECCV 2014, vol. 8691 of Lecture Notes in Computer Science, pp. 346–361, Springer International Publishing, Cham, 2014. View at Publisher · View at Google Scholar
  29. K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904–1916, 2015. View at Publisher · View at Google Scholar · View at Scopus
  30. W. Ouyang, X. Wang, X. Zeng et al., “DeepID-Net: Deformable deep convolutional neural networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 2403–2412, USA, June 2015. View at Publisher · View at Google Scholar · View at Scopus
  31. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS '12), pp. 1097–1105, Lake Tahoe, Nev, USA, December 2012. View at Scopus
  32. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '14), pp. 580–587, Columbus, Ohio, USA, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  33. Y. Bengio, “Learning deep architectures for AI,” Foundations and Trends in Machine Learning, vol. 2, no. 1, pp. 1–27, 2009. View at Publisher · View at Google Scholar · View at Scopus
  34. P. Smolensky, “Information processing in dynamical systems: Foundations of harmony theory,” in In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 194–281, MIT Press, Cambridge, MA, USA, 1986. View at Google Scholar
  35. G. E. Hinton and T. J. Sejnowski, “Learning and Relearning in Boltzmann Machines,” vol. 1, p. 4.2, MIT Press, Cambridge, MA, 1986. View at Google Scholar
  36. M. A. Carreira-Perpinan and G. E. Hinton, “On contrastive divergence learning,” in Proceedings of the tenth international workshop on artificial intelligence and statistics., NP: Society for Artificial Intelligence and Statistics, pp. 33–40, 2005.
  37. G. Hinton, “A practical guide to training restricted Boltzmann machines,” Momentum, vol. 9, p. 926, 2010. View at Google Scholar
  38. K. Cho, T. Raiko, and A. Ilin, “Enhanced gradient for training restricted Boltzmann machines,” Neural Computation, vol. 25, no. 3, pp. 805–831, 2013. View at Publisher · View at Google Scholar · View at MathSciNet
  39. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” American Association for the Advancement of Science: Science, vol. 313, no. 5786, pp. 504–507, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  40. I. Arel, D. C. Rose, and T. P. Karnowski, “Deep machine learning—a new frontier in artificial intelligence research,” IEEE Computational Intelligence Magazine, vol. 5, no. 4, pp. 13–18, 2010. View at Publisher · View at Google Scholar
  41. Y. Bengio, A. Courville, and P. Vincent, “Representation learning: a review and new perspectives,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1798–1828, 2013. View at Publisher · View at Google Scholar · View at Scopus
  42. H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations,” in Proceedings of the 26th Annual International Conference (ICML ’09), pp. 609–616, ACM, Montreal, Canada, June 2009. View at Publisher · View at Google Scholar
  43. H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Unsupervised learning of hierarchical representations with convolutional deep belief networks,” Communications of the ACM, vol. 54, no. 10, pp. 95–103, 2011. View at Publisher · View at Google Scholar · View at Scopus
  44. G. B. Huang, H. Lee, and E. Learned-Miller, “Learning hierarchical representations for face verification with convolutional deep belief networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '12), pp. 2518–2525, June 2012. View at Publisher · View at Google Scholar · View at Scopus
  45. R. Salakhutdinov and G. Hinton, “Deep boltzmann machines,” in Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 24, pp. 448–455, 2009. View at Publisher · View at Google Scholar
  46. L. Younes, “On the convergence of Markovian stochastic algorithms with rapidly decreasing ergodicity rates,” Stochastics and Stochastics Reports, vol. 65, no. 3-4, pp. 177–228, 1999. View at Publisher · View at Google Scholar · View at MathSciNet
  47. R. Salakhutdinov and H. Larochelle, “Efficient learning of deep Boltzmann machines,” in Proceedings of the AISTATS, 2010.
  48. N. Srivastava and R. Salakhutdinov, “Multimodal learning with deep Boltzmann machines,” Journal of Machine Learning Research, vol. 15, pp. 2949–2980, 2014. View at Google Scholar · View at MathSciNet
  49. R. Salakhutdinov and G. Hinton, “An efficient learning procedure for deep Boltzmann machines,” Neural Computation, vol. 24, no. 8, pp. 1967–2006, 2012. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  50. R. Salakhutdinov and G. Hinton, “A better way to pretrain Deep Boltzmann Machines,” in Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012, pp. 2447–2455, usa, December 2012. View at Scopus
  51. K. Cho, T. Raiko, A. Ilin, and J. Karhunen, “A two-stage pretraining algorithm for deep boltzmann machines,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 8131, pp. 106–113, 2013. View at Publisher · View at Google Scholar · View at Scopus
  52. G. Montavon and K. Müller, “Deep Boltzmann Machines and the Centering Trick,” in Neural Networks: Tricks of the Trade, vol. 7700 of Lecture Notes in Computer Science, pp. 621–637, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. View at Publisher · View at Google Scholar
  53. I. Goodfellow, M. Mirza, A. Courville et al., “Multi-prediction deep Boltzmann machines,” in Proceedings of the NIPS, 2013.
  54. H. Bourlard and Y. Kamp, “Auto-association by multilayer perceptrons and singular value decomposition,” Biological Cybernetics, vol. 59, no. 4-5, pp. 291–294, 1988. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  55. N. Japkowicz, S. J. Hanson, and M. A. Gluck, “Nonlinear autoassociation is not equivalent to PCA,” Neural Computation, vol. 12, no. 3, pp. 531–545, 2000. View at Publisher · View at Google Scholar · View at Scopus
  56. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in in Proceedings of the Twenty-fifth International Conference on Machine Learning (ICML'08), W. W. Cohen, A. McCallum, and S. T. Roweis, Eds., pp. 1096–1103, ACM, 2008.
  57. P. Gallinari, Y. LeCun, S. Thiria, and F. Fogelman-Soulie, “Memoires associatives distribuees,” in Proceedings of the in Proceedings of COGNITIVA 87, Paris, La Villette, 1987.
  58. H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio, “An empirical evaluation of deep architectures on problems with many factors of variation,” in Proceedings of the 24th International Conference on Machine Learning (ICML '07), pp. 473–480, Corvallis, Ore, UA, June 2007. View at Publisher · View at Google Scholar · View at Scopus
  59. Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, “Greedy layer-wise training of deep networks,” in Advances in Neural Information Processing Systems (NIPS06), B. Sch, J. Platt, and., T. Hoffman, and B. Schölkopf, Eds., vol. 19, pp. 153–160, MIT Press, 2007. View at Google Scholar
  60. J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeulders, “Selective search for object recognition,” International Journal of Computer Vision, vol. 104, no. 2, pp. 154–171, 2013. View at Publisher · View at Google Scholar · View at Scopus
  61. R. Girshick, “Fast R-CNN,” in Proceedings of the 15th IEEE International Conference on Computer Vision (ICCV '15), pp. 1440–1448, December 2015. View at Publisher · View at Google Scholar · View at Scopus
  62. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017. View at Publisher · View at Google Scholar
  63. J. Hosang, R. Benenson, and B. Schiele, “How good are detection proposals, really?” in Proceedings of the 25th British Machine Vision Conference, BMVC 2014, gbr, September 2014. View at Scopus
  64. B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Simultaneous detection and segmentation,” in Computer Vision—ECCV 2014, vol. 8695 of Lecture Notes in Computer Science, pp. 297–312, Springer, 2014. View at Publisher · View at Google Scholar
  65. J. Dong, Q. Chen, S. Yan, and A. Yuille, “Towards unified object detection and semantic segmentation,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 8693, no. 5, pp. 299–314, 2014. View at Publisher · View at Google Scholar · View at Scopus
  66. Y. Zhu, R. Urtasun, R. Salakhutdinov, and S. Fidler, “SegDeepM: Exploiting segmentation and context in deep neural networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 4703–4711, USA, June 2015. View at Publisher · View at Google Scholar · View at Scopus
  67. J. Liu, N. Lay, Z. Wei et al., “Colitis detection on abdominal CT scans by rich feature hierarchies,” in Proceedings of the Medical Imaging 2016: Computer-Aided Diagnosis, vol. 9785 of Proceedings of SPIE, San Diego, Calif, USA, February 2016. View at Publisher · View at Google Scholar
  68. G. Luo, R. An, K. Wang, S. Dong, and H. Zhang, “A Deep Learning Network for Right Ventricle Segmentation in Short:Axis MRI,” in Proceedings of the 2016 Computing in Cardiology Conference. View at Publisher · View at Google Scholar
  69. T. Chen, S. Lu, and J. Fan, “S-CNN: Subcategory-aware convolutional networks for object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. View at Publisher · View at Google Scholar
  70. W. Diao, X. Sun, X. Zheng, F. Dou, H. Wang, and K. Fu, “Efficient Saliency-Based Object Detection in Remote Sensing Images Using Deep Belief Networks,” IEEE Geoscience and Remote Sensing Letters, vol. 13, no. 2, pp. 137–141, 2016. View at Publisher · View at Google Scholar · View at Scopus
  71. V. Nair and G. E. Hinton, “3D object recognition with deep belief nets,” in Proceedings of the NIPS, 2009.
  72. N. Doulamis and A. Doulamis, “Fast and adaptive deep fusion learning for detecting visual objects,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 7585, no. 3, pp. 345–354, 2012. View at Publisher · View at Google Scholar · View at Scopus
  73. N. Doulamis and A. Doulamis, “Semi-supervised deep learning for object tracking and classification,” pp. 848–852. View at Publisher · View at Google Scholar · View at Scopus
  74. H.-C. Shin, M. R. Orton, D. J. Collins, S. J. Doran, and M. O. Leach, “Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1930–1943, 2013. View at Publisher · View at Google Scholar · View at Scopus
  75. J. Li, C. Xia, and X. Chen, “A benchmark dataset and saliency-guided stacked autoencoders for video-based salient object detection,” IEEE Transactions on Image Processing, 2017. View at Google Scholar
  76. D. Chen, X. Cao, F. Wen, and J. Sun, “Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '13), pp. 3025–3032, June 2013. View at Publisher · View at Google Scholar · View at Scopus
  77. X. Cao, D. Wipf, F. Wen, G. Duan, and J. Sun, “A practical transfer learning algorithm for face verification,” in Proceedings of the 14th IEEE International Conference on Computer Vision (ICCV '13), pp. 3208–3215, December 2013. View at Publisher · View at Google Scholar · View at Scopus
  78. T. Berg and P. N. Belhumeur, “Tom-vs-Pete classifiers and identity-preserving alignment for face verification,” in Proceedings of the 23rd British Machine Vision Conference (BMVC '12), pp. 1–11, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  79. D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun, “Bayesian face revisited: a joint formulation,” in Computer Vision—ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7–13, 2012, Proceedings, Part III, vol. 7574 of Lecture Notes in Computer Science, pp. 566–579, Springer, Berlin, Germany, 2012. View at Publisher · View at Google Scholar
  80. S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: a convolutional neural-network approach,” IEEE Transactions on Neural Networks and Learning Systems, vol. 8, no. 1, pp. 98–113, 1997. View at Publisher · View at Google Scholar · View at Scopus
  81. X. Wu, R. He, Z. Sun, and T. Tan, A light CNN for deep face representation with noisy labels, https://arxiv.org/abs/1511.02683.
  82. O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep Face Recognition,” in Proceedings of the British Machine Vision Conference 2015, pp. 41.1–41.12, Swansea. View at Publisher · View at Google Scholar
  83. F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: a unified embedding for face recognition and clustering,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15), pp. 815–823, IEEE, Boston, Mass, USA, June 2015. View at Publisher · View at Google Scholar · View at Scopus
  84. Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace: closing the gap to human-level performance in face verification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '14), pp. 1701–1708, Columbus, Ohio, USA, June 2014. View at Publisher · View at Google Scholar
  85. B. Amos, B. Ludwiczuk, and M. Satyanarayanan, “Openface: a general-purpose face recognition library with mobile applications,” CMU-CS-16-118, CMU School of Computer Science, 2016. View at Google Scholar
  86. A. S. Voulodimos, D. I. Kosmopoulos, N. D. Doulamis, and T. A. Varvarigou, “A top-down event-driven approach for concurrent activity recognition,” Multimedia Tools and Applications, vol. 69, no. 2, pp. 293–311, 2014. View at Publisher · View at Google Scholar · View at Scopus
  87. A. S. Voulodimos, N. D. Doulamis, D. I. Kosmopoulos, and T. A. Varvarigou, “Improving multi-camera activity recognition by employing neural network based readjustment,” Applied Artificial Intelligence, vol. 26, no. 1-2, pp. 97–118, 2012. View at Publisher · View at Google Scholar · View at Scopus
  88. K. Makantasis, A. Doulamis, N. Doulamis, and K. Psychas, “Deep learning based human behavior recognition in industrial workflows,” in Proceedings of the 23rd IEEE International Conference on Image Processing, ICIP 2016, pp. 1609–1613, September 2016. View at Publisher · View at Google Scholar · View at Scopus
  89. C. Gan, N. Wang, Y. Yang, D.-Y. Yeung, and A. G. Hauptmann, “DevNet: A Deep Event Network for multimedia event detection and evidence recounting,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 2568–2577, USA, June 2015. View at Publisher · View at Google Scholar · View at Scopus
  90. T. Kautz, B. H. Groh, J. Hannink, U. Jensen, H. Strubberg, and B. M. Eskofier, “Activity recognition in beach volleyball using a DEEp Convolutional Neural NETwork: leveraging the potential of DEEp Learning in sports,” Data Mining and Knowledge Discovery, vol. 31, no. 6, pp. 1678–1705, 2017. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  91. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and F.-F. Li, “Large-scale video classification with convolutional neural networks,” in Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition, (CVPR '14), pp. 1725–1732, Columbus, OH, USA, June 2014. View at Publisher · View at Google Scholar · View at Scopus
  92. C. A. Ronao and S.-B. Cho, “Human activity recognition with smartphone sensors using deep learning neural networks,” Expert Systems with Applications, vol. 59, pp. 235–244, 2016. View at Publisher · View at Google Scholar · View at Scopus
  93. J. Shao, C. C. Loy, K. Kang, and X. Wang, “Crowded Scene Understanding by Deeply Learned Volumetric Slices,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 3, pp. 613–623, 2017. View at Publisher · View at Google Scholar · View at Scopus
  94. K. Tang, B. Yao, L. Fei-Fei, and D. Koller, “Combining the right features for complex event recognition,” in Proceedings of the 2013 14th IEEE International Conference on Computer Vision, ICCV 2013, pp. 2696–2703, Australia, December 2013. View at Publisher · View at Google Scholar · View at Scopus
  95. S. Song, V. Chandrasekhar, B. Mandal et al., “Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition,” in Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2016, pp. 378–385, USA, July 2016. View at Publisher · View at Google Scholar · View at Scopus
  96. R. Kavi, V. Kulathumani, F. Rohit, and V. Kecojevic, “Multiview fusion for activity recognition using deep neural networks,” Journal of Electronic Imaging, vol. 25, no. 4, Article ID 043010, 2016. View at Publisher · View at Google Scholar · View at Scopus
  97. H. Yalcin, “Human activity recognition using deep belief networks,” in Proceedings of the 24th Signal Processing and Communication Application Conference, SIU 2016, pp. 1649–1652, tur, May 2016. View at Publisher · View at Google Scholar · View at Scopus
  98. A. Kitsikidis, K. Dimitropoulos, S. Douka, and N. Grammalidis, “Dance analysis using multiple kinect sensors,” in Proceedings of the 9th International Conference on Computer Vision Theory and Applications, VISAPP 2014, pp. 789–795, prt, January 2014. View at Scopus
  99. P. F. Felzenszwalb and D. P. Huttenlocher, “Pictorial structures for object recognition,” International Journal of Computer Vision, vol. 61, no. 1, pp. 55–79, 2005. View at Publisher · View at Google Scholar · View at Scopus
  100. A. Jain, J. Tompson, and M. Andriluka, “Learning human pose estimation features with convolutional networks,” in Proceedings of the ICLR, 2014.
  101. J. J. Tompson, A. Jain, Y. LeCun et al., “Joint training of a convolutional network and a graphical model for human pose estimation,” in Proceedings of the NIPS, 2014.
  102. L. Fei-Fei, R. Fergus, and P. Perona, “One-shot learning of object categories,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594–611, 2006. View at Publisher · View at Google Scholar · View at Scopus
  103. A. Krizhevsky and G. Hinton, Learning multiple layers of features from tiny images, 2009.
  104. S. A. Nene, S. K. Nayar, and H. Murase, Columbia object image library (coil-20), 1996.
  105. T. Skauli and J. Farrell, “A collection of hyperspectral images for imaging systems research,” in Proceedings of the Digital Photography IX, USA, February 2013. View at Publisher · View at Google Scholar · View at Scopus
  106. M. F. Baumgardner, L. L. Biehl, and D. A. Landgrebe, “220 band aviris hyperspectral image data set: June 12, 1992 indian pine test site 3,” Datasets, 2015. View at Publisher · View at Google Scholar
  107. E. Eidinger, R. Enbar, and T. Hassner, “Age and gender estimation of unfiltered faces,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2170–2179, 2014. View at Publisher · View at Google Scholar · View at Scopus
  108. G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition in unconstrained environments,” Tech. Rep., University of Massachusetts, Amherst, 2007. View at Google Scholar
  109. X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “ChestX-Ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3462–3471, Honolulu, HI, May 2017. View at Publisher · View at Google Scholar
  110. A. Seff, L. Lu, A. Barbu, H. Roth, H.-C. Shin, and R. M. Summers, “Leveraging mid-level semantic boundary cues for automated lymph node detection,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 9350, pp. 53–61, 2015. View at Publisher · View at Google Scholar · View at Scopus
  111. A. Voulodimos, D. Kosmopoulos, G. Vasileiou et al., “A dataset for workflow recognition in industrial scenes,” in Proceedings of the 2011 18th IEEE International Conference on Image Processing, ICIP 2011, pp. 3249–3252, Belgium, September 2011. View at Publisher · View at Google Scholar · View at Scopus
  112. A. Voulodimos, D. Kosmopoulos, G. Vasileiou et al., “A threefold dataset for activity and workflow recognition in complex industrial environments,” IEEE MultiMedia, vol. 19, no. 3, pp. 42–52, 2012. View at Publisher · View at Google Scholar · View at Scopus
  113. D. I. Kosmopoulos, A. S. Voulodimos, and A. D. Doulamis, “A system for multicamera task recognition and summarization for structured environments,” IEEE Transactions on Industrial Informatics, vol. 9, no. 1, pp. 161–171, 2013. View at Publisher · View at Google Scholar · View at Scopus
  114. S. Abu-El-Haija et al., “YouTube-8M: A large-scale video classification benchmark,” Tech. Rep., 2016, https://arxiv.org/abs/1609.08675. View at Google Scholar