Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2017 (2017), Article ID 5190490, 2 pages

Optimization for Detection and Recognition in Images and Videos

1Department of Electronics Engineering, Konkuk University, Seoul, Republic of Korea
2Department of Electrical Engineering, Hanbat National University, Daejeon, Republic of Korea
3Department of Informatics, Systems and Communication, University of Milan-Bicocca, Milan, Italy

Correspondence should be addressed to Wonjun Kim

Received 27 November 2016; Accepted 28 November 2016; Published 2 March 2017

Copyright © 2017 Wonjun Kim et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This special issue aims at providing the new optimization techniques for detection and recognition in computer vision. The optimization techniques have been widely understood and employed to accurately find a solution for a given task in the field of computer vision. For example, visual SLAM (simultaneous localization and mapping), camera calibration, denoising, and segmentation can be robustly implemented based on optimization algorithms [13]. Even though their great capability of clearly solving various problems has been proved, most of them are hardly deployed into the mobile and robot platforms due to the heavy computation, which requires quite a lot iterations with complex operations. To cope with this limitation, many researchers have devoted considerable efforts for constructing simple yet powerful optimization frameworks.

The most important thing to resolve a given problem by using the optimization technique is to efficiently construct the sophisticated probabilistic and statistical models. However, those have been restrictively applied to somewhat traditional problems as mentioned above in computer vision. With rapidly increasing demand on the high-level intelligence in mobile and robot platforms, such models need to be applied to more advanced applications such as object detection, classification, and recognition [4, 5]. In this point of view, articles published in this special issue show the plentiful possibilities in computer vision and machine learning. Specifically, some are devoted to solving the problem of face detection and recognition with novel optimization techniques, which is one of the hottest issues in this field. Part of articles attempts to precisely segment salient regions in a given image by optimizing the graph model while another part deals with the more specific problems in images and videos, for example, hyperspectral image classification and highway visibility detection. All things considered, it is thought that articles of the present special issue give a great contribution to object detection and recognition.

X. Lu et al. propose fusing features from two different neural networks for face detection in their article titled as “Feature Extraction and Fusion using Deep Convolutional Neural Networks for Face Detection.” To extract features for describing the face, they utilize Clarifai and VGG (16 layers) networks and conduct the feature optimization with the PCA scheme during the training phase. The proposed method is well evaluated based on two benchmark DBs (i.e., FDDB and AFW).

In the article “Customized Dictionary Learning for Subdatasets with Fine Granularity” by L. Ye et al., authors propose customizing the global dictionary, which is already trained based on a variety of subjects, corresponding to the target people for improving the performance of face recognition. To do this, they employ a regularizer penalizing the difference between global and subdataset dictionaries under the sparsity constraint and demonstrate the reconstruction performance on the benchmark dataset.

R. Huang et al. exploit a new neural network structure based on the supervised the autoencoder in their article titled as “Adaptive Deep Supervised Autoencoder Based Image Reconstruction for Face Recognition.” In this method, the characteristic features from corrupted and clean faces are exactly trained with their labels; thus, the reconstruction is efficiently achieved using the low-dimensional feature vector. A test image can be recognized by comparing its reconstructed image with individual gallery images. Authors show the performance improvement for face recognition by their adaptive scheme on various datasets, for example, AR, PubFig, and extended Yale B.

H. Sima et al. introduce a simple optimization scheme for merging superpixels for image segmentation in their article titled as “Objectness Supervised Merging Algorithm for Color Image Segmentation.” Specifically, they build two hierarchy segmentation models, which are based on the region growing and color features, respectively. It is noteworthy that, for smoothing out textures and measuring the objectness, authors propose conducting the total variation-based optimization, which is efficiently yielding the real boundary condition (i.e., object boundary). Both color and objectness features are computed to check the regional similarity between superpixel pairs, and the mixed standard deviation of the union feature is computed to adaptively stop the merging process. Authors demonstrate that this combined approach reliably provides the object boundary on the segmentation spaces in diverse natural images.

In the article “Graph-Based Salient Region Detection through Linear Neighborhoods” by L. Xu et al., authors formulate the problem of salient region detection as the process of graph labeling by learning from partially selected seed (i.e., labeled data) in the graph. Based on the assumption that every node in the graph can be optimally reconstructed by a linear combination of its neighbor nodes, the undirected weight graph is constructed instead of Gaussian-based pairwise weighting, which improves the detection performance in cluttered background. Since the salient region can be directly and efficiently applied to a wide range of computer vision fields, for example, image classification and object recognition, it can be seen as a preprocessing step for further applications.

Y. Tang et al. establish two dictionaries (i.e., background and union) for pixel representation to improve the performance of hyperspectral image classification in their article titled as “Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification.” Since the proposed dictionary is built on the neighbor regions centered at each pixel position, the performance is robust to the noise in captured image. Moreover, the kernel method is employed to improve the interclass separability. In the high-dimensional feature space induced by the kernel function, a coding vector is finally computed by the kernel-based orthogonal matching suit (KOMP).

In the article “Visibility Video Detection with Dark Channel Prior on Highway” by J. Zhao et al., authors attempt to remove the different illumination condition for image enhancement under hazy condition on the highway. To do this, they combine many different techniques with the optimization scheme and provide the plentiful experimental results.

In summary, this special issue handles various issues attracting considerable interests in the field of detection and recognition for images and videos. Despite the certain randomness in the submission of manuscripts for publication, we believe that this special issue could be interesting to all those who have to deal with computer vision and machine learning in the vast field of engineering.


We would like to thank the authors of articles published in the issue for their contribution. We are grateful to all anonymous reviewers for their valuable work.

Wonjun Kim
Chanho Jung
Simone Bianco


  1. N. Karlsson, E. Di Bernardo, J. Ostrowski, L. Goncalves, P. Pirjanian, and M. E. Munich, “The vSLAM algorithm for robust localization and mapping,” in Proceedings of the IEEE International Conference on Robotics and Automation, pp. 24–29, Barcelona, Spain, April 2005. View at Publisher · View at Google Scholar · View at Scopus
  2. Y. Xie, S. Gu, Y. Liu, W. Zuo, W. Zhang, and L. Zhang, “Weighted schatten p-norm minimization for image denoising and background subtraction,” IEEE Transactions on Image Processing, vol. 25, no. 10, pp. 4842–4857, 2016. View at Publisher · View at Google Scholar · View at MathSciNet
  3. Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222–1239, 2001. View at Publisher · View at Google Scholar · View at Scopus
  4. D. Novotny and J. Matas, “Cascaded sparse spatial bins for efficient and effective generic object detection,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV '15), pp. 1152–1160, Santiago, Chile, December 2015. View at Publisher · View at Google Scholar
  5. H. Liu, D. Guo, and F. Sun, “Object recognition using tactile measurements: kernel sparse coding methods,” IEEE Transactions on Instrumentation and Measurement, vol. 65, no. 3, pp. 656–665, 2016. View at Publisher · View at Google Scholar