Theory and Applications of Data ClusteringView this Special Issue
Research Article | Open Access
Ho Yub Jung, Kyoung Mu Lee, "Image Segmentation by Edge Partitioning over a Nonsubmodular Markov Random Field", Mathematical Problems in Engineering, vol. 2015, Article ID 683176, 9 pages, 2015. https://doi.org/10.1155/2015/683176
Image Segmentation by Edge Partitioning over a Nonsubmodular Markov Random Field
Edge weight-based segmentation methods, such as normalized cut or minimum cut, require a partition number specification for their energy formulation. The number of partitions plays an important role in the segmentation overall quality. However, finding a suitable partition number is a nontrivial problem, and the numbers are ordinarily manually assigned. This is an aspect of the general partition problem, where finding the partition number is an important and difficult issue. In this paper, the edge weights instead of the pixels are partitioned to segment the images. By partitioning the edge weights into two disjoints sets, that is, cut and connect, an image can be partitioned into all possible disjointed segments. The proposed energy function is independent of the number of segments. The energy is minimized by iterating the QPBO--expansion algorithm over the pairwise Markov random field and the mean estimation of the cut and connected edges. Experiments using the Berkeley database show that the proposed segmentation method can obtain equivalently accurate segmentation results without designating the segmentation numbers.
There are numerous approaches and applications for unsupervised image segmentation in computer vision. Many different theories are proposed for varying the roles of the unsupervised segmentation. As a low level vision problem, an image can be simplified by oversegmentation using a number of different approaches, such as mode-seeking mean shift, multilevel thresholding, histogram-based neural networks, superpixel algorithms, and various graph-based methods [1–4]. Conversely, semantic segmentation is attempted for simultaneous detection, recognition, and segmentation .
Generally, the role of unsupervised segmentation falls between image simplification and full semantic segmentation, where semantically meaningful segments are expected to be found but not necessarily recognized. Segmentation is posed as an image-coloring problem that minimizes specific energy functions. Energy functions can be optimized using stochastic methods such as deterministic annealing and stochastic clustering [6–10]. For graph theoretic segmentation approaches, the spectral method and graph cut are efficient deterministic optimization methods [11–13]. Another traditional segmentation method is the variational method, which evolves boundary contours in a level set framework [14, 15].
The edge weight-based segmentation methods have evolved together with graph partition problems. When edge weights are all positive, the minimum cut can be found; however, the minimum cut has bias toward smaller cuts. Adding negative edge weights can prevent the problem so the graph becomes nonsubmodular; however, the problem becomes NP-hard . Different algorithms have been introduced to estimate the correlation in clustering problem [17, 18]. In contrast, Shi and Malik normalized nonnegative edge weights so the bias toward smaller cuts was eliminated .
For the graph theoretic segmentation and level set methods, the number of segments must be predefined. The segment number choice greatly influences the quality of segmentation, especially for a normalized cut. Nonetheless, there have been attempts to solve this problem. The number of segments can be controlled by setting the threshold value to the recursive normalized cut . For level set approaches, a four-color theorem was used to segment images with an arbitrary number of phases with one or two level set functions . However, these methods are still functions of , the number of segments.
In this paper, transforming the pixel clustering problem into an edge partition problem circumvents the segment number selection problem. Edges among adjacent pixels can represent dissimilarity or similarity weights. Two edge partitions are always sufficient for pixel-partitioning problems. An edge can be in a cut set or connected set, which can then be translated into a unique segmentation, as in Figure 1(c). The cut edges indicate that the two node labels are different, whereas the connected edges indicate that two nodes have the same labels. In most cases, however, the cut or connect assignments on the edges are not enough to define a specific segmentation configuration, as in Figure 1(d). Random cut and connect assignments on the edges may result in contradiction of the node labels. However, under the pixel coloring framework, cut and connect assignments on the edges are defined concurrently with pixel labels, and inconsistencies, such as those in Figure 1(d), are prevented.
Under the pixel-labeling framework, a label number selection problem arises. Although the label number selection might seem similar to the segment number selection problem, there are subtle differences. First, pixels do not need to use all label assignments; thus, low numbers of segments are possible with large numbers of labels. Second, under the four-color map theorem, the maximum number of labels for two-dimensional (2D) segmentation can be as low as four. The four-color map theorem states that any 2D map can be colored with intact borders using a maximum of four colors . This theorem can be translated directly to the segmentation problem; any 2D image segmentation can be represented using four labels .
In the following sections, a new energy function is introduced for image segmentation through the edge partition. The edge partitions can uniquely define the image segmentation with the hard constraints enforced by the image-labeling framework. Next, an energy minimization algorithm is proposed for the edge partitioning. The experimental section discusses tests of the proposed algorithm using the Berkeley image segmentation database.
2. Pixel Clustering
Image segmentation can be viewed as a pixel-partitioning problem. Many image segmentation methods borrow their ideas from the general partitioning techniques. The -means algorithm minimizes the following function and segments the image into regions. Considerwhere is the pixel feature value and is the mean values of the respective partition . The -means algorithm minimizes the sum of the squared distance from the mean of each partition. The energy function must have a fixed segmentation number. However, estimating the number of segments is a difficult task, and the number of partitions is often designated by human discretion.
3. Edge Partitions
An image can be represented as a set of nodes and edges by a graph . An edge is assigned with weight between nodes . For each node , a label from is assigned to define a segmentation.
3.1. Energy Function
The segmentation problem is formulated in terms of edge partitions. The edges can be partitioned into two sets (cut) and (connect), such that and . If an edge is in , the pixel nodes connected by the edge have the same label. Otherwise, if an edge is in , the pixel nodes connected by the edge have a different labelIn (2), is an edge between pixel nodes and . The pixel labels for pixels and are denoted by and , respectively. is a positive edge weight between pixel nodes and . can be a similar or dissimilar measure between the two pixel nodes. A simple example of is the absolute difference between two pixel colors. Thus, if the colors between the two pixels have a large difference, the edge will likely be in . If the two pixel colors have a small difference, the edge should be in . The mean edge weight values of the and edge sets are found in the following equation:and then the energy function associated with the edge partitions can be defined by following equation: is the cardinality of the set. The energy function (4) is the same as the -means algorithm in (1) except that the number of partitions is set to . The proposed energy function has two mean centers, but it also has hard constraints in (2). Regardless of the segmentation number, there can only be two partitions for the edges, cut and connected edges .
The proposed energy function breaks down into an image-labeling problem in order to maintain the label consistency conditions of (2). The image label state that minimizes (4) under (3) and (2) constraints is the proposed segmentation state. The number of labels must be at least two to avoid division by zero in (3). Under the well-known four-color map theorem, four labels are sufficient to define all possible segment configurations for 2D images .
Given the image label state , the mean values , can be estimated as in (3). Otherwise, if and are kept constant, the image label state can be found by optimizing the following pairwise energy function:If the labels between edges are not the same, the edge is considered to be in the cut set; otherwise, it belongs to the connected set. With and constants, minimizing (5) is equivalent to minimizing the edge partition function (4).
The multilabel pairwise energy function (5) can be solved by QPBO--expansion. QPBO--expansion optimizes the multilabel MRFs by iteratively expanding a single label using graph cut . Graph cut can find the optimal expansion if the expansion is submodular. In this problem, the expansions are nonsubmodular. The pairwise potentials for QPBO--expansion, where is the current label state, can be defined as follows:This nonsubmodular binary labeling problem can be approached using the QPBO algorithm  with the possibility of a large number of unlabeled nodes. Recently introduced, QPBO improve (QPBOI) algorithm can cope with unlabeled regions ; however, this algorithm is not as efficient as the graph cut which minimizes the submodular potentials. The QPBOI algorithm can randomly improve the solution, but iterations of the improved steps can be time-consuming for large numbers of nodes.
Similar to the original -means algorithm, good initialization is helpful to the optimization. The initial estimation of the means, and , can be found by a -means algorithm minimization of edge partitions (4) without the labeling constraint of (2). To estimate the initial state , the pixel clustering -means algorithm (1) can be used. The general framework is illustrated in Algorithm 1.
3.3. Edge Weights
Various examples of the edge partition segmentation results using the color distance edge weights are shown in Figure 2 for the MSRC image database . The color distance from the neighboring pixels is sufficient for some image segmentation problems, but more rigorous weight calculations are often suited for semantic segmentation. Instead of proposing new edge weight calculations, an existing state-of-the-art contour detection algorithm is incorporated.
The global probability of the boundary (GPB) edge detection method [25, 30], which scored best for the Berkeley database (http://www.cs.berkeley.edu/projects/vision/bsds), is employed as the edge weights. The edge weights can be connected between the pixel nodes, and the proposed edge partitioning algorithm can be implemented. Figure 3 shows the other segmentation results under the pixel-to-pixel edge connections. Although Figures 3(a) and 3(b) show a good segmentation result, the QPBOI algorithm cannot obtain a good segmentation in Figures 3(c) and 3(d). The QPBOI algorithm often fails in the presence of a large number of nodes. Thus, to reduce both the computational time and the chance of failure in the QPBOI algorithm, the oversegmentation process is adopted from  in this segmentation. The edges are connected between the superpixels instead of the pixels. The number of oversegments is between 400 and 1000. The edge partitioning algorithm segments a BSDS image average in under 5 seconds.
The proposed edge partition approach is evaluated using the popular Berkeley image database. The set contains 300 images with at least four human segment annotations per image. The three quantitative evaluation methods used are as follows: Probabilistic Rand Index (PRI) , Variation of Information (VoI) , and Boundary Displacement Error (BDE) . Global Consistency Error (GCE)  is not included in this evaluation. GCE measures the extent to which one segmentation can be viewed as a refinement of another. However, one pixel per segment and one segment for an entire image can give zero error for GCE . GCE favors extremely oversegmented or undersegmented results, and both cases are unwanted for a semantic segmentation. GCE is deemed to be an inconsistent evaluation method.
The evaluation methods used in this study are PRI, VoI, and BDE. PRI counts the number of consistent labels between the segmentation and the ground truth. VoI measures the segmentation randomness that cannot be explained by the ground truth. BDE is the average displacement error or the boundary pixels between two segmentation results. PRI counts the correctness in segmentation, while VoI and BDE measure the errors between the segmentation and ground truth. In the first subsection, the proposed method is evaluated against various segmentation methods. In the second subsection, the comparison between the proposed and the merge-threshold methods is demonstrated using the same edge weights.
4.1. Comparison to the Previous Segmentation Methods
Generally, the parameters are constant for the entire database and test methods. This evaluation includes mean shift (MShift) , graph-based segmentation (GBIS) , JSEG , Normalized Tree Partitioning (NTP) , saliency-based segmentation (Saliency) , Boundary Encoding Based Segmentation (TBES) , normalized cut (Ncut) , and fully connected spectral segmentation (SpecSeg) . Additionally, contour to region (CtoR)  uses the same edge weights. Table 1 summarizes the performance of these methods. Many of the evaluation results are obtained from .
For PRI measurements, the merge-threshold method of CtoR ranks first. The proposed segmentation ranks first for VoI and BDE. The CtoR method is available to the public by the authors. The threshold value for the CtoR method was chosen to be 80 for its highest average ranking. A number of segmentation results of CtoR and of the proposed EPartition are shown in Figure 4. For the normalized cut and fully connected spectral segmentation, the segmentation number is chosen for each image and is excluded from the rankings.
CtoR and EPartition use the same edge weights; thus, their performances are similar. However, in CtoR, a merge-threshold algorithm is used for segmentation. Different thresholds among integer intervals are shown for the PRI, VoI, and BDE evaluation methods in Figure 5. Generally, PRI and BDE favor oversegmentation and VoI favors undersegmentation. The optimal threshold value is generally smaller for PRI and BDE than VoI.
In contrast, the edge partitioning segmentation is independent of a threshold value. Figure 5 shows the performance of the CtoR merge-threshold method in terms of threshold values. The proposed EPartition segmentation evaluation scores for PRI, BDE, and VoI are very close to the highest evaluation score of CtoR. However, the merge-threshold method in CtoR requires a specific threshold value for each segmentation evaluation method. The advantage of EPartition is that correct segmentation is possible without the designation of segmentation number or a threshold value.
4.2. Comparison to Trained Threshold
In previous experiments, EPartition was shown to have competitive performance with CtoR when the optimal threshold value is hand-picked for CtoR. In this section, the threshold value is trained from the Berkeley 300 set and the segmentation performances are compared to the Weizmann segmentation set . The Weizmann set contains 100 images with three human segmentation annotations.
In Table 2, the segmentation evaluations of the CtoR and EPartition methods are compared. There is a minuscule difference for PRI and small differences in the VoI evaluation methods. For BDE evaluation, EPartition clearly outperforms CtoR method. The trained threshold value was not robust for different segmentation evaluation approaches. By partitioning the edges through minimizing the mean squared distance, the proposed EPartition shows adaptive performance among the three evaluation methods. Various comparative segmentation results are shown in Figure 6.
5. Conclusion and Future Works
In this paper, image segmentation by edge partitioning is proposed. In contrast with previous edge weight-based segmentation methods, such as normalized cut, the proposed method is independent of the number of segments. Furthermore, compared with the previous segmentation techniques, edge partitioning remains competitive without the need for the segmentation number selection. Segmentation by edge partitioning has shown to be competitive with previous segmentation techniques in the Berkeley database. The advantage of the proposed method lies in its adaptive nature for handling edge weights without threshold values or segment number assignments.
The proposed algorithm can be extended to general partitioning problems. Four labels are sufficient when segmenting 2D images. However, for fully connected graphs, the number of labels can be arbitrarily large. If a maximum number of labels are chosen, the edge partitioning method can be incorporated into a general partition problem without designating the specific number of partitions among nodes.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by Institute for Information & Communications Technology Promotion (IITP) Grant funded by the Korea government (MSIP) (no. R0101-15-0171, Development of Multimodality Imaging and 3D Simulation-Based Integrative Diagnosis-Treatment Support Software System for Cardiovascular Diseases). This work was also supported by Hankuk University of Foreign Studies Research Fund.
- D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002.
- X. Ren and J. Malik, “Learning a classification model for segmentation,” in Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV '03), vol. 1, pp. 10–17, IEEE, Nice, France, October 2003.
- A. Fabijańska and J. Gocławski, “New accelerated graph-based method of image segmentation applying minimum spanning tree,” IET Image Processing, vol. 8, no. 4, pp. 239–251, 2014.
- R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, “SLIC superpixels compared to state-of-the-art superpixel methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274–2281, 2012.
- Z. Tu, X. Chen, A. L. Yuille, and S.-C. Zhu, “Image parsing: unifying segmentation, detection, and recognition,” International Journal of Computer Vision, vol. 63, no. 2, pp. 113–140, 2005.
- J. Puzicha, T. Hofmann, and J. M. Buhmann, “Non-parametric similarity measures for unsupervised texture segmentation and image retrieval,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 267–272, IEEE, June 1997.
- Y. Gdalyahu, D. Weinshall, and M. Werman, “Stochastic image segmentation by typical cuts,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Fort Collins, Colo, USA, June 1999.
- N. Shental, A. Zomet, T. Hertz, and Y. Weiss, “Learning and inferring image segmentations using the GBP typical cut algorithm,” in Proceedings of the 9th IEEE International Conference on Computer Vision, vol. 2, pp. 1243–1250, Nice, France, October 2003.
- O. O. Olugbara, E. Adetiba, and S. A. Oyewole, “Pixel intensity clustering algorithm for multilevel image segmentation,” Mathematical Problems in Engineering, vol. 2015, Article ID 649802, 19 pages, 2015.
- E. Cuevas, A. González, F. Fausto, D. Zaldívar, and M. Pérez-Cisneros, “Multithreshold segmentation by using an algorithm based on the behavior of locust swarms,” Mathematical Problems in Engineering, vol. 2015, Article ID 805357, 25 pages, 2015.
- J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000.
- R. Zabih and V. Kolmogorov, “Spatially coherent clustering using graph cuts,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), vol. 2, pp. II-437–II-444, IEEE, June-July 2004.
- T. H. Kim, K. M. Lee, and S. U. Lee, “Learning full pairwise affinities for spectral segmentation,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '10), pp. 2101–2108, IEEE, San Francisco, Calif, USA, June 2010.
- T. F. Chan and L. A. Vese, “Active contours without edges,” IEEE Transactions on Image Processing, vol. 10, no. 2, pp. 266–277, 2001.
- S. Osher and N. Paragios, Geometric Level Set Methods in Imaging, Vision, and Graphics, Springer, New York, NY, USA, 2004.
- S. Chopra and M. R. Rao, “The partition problem,” Mathematical Programming, vol. 59, no. 1, pp. 87–115, 1993.
- E. D. Demaine and N. Immorlica, “Correlation clustering with partial information,” in Approximation, Randomization, and Combinatorial Optimization.. Algorithms and Techniques, vol. 2764 of Lecture Notes in Computer Science, pp. 1–13, Springer, Berlin, Germany, 2003.
- S. Nowozin and S. Jegelka, “Solution stability in linear programming relaxations: graph partitioning and unsupervised learning,” in Proceedings of the 26th Annual International Conference on Machine Learning (ICML '09), Montreal, Canada, June 2009.
- E. Hodneland, X.-C. Tai, and H.-H. Gerdes, “Four-color theorem and level set methods for watershed segmentation,” International Journal of Computer Vision, vol. 83, no. 3, pp. 264–283, 2009.
- Y. Deng and B. S. Manjunath, “Unsupervised segmentation of color-texture regions in images and video,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 8, pp. 800–810, 2001.
- P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International Journal of Computer Vision, vol. 59, no. 2, pp. 167–181, 2004.
- J. Wang, Y. Jia, X.-S. Hua, C. Zhang, and L. Quan, “Normalized tree partitioning for image segmentation,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, Anchorage, Alaska, USA, June 2008.
- M. Donoser, M. Urschler, M. Hirzer, and H. Bischof, “Saliency driven totla variation segmentation,” in Proceedings of the IEEE 12th International Conference on Computer Vision, pp. 817–824, Kyoto, Japan, October 2009.
- S. Rao, H. Mobahi, A. Yang, S. Sastry, and Y. Ma, “Natural image segmentation with adaptive texture and boundary encoding,” in Computer Vision—ACCV 2009: 9th Asian Conference on Computer Vision, Xi'an, September 23–27, 2009, Revised Selected Papers, Part I, vol. 5994 of Lecture Notes in Computer Science, pp. 135–146, Springer, Berlin, Germany, 2010.
- P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “From contours to regions: an empirical evaluation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 2294–2301, IEEE, Miami, Fla, USA, June 2009.
- K. Appel and W. Haken, “Solution of the four color map problem,” Scientific American, vol. 237, no. 4, pp. 108–121, 1977.
- Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.
- C. Rother, V. Kolmogorov, V. Lempitsky, and M. Szummer, “Optimizing binary MRFs via extended roof duality,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, IEEE, Minneapolis, Minn, USA, June 2007.
- J. Shotton, J. Winn, C. Rother, and A. Criminisi, “TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation,” in Computer Vision-ECCV 2006, vol. 3951 of Lecture Notes in Computer Science, pp. 1–15, Springer, Berlin, Germany, 2006.
- M. Maire, P. Arbeláez, C. Fowlkes, and J. Malik, “Using contours to detect and localize junctions in natural images,” in Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '08), pp. 1–8, Anchorage, Alaska, USA, June 2008.
- R. Unnikrishnan, C. Pantofaru, and M. Hebert, “Toward objective evaluation of image segmentation algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 929–944, 2007.
- M. Meila, “Comparing clustering: an axiomatic view,” in Proceedings of the 22nd International Conference on Machine Learning (ICML '05), pp. 577–584, ACM Press, 2005.
- J. Freixenet, X. Muñoz, D. Raba, J. Martí, and X. Cufí, “Yet another survey on image segmentation: region and boundary information integration,” in Computer Vision ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28–31, 2002 Proceedings, Part III, vol. 2352 of Lecture Notes in Computer Science, pp. 408–422, Springer, Berlin, Germany, 2002.
- D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proceedings of the 8th International Conference on Computer Vision, pp. 416–423, July 2001.
- S. Alpert, M. Galun, R. Basri, and A. Brandt, “Image segmentation by probabilistic bottom-up aggregation and cue integration,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '07), pp. 1–8, Minneapolis, Minn, USA, June 2007.
Copyright © 2015 Ho Yub Jung and Kyoung Mu Lee. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.