Abstract

One challenge of unsupervised MRI brain image segmentation is the central gray matter due to the faint contrast with respect to the surrounding white matter. In this paper, the necessity of supervised image segmentation is addressed, and a soft Mumford-Shah model is introduced. Then, a framework of semisupervised image segmentation based on soft Mumford-Shah model is developed. The main contribution of this paper lies in the development a framework of a semisupervised soft image segmentation using both Bayesian principle and the principle of soft image segmentation. The developed framework classifies pixels using a semisupervised and interactive way, where the class of a pixel is not only determined by its features but also determined by its distance from those known regions. The developed semisupervised soft segmentation model turns out to be an extension of the unsupervised soft Mumford-Shah model. The framework is then applied to MRI brain image segmentation. Experimental results demonstrate that the developed framework outperforms the state-of-the-art methods of unsupervised segmentation. The new method can produce segmentation as precise as required.

1. Introduction

In recent years, MRI based medical image processing and analysis have been studied widely. Among these researches, segmentation is at the first stage and is fundamental for poster processing and analysis. One of the most important applications in medical image processing is MRI brain image segmentation. It has been noticed that, by calculating changes of volumes of different brain tissues (called white matter, gray matter, and cerebrospinal fluid in image processing), some brain related diseases can be found at their early stage [1]. However, there are two challenges in calculating the volumes of different matters in MRI brain images. One challenge is the calculation of partial volumes appearing usually at the border of different tissues, due to limited resolution [25]; another challenge is the segmentation of central gray matter due to the faint contrast with respect to the surrounding white matter [6]. This paper addresses the later challenge.

Central gray matter lies in the central area of brains. Its intensity is usually very close to the white matter located not in central areas. In detail, the intensity of central gray matter is a little smaller than the intensity of white matter located in central area, but usually very close to or even greater than the intensity of this white matter near the outer layer. As a result, it is deficient for intensity based unsupervised segmentation methods in distinguishing central gray matter from white matter for MRI brain images.

In general, unsupervised methods explore the intrinsic data features to partition an image into regions with different statistics. The segmentation procedure can be implemented using some assigned algorithm automatically without human beings’ interaction or interfering. There are several cases that unsupervised methods either fail to work or are deficient. One case is of intensity inhomogeneity. Another case is when some parts of different classes have almost the same intensities or features. In the first case, it is solved by bias correction methods [7, 8] or by stochastic methods [9, 10] or both [6]. When bias correction methods are used, the bias field is always assumed smooth. When a bias field is not smooth, an alternative way is to use stochastic methods treating pixel intensities as randomly distributed random variables. However, stochastic methods in dealing with bias only work well when the bias is not strong. Nevertheless, some traditional unsupervised methods can still work for the first case (with bias). However, unsupervised segmentation methods usually fail to work efficiently for the second case.

Different from unsupervised segmentation, supervised image segmentation is a technique to partition an image using either known images or known features of some parts of the image to direct the segmentation.

Machine-learning based image segmentation is such a method that uses a collection of known images having the same type as a given image (to be segmented) to direct the image segmentation [11, 12]. The direction is implemented by a learning mechanic. The machine-leaning based image segmentation methods are originated from general classification methods. Such methods, when applied to image segmentation, deal with each pixel as an isolated object without considering the relation to its neighbors such as smoothness of the intensities inside a class. Moreover, the methods are usually based on algorithms, not based on a mathematical model, and therefore mathematically less precise.

Another way for supervised segmentation is to use some patches of a given image to direct the image segmentation. It assigns some regions for each class in advance based on prior knowledge and then uses the features of the known regions as constraints to model image segmentation. A class of such methods is supervised image matting [1315]. Image matting studies the problem of accurate foreground estimation in images and videos. It is essentially a two-phase image segmentation and usually deals with natural images that are very complicated. During image matting, supervised methods are usually used by assigning some regions as foreground and then use the assigned regions as reference to help extract the foreground. Image matting also provides interactive segmentation. Interactive method is also discussed in the famous Grabcut [16] method which deals with an image as a graph under discrete settings.

There are two shortcomings when using supervised image matting for image segmentation. First, image matting works only for images with two classes and assumes that the image is a linear combination of background and foreground. Second, there is no theoretical proof addressing why a supervised or interactive matting method is more reliable than an unsupervised image matting. Results are claimed better only based on visual effect.

Mumford-Shah model is a multiphase image segmentation model that has been extensively investigated [10, 1722]. Original Mumford-Shah model assumes an image to be a piecewise smooth function [23]. Later researches usually assume each piece of the function to have some special property such as piecewise polynomial [24]. Most often, the image in a Mumford-Shah model is assumed to be piecewise constant [1820]. In the later case, the model is sometimes implemented under the assumption that the mean of each class is known based on prior knowledge. Since the assumption of piecewise constant is too strong and may limit its application, some varied forms of Mumford-Shah model are also developed [8, 22, 25].

Considering that soft segmentation model is usually more flexible and makes it possible to produce a globally optimized result, Jianhong Shen extended Mumford-Shah model for soft segmentation [10], where each pixel can partly belong to more than one class. Membership functions are used in the model to denote the percentage or probability that a pixel belongs to each class. The value of a membership function at some pixel can be viewed as either the probability of the pixel belonging to the corresponding class such as fuzzy segmentation model [26, 27] or the percentage of the pixel belonging to the corresponding class such as partial volume segmentation [5, 2830].

In this paper, a soft version of the piecewise constant Mumford-Shah model is introduced. Then, a frame work of semisupervised and interactive image segmentation is developed based on the soft piecewise constant Mumford-Shah model using Bayesian principle. The developed model is proved to be an extension of the general unsupervised soft Mumford-Shah model. The semisupervised and interactive framework can produce segmentation result as precise as required. The rest of the paper is organized as below. Section 2 addresses the importance of supervised segmentation methods and its basic idea. For a given synthetic image, different segmentation results are presented when different methods, an unsupervised method, and a supervised method are used. Section 3 introduces the development of the proposed framework. Section 4 presents the numerical analysis and algorithm implementation. The efficiency of the framework is shown in Section 5 using experiments, where the application to MRI brain images is especially introduced. Finally, some comments, conclusion, and future work are addressed in Section 6.

2. Introduction to Semisupervised Segmentation

Unsupervised image segmentation utilizes the inherent image features to partition an image into different classes such that the pixels in the same class share the same or similar features while pixels in different classes have quite different features. The lowest level image feature is image intensity. Most of the unsupervised image segmentation models directly use image intensities to classify pixels. The advantage of unsupervised image segmentation is well-known. For example, it is fast; it does not need human’s interaction; even the number of classes is not required to be known before implementation. Meanwhile, the disadvantages are also well-known. For example, in Mumford-Shah model, if the number of classes is unknown, it is hard to give an expected result: different numbers of classes will lead to different segmentation results. Another example is the initialization during the implementation for a nonconvex model. When a model is nonconvex, the implementation usually leads to a local minimizer that may not be the expected result. There are also other shortcomings for unsupervised image segmentation. Some shortcomings can be or has been resolved by new mathematical methods. For example, when the number of classes is unknown, Chiu [31] and Wang [32] proposed different ways to solve the problem under an unsupervised setting. However, some drawback of unsupervised image segmentation could not be solved due to the inherent features of digital images. For example, when the pixels in the same class have quite different intensities or pixels in different classes have very close intensities, it is generally impossible to achieve an ideal result using an unsupervised segmentation model that is based on intensities only (this statement is not really true for texture image segmentation). Figure 1 shows the difference at an extreme case.

Figure 1 is a binary image. In our common sense, the image in Figure 1 contains two classes: the background (black part) and the foreground (white parts). Obviously, the foreground contains two parts. Any unsupervised image segmentation can easily segment the image into two classes. However, we are going to rethink the problem under a different assumption. Assume that we are only interested in the left white part and assume that the right white part is actually a part of the background which is blocked by something or presents quite different intensity for some reason and therefore should not be classified as the foreground. We are interested in obtaining segmentation for the foreground that contains only the left white part by running the code of an algorithm. In this sense, we mean to achieve segmentation with a foreground (see Figure 2) directly after running the code.

It is almost impossible for any unsupervised image segmentation method to achieve such a result only based on intensities of the image. The framework of the semisupervised and interactive image segmentation addressed in this paper is to provide a way to achieve such image segmentation by assigning some regions of the image to each specific class before implementation. The main idea for the semisupervised image segmentation that is developed in this paper is to introduce a classification strategy in which each pixel is classified based on not only its intensity but also the distances that the pixel is from those known regions. If a pixel’s intensity is closer to class but its position is closer to a known region that belongs to a different class, say class , then it is very possible that the pixel will be classified into class , rather than class .

For example, we can draw two regions in the image, red mask and green mask, as shown in Figure 3, and assign the red region to be the foreground and the green region to be the background. The red mask contains only white pixels while the green mask contains both black region and white region. Using the method developed in this paper, those regions that are also white but closer to the green mask are classified to background, rather than the foreground. Therefore, the resulted segmentation for the foreground contains only the left white part as shown in Figure 2 after using the proposed frame work.

3. Model Development

In this section, we start from two-phase images to model semisupervised segmentation using conditional probability. Then, we extend it to multiphase image segmentation.

3.1. Conditional Probability in a Semisupervised Image Segmentation

Let be an image with domain . Suppose contains two classes and , and . For , the conditional probability that providing is given byIn a semisupervised image segmentation, some regions are already assigned with class labels, called known regions. The task of supervised image segmentation is to determine the class for each pixel in unknown regions based on the features of and the features of the intensities of those pixels in known regions. When is in a known region and assigned with class label , we have . So, (1) is reduced toStill based on conditional probability, we haveTherefore,provided is known.

Let us consider the case of . Without loss of generality, assume . Combining (2) through (4) and assuming that is labeled in class , we haveIn this case, the value of or depends on the similarity of the features of pixel and pixel . The more their features are similar, the bigger the conditional probability should be. In other words, we are interested in the similarity between the feature of and the feature of . The right side of (5) can be characterized by both the probability that belongs to class (namely, ) and the similarity between the feature of and the feature of (namely, ). That is, the probability that both and belong to the same class is proportional to the similarity between the feature of and the feature of when the probability of is fixed. We use exponential function to characterize the similarity with , where is the function for features. and represent the features of pixels and , respectively. The easiest feature function is of intensity; namely, . We know that the exponential function lies in the range when the parameter . Moreover, the function is decreasing and reaches its maximum 1 at and approaches 0 when . That is, we have the following result relating to :(1) when the feature of and the feature of are very close.(2) when the feature of and the feature of are quite different.Therefore, the expression can characterize the probability very well.

By summarizing the discussion above, we have from (5) thatprovided is known, where is a weight. The choice of depends on the intensity scales in image .

For a gray image , the feature at some point can be simply denoted by its intensity . Then, from (6), we have

Together with (4), we haveprovided is known.

Suppose now that it is a region, not a pixel, that is assigned to a class. In this case, we are interested in such a conditional probability that , where is a region (a collection of pixels) that belongs to class . Similar to (8), we use the following formula accordingly:provided is known, where is the mean intensity of region .

Extension to Multiphase Image Segmentation. Given an image , suppose there are totally classes, denoted by . According to (9), the conditional probability that belong to given that belongs to can be characterized bywhere denotes the known region for the th class.

3.2. Soft Mumford-Shah Model

Let be an image domain and let be a gray image. Let be the pairwise disjoint regions in representing different classes of the image and the boundaries of each class. Then, the classic -class piecewise smooth Mumford-Shah model is to minimize the following energy functional (see [21, 23]):where are patterns for each class and denotes the measure of the boundary. For a two-dimensional image, is the length of the boundary. The parameters and are weights used to balance among the fitting term, smoothness term, and the boundary term. By minimizing the functional, the goal and essence of the middle term is to force each pattern to be smooth. The length of the boundary can be expressed by the total variation of the indication function of ; that is,where is the indication function of . It is well-known that -norm based image diffusion is better than -norm based image diffusion in that -norm is anisotropic but -norm is isotropic (see [33, 34]). In our developed model, the -norm of is changed to -norm. Then, the Mumford-Shah model in terms of -norm can be represented in terms of as below:

In a soft segmentation, each point may not exclusively belong to only one class. On the contrary, a point can partly belong to more than one class, which can be expressed using membership functions , . The value can be the percentage that belongs to the th class such as in applications of partial volume segmentation [5, 28] or the probability that a pixel belongs to the th class such as in applications of fuzzy image segmentation [14, 26]. For more details on soft segmentation, we refer readers to [10, 35]. A soft Mumford-Shah model can be viewed as a modification of the classic Mumford-Shah model by replacing the characteristic function of each class to the membership function. Accordingly, the corresponding soft Mumford-Shah model is to minimize the energy functional defined by with respect to patterns and membership functions .

3.3. Framework of Semisupervised Image Segmentation Based on Soft Mumford-Shah Model

In the framework of semisupervised image segmentation addressed in this paper, it is always assumeed that some subregion is already known for each class , . That is, is already known for some region . But the union of those known regions is not equal to . In general, the union of is much less than ; that is, . The task of the framework is to determine the segmentation for the rest of the region based on their intensity distribution and the intensity distribution of the known regions.

Let denote the probability that belongs to the th class provided is known. Then, the semisupervised soft Mumford-Shah model based on (14) can be described by minimizing the following energy functional:

Using (10) to denote the conditional probability , we havewhere is the mean intensity for those pixels that are closest to and in the th known region .

It is interesting to notice that the supervised soft Mumford-Shah model (16) turns out to be the unsupervised soft Mumford-Shah model when . Therefore, the developed model is a generalization of the unsupervised soft Mumford-Shah model.

4. Algorithm and Implementation

In the developed model, there are two sets of variables to be determined, the patterns and the membership functions . In order to solve for these variables, we need to calculate first. The task of the semisupervised image segmentation in this paper is to determine the unknown regions supposing some known regions are given for each class. In the developed model, we use the means of known regions to determine the class of each pixel in the unknown region. Due to the inhomogeneity such as bias, the means of different regions for a same class can be different. Based on this thinking, we choose means in the model for each pixel not as the overall mean of all known regions for th class, but the mean for the known region that is closest to .

The Euler-Lagrange equations of is

We choose primal-dual hybrid gradient algorithm (PDHG) [36] to solve the equation for . The primal-dual form with respect to is

The iterations on and are

Similarly, can also be solved using a PDHG based on a primal-dual form of the energy functional with respect to . However, such a solution will lead to a less supervised segmentation where the information of the known regions can not be utilized sufficiently. We assume the following nearest point principle to solve for .

4.1. Nearest Point Principle

The iteration of patterns is performed not based on the Euler-Lagrange equation, but based on its nearest congeneric points. That is,

Note that the membership functions are still updated in an unsupervised way in the framework. In order to achieve a more supervised segmentation result, we determine the class of each pixel at the decision-making step not only based on the membership values but also based on the similarity between the intensity and each pattern , .

In our framework, the membership functions solved from the above iterations (see (19)) are actually temporary ones. We then update the memberships based on the following rule: if and for some and , then put to . So, the known parts are updated byCorrespondingly, the unknown part is updated according to the following equation:

4.2. Algorithm

We now describe the complete algorithm. Given an image defined in a domain , if the image contains classes, then the complete algorithm for the semisupervised multiphase image segmentation is given below.

(1) Initialization(a)Initialize known parts using brushes.(b)Initialize unknown part by .(c)Initialize memberships: For each and , set and for ; for , set randomly.(d)Initialize patterns: For each and , set ; For any , set in terms of the nearest point principle as (20).

(2) Iterations(a)Update memberships by (19).(b)Update known areas by (21).(c)Update unknown area by (22).(d)Update patterns by (20).

(3) Termination. The iterations will be terminated if . In our application, we terminated the iterations when is very small. In this case, the classes of those pixels in the remain undetermined region will be determined simply using thresholding the membership functions.

5. Experimental Results

In this part, we first use a natural image to show the difference between supervised image segmentation and unsupervised image segmentation. Then, an application to MRI brain images is elaborated.

In Figure 4, we present a comparison of a flower segmentation between using unsupervised Mumford-Shah segmentation and using the proposed semisupervised soft Mumford-Shah model. The first row is about unsupervised segmentation and the second row is about semisupervised segmentation. (a1) is the original image of a flower, while (a2) is the original image of a flower added with masks for known regions, where the blue mask represents known foreground (the flower region) and the yellow masks represent background (nonflower region). (b1) shows the segmentation of the flower using unsupervised method while (b2) shows the segmentation of the flower using semisupervised method (the proposed method). The main difference lies in the center of the flower. In (b1), the black part of the center of the flower means it is misclassified to the background. However, by using semisupervised method and marking some region of the center of the flower, the flower can be well segmented out (see (b2)).

Figure 5 shows the shrinking procedure of the unknown area in the first 10 iterations in the flower segmentation, where dark areas are unknown areas. From the graphs, we see that the segmentation is almost done only after 10 iterations. For a image, the iterations take around 3 seconds in our laptop. However, if we use the corresponding unsupervised method, the iterations will take around 38 seconds under the same settings of convergence.

Before presenting the difference of MRI brain images between unsupervised method and the supervised method, let us see the challenges about such segmentation. One major challenge in MRI brain image segmentation is the central gray matter (also called deep gray matter) due to the intensity similarity and closeness between white matter and central gray matter. Figure 6 shows the comparison between the unsupervised segmentation and the ground truth. In Figure 6, (a1) and (a2) are the same original MRI brain image (the skull is removed in the preprocessing). (b1), (c1), and (d1) are the respective segmentation results for cerebrospinal fluid (CSF), gray matter, and white matter using piecewise constant soft Mumford-Shah model. (b2), (c2), and (d2) are segmentation results revised manually under the instructions of experienced radiologists, which is used as the ground truth in the experiment. Note that the two CSFs are almost the same. The major difference between the two sets of segmentation results is in the central part for gray matters and white matters. The white parts in (c1) and (c2) represent gray matter segmentation while the white parts in (d1) and (d2) represent white matter segmentation. By comparing the unsupervised segmentation results and the ground truth, we see that most of the gray matter in the central part, called central gray matter, was misclassified as white matter when the unsupervised Mumford-Shah model is applied.

Figure 7 shows the comparison between unsupervised soft Mumford-Shah method and the developed method. The figure contains three columns. The first column shows the segmentation using the unsupervised soft Mumford-Shah segmentation model; the second column is the segmentation using the developed segmentation method; and the third column is the segmentation obtained first with the unsupervised method and then fixed under experienced radiologists’ instructions (the ground truth). In the first column, (a1) is the original image, and (b1) through (d1) are the respective segmentation models of cerebrospinal fluid (CSF), gray matter, and white matter, respectively. In the second column, (a2) is the original image and (b2) through (d2) are the respective segmentation models of CSF, gray matter, and white matter using the developed semisupervised segmentation method. In the third column, (a3) is the image with masks drawn with hand by experienced radiologists and (b3) through (d3) are segmentation models (ground truth) obtained with the unsupervised segmentation method and then fixed with the masks drawn in (a3).

From the results, we can easily see that the segmentation results using supervised method is much better than the results using unsupervised segmentation in the central part of gray matter and white matter. The semisupervised segmentation results (b2–d2) are very close to the ground truth (b3–d3).

Figure 8 shows the comparison between using more known regions and using less known regions. The first column shows the segmentation models using the developed semisupervised segmentation method with less known regions marked; the second column shows the segmentation models also using the developed semisupervised segmentation method but with more known regions marked; and the third column is the ground truth. In each column, from the second row to the forth row are the segmentation models of cerebrospinal fluid (CSF), gray matter, and white matter, respectively. From the results, we see that the semisupervised segmentation with more labeled regions is better than the results with less labeled regions. The semisupervised segmentation results (b2)–(d2) are closer to the ground truth than the supervised segmentation results (b1)–(d1).

Next experiment shows the comparison among unsupervised soft Mumford-Shah model, unsupervised Mumford-Shah model with bias correction, the proposed semi-supervised image segmentation, and the ground truth. In Figure 9, the first column shows the segmentation models using unsupervised soft Mumford-Shah model, the second column is the segmentation models using Mumford-Shah model with bias correction, the third column is the segmentation using the proposed method, and the forth column is the ground truth. By comparing with the ground truth, we see that the method with bias correction is only a little better than the method without using bias correction. However, both results are obviously weaker than using the proposed method.

Finally, we compared the computational efficiency of the semisupervised method with the corresponding unsupervised method. Figure 10 contains two curves denoting convergence times (in seconds) for unsupervised soft Mumford-Shah model and the developed supervised method applied to the flower image shown in Figure 4. The experiment is carried out with a Lenovo T400 laptop. In the figure, the horizontal axis denotes the precision for the iterations to terminate and the vertical axis denotes the time in seconds for the segmentation to complete. The upper curve (blue curve) denotes the convergence time using unsupervised soft Mumford-Shah method while the lower curve denotes the convergence time for the developed semisupervised method after the known regions are marked. Obviously, the convergence for the semisupervised method is much faster than the unsupervised method.

6. Background, Discussion, and Conclusion

Based on the development of the semisupervised segmentation model, some regions must be assigned to each class before the numerical implementation is performed. Therefore, the results of semisupervised soft segmentation depends on two aspects: the intensity distribution of the given image and the known regions assigned. In case that the segmentation results are not satisfying, more regions can be assigned to some classes until a satisfying result is obtained.

This work is motivated by MRI brain image segmentation which is a part of our previous project supported by NIH grant. The project has been closed in the summer of 2012. It is well-known that, in the central area of a MRI brain image, the intensities of gray matter are usually very close to the intensities of white matter. Sometimes (actually very often), the intensities of central gray matter are bigger than the intensities of white matter not located in the central part. Therefore, unsupervised segmentation methods cannot obtain the expected result. Even for natural images, an object may have the same intensities as some other objects nearby. Therefore, semisupervised image segmentation is useful and necessary.

In a semisupervised image segmentation, it is necessary to choose some regions for each class as known regions. This can be done by embedding some code in the algorithm. However, it is convenient to construct a software program to integrate the function of choosing regions and semisupervised segmentation algorithm. In our project, we developed a software program mainly for MRI brain image segmentation, where the function for choosing some regions for each class is embedded. Readers who are interested in it can refer to the technical report [30]. For a three-dimensional image, one can use the developed supervised method to segment all slices of a 3D MRI brain image only by choosing some area as known regions from just single slice or a few slices. In this way, a lot of time can be saved.

The frame work of the developed semisupervised image segmentation is based on intensities for gray-level images as shown in Model (16). Nevertheless, the work can be easily extended to color images. Although the framework of supervised image segmentation developed in this paper is based on Mumford-Shah model, it can be easily extended to any other image segmentation model. For example, when the image contains some texture features, the frame work does not work very efficiently. In this case, feature-based model must be used in the frame work. Let be a function which maps an -dimensional image domain to a multidimensional (-dimensional) space of contextual features . For each point , is a vector containing image statistics or features. Such features can encode contextual knowledge about the regions of interest and their neighboring structures (e.g., size, shape, orientation, and relationships to neighboring structures). Feature-based image segmentation is extensively used in texture segmentation and some medical image segmentation. Therefore, an immediate future work is to develop a frame work for feature-based supervised multiphase image segmentation.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This research has been supported by NIH/R01 (no. 7095364), National Natural Science Foundation of China (no. 61572085), and West Virginia Clinical & Translational Science Institute (WVCTSI).