Abstract

Variation in signal intensity within mass lesions and missing boundary information are intensity inhomogeneities inherent in digital mammograms. These inhomogeneities render the performance of a deformable contour susceptible to the location of its initial position and may lead to poor segmentation results for these images. We investigate the dependence of shape-based descriptors and mass segmentation areas on initial contour placement with the Chan-Vese segmentation method and compare these results to the active contours with selective local or global segmentation model. For each mass lesion, final contours were obtained by propagation of a proposed initial level set contour and by propagation of a manually drawn contour enclosing the region of interest. Differences in shape-based descriptors were quantified using absolute percentage differences, Euclidean distances, and Bland-Altman analysis. Segmented areas were evaluated with the area overlap measure. Differences were dependent upon the characteristics of the mass margins. Boundary moments presented large percentage differences. Pearson correlation analysis showed statistically significant correlations between shape-based descriptors from both initial locations. In conclusion, boundary moments of digital mass lesions are sensitive to the placement of initial level set contours while shape-based descriptors such as Fourier descriptors, shape convexity, and shape rectangularity exhibit a certain degree of robustness to changes in the location of the initial level set contours for both segmentation algorithms.

1. Introduction

Breast masses are one of the most common indications of breast cancer. They are frequently identified on mammograms, due to their saliency relative to the surrounding regions and also to comparable regions on the mammograms with the same projection of the opposite breast [1]. Computer Aided Detection algorithms for breast mass classification exploit suitable shape-based descriptors derived from the mass boundary which are powerful enough to differentiate between benign and malignant masses. Segmentation algorithms are necessary for mass contouring in direct digital mammography. However, in this imaging modality, mass margins are embedded in complex backgrounds of overlying and underlying tissues which creates missing boundary information and local minima where a deforming contour can be entrapped and as a consequence produces an undesirable segmentation outcome. Moreover, the wide dynamic range of flat panel detector systems of direct digital mammography units records small differences between the attenuation coefficients of structures or regions present in a mass lesion and they are clearly distinguishable over a wide range of densities, whereas in film screen mammography the exposure latitude of the film limits the dynamic range of information captured on the film. Hence, masses which may have appeared as dense structures without significant topographical relief features on film screen mammograms can emerge following digital imaging, as regions with varying densities on soft copy display. Enhancement of these variations, following postprocessing by the processing algorithms of the manufacturer, may also be present. Usually, small differences in densities may sometimes appear as low signal areas which can act as local minima for contour entrapment each time an evolving curve determines its path within the mass lesion. Consequently, local minima and missing boundary information render deformable contours susceptible to their initial locations.

A geometric active contour is a deformable contour based approach for image segmentation. In breast mass segmentation, an initial contour is deformed and driven by a partial differential equation (PDE) towards the boundary of the candidate mass. It is categorized into two groups: edge based models [24] and region based models [513]. Both models make use of a stopping term which reduces the speed of the evolving contour as it approaches the boundary of the object and finally reaches a steady state at the boundary. In edge based models, the stopping term utilizes an edge indicator function modelled on the image gradient; consequently, objects with weak and noisy boundaries may present some difficulties to this segmentation model [14, 15].

The Chan-Vese region based algorithm models energy functionals as a competition of regional statistical information [16]. They defined the stopping term as a competition of the first moments of the local intensity distribution of the foreground and the background within a narrowband, which takes into consideration only pixels which will influence the propagation of the interface (zero level set function) between these two regions. The energy functionals drive the initial contour from its initial location toward a desirable local minimum, which in principle should correspond to the delineated boundary of an expert radiologist. However, these are determined by localized statistics; hence, the evolution of the curve becomes sensitive to the location of the initial level set contour and segmentation results will depend on the placement of this contour, especially when tuning parameters for an arbitrary collection of masses are fixed. This becomes evident during segmentation of direct digital masses with obscured or ill-defined margins and low signal areas within.

The active contours with selective local or global segmentation model [9] are a region based energy functional formulated as a signed pressure force function which propagates the initial contour by modulating the signs of the pressure forces inside and outside the region of interest. These pressure forces are derived from the means of the local intensity distributions of the foreground and the background. The algorithm penalizes the level set function to be binary and regularizes it with a Gaussian smoothing kernel. It can effectively handle images with weak edges and interior intensity inhomogeneity.

In most segmentation problems, the initial contour is either drawn by the operator or estimated from other segmentation algorithms [17, 2225] and this may place the initial level set contour on different locations within the mass. Any variation in segmentation outcomes will cause changes in shape-based descriptors and the area occupied by the segmented mass. Variations in segmentation outcomes which are due to the placement of the initial level set contours in complicated images have been mentioned [11]. Mass lesions on mammograms are complicated image domains for curve evolution and variations in mass lesion segmented areas and their influences on shape-based feature vectors due to changes in the placement of the initial level set contours are not found in the literature.

Understanding these inconsistencies can improve the choice of tuneable parameters and initial contour locations for curve evolution either for a data set of mass lesions with labelled margin characteristics or unlabelled margin characteristics. Shape-based descriptors [2628] are feature vectors in training sets for binary classification of mass lesions in mammography and changes in these descriptors can play a role in determining the interclass separability measures, the choice of margin hyperplanes, and hence the classification efficiencies of these algorithms.

In this study, we investigate changes in one-dimensional shape-based descriptors and the segmented areas of masses in direct digital mammograms due to changes in the location of the initial level set contours with the implementation of the Chan-Vese segmentation method and the active contours with selective local or global segmentation model. Two groups of masses are considered in this study, one with obscured or ill-defined margins and low signal areas within and the other with well-defined and distinct margins. We consider a contour which encloses the mass lesion and is propagated towards the margin of the lesion. We propose a semiautomatic method which derives the initial contour as a curve connecting points with maximum gradient in the radial direction, representing an optimum curve characterizing the intrinsic shape of the mass lesion, and then assess the differences in the segmentation results.

2. Background to Mathematical Methods

In mammography, smoothed images present topological surfaces that can be thresholded into multiple layers to obtain topographical relief maps of dominant structures found on the images. Mammograms are filtered with edge-preserving denoising methods such as weighted total variation (TV) scale-space smoothing technique [29, 30] to remove noise and fine details while preserving dominant edge characteristics through different degrees of smoothing.

2.1. Weighted Total Variation Scale-Space Smoothing Technique

Suppose denotes an image and the image domain. The variational approach for image denoising for this model involves the minimization of the following energy functional:where is the noisy input image and its regularized approximation. is the Lagrange multiplier indicating the scale of detail desired in the smoothed image. Bresson et al. proposed a modified model [30] in which the -norm square of Rudin et al.’s model is replaced with an -norm to preserve image contrast [31] and in addition the TV norm of is multiplied with a function, , which is an edge indicator function. This represents the weighted TV model with an -norm as a data fidelity measure. The energy functional for minimization is given aswithwhere is a constant >0 and is a Gaussian kernel with standard deviation, . The minimization of results in the following weighted TV flow equation:For small values of , the degree of image smoothing increases and edge is preserved; therefore, the global boundary information which is essential for segmentation algorithms can be modelled as the initial contour for the gradient descent flow equation of the level set. This contour will depend on the boundary properties of a given mass lesion.

2.2. Chan-Vese’s Piecewise Constant Model for Binary Segmentation

Suppose is an evolving curve that partitions the image domain into the foreground, , and the background, . The Chan-Vese model [16] seeks an optimal contour, representing the boundary of an object by minimizing the following energy functional: where represents the regional term guiding the contour in the image domain and is given byin which, , and and are positive constants while the average image intensities of regions inside and outside the contour are and , respectively. In level set formulation, the interface of the foreground and background is embedded as the zero level set of a Lipschitz function, : with for pixel positions in and for pixel positions in whilst on the curve . Using the Heaviside step function, , can be expressed as Minimizing with respect to yields the following gradient descent flow:where is the Dirac function.

2.3. Active Contours with Selective Local or Global Segmentation Model

The signed pressure force function [9] is derived from the means of regions inside and outside the contour and it is defined aswhere and are defined in (8). The active contour with selective local or global segmentation model utilizes the geodesic active contour to formulate the level set equation asUsing the Gaussian filtering process to regularize the level set function, the above equation can be written as follows:where is a tuneable parameter.

3. Method

3.1. Data Set Description

Direct digital mammograms were acquired from a Hologic Selenia Dimensions system with an image receptor consisting of a 70 μm pixel pitch selenium direct-capture detector. Ninety mammograms with mass lesions were selected for this study. Forty mammograms had masses with low signal areas within the mass and margins described as obscured, or ill-defined, while the others had masses with well-defined or distinct margins. On each mammogram, the region of interest containing the mass lesion was cropped and then resized to a matrix to create a submammogram. Each submammogram was denoised and thresholded to localize the initial level set contour.

3.2. Search Space for Localizing the Initial Level Set Contour

The weighted total variation scale-space smoothed breast mass region is represented as a topological surface in which the grey level value of each pixel is the height of the surface. Let denote a smoothed image and the image domain. The image domain is thresholded into multiple regions with an ordered set of equally spaced grey level threshold values within the intensity range of the image domain [3234]. Suppose = the maximum grey level intensity in the image domain; = minimum grey level intensity; , a finite sequence of equally spaced partition weights in ascending order; = number of threshold values; and , an ordered set of equally spaced grey level threshold values; then,with and .

The subregions in the image domain with grey level intensities less than or equal to the threshold value, , are given asand the iso-level contours ’s of these regions are boundaries of . The iso-level contour map of the image domain represents the set of all for . A graph-based representation of the iso-level contour map evaluates the enclosure relationship between an iso-level contour and its nearest neighbour, to identify the path to the base contour that delineates the mass. Details of this method can be found in the literature [32, 33]. In our implementation, the boundary region of the breast mass is the region around the base contour with a dense nested pattern of iso-level contours, indicating the search space for the actual boundary of the mass and the placement of the initial level set contour. The dense nested pattern of iso-level contours is extracted and superimposed on the gradient map of the smoothed image.

3.3. Placement of the Initial Level Set Contour

A set of uniformly spaced radial lines, , are generated from a point close to the centre of mass of the innermost iso-level contour, defining the search space on the gradient map of the mass as shown in Figure 1(d). Let this point be the reference point. The gradient strength is noted at every point of intersection of the nested iso-level contours and radial lines. Along each radial line, , for , the coordinates of the point of intersection with the greatest gradient strength are noted and the radial distance from this point to the reference point is calculated and noted as .

Let and ; then radial description of the initial level set contour is given byThe spatial coordinates of the points of intersection of ’s and the iso-level contours are the coordinates of the initial level set contour. Figure 1 illustrates the summary of the methodology in acquiring the initial level set contour and Figure 2 shows the variation of the radial distance function, , for , with the scale of observation, , in weighted total variation scale-space smoothing technique. The radial distance function of the initial level set contour corresponds to the radial distance from each point on the initial contour to the reference point with a sampling angle of 1°.

3.4. Evaluation Metrics of Segmentation Results

Manually drawn initial contours and those obtained from our proposed method were propagated with the Chan-Vese algorithm and the active contours with selective local or global segmentation model. Feature vectors representing boundary-based shape signatures and the areas occupied by the segmented mass lesions were assessed to provide relative measures of the differences between the segmented mass lesions.

3.4.1. Area Metric of Relative Size of Segmented Mass Lesion

Let represent the binary image obtained by evolving the initial level set contour from our proposed method and from the manually drawn initial level set contour; then, the area overlap measure, which is the Jaccard similarity coefficient between the binary images, and , is given as lies between 0 and 1. A perfect match between and is achieved as , consequently, the same segmentation outcome for both initial level set contours.

3.4.2. Evaluation Metrics of Shape-Based Descriptors

Boundary Moments. A boundary-based shape signature of the segmented mass lesion from each initial contour model is represented as the centroid distance function, which is a one-dimensional function representing the Euclidean distance between an ordered set of boundary coordinates and the centroid signifying the centre of mass of the binary image generated from the contour:where is the total number of points on the contour.

The centroid distance function captures the local and global characteristics of the final shape of the segmented mass lesion. Its statistical characteristics are assessed as shape features derived from the contour sequence moments and [35] where the th contour sequence moment is estimated asand the th central moment is estimated asThese shape features are normalized low-order boundary moments [36, 37] described aswhere is the normalized amplitude variation and and are indicators of shape roughness.

Spicules are fine extensions radiating from the margin of a mass lesion. The presence of these boundary features generates variations in the radial distances, which are indicative of contour roughness along the boundary of a mass lesion. The evaluation metric is the percentage change in the degree of spiculation between and and is expressed as the percentage difference in the boundary moments, ’s:

Fourier Descriptors. The centroid distance function can be analysed in the frequency domain to obtain spectral descriptors of its characteristics. Its spectral representation is expressed as the coefficients of its discrete Fourier transform, yieldingFeature vectors which are invariant to translation, scale, and rotation are extracted from these coefficients and are known as the Fourier descriptors for shape representation:Zhang and Lu [38] have shown that derived from the centroid distance function outperforms ’s derived from using complex coordinates, cumulative angles, and curvature function as boundary signatures in shape-based image retrieval system, and furthermore, in Zhang and Lu [39], they mentioned that 60 ’s are sufficient for shape indexing.

We define the evaluation metric of the initial level set contours yielding and based on the boundary signatures of the final contours delineating and in the frequency domain as the Euclidean distance between the Fourier descriptors of the images:where and are the th Fourier descriptors of the final contours delineating and .

Shape Convexity. Shape convexity measures the degree of spiculation in masses. The shape convexity of a binary image is defined as the ratio of the area of the binary image to the area of its convex hull [26]. Let and be the convexity of binary images and , respectively; the evaluation metric of the difference between the shape convexities of images and is defined as

Shape Rectangularity. Shape rectangularity [40] is defined as the ratio of the area of the binary image to the area of its minimal bounding rectangle. Let and be the shape rectangularity of binary images and , respectively; the evaluation metric of the difference between the shape rectangularities of images and is defined asDifferences in shape-based descriptors of the final contours were further evaluated with Bland-Altman analysis to explore the agreement and trends between placements of the initial level set contours in digital mass lesions segmentation while Pearson correlation analysis assessed the correlation between these descriptors.

4. Experimental Results and Discussion

In our implementation of the Chan-Vese method, we set , , and . We chose to give a greater weight to the variance of pixels in the foreground so as to achieve measurable segmentation differences between the proposed locations for the initial level set contours. Furthermore, we assigned to investigate changes in the final segmentation results due to differences in tuneable parameters. In practice, for a given database of masses, the values assigned to and depend on the similarity indices between segmentation results of a proposed algorithm and the gold standard of a training set of masses, which in some cases is a subset of the database. For the active contour with selective local or global segmentation model, we set for this database so that masses with ill-defined boundaries should be accurately segmented. The segmentation performances of this algorithm were poor with values of for this group of masses. The average time for curve evolution for these images was  s for the segmentation methods.

Boundary information represents sharp changes in image properties. Figure 2 shows that as the degree of smoothing increases the radial distance functions of the initial level set contours form a dense nested pattern of curves. The differences between these curves are very small because edge is preserved through different values of ’s in weighted TV scale-space smoothing technique; consequently, segmentation results with the initial level set contours generated from these curves are expected to be similar.

Segmentation results for some masses with low signal areas and having obscured, or ill-defined, margins are shown in Figure 3. The proposed method defines the initial level set contour as the curve connecting points with maximum gradients in the radial direction as shown in column 3. Each curve characterizes the intrinsic shape of its mass lesion and its evolution is guided by the statistics of pixels surrounding the region. For this group of masses, the mean area overlap measure between segmented areas generated from the final contours of our proposed method and that of the manually drawn initial level set contours were for the Chan-Vese model and with the selective local or global segmentation model. This is almost comparable to the mean area overlap measures between expert radiologists [17] and expert radiologists against segmentation methods [1721] as shown in Table 1. Therefore, changes in shape-based descriptors as expressed in our setup will be suggestive of changes in shape-based descriptors encountered by the abovementioned publications.

Table 2 shows the variation in the area overlap measures with percentage differences in boundary moments , , and when masses in Figure 3 were evolved with tuneable parameters , . The area overlap measure of mass D is greater than 0.8; however, the percentage difference in boundary moments was above 50%, with being 87.0%. The mean values of , , and for this group were 23.9% (range 1.0–87.0%), 24.5% (range 1.7–86.8%), and 32% (range 1.4–86.0%), respectively, as shown in Table 6. The mean values are large with wide range. For , , the mean values of the percentage change of each boundary moment were less than 20.2%. These large ranges and mean values show that boundary moments are sensitive to the location of the initial level set contour for masses with obscured or ill-defined margins and the degree of sensitivity depends on the choice of tuneable parameters. As shown in Table 7, the mean values of boundary moments , , and were obtained as 15.1% (range 0–74%), 15.4% (range 0–67.5%), and 23.5% (range 0–52%), respectively, by using the selective local or global segmentation model. These values are comparable to values obtained by implementing the Chan-Vese model for , .

In Table 3, the variation in Euclidean distances of the Fourier descriptors and the percentage differences in shape convexity and rectangularity for the masses in Figure 3 are illustrated. In Table 6, for , , the mean Euclidean distance between the Fourier descriptors of the segmented areas was while the mean values of percentage changes in shape convexity and rectangularity were 8.3% (range 0.0–28.1%) and 11.7% (range 0.1–42.0%), respectively, with more than 50% reduction in the mean values with tuneable parameters , . The values for the mean percentage difference in shape convexity and rectangularity and their range were less than those from boundary moments for both Chan-Vese algorithms. The selective local or global segmentation model presented similar results for the percentage differences in shape convexity and shape rectangularity as shown in Table 7.

Figure 4 illustrates the segmentation results with different locations for the initial level set contours for some masses with distinct, or well-defined, margins. The initial level set contour from the proposed method is shown in column 3. Fewer points defining the maximum gradients in the radial direction are found within the mass lesion, as compared with the previous group. Most points defining the maximum gradients in the radial direction are found on the mass boundary; consequently, the statistics of the pixels surrounding the initial level set contour will be similar to those of the manually drawn contour when it arrives at the edge of the mass lesion.

Table 4 shows the variation in the area overlap measures and the percentage differences in boundary moments , , and while Table 5 illustrates the variation in Euclidean distances between the Fourier descriptors (DF), percentage differences in shape convexity , and shape rectangularity when the masses in Figure 4 were evolved with tuneable parameters , . The area overlap measure of mass B was greater than 0.95; however, the percentage differences in boundary moments were above 18%. For masses with distinct or well-defined margins, similar segmentation results are expected and this is confirmed with a mean area overlap measure of as shown in Table 6. For this category of masses, the mean value of was 8.9% (range 0.3–25.0%); of , 8.6% (range 2.1–33%); and of , 14.1% (range 0.9–53.0%). The mean Euclidean distance between the Fourier descriptors of the segmented areas was and the mean values of percentage changes of shape convexity and rectangularity were 4.5% (range 0.07–17.2%) and 5.7% (range 0.04−14.9%), respectively. The values for the mean percentage differences in shape convexity and rectangularity were almost 50% less than those from boundary moments. This group presented a small percentage change in shape convexity and shape rectangularity and also a small mean Euclidean distance of the Fourier descriptors as compared to the previous group due to segmentation results having relatively similar shapes. For these groups of masses, shape-based descriptors derived from final contours of tuneable parameters , were less sensitive to changes in the location of the initial level set contours. Table 6 shows that the mean percentage differences of the shape convexity and shape rectangularity are less than the values for the boundary moments. Table 7 illustrates similar trends with the selective local or global segmentation model; however, the Jaccard similarity indices of the Chan-Vese segmentation model for this group of masses were greater than values obtained by using the selective local or global segmentation model.

The evaluation metrics of shape-based descriptors of both groups of masses were combined and assessed with Bland-Altman plots to investigate the intermethod agreement between placements of the initial level set contours. Each Bland-Altman plot was evaluated within a 95% confidence interval as the limits of agreement.

Figures 5 and 6 illustrate the linear regression plots of boundary moments, shape rectangularity, and shape convexity with their associated Bland-Altman plots with the Chan-Vese segmentation method. The Pearson correlation analysis indicated good correlations between the shape-based descriptors: shape rectangularity () and shape convexity () resulting from the final contours of the proposed and manual methods as compared to boundary moments , , and . The selective local or global segmentation method gave higher correlation coefficients for these shape descriptors. Table 8 shows the summary results of the linear regression analysis of shape-based descriptors for these masses and their variation with tuneable parameters. values indicated that the correlations of shape-based descriptors derived from these methods were statistically significant (). The strength of the linear relationship () between the descriptors derived from these methods depends on the values of tuneable parameters, and , for the Chan-Vese model. For this database of masses, the correlation coefficients of descriptors obtained with tuneable parameters and were higher than those with parameters and ; however, this does not imply that tuneable parameters and will provide higher values of similarity measures when segmentation results are compared with segmentation outcomes of expert radiologists. Overall, the performance of the selective local or global segmentation model was similar to the performance of the Chan-Vese segmentation model for this database of direct digital mammographic masses.

The difference plots in Figures 5 and 6 show that differences in shape-based features for masses with distinct or well-defined margins are scattered very close to the central bias line as compared to masses with obscured, or ill-defined, margins, thus indicating that the magnitude of differences in shape-based descriptors due to changes in the placement of the initial level set contours depends on the mass margin characteristics. Other researches have reported the variation of segmentation accuracy with the characteristic of the mass margins for a given segmentation algorithm [41]. The correlations (, ) between differences in shape-based descriptors due to changes in the placement of the initial level set contours and the average magnitude of descriptors from both algorithms were very poor and they were not significantly different from zero.

In general, the mean area overlap measure of the combined categories was , the mean Euclidean distance between the Fourier descriptors was , and moreover, in the Bland-Altman plots, the differences in shape-based descriptors of 90% of these masses are within the limits of agreement; therefore the interplacement agreement of the initial level set contours based on these descriptors is acceptable. However, both segmentation methods illustrated large variation in boundary moments as compared to shape-based descriptors such as shape convexity, shape rectangularity, and Euclidean distance of the Fourier descriptors. Hence, boundary moments should be utilized with caution because they exhibit large percentage differences.

Interobserver variability amongst radiologists and intermethod variability in delineating masses in mammography translate to differences in shape-based feature vectors. The magnitude of these differences should however not be so large as to compromise the interclass separability measures and hence the classification accuracies of shape-based binary classifiers. This can be achieved if these feature vectors show a certain degree of robustness to interobserver and intermethod variability in segmented masses.

5. Conclusion

We have investigated and quantified the variations in shape-based features in segmentation outcomes due to differences in the location of the initial level set contour for mass lesion segmentation in direct digital mammography. The Chan-Vese segmentation method and the active contours with selective local or global segmentation model presented similar results. The results show that the magnitude of these variations expressed as area overlap measures and percentage differences in shape-based features depend on the characteristics of the mass margins and the choice of tuneable parameters. For masses with distinct or well-defined margins, percentage differences are reduced as compared to those with ill-defined or obscured margins for both segmentation algorithms. The mean percentage differences in boundary moments and their ranges were large as compared to those of shape convexity and shape rectangularity, even though the area overlaps measures were within acceptable values. The influences of these variations on the classification accuracy of shape-based binary classifiers will depend on the magnitude of the interclass separability measures; however, large fluctuations in these values for the same mass are undesirable. Finally, we concluded that boundary moments are sensitive to the placement of initial level set contours while Fourier descriptors, shape convexity, and shape rectangularity exhibit a certain degree of robustness to changes in the location of the initial level set contours.

Conflict of Interests

Both authors declare that there is no conflict of interests regarding the publication of this paper.