Abstract

The prevalence of melanoma skin cancer disease is rapidly increasing as recorded death cases of its patients continue to annually escalate. Reliable segmentation of skin lesion is one essential requirement of an efficient noninvasive computer aided diagnosis tool for accelerating the identification process of melanoma. This paper presents a new algorithm based on perceptual color difference saliency along with binary morphological analysis for segmentation of melanoma skin lesion in dermoscopic images. The new algorithm is compared with existing image segmentation algorithms on benchmark dermoscopic images acquired from public corpora. Results of both qualitative and quantitative evaluations of the new algorithm are encouraging as the algorithm performs excellently in comparison with the existing image segmentation algorithms.

1. Introduction

The purpose of this study is to test the performance of perceptual color difference saliency algorithm for segmentation of melanoma skin lesion in dermoscopic images. Melanoma is a cancer of pigment that produces melanocytes and is one of the most serious, complex, aggressive, and fatal forms of all skin cancer related diseases [1]. It is a cancerous skin disease that typically results from environmental factors such as exposure to sunlight [2]. It originates from the parts of the body such as skin, eyes, brain, spinal cord, and mucous membrane containing melanocytes. The ability to spread widely to other parts of the body is a unique characteristic that makes melanoma one of the deadliest skin cancer diseases. Its prevalence is rapidly increasing across the world, as recorded death cases of its patients continue to annually escalate [1, 3, 4]. Prevention which is better than cure and early detection are recommended as the best strategies for improving outcomes in melanoma and to reduce the induced mortality rate of the disease because treatment at later stages can be hard [1, 46].

Digital dermoscopy is a widely used noninvasive tool that combines optical magnification and special illumination techniques to render an improved dermoscopic image for clinical diagnosis of melanoma. Dermatologists have regularly applied this tool for several decades to analyze the surface structure of human skin that is invisible to the naked eyes [7, 8]. However, this diagnostic process is time-consuming and highly subjective and it requires a great deal of experience from a dermatologist [4, 8]. Due to the complexity of melanoma treatment at later stages, researchers are attempting to develop an efficient noninvasive automated computer aided system to make its diagnosis faster and easily accessible to nonexpert practitioners [4, 811]. Such an automated system relies heavily on reliable segmentation of skin lesion, pertinent extraction of skin lesion features, and effective classification of skin lesion using the extracted features [11].

This study focuses on segmentation of melanoma skin lesion in dermoscopic images because other subsequent diagnostic stages heavily depend on its output [7, 8, 12]. Moreover, segmentation is one of the central stages for computer aided diagnosis of melanoma with dermoscopic images [13]. The automatic segmentation of skin lesion is particularly challenging because of the possible presence of undesirable factors in the form of skin hairs, specular reflections, variegated coloring, weak edges, low contrast, irregular and fuzzy borders, marker ink, color chart, ruler marks, dark corners, skin lines, blood vessels, and air or oil bubbles [10, 1416].

The method of saliency based segmentation has emerged as an important tool for medical image analysis because of its capability to identify salient objects in images [17, 18]. Its application in computer vision is largely inspired by the findings that human vision perception has a higher probability to focus on the part of an image that carries useful information [17, 19]. The cognitive properties of visual saliency incorporated into the conventional saliency segmentation methods are based upon local or global visual rarity such as contrast prior, color prior, brightness prior, and center prior. Contrast prior is one of the frequently used visual rarities which assumes that color contrast between object and background is usually high to detect visual saliency. Color prior assumes that background has uniform color while salient object colors are variegated. Brightness prior assumes that the brightness of background is higher than that of the salient object [13]. However, while methods based on these cognitive properties have performed well on certain images, they can fail to accurately detect salient objects that share uniform homogeneity with the background and for salient objects that touch the image border slightly [20]. The center prior assumes that images are acquired such that a salient object is often framed near the image center while background is distributed in the borders. However, salient objects in many images often appear off the image centers which makes the center prior map incorrectly suppress salient objects far off the image centers and highlight certain background regions near the image center [21, 22].

The methodology of the perceptual color difference saliency segmentation algorithm reported in this paper consists of four essential stages. They are color image transformation, luminance image enhancement, salient pixel computation, and image artifact filtering. The main contributions of this paper are as follows:(a)The new saliency algorithm effectively segment melanoma skin lesion in dermoscopic images through the aggregation of color feature of a background pixel and color feature of an object pixel.(b)The new saliency algorithm uses a simple decision rule that does not follow the conventional thresholding methods for binary segmentation of melanoma skin lesion in grayscale dermoscopic saliency map.(c)The outputs computed by the new saliency algorithm are qualitatively evaluated using test images acquired from public medical corpora and quantitatively evaluated in terms of precision, recall, accuracy, and dice which are widely used statistical metrics for evaluating binary segmentation results.(d)A detailed evaluation against other existing saliency and nonsaliency benchmark algorithms is performed that provided a fair comparison to demonstrate the performance of the new saliency algorithm.

The discussion of related studies is organized in four dimensions in order to show currency, originality, relevance, and relatedness of this study to the previous research and to justify the suitability of the study methodology. These dimensions are nonsaliency based segmentation, saliency based segmentation, color image models, and perceptual color difference.

2.1. Nonsaliency Based Segmentation

Many image segmentation algorithms have been developed to deal with the complex problem of segmenting skin lesion from the healthy skin. They can be appositely categorized into region, edge, and pixel based methods [8, 23]. Region based methods such as the modified JSEG [12], region growing [24], modified watershed [25], and statistical region merging [26] group image pixels into clusters and maintain connectivity between cluster pixels. Edge based methods such as zero-crossing of Laplacian-of-Gaussian [27] and geodesic active contour [28] are aimed at detecting discontinuities in image pixel intensity values [29]. Pixel based methods group similar pixels as belonging to a homogenous cluster that corresponds to an object or part of an object [30] and are widely applied because of their inherent simplicity and robustness [31, 32]. Thresholding and clustering algorithms are archetypes of the pixel based methods that have been applied for segmentation of skin lesion [9, 33]. Research has revealed that existing segmentation algorithms achieve good results when dermoscopic images exhibit good contrast and in the absence of undesirable factors. However, they often lack robustness for low contrast images and may not perform well on complex images that exhibit significant volume of undesirable artifacts [4, 7].

2.2. Saliency Based Segmentation

Saliency based methods have received a great deal of attention in cognitive science, computer vision, and image processing [34] and they have been applied to image segmentation [3436]. However, the application of saliency methods for segmentation of skin lesion is relatively new [13, 17, 37]. Saliency segmentation computes the most informative region in an image based on human vision perception such that salient and nonsalient parts become foreground region (skin lesion) and background region (healthy skin), respectively. It has been alluded that a good saliency segmentation model should satisfy three essential criteria of good segmentation, high resolution, and computational efficiency [38]. Good segmentation means that the probability of missing real salient regions and falsely marking background regions as salient regions should be low. High resolution means that saliency maps should possess high resolution to accurately locate salient objects and retain original image information. Computational efficiency means that saliency based segmentation methods should rapidly detect salient regions with less complexity. This paper reports a less complicated saliency based image segmentation algorithm that achieves good performance and generates a high resolution saliency map containing much salient pixels.

Many saliency based segmentation algorithms reported in literature are based on color feature of an input image, but they significantly differ in their computational strategies. Color is one of the most important cues that people use extensively to identify real world objects. It is widely used in medical image analysis for screening dermoscopic images in order to discriminate between healthy skin and unhealthy skin regions [39]. The basic assumption in most cases of dermoscopic image analysis is that the lighter shade of color corresponds to healthy skin, while unhealthy skin possesses different color distribution that differs from the healthy skin [40]. Itti et al. [41] introduced a computational saliency segmentation model based on color, intensity, and texture features for rapid scene analysis. A method to segment salient regions in video sequences based on the application of luminance information has been discussed by Zhai and Shah [42]. The spectral residual approach based on Fourier transform has been reported [43] with several improvements to segment salient objects in images [34, 44]. However, many of the improved saliency segmentation algorithms still face difficulty when salient objects share similar color features with the background pixels. These algorithms often lack the ability to effectively handle complicated images with low contrast [18, 20, 37]. Complementing the methods of saliency computation with other useful analysis methods such as the morphological analysis can significantly improve image segmentation results. The hybrid segmentation of skin lesion in dermoscopic images using wavelet transform along with morphological analysis has been reported [1], while segmentation using saliency combined with Otsu threshold has been discussed [13].

2.3. Color Image Models

The importance of selecting a suitable color model for color image segmentation has been emphasized in the literature [4345]. Since the appearance of skin in an image is illumination dependent, different color models are widely used for skin lesion analysis with the objective of finding a color model where the color of skin lesion is invariant to illumination conditions. Researchers have attempted to identify the most discriminating and effective color models for processing skin lesion in dermoscopic images. The decomposition of a color image into constituent components is a good analysis technique for medical diagnosis because essential information is conveyed in the color of an image [46].

The segmentation of skin lesion in dermoscopic images using wavelet networks considers the , , and channels of the color model as the network inputs and network structure formation [47]. The segmentation of skin lesion in dermoscopic images based on wavelet transform along with morphological analysis found the channel of the color model to give better performance than grayscale conversion [1]. The segmentation of skin lesion based on the , normalized , YIQ, and color models has been reported to give good results for channel of YIQ and channel of [48]. It has been found by experimental comparison of HSI, CMY, YCbCr, and CIE color models that the “” channel of the HSI and “” channel of CIE gave good results for segmentation of skin lesion [49].

This study applies the CIE color model instead of the widely used color model for segmentation of melanoma skin lesion. The color model is perceptually uniform; it separates luminance and chrominance information and comes with different intrinsic human visual perception based color difference formulae that are useful for saliency computation [11]. However, the color model is not perceptually uniform and it does not separate luminance and chrominance information because of the high correlated nature of its channels [50]. The information in all the channels of the color image is utilized in this study to ensure that no useful color information is otherwise discarded.

2.4. Perceptual Color Difference

Color analysis is an important topic in different studies such as prosthodontics, aesthetics, and dental materials science where color quantification is used to gain the understanding of scientific data [51]. The clinical relevance of these studies is highly dependent on how much color change is considered perceptible. The determination of color difference has been proposed in the literature to improve the correlation between color measurement and human vision perception. The measurement of color difference is considered an important problem for color analysis. The practical application of color difference is mostly found in clinical dentistry, where the ability to reproduce the exact shade of natural teeth using restorative dental material is considered a challenging problem [5156]. The other useful applications of color difference include content-based retrieval [57], quality inspection of food [58, 59], and video compression [60], but it has not been well explored for saliency based segmentation of melanoma skin lesion in dermoscopic images.

There are diverse color difference formulae which are designed to provide a quantification of the correlation between the computed and perceived color differences. The most widely used formula of them includes the CIELAB and CIELUV recommended by the Commission Internationale de l’Eclairage (CIE). The CIEDE2000 color difference formula is applied in this study because it is the recent CIE recommendation with more consistent trends in lightness and hue angle dependencies [54]. It was designed to improve the earlier color difference formulae and correction between the computed and perceived color differences. It incorporates a term that accounts for the interaction between Chroma and hue differences, a modification of the coordinate that affects colors with low Chroma and parameters that account for the influence of illumination and vision conditions in color difference [54]. In addition, it reflects the color differences perceived by the human eye and is generally recommended for evaluating color difference thresholds in dental research and in vivo instrumental color analysis [56].

3. Material and Methods

The discussion of the experimental images, perceptual color difference saliency, and algorithm implementation are presented in this section.

3.1. Experimental Images

Dermoscopic images used for experimentation in this study are acquired from the International Symposium on Biomedical Imaging (ISBI 2016) challenge [61] and Pedro Hispano Hospital (PH2) corpora [62]. These corpora particularly inspired us because they contain numerous challenging dermoscopic images and support the development of automated algorithms for the analysis of skin lesion. A dermoscopic image is considered to be “challenging” if one or more undesirable factors are present in the image. These challenging images are usually excluded from test images in the previous research in order to ensure accurate border segmentation [12, 63].

3.2. Perceptual Color Difference Saliency

The essential stages of the methodology of perceptual color difference saliency are color image transformation, luminance image enhancement, salient pixel computation, and image artifact filtering.

3.2.1. Color Image Transformation

The input RGB color image of dimensions has values in the range , where and are the number of rows and columns, respectively. The image is transformed into CIE color image to achieve perceptual color image for saliency computation. The process of transforming an Adobe color image to CIE color image is usually performed in two steps. The first step converts the Adobe image into CIE image according to the following equation [64, 65]:where , , and are defined in terms of the constant gamma value which in this study is . The parameters and in (2) are added to correct the values obtained from digital cameras to obtain the best possible calibration of the transformation model [6466]:

In the second step of the transformation process, the CIE image is transformed to the CIE image following the ITU-R BT.709 recommendation. The transformed image serves as input to the luminance image enhancement function. The D65 illuminant is used in this study where , , and are the CIE tristimulus values of standard light source [64, 67]: where

3.2.2. Luminance Image Enhancement

The transformation of color image alone does not alleviate the adverse effect of illumination or low contrast. This is because an absolute separation between luminance and chrominance channels is not achievable due to high correlation between the image channels [68, 69]. It is therefore desirable to enhance luminance channel of the input image which does not change the original color of a pixel [69]. The adaptive gamma correction function has been recommended for this purpose because a fixed gamma correction function is not always desirable for all types of images. The following adaptive gamma correction function is applied in this study to enhance the luminance channel of the transformed input image [69]:

The images and are input luminance and output luminance, respectively, and is the adaptive gamma correction value that controls the slope of the transformation function. The Heaviside function returns a value of 1 if its argument is greater than 0; otherwise it returns a value of 0. Rahman et al. [69] gave logarithm and exponential adaptive gamma correction functions to, respectively, enhance low contrast and high contrast images. The functions gave impressive segmentation results for a number of images. However, for some high contrast images such as an image with a mean value of 0.7097, standard deviation of 0.1513, and gamma value of 1.0720, the image enhancement needs further improvement as shown in Figure 1(c). The segmentation result can be seen to improve as shown in Figure 1(d) with an increase in the gamma value from 1.0720 to 2.9212 using the product of logarithm and exponential functions introduced by Rahman et al. [69] as the gamma correction function:where and are the global standard deviation and global mean of the luminance image, respectively. The enhanced luminance image together with the chrominance images serve as input to the salient pixel computation function.

3.2.3. Salient Pixel Computation

Pixel saliency can be computed in terms of the difference of color feature with the global mean of this color feature [35, 38]. However, this method has difficulty in distinguishing similar color feature in background and object regions in an input image [70]. In this study, the mean of background color feature and mean of object color feature are computed instead of the global mean to correct this deficiency. The mean of background color feature can be estimated by the mean of pixel values on an ellipsoidal patch drawn close to image borders. Similarly, the mean of object color feature can be estimated by the mean of pixel values within a rectangular patch drawn close to the image center. This design principle follows the assumption of center prior [6, 10, 21, 33, 71]. However, this study applies a different computational strategy to cater for the identification of skin lesion pixels not necessarily framed near the image center.

The applied computational strategy is compactly described as follows. The background mean and background standard deviation are computed from values of pixels on an ellipsoidal patch traced by the midpoint ellipse algorithm to achieve computational efficiency [72, 73]. Moreover, the object mean is computed from values of pixels within a rectangular patch whenever the inequalities and are concomitantly satisfied, where are the image channels, is a given pixel value, , is the image dimension, and standard deviation is used in this study. In addition, other values of can be used, but the value of has been experimentally found to give good segmentation results in this study. Figure 2 shows the diagrammatic illustration of image patches used for the computation of mean values. The yellow ellipse represents a set of color pixels that is used for the computation of background mean. The red solid rectangle represents a set of pixels that is used for the computation of object mean. It is important to note the difference between Figures 2(b) and 2(d) from the rectangular shapes. The segmentation algorithm computes object mean for those pixels within the rectangular patch that differ from background pixels following the assumption of color prior [13, 40]. In Figure 2(b), not all pixels in the rectangular patch are object pixels, but in Figure 2(d) all pixels in the rectangular patch are object pixels, hence the principal reason for the observed difference in the rectangular shapes.

The color difference of background color feature with mean of this color feature, , and color difference of object color feature with mean of this color feature, , are computed for each pixel to preserve spatial information. These two measures are then aggregated to create a grayscale saliency map, , whose entry can be determined as follows:

The resolution of a salient pixel is determined by the degree to which every value of the salient pixel tends to the maximum grayscale value of 255. Salient pixels are those pixels of a dermoscopic image that contain useful information for diagnosis purpose. The binary saliency map, , is constructed to provide high resolution and good segmentation [38]. The value tends to 255 for a salient pixel and 0 for a nonsalient pixel according to the following simple decision rule: In fact, (7) and (8) can be combined into one equation such that nonsalient pixels are assigned the value of 0 to realize a high resolution grayscale saliency map as follows:

The saliency of a pixel as measured by (7)–(9) is controlled by the value of the color difference between the object color feature and mean of this feature. Large value of corresponds to week saliency and low value of corresponds to strong saliency. The parameters and can be computed using the accurate CIEDE2000 color difference formula which is symbolically denoted in this paper by . The color difference between two given color values (pixel color feature) and (mean color feature) in the CIE color model is defined as [54, 74, 75] The parametric weighting factors are correction terms for experimental conditions, where the differential color vector components that represent the differences in lightness, Chroma, and hue arewhereThe rotation function that accounts for the interaction between Chroma and hue differences in the blue region is mathematically expressed asThe parametric weighting functions that adjust the total color difference for variation in the location of the color difference pair in the coordinates of the color model areThe symbols used in the rotation and parametric weighting functions are defined in terms of the hue angle for a pair of color samples as follows:whereThe expression in (11) and (16) means that “” in radian is to be expressed in degree and the expression in (13) and (14) indicates that “” in degree is to be expressed in radian. The other symbols appearing in the color difference equation are defined as follows:

3.2.4. Image Artifact Filtering

The computed binary saliency map is the input to the artifact filtering function, so any desirable algorithm can be used to filter the saliency map. The prime objective of the artifact filtering is to remove any extra element that might be remaining after segmentation and select a single connected region that is more likely to be the actual skin lesion. The two approaches for removing artifacts from images are preprocessing and postprocessing. This study implements the postprocessing approach to achieve computational efficiency because not all the three channels of the image are processed to remove artifacts.

This study applies the morphological analysis as the artifact filtering tool to remove undesired elements in the binary map while maintaining the structural properties of skin lesion. Morphological analysis is important in digital image processing because it can preserve structural properties of skin lesion and rigorously quantify many aspects of the geometrical structure of images in agreement with the human perception [16, 25]. The relationship between each part of an image can be identified when processing with morphological theory [25, 33]. The structural character of an image in a morphological approach is analyzed in terms of some predetermined geometric shapes such as disk, diamond, and squared shapes which are known as structuring elements [33]. The MATLAB median filter, clear border function, and morphological operations of opening and closing are used in this study. The median filter with structuring element of size is first used to eliminate hairs and smooths against noise because of its capability to reduce bubble intensity and prevent fuzzy edges [16, 28]. It is widely used in digital image processing because it preserves edge information under certain conditions while removing oversegmentation. The filter considers each pixel in the input image in turn and looks at its nearby neighbors to decide whether or not it is a representative of its surroundings. It is usually evaluated by ordering all pixel values from the surrounding neighborhood and the pixel being considered is replaced with the middle pixel [76].

The opening operation smooths object contours, breaks thin connections, removes thin protrusions, and eliminates those objects smaller than the structuring element using morphological erosion followed by morphological dilation. The disk structural element is created to preserve the circular nature of lesion when performing morphological opening operation. The radius of the structural element is specified in this study to be 11 pixels so that large gaps can be filled adequately. The resulting binary image is then closed using the morphological closing operation by performing dilation followed by erosion. The same disk structural element that is created in the opening operation is used for the closing operation. The closing operation smooths object contours, joins narrow breaks, and fills long thin gulfs and holes smaller than the structuring element. The “clear border” function is finally used to remove vignette and disconnected objects touching the image borders. However, for nondisconnected objects touching the image borders, we recommend the use of a more effective border processing algorithm to avoid the inherent limitation of the MATLAB “imclearborder” function.

3.3. Algorithm Implementation

The algorithmic implementation of the method of perceptual color difference saliency (PCDS) is succinctly outlined based on mathematical equations (1)–(17). The asymptotic time complexity of the PCDS algorithm is for an input color image of dimensions . The PCDS algorithm is described step by step in Algorithm 1.

Input:    color image.
Output:   grayscale saliency map,   silhouette saliency map.
It is assumed that the standard color difference formula described by equations (10) to (17)
has been implemented to be invoked in the computation of a saliency map in step (12)
of this algorithm.
(1) for all    do
(2)for all    do
(3) transform the Adobe image to CIE image using equations (1) and (2).
(4) transform the CIE image to CIE Lab image using equations (3) and (4).
(5)end for
(6) end for
(7) enhance the luminance channel of CIE Lab image using equations (5) and (6).
(8) compute mean of representative background pixels on an ellipsoidal patch.
(9) compute mean of representative object pixels within a rectangular patch.
(10) for all    do
(11)for all    do
(12) compute grayscale saliency map using equation (7) or equation (9).
(13) compute binary saliency map using equation (8).
(14)end for
(15) end for
(16) filter binary saliency map using morphological analysis or any desirable method.
(17) stop

4. Discussion of Experimental Results

The experimental results obtained by the PCDS algorithm are discussed in this section. The PCDS algorithm is qualitatively and quantitatively compared to the spatially weighted dissimilarity (SWD) [77], principal component analysis (PCA) [78], Markov chain (MC) [79], and saliency based skin lesion segmentation (SSLS) [17] which are benchmark saliency segmentation algorithms. In addition, we establish comparison with the Otsu algorithm [71], -means clustering [80], fuzzy -means (FCM) clustering [81], and modified JSEG [12] which are benchmark nonsaliency segmentation algorithms. The source code for the SSLS algorithm with default parameter settings has been provided by the author whereas the source codes for the SWD, PCA, and MC algorithms are readily available at the following website: https://github.com/MingMingCheng/SalBenchmark/tree/master/Code/matlab.

4.1. Qualitative Evaluation of Segmentation Results

The purpose of the qualitative evaluation is to test the performance of the PCDS algorithm through qualitative comparison with existing saliency and nonsaliency based benchmark algorithms.

4.1.1. Comparison with Saliency Algorithms on ISBI 2016 Images

The segmentation results obtained by the PCDS algorithm is qualitatively compared with the results obtained by the existing benchmark saliency segmentation algorithms using test images acquired from the ISBI 2016 challenge corpus. Figure 3 shows a few examples of the original and ground truth images under varying conditions such as the presence of air bubbles (Im1 and Im2), presence of thick hair (Im3), low contrast (Im4, Im5, and Im6), and thin hair (Im7). In Figure 3, it can be seen that most of the skin lesions are correctly and consistently highlighted by the PCDS algorithm across all test images. However, when dermoscopic images possess air bubbles and illumination variation as in Im1, a situation whereby the skin lesion color distribution appears uneven, the other four benchmark saliency algorithms do not effectively and consistently highlight the skin lesion as our PCDS algorithm.

The PCDS algorithm generates an improved saliency map with more defined image boundaries when compared to the four other benchmark saliency algorithms. This is evidence in the case of Im2 that exhibits air bubbles and simultaneously presents similar color intensity between skin lesion and background skin. It is worth mentioning from an observation that virtually, for all images shown in Figure 3, the SWD algorithm has the poorest performance because it generates saliency maps with low resolution, blurry, and poorly defined borders. Moreover, it can be observed that all the saliency algorithms are able to achieve satisfactory results for dermoscopic images with high contrast as in Im3. However, for low contrast images such as Im4, Im5, and Im6, the PCA and SSLS algorithms do not uniformly highlight the salient objects. In fact, these algorithms could only highlight certain parts of the lesions while some parts share similar intensities with background color and salient lesions smaller in size when compared to the ground truth lesion. Although the MC algorithm has performed better than PCA and SSLS algorithms, it can be observed that saliency maps generated by the MC algorithm possess heterogeneous regions and fuzzy boundaries not uniformly highlighted. Contrarily, the PCDS algorithm outperforms the others in completely and uniformly highlighting the lesion objects with no varying colors. This indicates that the PCDS algorithm assigns uniform saliency values to the pixels within the salient objects.

In addition, another interesting observation from Figure 3 can be seen in Im7 that other benchmark algorithms highlight only the visible part of the skin lesion when a skin lesion possesses thin hair and low contrast. Interestingly, only the PCDS algorithm has detected the tail end of the lesion as seen in the ground truth lesion which can lead to diagnostic error. The impressive performance of the PCDS algorithm in segmenting all images considered can be attributed to the effective measurement of color difference between uneven lesion color distributions using the accurate CIEDE2000 formula.

4.1.2. Comparison with Saliency Algorithms on PH2 Images

The segmentation results obtained by the PCDS algorithm is qualitatively compared with the results of existing benchmark saliency algorithms using test images acquired from the PH2 corpus. Figure 4 shows some saliency maps produced by the PCDS algorithm along with those of other algorithms. Moreover, we have noted that the PCDS algorithm achieves good segmentation against the other algorithms. This is because the PCDS algorithm has the advantage of uniformly highlighting the whole salient object with high resolution as seen across the entire dermoscopic images.

The SWD algorithm has the least performance on PH2 images as seen in the segmentation results depicted in Figure 4. Moreover, it can be observed that saliency maps generated by the SWD algorithm are blurry and do not convey much useful information with respect to identifying the skin lesion. Although the PCA algorithm can correctly locate the skin lesion in the images, usually the algorithm highlights certain parts of the salient lesion boundaries as seen in Im6 and Im7 which can lead to diagnostic error. In addition, it can be observed further that the PCA algorithm fails at detecting the precise location of the skin lesion. It can be observed, for example, in Im1 and Im2, that the PCA algorithm detects skin lesion in such a way that it touches the image border which is oversegmentation.

The MC algorithm highlights skin lesion boundaries and detects skin lesion. However, it can be seen that boundaries of the saliency map are imprecise and fuzzy across the test images. This can result in the segmentation of healthy skin as skin lesion if fuzzy based thresholding algorithms such as the Huang and Wang [82] are applied for binary segmentation of the saliency map. Furthermore, it can be observed that the SSLS algorithm is able to highlight skin lesion boundaries, but still it cannot assign uniform salient pixel values in the inner part as in Im3, Im4, and Im6. In addition, it can be observed that the skin lesion produced by the SSLS algorithm in Im7 is smaller than ground truth skin lesion. In sharp contrast, it can be seen that the PCDS algorithm, to a greater extent, uniformly highlights the skin lesion, predicts precise location of the skin lesion, and produces well defined skin lesion borders. This clearly indicates that the PCDS algorithm shows a good performance and desirable saliency segmentation with reference to the ground truth dermoscopic images.

4.1.3. Comparison with Nonsaliency Algorithms on ISBI 2016 Images

The binary segmentation results obtained using the default thresholding method of the PCDS algorithm on ISBI 2016 images are presented in this section. The image artifact filtering method is not performed in this particular case in order to test the performance of the PCDS default thresholding without being aided. Figure 5 shows some examples of binary segmentation results produced by the PCDS default thresholding with other nonsaliency benchmark algorithms. The lesion images for the qualitative comparison are the same ISBI 2016 images presented in Figure 3, but in the absence of artifact filtering. However, it is worth mentioning that the implementation of the modified JSEG algorithm is inherently embedded with preprocessing and postprocessing methods to deal with artifacts which we do not have control over.

The results in Figure 5 show that, despite the absence of image artifact filtering, it is easy to note that binary segmentation results produced by the default PCDS thresholding show performance improvement. Specifically, one can see that PCDS algorithm gives a better segmentation result for Im1. It is observed that Otsu, -means, and FCM algorithms produced incomplete binary segmented lesions smaller in size than ground truth lesions. This problem can be attributed to the illumination variation in the original dermoscopic image in Im1 that the algorithms cannot deal with intelligently. In addition, there is a considerable amount of border irregularities in the lesion borders of the binary segmented images produced by the modified JSEG algorithm. This is a conspicuous demerit as border irregularities caused by inaccurate segmentation can mislead the automatic diagnosis process.

Moreover, Im2 reveals that Otsu, -means, and FCM algorithms exhibit poor performances when the input image has low contrast between the skin lesion and healthy skin. It is also noticeable that, apart from the presence of image artifacts in the binary segmented images produced by Otsu thresholding, -means, and fuzzy -means, some parts of the healthy skin share similar color intensities as the lesion. This is an indication that Im2 contains heterogeneous regions with different visual properties. Most especially when the healthy skin intensity is similar to the lesion as it can be seen that the healthy skin in the segmented binary images produced by these three algorithms share similar color intensities like skin lesion as seen in Im5. Still, on Im2, there is an indication that the modified JSEG algorithm failed to produce a binary segmented image because the algorithm is unsuccessful at segmenting Im2 as shown with white block written FAILED. The unsuccessful cases recorded by the modified JSEG algorithm as reported in this study is not the first of its kind. The original authors of the algorithm reported similar unsuccessful cases produced by the algorithm during experimentation [12]. Moreover, Norton et al. [83] reported an unsuccessful case of the modified JSEG for failing to segment fourteen test images in the most challenging situations. On the other hand, despite the absence of image artifact filtering, Im2 produced by the PCDS algorithm does not contain oil bubbles as seen in the original image. It is evident that the PCDS algorithm gives good binary segmentation results when compared to other benchmark nonsaliency algorithms.

In Im3, aside from the presence of thick hair, all the four nonsaliency algorithms produce binary segmented images similar to the ground truth images. This happens when there is a good contrast between the lesion and healthy skin; thus the lesion boundaries are well defined. However, it can be observed that the modified JSEG algorithm segmented hair trace to be part of the lesion which as stated earlier can result in diagnostic error. In Im6, we can see that the binary segmented image produced by the PCDS default thresholding is almost comparable to the result of the modified JSEG algorithm. It can be observed that the appearance of the PCDS default thresholding still produced well connected and precise lesion border than those of Otsu, -means, and FCM algorithms as seen in Im6. Eventually, Im7 shows that the PCDS default thresholding produced a full representation of the skin lesion when compared to four other nonsaliency algorithms.

4.1.4. Comparison with Nonsaliency Algorithms on PH2 Images

The binary segmentation results obtained using the default thresholding method of the PCDS algorithm on PH2 images are presented in this section. Figure 6 shows some examples of the binary segmentation results in the absence of artifact filtering. There is no apparent differential between results produced by all the algorithms. However, the PCDS algorithm shows slight improvement when compared to the Otsu, -means, and FCM for low contrast images as in Im5. Slight improvement in border irregularities can be seen in Im3, Im6, and Im7 produced by the modified JSEG algorithm when compared to the ground truth images. The less apparent differential is because the acquired PH2 images are not in varying imaging conditions as those of the ISBI 2016 images. However, many of the acquired PH2 test images exhibit vignette effect which is mainly due to the challenge of using round circular lens designed for a smaller sensor in dermatoscope [1].

In summary, the PCDS algorithm performs favorably against the benchmark algorithms as shown in Figures 36. The algorithm produces more stable discriminating saliency maps with high resolution and it uniformly highlights salient objects across the test images. Moreover, the algorithm extracts lesion borders in challenging conditions and it handles the problems of illumination variation and low contrast more effectively. These results validate the performance of the PCDS algorithm in handling challenging images and they demonstrate that implementation steps of the algorithm are relevant for its overall performance.

4.2. Quantitative Evaluation of Segmentation Results

The purpose of the quantitative evaluation is to test the performance of the PCDS algorithm through quantitative comparison of binary segmentation results with the existing benchmark saliency and nonsaliency algorithms on dermoscopic images acquired from the ISBI 2016 and PH2 corpora. The quantitative evaluation allows generalization to a large set of test images that cannot easily be achieved by qualitative evaluation because of few test samples. This study applies the precision (), recall (), accuracy (), and dice () evaluation metrics to quantitatively score the binary segmentation results computed by the comparative algorithms. These evaluation metrics are widely used for judging the performance of binary segmentation algorithms [8, 13, 19, 20, 47, 48, 61, 8385]. A binary segmentation algorithm with satisfactory performance has high precision, recall, accuracy, and dice values.

Precision is the ratio of the number of skin lesion pixels correctly identified to the total number of pixels in the saliency map. Recall is the ratio of the number of skin lesion pixels correctly identified to the total number of skin lesion pixels in the saliency map. Accuracy is the total number of pixels correctly identified to the total number of pixels in the saliency map. Dice coefficient measures agreement between the ground truth and result of automated segmentation method. The formal definitions of these evaluation metrics are based on the following parameters. True positive is the count of skin lesion pixels correctly identified as skin lesion pixels. False negative is the count of skin lesion pixels incorrectly identified as healthy skin pixels. False positive is the count of healthy skin pixels incorrectly identified as skin lesion pixels. True negative is the count of healthy skin pixels correctly identified as healthy skin. These measures are mathematically defined as follows [13, 47]:

The performance of binary segmentation using the PCDS algorithm is compared with the widely used Otsu thresholding algorithm [86] because thresholding algorithms are conventionally applied for binary segmentation of salient objects from grayscale maps [17, 87]. Table 1 shows ten comparative image segmentation algorithms compared in this study to establish the performance of binary segmentation using the PCDS algorithm.

4.2.1. Precision Scores

Table 2 lists the average precision (AVEP) scores and corresponding standard deviation (STDP) scores for each set of test images. It can be seen in Table 2 that the PCDS algorithm consistently recorded the highest AVEP score of 0.8911 and lowest STDP score of 0.1166 on ISBI 2016 test images. However, the SSLSOtsu algorithm recorded the lowest AVEP score of 0.6439 (0.2154) on ISBI 2016 images. Since the STDP score of 0.2154 for the SSLSOtsu algorithm is lower than that of the modified JSEG algorithm (0.2363) and Otsu algorithm (0.2445), the SSLSOtsu algorithm has better precision than modified JSEG and Otsu algorithms on some of the ISBI 2016 test images.

The PCDSOtsu algorithm consistently recorded the highest AVEP score of 0.9617 and lowest STDP score of 0.0503 on PH2 test images. However, the Otsu algorithm recorded the lowest AVEP score of 0.5557 and highest STDP score of 0.3697 on PH2 test images. The Otsu algorithm with the highest STDP score did not give better precision than any of the other algorithms on the PH2 test images. These results generally indicate that the PCDS algorithm consistently recorded good precision on ISBI 2016 test images, while the PCDSOtsu algorithm consistently recorded excellent precision on PH2 test images.

4.2.2. Recall Scores

Table 3 lists the average recall (AVER) scores and corresponding standard deviation (STDR) scores for each set of test images. It can be seen in Table 3 that the SSLSOtsu algorithm consistently recorded the highest AVER score of 0.9998 and lowest STDR score of 0.0012 on ISBI 2016 test images. The SWDOtsu algorithm recorded the lowest AVER score of 0.8014 (0.1893) on ISBI 2016 images. Since the STDR score of 0.1893 for the SWDOtsu algorithm is lower than that of the modified JSEG algorithm (0.2330), the SWDOtsu algorithm recorded better recall than modified JSEG on some of the ISBI 2016 test images.

The MCOtsu algorithm recorded the highest AVER score of 0.9620 on PH2 test images. However, the STDR score of 0.1420 for the MCOtsu algorithm is higher than that of the PCDSOtsu algorithm (0.0766) and PCDS algorithm (0.0467). The PCDSOtsu and PCDS algorithms recorded better recall than MCOtsu algorithm on some PH2 test images. The SWDOtsu algorithm recorded the lowest AVER score of 0.6569 on PH2 test images. However, the STDR of 0.2144 for the SWDOtsu algorithm is lower than those of the nonsaliency based algorithms which implies that the SWDOtsu algorithm recorded better recall than nonsaliency based algorithms on some of the PH2 test images. These results generally indicate that the SSLSOtsu algorithm oversegment ISBI 2016 test images because it achieves imbalance precision (0.6439) and recall (0.9998) while the PCDS algorithm consistently gave excellent recall on PH2 test images because it achieves balance precision (0.9499) and recall (0.9586).

4.2.3. Accuracy Scores

Table 4 lists the average accuracy (AVEA) scores and corresponding standard deviation (STDA) scores for each set of test images. It can be seen in Table 4 that the PCDS algorithm consistently recorded the highest AVEA score of 0.9769 and lowest STDA score of 0.0303 on ISBI 2016 test images. The SSLSOtsu algorithm recorded the lowest AVEA score of 0.8868 (0.1086) on ISBI 2016 images. Since the STDA score of 0.1086 for the SSLSOtsu algorithm is lower than that of the PCDSOtsu algorithm (0.1194), modified JSEG algorithm (0.2297), and Otsu algorithm (0.1204), the SSLSOtsu algorithm recorded better accuracy than these algorithms on some of the ISBI 2016 test images.

The PCDSOtsu algorithm consistently recorded the highest AVEA score of 0.9888 and lowest STDA score of 0.0113 on PH2 test images. However, the SWDOtsu algorithm consistently recorded the lowest AVEA score of 0.9185 and highest STDA score of 0.1172. The SWDOtsu algorithm with the highest STDA score did not give better accuracy than any of the other algorithms on the PH2 test images. These results generally indicate that the PCDS algorithm consistently recorded excellent accuracy on ISBI 2016 test images, while the PCDSOtsu algorithm consistently recorded excellent accuracy on PH2 test images.

4.2.4. Dice Scores

Table 5 lists the average dice (AVED) scores and corresponding standard deviation (STDD) scores for each set of test images. The PCDS algorithm can be seen in Table 5 to consistently record the highest AVED score of 0.9342 and lowest STDD score of 0.0709 on ISBI 2016 test images. The SSLSOtsu algorithm recorded the lowest AVED score of 0.7601 (0.1820) on ISBI 2016 test images. Since the STDD score of 0.1820 for the SSLSOtsu algorithm is lower than that of the modified JSEG algorithm (0.2301) and Otsu algorithm (0.2312), the SSLSOtsu algorithm performed better than modified JSEG and Otsu algorithms on some of the ISBI 2016 test images.

The PCDS algorithm gave the highest AVED score of 0.9522 and lowest STDD score of 0.0287 on PH2 test images. However, the Otsu algorithm recorded the lowest AVED score of 0.62627 and highest STDD score of 0.3697 on PH2 test images. The Otsu algorithm with the highest STDD score did not compute segmentation outputs with better agreement with the ground truth than any of the other algorithms on the PH2 test images. These results generally indicate that the PCDS algorithm consistently computed segmentation outputs that have excellent agreement with the ground truth images across the ISBI 2016 and PH2 test images.

4.2.5. Performance Scores

The coefficient of variation (CV) statistic is ultimately used in this study to determine the algorithm that gives best performance across different test images. The CV is a standardized dispersion measure of a probability distribution that represents the ratio of standard deviation to mean. The weighted mean of coefficient of variations (MCV) unifies the scores associated with a given evaluation criterion across different test images. The MCV value of 1 means low dispersion (excellent result) in the evaluation criterion and a value of 0 means high dispersion (inferior result) in the evaluation criterion. Given a set of distributions with mean values of and standard deviation values of , the MCV is determined with the largest weight given to the largest sample as follows:where is the total number of datasets and weight functions sum up to unity:The main reason to use sample sizes as weight functions in MCV calculation is that an algorithm that performs well on a large set of test data is preferable to that which performs well on a small set of test data. In this study, the sizes of ISBI 2016 and PH2 test images are, respectively, 70 and 50. In fact, we deliberately selected more test images from the ISBI 2016 corpus because it has more challenging images than PH2 corpus. The PH2 contains 200 melanocytic lesions whereas the ISBI 2016 contains 900 dermoscopic images with ground truths of both sets of images available [13]. Consequently, , , and . In the special case of (19) reduces to the following equation:

Table 6 shows the result of applying (21) to compute the MCV for precision (Precision_MCV), recall (Recall_MCV), accuracy (Accuracy_MCV), and dice (Dice_MCV). The overall performance score for each comparative algorithm is based on the utility function obtained by averaging the scores for all evaluation criteria. The result in Table 6 shows that, ranking in terms of the utility function, the PCDS algorithm recorded an excellent overall performance and is ranked in the first position while the Otsu algorithm is ranked in the tenth position. The ranking of each algorithm in terms of individual criterion is also given with the PCDS algorithm leading. Surprisingly, the PCDSOtsu algorithm did not rank second following the PCDS algorithm which means that the binary segmentation technique of the PCDS algorithm is effective. In the literature, the Otsu algorithm is acclaimed to be optimal for binary segmentation, but its performance is poor for segmentation of melanoma skin lesion in dermoscopic images as experienced in this study. Finally, it is important to note that the low performance scoring of the modified JSEG algorithm is mainly due to its inability to segment some of the test images.

5. Conclusion

This paper reports a new image segmentation algorithm based on perceptual color difference saliency (PCDS) that integrates both background and foreground information for segmentation of skin lesion in dermoscopic images. The PCDS algorithm has been tested on 120 challenging dermoscopic images acquired from the ISBI 2016 challenge and PH2 corpora. The algorithm has been quantitatively compared with a variety of saliency and nonsaliency benchmark algorithms using famous statistical evaluation metrics of precision, recall, accuracy, and dice. The experimental results of this study show that PCDS algorithm achieves excellent performance in segmenting skin lesion in dermoscopic images with different classes of challenges when compared to benchmark algorithms investigated in this study. Moreover, the PCDS algorithm tends to be more robust to the presence of air bubble, thick hair, and low contrast than other comparative algorithms investigated in this study.

Future work will focus on the extraction of distinctive skin lesion features for the classification of melanoma skin lesion in dermoscopic images using the PCDS algorithm for segmentation. In addition, we plan to extend the PCDS method to other existing color models and color difference formulae for comparative purpose. In addition, it will be prudent to look at other practical applications to test the performance of the PCDS algorithm on images with other challenges. The one important aspect of the PCDS algorithm that needs further investigation is the estimation of mean value of background color pixels and mean value of object color pixels because effectiveness of the algorithm heavily depends on accurate estimation of these statistics. It is also essential to combine color cue with other cues such as texture to further improve the performance of the PCDS algorithm.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This publication is supported by postgraduate research grants from the Durban University of Technology in South Africa. The authors wish to thank Professor M. Emre Celebi from the Department of Computer Science at the University of Central Arkansas, Conway, AR, USA, for providing the source codes for their previous work. The authors are also grateful to Mrs. Seena Joseph for running the OpenCV implementation of -means algorithm on experimental data and the set.