Abstract

The objective of this paper is to explore an expedient image segmentation algorithm for medical images to curtail the physicians’ interpretation of computer tomography (CT) scan images. Modern medical imaging modalities generate large images that are extremely grim to analyze manually. The consequences of segmentation algorithms rely on the exactitude and convergence time. At this moment, there is a compelling necessity to explore and implement new evolutionary algorithms to solve the problems associated with medical image segmentation. Lung cancer is the frequently diagnosed cancer across the world among men. Early detection of lung cancer navigates towards apposite treatment to save human lives. CT is one of the modest medical imaging methods to diagnose the lung cancer. In the present study, the performance of five optimization algorithms, namely, k-means clustering, k-median clustering, particle swarm optimization, inertia-weighted particle swarm optimization, and guaranteed convergence particle swarm optimization (GCPSO), to extract the tumor from the lung image has been implemented and analyzed. The performance of median, adaptive median, and average filters in the preprocessing stage was compared, and it was proved that the adaptive median filter is most suitable for medical CT images. Furthermore, the image contrast is enhanced by using adaptive histogram equalization. The preprocessed image with improved quality is subject to four algorithms. The practical results are verified for 20 sample images of the lung using MATLAB, and it was observed that the GCPSO has the highest accuracy of 95.89%.

1. Introduction

Lung cancer, also known as lung carcinoma, is a malignant tumor characterized by uncontrolled growth of the cell in tissues of the lung. It is mandatory to treat this to avoid spreading its growth by metastasis to other parts of the body. Most cancers that start in the lung are carcinomas. The two main types are small-cell lung carcinoma and non-small-cell lung carcinoma [1]. Long-period tobacco smoking is the primary factor for 85% of lung cancers [2]. About 10–15% of cases occur in people who have never smoked but due to air pollution, secondhand smoking, asbestos, and radon gas. Computer tomography (CT) and radiographs are the conventional methods to detect the presence of lung cancer. The diagnosis is confirmed by biopsy which is usually performed by bronchoscopy or CT scan. The cause of cancer-related death among men is mainly due to lung cancer. Hence, it is essential to determine a new robust method to diagnose the lung cancer at an earlier stage [3]. For the present study, 20 lung image samples and four algorithms have been taken for analysis. It was proved that the combination of adaptive median filter, adaptive histogram equalization, and guaranteed convergence particle swarm optimization- (GCPSO-) based algorithm has more accurate results among others.

2. Methods

In medical image segmentation, the accuracy is foremost important, as it deals with human lives. It is highly crucial to eradicate the incidence of noise content and to improve the image quality before an examination [4]. This part of work is known as preprocessing. In the preprocessing stage, noise removal and contrast enhancement are two primary steps. In the present study, the performance results of median, adaptive median, and average filters to isolate the presence of speckle noise have been compared. The coding for the same has been implemented using MATLAB. Furthermore, the image quality and visual appearance are improved by adaptive histogram equalization. The second stage of work is segmentation. This stage consists of applying five methods, namely, k-means, k-median, particle swarm optimization (PSO), inertia-weighted particle swarm optimization (IWPSO), and GCPSO. The tumor portion was extracted from the segmented results of the above-said five methods and compared with manual extraction. The results show that the GCPSO-based segmentation has more accuracy than the others. Figure 1 depicts the process of operation for the present study.

2.1. Median and Adaptive Median Filters

The median filter removes the noise and retains the sharpness of the image. Accordance to the name, each pixel is replaced by the median value from the neighborhood pixels. A 3 × 3 window is used in this filter [5]. This is one of the best filters among conventional filters which remove the speckle noise. The steps followed to construct the median filter are given in Algorithm 1.

(1)Assume the input matrix “A” which has M rows and N columns.
(2)Construct a matrix with M + 2 rows and N + 2 columns by appending zeros to sides of the input matrix
(3)Take a mask of size 3 × 3.
(4)Place the mask on the first element, i.e., element on the first row and first column of matrix “A”.
(5)Select all the elements listed by the mask and sort them in ascending order.
(6)Take the median value (center element) from the sorted array and replace the element A(1, 1) by the median value
(7)Slide the mask to the next element.
(8)Repeat the steps from 4 to 7 until all the elements of matrix “A” are replaced by their corresponding median value.

Spatial processing to preserve the edge detail and to eliminate nonimpulsive noise by the adaptive median filter plays a vital role. The small structure in the image and edges are retained by the adaptive median filter. In the adaptive median filter, the window size varies with respect to each pixel.

2.2. Average Filter

This is a simple filter which removes the spatial noise from a digital image. The presence of spatial noise is mainly due to the data acquisition process. The neighborhood mean value is measured for each and every pixel and is replaced by the corresponding mean value. This process is repeated for every pixel in the image [5]. All the pixels in the digital image are modified by sliding the operator over the entire range of pixels. The steps followed for the average filter are given in Algorithm 2.

(1)Assume the input matrix “A” which has M rows and N columns.
(2)Construct a matrix with M + 2 rows and N + 2 columns by appending zeros to sides of the input matrix.
(3)Take a mask of size 3 × 3.
(4)Place the mask on the first element, i.e., element on the first row and first column of matrix “A”.
(5)Select all the elements listed by the mask and find the average
(6)Take the mean value from the sorted array and replace the element A(1, 1) by the median value.
(7)Slide the mask to the next element.
(8)Repeat the steps from 4 to 7 until all the elements of matrix “A” are replaced by their corresponding median value.
2.3. Histogram Equalization

Image enhancement is the technique which is used to improve the image quality. For better understanding and analysis, it is mandatory to enhance the contrast of medical images. The conventional method used for this operation is histogram equalization. A minor adjustment on the intensity of image pixels is done in this method. Each pixel is mapped to intensity proportional to its rank in the surrounding pixels. The steps followed for histogram equalization are given in Algorithm 3 [6].

(1)Obtain the histogram for the input image and find the probability mass function.
(2)Find the cumulative distributive function; from that, find the CDF according to gray levels.
(3)Find the new gray levels by using the following equation:
CDFNew = CDF (number of gray levels − 1).
(4)Map the new gray levels into a total number of pixels and plot the modified histogram.
2.4. k-Means Clustering Algorithm

The simplest and conventional method in cluster analysis is the k-means clustering algorithm. This algorithm segregates the given dataset into two or more clusters [7]. The accuracy of this method completely depends on the selection of the cluster center. It is mandatory to select the optimum cluster center to get a better result. The Euclidean distance is the general measure to segregate the dataset [8]. Pixels are assigned to an individual cluster based on the Euclidean distance. The objective function used in this algorithm iswhere are the pixels, are the cluster centers, is the Euclidean distance between and , is the number of data points for the cluster, and is the number of cluster centers [9]. The steps followed for k-means clustering are given in Algorithm 4.

(1)Select the cluster centers. Let them be “C.”
(2)Calculate the Euclidean distance.
(3)Take each and every pixel and assign them into the appropriate cluster if the Euclidean distance is minimum between the cluster and pixel.
(4)Once the segregation is completed for all the pixels, recalculate the new cluster center using the following formula:
(5)Repeat the steps from 2 to 4 for some number of iterations or until a certain condition is encountered.
2.5. k-Median Clustering Algorithm

This is also a clustering algorithm slightly modified from the k-means algorithm. In centroid calculation instead of calculating the mean value, the median value is considered. This algorithm significantly reduces the error since there is no squared operation as in the calculation of the Euclidean distance. The clusters formed by this method are more compact. As an alternate, this approach uses the Lloyd-style iteration. The steps followed for k-median clustering are given in Algorithm 5 [10].

(1)Select the random cluster centers. Let the number of cluster centers be “C.”
(2)Calculate the Euclidean distance.
(3)Take each and every pixel and assign them into the appropriate cluster if the Euclidean distance is minimum between the cluster and pixel.
(4)Once the segregation is completed for all the pixels, recalculate the new cluster center using the median value instead of using a squared formula.
(5)Repeat the steps from 2 to 4 for some number of iterations or until a certain condition is encountered.
2.6. Particle Swarm Optimization

PSO is a metaheuristic algorithm used efficiently in medical image analysis [11]. It mimics the social behavior of the birds searching for food [12]. The fundamental idea of PSO is sharing and communicating the information. In this approach, each particle has initial position and velocity. Based on the fitness value, the velocity and position are updated. The relevant two equations in PSO to update the position and velocity are as follows [11, 12]:where r1 and r2 are the random numbers and the acceleration coefficients c1 and c2 are two positive constants. The success of PSO relies on the fitness function. The following fitness function has been used for the present study:where n is the number of clusters. The steps followed for the particle swarm optimization are shown in Algorithm 6.

(1)Initialize the velocity and position of all the particles with random values.
(2)Define a fitness function.
(3)Find the fitness value for each particle.
(4)Compare the fitness value with the best fitness. If the fitness values are better, then set the current value as new pbest.
(5)Repeat steps from 3 to 5 for each particle.
(6)Update the velocity using equation (1).
(7)Upgrade the position.
(8)Update gbest.
(9)Repeat steps from 7 to 9 until certain conditions are encountered or for the predefined number of iterations.
2.7. Inertia-Weighted Particle Swarm Optimization

The exploration and exploitation in PSO are based on the inertia weight. The basic PSO, presented by Eberhart and Kennedy in 1995, has no inertia weight. In 1998, Shi and Eberhart introduced the concept of inertia weight by adding constant inertia weight. They stated that a significant inertia weight facilitates a global search, while a small inertia weight facilitates a local search [14]. This enhances the convergence rate and reduces the number of iterations. Inertia weight less than 1, in general, improves the results. The used method improves the convergence rate and saves the time taken and some iterations.

The resulting velocity update equation becomeswhere is the inertia weight, with constant inertia weight = 0.7 and random inertia weight = 0.5 + rand()/2.

2.8. Guaranteed Convergence Particle Swarm Optimization

The GCPSO focuses on a new particle which deals with the current best position in the region. In this task, this particle is treated as a member of the swarm, and the velocity update equation for this new particle is given as follows [15]:

The search ability is increased by the social part. This will improve the random search in the area around the gbest position. The random vector and diameter of the search area are r and ρ(t), respectively. The range of the random vector lies between 0 and 1. The diameter of the search area can be updated using the following equation:where the terms #successes and #failures are defined as the number of consecutive successes and failures, respectively. The threshold parameters sc and fc are determined empirically. Since it is hard to obtain a better value in only a few iterations in a high-dimensional search space, the recommended values are thus sc = 15 and fc = 5. On some benchmark tests, the GCPSO has shown an excellent performance of locating the minimal of a space after unimodal with only a small amount of particles. The steps to be followed for the GCPSO are shown in Algorithm 7.

Initialization(1)Initialize the number of clusters and number of iterations.(2)Initialize , sc, fc, numSuccess = 0, and numFailures = 0.(3)Define a fitness function.
Clustering(4)Find the fitness value for each particle.(5)Update the local best solution obtained so far.(6)Repeat steps 4 and 5 for the predefined number of iterations.(7)Update velocity and position of each particle for the current global best particle.
Selection step(8)Execute the selection operator.(9)If any local best position yi has changed, perform the clustering algorithm. Otherwise, end the algorithm.

3. Performance Measures

Certain performance measures are used to evaluate the results obtained from medical image segmentation. The list of performance measures used to assess the filter operation is shown in Figure 2 [16]. Let If be the image after noise reduction and I0 be the noisy image.

Performance measures used for the evaluation of the results of the segmentation algorithm are given in Figure 3 [17].

4. Results and Discussion

The used methods are practically implemented using MATLAB coding, and the results were verified.

In the preprocessing stage, a comparison was done between the performance of median, adaptive median, and mean filters. The SSI and SMPI values are shown in Table 1 and Figures 4 and 5. From the results, it is evident that the adaptive median filter has accurate characteristics than the mean and median filters for medical image segmentation.

The segmentation accuracy was measured using the true positive rate, true negative rate, false positive rate, and false negative rate by comparing the results from the algorithm with manual segmentation results. The practical results of the k-means clustering segmentation algorithm are shown in Table 2.

The practical results of the k-median clustering segmentation algorithm are shown in Table 3.

The practical results of the PSO-based segmentation algorithm are shown in Table 4.

The practical results of the IWPSO segmentation algorithm are shown in Table 5.

The practical results of the GCPSO segmentation algorithm are shown in Table 6.

The graphical view of the comparison of the true positive rate, true negative rate, false positive rate, and false negative rate for the algorithms used is shown in Figures 69. It is proved that the true positive and true negative rates are high and false positive and false negative rates are low for the GCPSO algorithm.

The comparative evaluation based on the accuracy of the segmentation is shown in Table 7 and Figure 10. The results indicate that the GCPSO-based technique has the highest average value of accuracy than the other methods.

The resultant images after preprocessing are shown in Figures 11(a) and 11(b).

The resultant images after segmentation using k-means clustering are shown in Figure 12.

The resultant images after segmentation using k-median clustering are shown in Figure 13.

The resultant images after segmentation using the PSO algorithm are shown in Figure 14.

The resultant images after segmentation using the IWPSO algorithm are shown in Figure 15.

The resultant images after segmentation using the GCPSO algorithm are shown in Figure 16.

In an earlier research, lung cancer detection was done using PSO, genetic optimization, and SVM algorithm with the Gabor filter and produced an accuracy of 89.5% [18]. The method to detect lung cancer by means of K-NN classification using the genetic algorithm produced a maximum accuracy of 90% [19]. The comparative results with respect to the above-said methods are shown in Table 8.

The graphical comparative analysis between the used and existing methods is shown in Figure 17.

5. Conclusion

In this study, various optimization algorithms have been evaluated to detect the tumor. Medical images often need preprocessing before being subjected to statistical analysis. The adaptive median filter has better results than median and mean filters because the speckle suppression index and speckle and mean preservation index values are lower for the adaptive median filter. Comparing the five algorithms, the accuracy of the tumor extraction is improved in GCPSO with the highest accuracy of 95.8079%, and it obtained above 90% of precision in all the 20 images. It is more accurate when compared to the previous method which had an accuracy of 90% in 4 out of 10 datasets only. In future studies, the use of more number of optimization algorithms will be included to improve the accuracy.

Data Availability

The CT images data used to support the findings of this study have been deposited in the LungCT-Diagnosis repository (doi.org/10.7937/K9/TCIA.2015.A6V7JIWX).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.