Abstract

Medical image segmentation identifies an area that should be analyzed later in the processing process, such as for disease recognition and classification. As the image search area is reduced, this action allows for faster computation and analysis. We propose the use of a heuristic red fox heuristic optimization algorithm (RFOA) for medical image segmentation in this paper. The heuristics’ operation was adapted to the analysis of two-dimensional images, with a focus on equation modification and the novel fitness function. The proposed solution analyzes the image by converting the selected pixels to one of two color variants, black or white, based on the threshold value used. Their number is counted, allowing analysis of the chosen threshold. As a result, such analysis results in the automatic selection of the segmentation threshold parameter. Our method propose a new fitness function and the adjustment of RFOA to image analysis. We used a publicly available database of lung X-ray images for evaluation, and based on the results, an accuracy analysis was performed, as well as a discussion of the benefits and drawbacks is presented.

1. Introduction

Data are obtained while performing a medical examination result that should be analyzed. However, quite often the results obtained are large or even contain additional objects that are not important for the task of analysis. Large amounts of data mean huge computational complexity [1]. In the case of two-dimensional (2D) images, they are composed of pixels, so the larger the image, the more pixels need to be analyzed. In addition, attention should also be paid to the situation when machine learning algorithms are used, which require the annotated result to train the model. Hence, image segmentation is important because of minimizing the number of pixels in the images and leaving only those image areas that are important from the point of view of the implemented solution.

Recent years have brought enormous development in artificial intelligence (AI) methods, thanks to convolutional neural networks (CNNs). On its basis, a model of neural architectures called U-Net [2] was created, which takes an image and returns its processed counterpart. One such example of application of this architecture is segmentation, where the network receives an image and returns a segmented mask. However, this solution is based on the preparation of a training database that must contain the original data and the expected segmentation results. Moreover, these are data-hungry algorithms, that is, the greater the amount of data used in the training process, the greater the probability of obtaining a more accurate model. In [3], the U-shaped network was proposed as the encoder and decoder that are connected by different resolution of processed image. The solution was designed for segmentation of volumetric images that are processed by linear projection of flattened patches and then used as input in the u-net model. A similar approach was shown in [4] as a framework that includes a few modules. The main tool is the u-net encoder and decoder with expertise-aware inferring module, in which the results are used by the multirater agreement modeling. A review of the theory and different applications of u-net and its modified version was discussed in [5]. This paper focused on many aspects, including challenges in the u-net models like black boxes where the operation of hidden layers is not understood. Moreover, the analysis of different construction of such network is needed to fully understand which of them are the most important for data processing.

Moreover, many of the segmentation methods operate on one thread, so many researchers study the parallelization potential. One of such studies was shown in [6], where a new parallel-in-branch model was developed. The proposal was described as a combination of transformers and CNN. A classic version of this hybridization was described also in [7], where a transformer was placed between encoder (after flattening) and the decoder (before reshaping). A similar approach is the development of pyramid dilated module that uses many dilated convolutions in parallel [8] or applying self-attention modules [9]. In addition to parallel approaches, recursive methods are also used, as shown in [10]. The described method was based on recurrent mask refinement and convolutional operations.

Another common approach is based on using clustering methods and fuzzy logic. An example of such an idea was proposed in [11]. The authors used intuitionist possible fuzzy c-mean clustering, and the results were evaluated by a fuzzy support vector machine. The idea was based on the use of the clustering method on the image and then used it in the classification task. To analyze the proposed method, different tools such as decision trees or the Otsu thresholding technique were used for comparison. A similar task was performed using fuzzy c-means and the dual attention mechanism [12]. Again in [13], the idea of using a crow search was used. The heuristic was applied to find the cluster centroids for the fuzzy c-means clustering technique. In many cases, the use of the thresholding method needs to find some threshold value to change the pixel color. For this purpose, the analysis of fuzzy 2-partition Kapur entropy can be applied [14]. The idea is based on the search for the combination of parameters in fuzzy membership functions to maximize the Kapur entropy value.

Recently, nature-inspired and heuristic optimization algorithms have been successfully adopted for various applications of image segmentation [15, 16]. For example, the Ant lion algorithm, which simulates predators hunting ants, was used to scan over X-ray image for possible degenerated tissues [17]. Artificial ecosystem-based optimization algorithm was used for the selection of most relevant features from chest X-ray images while classification was performed using MobileNet [18]. A combination of fractional-order calculus and swarm-based marine predator algorithm was used to extract images from X-ray images for classification with Inception model [19]. A custom 10-layer CNN was used to segment color skin lesion images, while the ResNet101 and DenseNet201 models were used for deep feature extraction, and an improved moth flame optimization algorithm was used for the selection of discriminative features [20]. A trilevel thresholding based on slime mould algorithm and Shannon’s entropy was used to enhance the Breast-Tumor-Section of the slices of breast magnetic resonance imaging (MRI) images, while segment extraction was done using watershed segmentation and further applied for breast cancer recognition [21]. Slime mould optimization algorithm was used for extracting blood vessel from digital fundus images for the recognition of retinal disease [22]. The mayfly optimization algorithm was used for the extraction of optimal features from optical-coherence-tomography eye images for further classification and recognition of the choroidal-neovascularization disease [23]. Firefly optimization algorithm was used for COVID-19 case recognition from chest CT images [24]. Sequential minimal optimization was used to classify diabetic retinopathy in retinal images [25]. However, in most cases, the nature-inspired algorithms are applied after the image segmentation part for the extraction of most informative features, rather than before or during image segmentation. Again in [2634], the authors propose using different methods of analyzing data and its extractions.

Based on the analysis of existing solutions and often focusing on the application of neural models, we propose a different approach based on a nature-inspired optimization algorithm. In this work, we describe the model of using the heuristic red fox optimization algorithm (RFOA) [35] for the process of segmentation of medical images. This is an approach that offers automatic image processing due to the heuristic parameters, and it was recently successfully applied for the segmentation of skin images [36] and retinal images [37].

The main contributions of this paper are as follows:(i)Modeling the operation of the heuristic algorithm for processing 2D images(ii)Modeling an adaptation function depending on a parameter denoting a threshold value for modifying a segmented image(iii)A new evaluation mechanism for the used fitness function and threshold parameter.

2. Materials and Methods

2.1. Red Fox Optimization Algorithm for Image Analysis

An image I is a composition of pixels of size (w is the widths, and h is the height). Therefore, we can use an image as an environment for the heuristic algorithm. Each individual (red fox) will be represented as a pixel on position , where and . The inspiration of the selected heuristic is based on the behaviour of the herd of red foxes during the hunting for food. A first step is the selection of the number of all foxes in the herd and it is marked as . Then, the basic population is created, and their localization is selected as random due to the size of environment. Then, all of them are evaluated by the following fitness function:where , and is a threshold value. After evaluation of all foxes, the herd is sorted according to evaluation values and the one with the highest value is selected as the best one . Then, all individuals are moved according to global and local movement to search a better position. The global movement is defined aswhere the sign is dependent on the ability to go beyond the image range. The gamma value is calculated as

A distance metric is defined as Euclidean one for two points and :

Local movement depends on decision whether a fox wants to move closer to a potential victim nearby, or stop moving and rest. It is modeled by the random selection of parameter

In the case, when the fox decides to move, the change position is determined aswhere the coefficients , are a random value in a given range. The parameter is an observation angle, and this value is modified due to the greater possibility of changing the position of an individual in the image. The original algorithm assumes that the solution space is a continuous, so a small change in value results in the new solution. However, in the case of a discrete space (as here), the coordinate values must be integers. These operations are repeated by a maximum number of iterations . A pseudocode for the proposed method is shown in Figure 1.

2.2. Segmentation Process by Heuristic Algorithm

RFOA analyzes the image depending on the given parameters, such as the number of individuals or iterations . Each individual is evaluated by a specific fitness function. Equation (1) proposes a function depending on the parameter α, which is the threshold value. Our solution is based on a heuristic analysis of this parameter and the best one is selected. The idea is to introduce automatic pixel modification when evaluating a point with the fitness function. This modification is made based on the current parameter value (selected as random). As a result, for the selected value , some pixels are modified and the change for a specific color (black or white) is counted, as well as their intensity. Then, a ratio is calculated as follows:where are the numbers of counted white/black pixels. This operation is made by a specific number of repetitions . Next, square root of Between Class Variance is calculated aswhere are the sums of counted intensity of pixels changed to white/black. This operation is also made by a specific number of repetitions . Then, for all obtained results saved as a pair , we can select the best parameter as one of the two formulas:

A visualization of the proposed methodology is shown in Figure 2.

After creating a mask (a result obtained with thresholding using the value), the final image must be processed to extract just analyzed objects like lungs. For this purpose, a flood function is made on all areas with black color on the edges of images. Then, all areas with black pixels that cover less than 5% of image size are also flooded by white color. Such a mask is used to extract the data from the original image. In that way, the final segmentation can be created. However, the extracted area in the mask can have some white pixels inside; therefore, the flood (with black) can be made also on areas that cover less than 5% of image size and has white pixels. This process is shown in Figure 3.

3. Results and Discussion

All experiments were conducted on a computer with the following specifications: Processor: AMD Ryzen 5 5600X 6-Core Processor 4.20 GHz, x64-based processor, and 32.0 GB RAM. The data used in our experiments consist of chest X-ray images [38], accessible at Kaggle, at https://kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia.

Evaluation of the proposed method is based on the analysis of the impact of found thresholding values and their final value (using equations (9) and (10)) on X-ray images to create a mask that can be used for segmentation. An example of obtained results is shown in Figure 4. Based on these examples, (9) gives interesting results, but many areas are just missed. Especially, it can be seen in the example of using , where the outer area has been thresholded. Consequently, segmentation is impossible, as there is no possibility of extraction of the lungs themselves. Moreover, the other two samples indicate that too many elements were left, which contributes to increasing the number of calculations with their deletion using a flood. In the case of using the second function (see Figure 4(c)), the mentioned problems are not visible here. The lung area is surrounded by white pixels, which allows it to be extracted.

We use three parameters that have an impact on the created images: number of iterations , size of the population , and the number of analyzed values . The results are shown in Tables 1 and 2. Tests were performed on both metric of aggregating alpha values and analyzed for two different number of . The first used function was described by (9) and results indicate an accuracy of less than 80% for each variant of the parameters. In the case of a greater number of alpha parameters (twice), higher results were obtained in each case, indicating the possibility of extracting the lungs on X-rays. However, when the number of individuals in a population is small, the performance is worse, since only a small number of points are considered in the image analysis. In a pessimistic situation, all points could be drawn in the same area, which would contribute to unreliable measurement in determining the parameter. Increasing from 20 to 50 individuals resulted in a small increase in effectiveness. Only increasing it to 100 individuals with many iterations (100) allowed to obtain the effectiveness at the level of 74% and 76% (for 50 and 100 rounds, respectively).

The second condition (see (10)) allows improving the extraction performance. In the best case (using the highest parameter values, i.e., 100 iterations and 100 individuals), the correctness reached over 93% and 94%, respectively (for 50 and 100 rounds). In the case of using only 20 individuals, the correctness for this function was also much higher than using the first condition (see Table 1) for 50 and 100 iterations. Based on the obtained results, it can be said that the number of iterations and population size are important and have a great impact on the correctness of finding the best parameter. In the case of a small number of individuals, the number of pixels taken for analysis is small, which does not allow for the correct determination of the parameter. Again, a small number of iterations affect the possibility that individuals will get stuck in the initially drawn areas. Many iterations allow for a greater possibility of global displacement, thus covering more pixels in the image. However, it should be noted that an increase in these parameters affects the amount of computation. The use of a heuristic algorithm for such analysis allows for more controlled pixel analysis. It is especially visible in the case of the RFOA application, because the global movement depends on the distance from the best individual, i.e., pointing to the white area. Consequently, individuals initially analyze a random area, and, in later iterations, an area composed of bright pixels. It is also worth noting that the use of such an algorithm will work in the analysis of the reduced version of the image and transferring the value to the original size (which will reduce the number of calculations). Moreover, the proposed solution is based on the analysis of a single image, which makes it a self-adapting solution to the pixels of a given X-ray image.

In Table 3, the comparison with other existing methods was shown based on the COVQU dataset [39]. For a detailed analysis of the obtained results, we calculated accuracy, Jaccard index, and Dice coefficient. The comparison was made with a machine learning solution based on u-net architecture and a modified heuristic algorithm based on ant colony optimization (ACO). In the case of heuristic comparison, RFOA reached higher than all coefficients. It is visible, especially in accuracy which reached over 3,7%. In the case of the machine learning solution, the accuracy was lower by 1,01% (using u-net) and 1,43% (using modified u-net). However, it should be noted that RFOA has a higher value for the Jaccard index and Dice coefficient.

4. Conclusions

In this paper, a solution for creating a segmentation mask that allows for easy extraction of important objects on medical images was presented. The main idea was to adopt a heuristic algorithm (based on red fox optimization algorithm) to analyze the image (i.e., a matrix composed of pixels was a solution space). The individuals in the algorithm analyze the images and count the changed pixels. The measurements performed, depending on the selected heuristic parameters, made it possible to select the value of the thresholding parameter. As part of the research, two methods of selection were modeled, and measurements were made on them. Using the arithmetic mean determined by taking the minimum and maximum values resulted in low accuracy. When using maximum values, the accuracy of extraction was obtained at a level of over 70%. However, the second method of aggregating the obtained values based on variance allowed for the achievement of more than 90% correct lung extraction on chest X-ray images.

In future work, we plan to improve the operation of the proposed method by expanding it with a supervisory mechanism that will allow for a more correct analysis of the coefficients and the automation of parameter selection.

Data Availability

The data used in this study are accessible at Kaggle, at https://kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia. The implementation of the proposed segmentation method is available at https://github.com/AJaszcz/RFA_ImageSegmentation.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Silesian University of Technology under grant no. 09/010/RGJ22/0067.