Exploration of Human Cognition using Artificial Intelligence in HealthcareView this Special Issue
Detection of Peripheral Malarial Parasites in Blood Smears Using Deep Learning Models
Due to the plasmodium parasite, malaria is transmitted mostly through red blood cells. Manually counting blood cells is extremely time consuming and tedious. In a recommendation for the advanced technology stage and analysis of malarial disease, the performance of the XG-Boost, SVM, and neural networks is compared. In comparison to machine learning models, convolutional neural networks provide reliable results when analyzing and recognizing the same datasets. To reduce discrepancies and improve robustness and generalization, we developed a model that analyzes blood samples to determine whether the cells are parasitized or not. Experiments were conducted on 13,750 parasitized and 13,750 parasitic samples. Support vector machines achieved 94% accuracy, XG-Boost models achieved 90% accuracy, and neural networks achieved 80% accuracy. Among these three models, the support vector machine was the most accurate at distinguishing parasitized cells from uninfected ones. An accuracy rate of 97% was achieved by the convolution neural network in recognizing the samples. The deep learning model is useful for decision making because of its better accuracy.
As per the World Health Organization, 3.4 million inhabitants in 92 countries may be at risk of malaria infection, with 1.1 billion people at high risk. Epidemiological factors can affect the transmission of malaria, including an ecological and epidemiological study performed in a confined, isolated location . As a result, the WHO supports the advancement of rapid and inexpensive diagnostic testing that aids incorrect treatment method identification. In 2019, there has been an estimated 229 million cases of malaria worldwide [2, 3]. The rate of death totaled 409,000 cases of malaria, as measured by an annual survey . Malaria is caused by parasites transmitted through mosquito bites from infected female Anopheles mosquitoes [5, 6]. In general, microscopy testing is widely accepted and widely used for attracting potential patients with malaria. Children under five are most likely to be affected by malaria [7, 8]. The WHO report predicts that the African region carries the highest share of the global malaria burden.
As per WHO research, the African region might feel the consequences of the global malaria burden . Since it would provide more accuracy, improve consistency, and be cost effective in rural regions, an automated malaria diagnosis system would make a big difference in eliminating the shortage of this insufficiency [10, 11]. Thus, according to medical specialists from the World Health Organization, several malaria parasite groups may cause malaria infection in humans . These include “Plasmodium Falciparum, Plasmodium Vivax, Plasmodium Malaria, Plasmodium Ovalle, and Plasmodium-knowlesi.” Out of these, the most common two classes are “ Plasmodium Falciparum and Plasmodium Viva.”
Figure 1 illustrates how malarial cells progress through their various stages: an examination of the first slide shows trophozoites and gametocytes of P. falciparum along with white blood cells as mentioned in Figure 2. A comparison is now made between the enlarged nucleus and the rest of the red blood cells. The second image proves how Plasmodium Falciparum is erected with Plasmodium Schizonts. There is a need for researchers to investigate innovations relating to malaria diagnosis to automate the process for society. This field has seen an increase in good research articles over several decades. Besides research articles, automatic malaria diagnosis has occurred with a variety of software tools and hardware tools.
2. Related Work
Makhija et al. implemented a V-value histogram method. It has achieved 60% sensitivity . Rajaraman et al. designed the contrast enhancement and threshold-based segmentation approach. A qualitative analysis of two hundred patients' images was achieved using this method . Suriya et al. worked on the color information-based pattern segmentation by considering 75 patients' image samples and achieved around 90% of detection . Liang et al., according to their study, could detect 90% of the speckle noise images using a median filter based on histogram and morphological operations . Vijayalakshmi et al. worked on microscopic images by applying transfer learning on VGG 16 and SVM classifier and claiming the result of 91% of classification accuracy .
Hommelsheim et al. implemented the DNA/RNA-binding domains with nucleotide sequence accuracy and redesigned transcription-activator-like signaling pathways to alter genomes with improved binding specificity . Hawkes et al. worked on an analysis of rapid diagnostic tests and made the awareness of utilizing the RDT for equality and ease of use for cost effectiveness . Ross et al. implemented a method for automating the diagnosis of malaria from thin blood smears that has been developed. Identification of infected erythrocytes is achieved with 75% sensitivity and an 85% positive predictive value (PPV) . Das et al. worked on parasite characterization and classification using ML on light microscopic images of peripheral blood smears and achieved 89% classification using an SVM classifier .
Postiche et al. discussed the development of image analysis and machine learning for the diagnosis of malaria microscopically as well as the emergence of smartphone technology for future diagnosis . LeCun et al. worked to find complex structures in large datasets, and the backpropagation algorithm was implemented to propose that a machine could constantly update its internal parameters based on its representation in the prior phases. . Dahou Yang et al. compared the detection of antimalarial efficacy of an image-based cytometer with a commercial flow cytometer and the results of these two tests .
Yunda et al. determined that by applying principal component analysis to samples of thick film blood, they could reduce the number of features. . Kaewkamnerd et al. to detect the sensitivity of plasmodia on thick blood films, created a two-stage algorithm . Hanif et al. developed an elongating technique for the enhancement and segmentation of heavy blood smear images of Plasmodium falciparum. They reduced the amount of noise and blurring while increasing the contrast range to make the images more visible . The existing models suffer from various issues such as overfitting [13, 18, 19], vanishing gradient [20–22], and poor convergence speed [23–25] kinds of problems.
3. Overview of Convolution Neural Network (CNN)
Convolution layers comprise the CNN (convolutional neural network). Pixel-by-pixel weights and biases are learned. The synapse cell accepts a variety of inputs before combining the weights, either from the input in each layer or from the weights and biases passing through the activation node. This level comprises the input pixels and weights that are shared by the layer's neurons [26–28]. Furthermore, the inputs to standard neural networks are vectors of a single dimension, but in CNN, they are represented as multichannel images. It uses the framework of this algorithm to optimize such algorithms as random gradient, Adam, AdaGrad, etc. as well as to perform tasks such as object detection, image classification, and localization as mentioned in Figure 3.
3.1. Pooling Layers
Filters are applied to the input images to generate feature maps that distinguish the inclusion of those features and they accomplish it by pooling layers. The constraint of this feature map is that it will show a feature's location precisely. The feature map has the dimensions , and the output obtained after the pooling layer is . There are three types of pooling layers such has, Max, Averaging, and Global Pooling. Feature pooling takes the maximum value from a region on the feature map. Average pooling takes the average of the elements present in the neighborhood. The Global pooling takes the value of the entire feature map and scales it down.
3.2. Frequently Used Activation Layer
Sigmoid: the mathematics of the sigmoid function involves the outcome of taking a single number and achieving convinced mathematical operations. (x) = 1/(1 + e−x). This takes the real input number and flattens the range between 0 and 1. Tanh: in the study, tanh is a nonlinearity that is preferred over the sigmoid functions. Even though it shows better results than the sigmoid functions, it is still not possible to solve gradient problems using the Tanh function. . ReLU: as said in the function, the activation is simply a threshold at zero. Tanh functions are mathematically portrayed as .
Various examination of the freely accessible datasets was utilized for classification, augmentation, and preprocessing. As shown in Table 1, a few authors have already provided a dataset that we evaluated for collection, classification, augmentation, and preprocessing techniques.
Figure 2 illustrates two sets of datasets consisting of roughly 13,000 samples, one parasitized and one nonparasitic, illustrating infected and uninfected malarial blood samples.
4.2. Classification of Malaria Cells
The use of computer vision and machine learning algorithms to diagnose malaria has recently got performance metrics in plenty of new studies. As part of a collaborative effort, a recently proposed automated system for detecting and acting on red blood cells was recently presented.
4.3. Image Smoothing
The impact of blurring was applied to several smoothing algorithms, such as Gaussian noise, salt pepper noise, and bilateral filters for both noisy images, and this was compared using the Gaussian, median, and bilateral filters. By eliminating noise and blurring an image, 2D convolution filtering utilized in various low-pass and high-pass filters achieved promising results. The high-pass filter recognized the edges in a cell image and generated promising results. For this cell image, a 2 averaging filter kernel has been used, K = 1/9 as shown in Figures 4 and 5.
4.4. Gabor Filtration Technique
The presence of no malaria-infected cells in many samples allows them to be used as a method to reduce the overall processing run time, which is why statistical analysis is applied to calculate the infected cell sample numbers. A threshold was applied using the Gabor filter method to color the image of the infected area after this was noticed. Distortion was detected in both the background and inside RBCs, according to results of this approach. Later, morphological series have been used to fill the gaps to gain distinct samples, as illustrated in Figure 6. The precision of a Gabor filter influences its orientation; this has kernels that are common to the 2D field and depict essential spatial localization and orientation aspects; as a result, the kernels of this filter are also relative term of dimensions mentioned in (1).
The actual and imagined regions of the filtration are shown in Figure 4. (infected) and Figure 5. (uninfected). We assume that the value of I(ap + bq) is a grey value at (ap, bq). The sample convolution and the scale's Gabor kernel, as well as the direction of θ, are specified as
Equation (3) now has two actual and imagined values. Each separated by a distance orientation's response is specified as
Figures 6 and 7 illustrates images combined with subsampled values analyzed with Gabor. The values considered k-size = 2020, , , ,
4.5. Data Preprocessing
The model's actions depend on the data that were provided in the supervised process of learning. As a result, it has a profound influence on decision making. A large dataset is obtained because the image smoothing algorithm is applied for feature extraction. The vector was standardized in the range of 0 to 255 as the first step toward accurate data identification. The chi-square feature selection method was used for selecting 80% of the most useful features for the final vectors.
5.1. Support Vector Machine
A support vector machine (SVM) is useful for deciding the optimal location of a decision boundary or for learning statistics and for determining when it is necessary to separate classes. Regarding multiclass classification problems, this study used the “each” strategy, wherein labels are assigned from a finite set of several elements. Infectious parasite samples and unaffected parasitic samples are two classes that were used to estimate the number of classes. The “one-to-one” classifier results can be transformed to form a decision function of shape by using the decision function shape option, . The test vectors are analyzed by applying each classifier to them, giving each vote.
5.2. XG-Boost Algorithm
Gradient boosting reduces the loss function by building models from single weak learners in an iterative fashion instead of building a complete model from random subsets or features like the random forest. The loss function for gradient boosting is minimized using gradient descent. Based on the application of this algorithm, it is possible to achieve a score of about 90% in accuracy and speed because it uses advanced regularization techniques to prevent overfitting and speed up computation, as shown in Table 2.
5.3. Proposed Convolutional Neural Network
A trainable customizable network, which is divided into four fully connected layers, is based on the convolution layer, the pooling layer, and the average pooling step to obtain features, reduce computations, and update features. 26,188 samples were divided into three sets: training, testing, and validation. It was then converted to a 128 × 128 pixel size. According to Figures 8 and 9, the study included 13,779 uninfected samples.
6. Evaluation Metrics
6.1. Optimizer Function
In contrast to the RMSProp optimizer function, the Adam optimizer could much more successfully overcome the AdaGrad optimizer's inadequacy. It is a much more advanced version of stochastic gradient descent wherein the weights are continuously updated with the training data. The iterations must be stated in Step 1 for the algorithm to work. The very first step is to get a gradient. Step 2 is by using the moving averages formula for determining the moving averages. In this step, the estimator's bias is corrected, where xhat and yhat are biased correction equations, and in the last step of the Adam algorithm, the weights are adjusted in the network using the z-expression.
6.2. Loss Function
In the optimizer, the neural net's weights and biases are updated to reduce the loss function. Loss functions are defined as ways of mapping input functions to output functions. This function will decide the probability value of a prediction class by calculating the categorical cross-entropy, as shown in (4).where “ca” is the number of categories and “in” is the number of observations.
6.3. Performance Analyses
6.3.1. Experimental Set-Up and Performance Metrics
On the server, the recognition system is installed for online access. The CPU is a Lenovo ThinkPad workstation, RAM is 32G, and the operating system is Windows 10.
The problem consists of sorting the images into two classes: parasitized (i) and uninfected (ii). The F1-score is calculated by using the harmonic mean of precision and recall. It can be evaluated as mentioned in (5).
True positive rate is another well-known measure. Compared to actual positives correctly classified as positives, this reflects the number of true positives, and it can be defined as mentioned in (6).
Accuracy can be defined as the systematic errors that measure the differences between true and predicted values, and it can be defined as mentioned in (7).
The positive predicted value represents the subjects who were positive for the presence of the disease during the screening procedure, and it can be defined as mentioned in (8).
The negative predicted value represents how likely it is for the subjects screened for the disease to have a negative result, and it can be defined as mentioned in (9).
The false negative rate can be calculated by adding the number of positive events that were misclassified as negatives to the total number of positive events, and it can be defined as mentioned in (10).
To determine the accuracy of malarial parasite detection using ML algorithms, experiments were conducted with several algorithms. Based on Tables 2 and 3, the classification accuracy obtained by the SVM was around 94%, and XG-Boost had an estimated classification accuracy of 90%. As shown in Table 4, an estimated 98.0% of the data was able to be classified by the convolution neural network. Figures 10 and 11 show the training and validation loss followed by training and validation accuracy, respectively, and Figure 12 the confusion matrix accuracy, Figure 13 confusion matrix normalized, and Figure 14 confusion matrix true positive and true negative, respectively.
Plasmodium parasites cause dengue infection in red blood cells. Counting blood samples mechanically is a very time-consuming process that results in a tedious diagnosis strategy. For the detection and analysis of this malarial virus, the XG-Boost classification algorithm, support vector machine, and neural network algorithms were compared using Gabor filters. Convolutional neural networks perform well on the same datasets when analyzing and recognizing them. A model is developed that analyzes the blood sample to determine parasitized or uninfected cells. Using this model, we aim to reduce model discrepancies and improve their robustness and generalization. We collected 13,750 parasitized samples and 13,750 nonparasitic samples for comparative analyses. Support vector machines were accurate to 94%, while XG-Boost achieved 90% accuracy, and neural networks achieved 80% accuracy, respectively. Both parasitized and uninfected cells were more accurately classified by the support vector machine than by the other two models. In either case, convolutional neural networks were designed to recognize the samples with 97% accuracy. In terms of decision making, these models are helpful because of their improved accuracy.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R120), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
K. S. Makhija, S. Maloney, and R. Norton, “The utility of serial blood film testing for the diagnosis of malaria,” Pathology, vol. 47, no. 1, pp. 68–70, 2015.View at: Publisher Site | Google Scholar
S. Rajaraman, S. K. Antani, M. Poostchi et al., “Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images,” PeerJ, vol. 6, Article ID e4568, 2018.View at: Publisher Site | Google Scholar
M. Suriya and M. G. Sumithra, “Efficient evolutionary techniques for wireless body area using cognitive radio networks,” Computational Intelligence and Sustainable Systems, Springer, Berlin, Germany, pp. 61–70, 2018.View at: Publisher Site | Google Scholar
Z. Liang, A. Powell, I. Ersoy et al., “CNN-based Image Analysis for Malaria Diagnosis,” in Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine. BIBM, pp. 493–496, Shenzhen, China, December 2017.View at: Google Scholar
A. Vijayalakshmi and B. Rajesh Kanna, “Deep learning approach to detect malaria from microscopic images,” Multimedia Tools and Applications, vol. 79, pp. 15297–15317, 2019.View at: Publisher Site | Google Scholar
C. M. Hommelsheim, L. Frantzeskakis, M. Huang, and B. Ülker, “PCR amplification of repetitive DNA: a limitation to genome editing technologies and many other applications,” Scientific Reports, vol. 4, p. 5052, 2014.View at: Publisher Site | Google Scholar
M. Hawkes, J. Katsuva, and C. Masumbuko, “Use and limitations of malaria rapid diagnostic testing by community health workers in the war-torn Democratic Republic of Congo,” Malaria Journal, vol. 8, no. 1, p. 308, 2009.View at: Publisher Site | Google Scholar
N. E. Ross, C. J. Pritchard, D. M. Rubin, and A. G. Dusé, “Automated image processing method for the diagnosis and classification of malaria on thin blood smears,” Medical, & Biological Engineering & Computing, vol. 44, pp. 427–436, 2006.View at: Publisher Site | Google Scholar
D. K. Das, M. Ghosh, M. Pal, A. K. Maiti, and C. Chakraborty, “Machine learning approach for automated screening of malaria parasite using light microscopic images,” Micron, vol. 45, pp. 97–106, 2013.View at: Publisher Site | Google Scholar
M. Poostchi, K. Silamut, R. J. Maude, S. Jaeger, and G. Thoma, “Image analysis and machine learning for detecting malaria,” Translational Research, vol. 194, pp. 36–55, 2018.View at: Publisher Site | Google Scholar
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436–444, 2015.View at: Publisher Site | Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, pp. 2278–2323, 1998.View at: Publisher Site | Google Scholar
D. Singh, V. Kumar, M. Kaur, M. Y. Jabarulla, and H. N. Lee, “Screening of COVID-19 suspected subjects using multi-crossover genetic algorithm based dense convolutional neural network,” IEEE Access, vol. 9, pp. 142566–142580, 2021.View at: Google Scholar
A. Rahman, H. Zunair, M. S. Rahman et al., “Improving malaria parasite detection from red blood cell using deep convolutional neural networks,” 2019, https://arxiv.org/abs/1907.10418.View at: Google Scholar
L. Yunda, A. A. Ramirez, and J. Millán, “Automated image analysis method for p.vivax malaria parasite detection in thick film blood images,” Sist. Telemática, vol. 10, pp. 9–25, 2012.View at: Publisher Site | Google Scholar
S. Kaewkamnerd, A. Intarapanich, M. Pannarat, S. Chaotheing, C. Uthaipibull, and S. Tongsima, “Detection and classification device for malaria parasites in thick blood films,” in Proceedings of the 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems, pp. 435–438, Prague, Czech Republic, November 2011.View at: Publisher Site | Google Scholar
N. S. M. M. Hanif, M. Y. Mashor, and Z. Mohamed, “Image enhancement and segmentation using the dark stretching technique for plasmodium falciparum for thick blood smear,” in Proceedings of the 2011 IEEE 7th International Colloquium on Signal Processing and its Applications, pp. 257–260, Malaysia, May 2011.View at: Google Scholar
G. De Luca, “A survey of nisq era hybrid quantum-classical machine learning research,” Journal of Artificial Intelligence and Technology, vol. 2, no. 1, pp. 9–15, 2022.View at: Google Scholar
D. Jie, G. Zheng, Y. Zhang, X. Ding, and L. Wang, “Spectral kurtosis based on evolutionary digital filter in the application of rolling element bearing fault diagnosis,” International Journal of Hydromechatronics, vol. 4, no. 1, pp. 27–42, 2021.View at: Publisher Site | Google Scholar
M. Kaur, D. Singh, V. Kumar, B. B. Gupta, and A. A. Abd El-Latif, “Secure and energy efficient-based E-health care framework for green internet of things,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 3, pp. 1223–1231, 2021.View at: Publisher Site | Google Scholar
R. Amin and M. Mojarad, “Modeling the scheduling problem in cellular manufacturing systems using genetic algorithm as an efficient meta-heuristic approach,” Journal of Artificial Intelligence and Technology, vol. 1, no. 4, pp. 228–234, 2021.View at: Google Scholar
T. V. Hahn and C. K. Mechefske, “Self-supervised learning for tool wear monitoring with a disentangled-variational-autoencoder,” International Journal of Hydromechatronics, vol. 4, no. 1, pp. 69–98, 2021.View at: Publisher Site | Google Scholar
H. Kaushik, D. Singh, M. Kaur, H. Alshazly, A. Zaguia, and H. Hamam, “Diabetic retinopathy diagnosis from fundus images using stacked generalization of deep models,” IEEE Access, vol. 9, pp. 108276–108292, 2021.View at: Google Scholar
J. Liu, Z. Liu, C. Sun, and J. Zhuang, “A data transmission approach based on ant colony optimization and threshold proxy re-encryption in wsns,” Journal of Artificial Intelligence and Technology, vol. 2, no. 1, pp. 23–31, 2022.View at: Google Scholar
A. Balakrishna and P. K. Mishra, “Modelling and analysis of static and modal responses of leaf spring used in automobiles,” International Journal of Hydromechatronics, vol. 4, no. 4, pp. 350–367, 2021.View at: Publisher Site | Google Scholar
Akshaya and C. V. Aravinda, “Predictive analysis of malignant disease in woman using machine learning techniques,” in Proceedings of the Advances in Artificial Intelligence and Data Engineering, pp. 431–438, Springer, Singapore, August 2020.View at: Publisher Site | Google Scholar
C. V. Aravinda, L. Meng, M. Atsumi, K. R. Udaya, and G. Amar Prabhu, “A complete methodology for kuzushiji historical character recognition using multiple features approach and deep learning Model,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 1, no. 8, 2020.View at: Publisher Site | Google Scholar
C. V. Aravinda, Z. Meng, S. Kenshi, Y. Duan, U. Kazuaki, and P. G. Amar, “Apathy classification based on doppler radar image for the elderly person,” Frontiers in Bioengineering and Biotechnology, vol. 8, p. 1235, 2020.View at: Publisher Site | Google Scholar
V. Badrinarayanan, A. Handa, and R. S. Cipolla, “A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labeling,” https://arxiv.org/abs/1505.07293.View at: Google Scholar
T. Tran, O. Kwon, K. Kwon, S. Lee, and K. Kang, “Blood cell images segmentation using deep learning semantic segmentation,” in Proceedings of the 2018 IEEE International Conference on Electronics and Communication Engineering , (ICECE), pp. 13–16, Xi’an, China, December 2018.View at: Publisher Site | Google Scholar
H. Li, W. S. Zheng, and J. Zhang, “Deep CNNs for HEp-2 Cells classification: a cross-specimen analysis,” https://arxiv.org/abs/1604.05816.View at: Google Scholar
D. Yang, G. Subramanian, J. Duan et al., “A portable image-based cytometer for rapid malaria detection and quantification,” PLoS One, vol. 12, Article ID e0179161, 2017.View at: Publisher Site | Google Scholar
K. Chakraborty, “A combined algorithm for malaria detection from thick smear blood slides,” Journal of Health and Medical Informatics, vol. 6, no. 1, pp. 179–186, 2015.View at: Publisher Site | Google Scholar
M. Elter, E. Hasslmeyer, and T. Zerfass, “Detection of malaria parasites in thick blood films,” in Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5140–5144, Boston, MA, USA, December 2011.View at: Publisher Site | Google Scholar