The 2019 coronavirus pandemic (COVID-19) struck without warning, and existing medical screening and clinical management systems were unprepared, causing a high fatality rate. Given the virus’s ongoing evolution, there is still a potential for reemergence; earlier weak preparedness will not be accepted in such a situation. Therefore, it is vital to understand and rectify past diagnostic work’s flaws. RT-PCR and antigen tests, both widely used, have experienced problems in the past. They either were too sluggish or produced an excessive number of false negatives. Another issue was a lack of test kits. As a result, chest X-ray image-based disease classification has emerged. However, managing a variety of chest X-ray pictures for COVID-19 and pneumonia patients is complicated and error-prone. As a result, the only way to improve the current diagnosis is to apply deep learning algorithms that learn from radiography pictures and anticipate COVID-19 development. We constructed our own convolutional neural network (CNN) by incorporating transfer learning from the most popular ResNet, VGG, and InceptionNet models. The endeavor necessitated the creation of a sizable dataset that accurately depicted the patient population. Before importing the model, the images were enhanced to remove artifacts caused by noise, motion, or blurring that could impair the detection of infection. Preprocessing has a substantial impact on the model’s accuracy. The results indicated that the VGG16 architecture, with a detection accuracy of 95.29%, is optimal for COVID-19 identification from X-ray images. Furthermore, most generated models outperformed current state-of-the-art research in the same field.

1. Introduction

Coronavirus disease 2019 (COVID-19) is a coronavirus infection triggered by a new coronavirus originally called as 2019-nCoV. It is a component of a variety of pathogens that causes breathing infections, including severe acute respiratory syndrome (SARS) and the Middle East respiratory syndrome (MERS) [1]. COVID-19 virus was discovered for the first time in Wuhan, Hubei, China. The virus causes breathing disease, illness, dry cough, and shortness of breath as frequent symptoms [2]. No exact drug or vaccination is present, and therapies are continuously being investigated [3]. COVID-19 is a contagious illness spread mainly by drops formed when a disease-ridden individual coughs, sneezes, or breathes out. Before the outbreak, the infection was utterly unknown, and it is regarded as the most significant challenge due to the socioeconomic catastrophe it produces. The pathogen, which affects the upper respiratory system, is readily passed from person to person, making this sickness hazardous. As a result, early discovery may aid in treating, isolating, and hospitalization of infected individuals. His virus’s numerous testing techniques are available, including RT-PCR, RT-LAMP, electrochemical, and optical biosensors for RNA recognition [4].

Presently, two methods for detecting COVID-19 contamination in afflicted people are available: diagnostic testing (present contamination) and antibody tests (past infection). Rapid detection of COVID-19 is accomplished by using diagnostic techniques like reverse transcription-polymerase chain reaction (RT-PCR) and antigen assays. Because false positives (FPs) are more prevalent in antigen testing, RT-PCR is the gold standard typical for illness detection. However, RT-PCR tests need extensive laboratory work to get the data [5], and the test’s cost is a significant problem in several countries with a privatized wellbeing system. While PCR and antigen testing may now offer a quick diagnosis, medical imaging of the lungs will provide material on illness load. Additionally, a faster and more accurate diagnosis of COVID-19 would aid in detaching infected individuals quicker, limiting disease dissemination.

Apart from laboratory detection procedures, various alternative approaches for detecting COVID-19 are available. The usual medical imaging modalities for diagnosing lung illness are chest radiography (CXR) and computed tomography (CT) images [6, 7]. While CT scans are often employed in diagnosing COVID-19 [810], cost [11] and radiation exposure are significant considerations. Additionally, it was discovered that chest CT has high compassion for analysis [12] and that X-ray pictures reveal pictorial indices linked to COVID-19 [13]. CXR images are favored over CT imaging due to their lower radiation dose and widespread availability.

Nowadays, healthcare professionals collect and generate vast amounts of data, which contain critical information and signals that may be analyzed and used to overcome the limitations of conventional analytical processes. However, this exponential development of medical pictures necessitates substantial effort from medical experts, which is highly subjective and prone to human mistakes. An alternative method is to systematize the composite procedure of medical analysis by utilizing health data and contemporary machine learning algorithms. As a result, using automated techniques for identification aids in the diagnostic process and provides very accurate early detection [14].

Computer-aided chest X-ray examination procedures are required for COVID-19 case identification from chest X-ray images. Deep learning approaches are effective in generating high-quality results while also providing extra benefits such as (1) maximizing the use of unstructured data, (2) eliminating unnecessary costs, (3) reducing feature engineering, and (4) eliminating explicit data labeling. As a result, deep learning algorithms are frequently used to extract essential features from photos to categorize them automatically. Moreover, deep learning algorithms have made significant contributions to medical image analysis and the accomplishment of high classification performance using less time-consuming simulated tasks [15].

We describe a technique based on deep learning for detecting COVID-19 infection from chest X-ray pictures in this work. To identify X-ray pictures as COVID-19 positive or COVID-19 negative, we suggest a deep convolutional neural network (CNN) model. The suggested technique was developed utilizing a transfer learning strategy using a variety of dense convolutional neural network pretrained models, including VGG16 [16], VGG19 [16], ResNet50 [17], and InceptionResNet-V2 [18]. A model capable of detecting COVID-19 contamination from chest radiography pictures should benefit physicians in the triage, quantification, and follow-up of positive patients. Even though this approach does not entirely swap current testing methods, it may be used to reduce the number of situations that need urgent testing or more review by specialists. The contributions of the work are as follows. (1)The current study utilized an extensive dataset for model training and validation, resulting in a genuine depiction of the real-world patient populace(2)Development of fine-tuned models using state-of-the-art CNNs to classify COVID-19 positive chest X-rays from normal chest X-rays efficiently. Our work modified only the fully connected networks. Kernels for feature extraction remained unchanged(3)Proposed efficient preprocessing and enhancement techniques that aided in the improvement of the proposed deep learning models’ accuracy(4)Comparing the current study with previous works in the same domain suggests that it outperforms the vast majority of them

The design of the paper is as follows. Section 2 delivers a summary of the previous work done in the domain. Section 3 describes the various datasets utilized for the development of the model. Section 4 described the model architecture followed in the work. Section 5 gives the details about the model evaluation metrics. Results and Discussions are discussed in Section 6. A separate section is dedicated for discussion in Section 7. Related work and its comparison are done in Section 8. Section 9 is concluded with conclusions.

2. Literature Review

Various CNN-based deep neural networks are frequently employed to classify medical images. Using CNN as a feature extractor in medical picture classification may avoid expensive and challenging feature extraction processes [19]. A CNN for diagnosing lung illness from image patches using a shallow convolutional layer (convlayer) was developed. The testing employed 16,220 patches from 92 HRCT pictures, and the authors obtained a precision of 94 percent utilizing the suggested model.

Reference [20] demonstrated a CNN-based approach for analyzing large chest X-ray film datasets. The authors utilized the Stanford Normal Radiology Diagnostic Dataset. It comprises about 400,000 CXR with 108,948 frontal-view CXRs for experiments. It achieved an accuracy and a recall of 0.90 and 0.91, respectively.

The authors in [21] conducted a comparative investigation on CXR into average bacterium and coronavirus using pretrained models based on DCNN, including VGG16, VGG19, InceptionResNet-V2, InceptionV3, ResNet50, DenseNet201, and MobileNetV2 (multiclass classification). The InceptionResNet-V2 model has an accuracy of 92.11 percent for coronavirus detection.

Reference [22] presented COVIDX-Net; this system is based on a deep learning technique built on seven DCNNs, including VGG19, Xception, ResNetV2, InceptionV3, InceptionResNet-V2, DenseNet201, and MobileNetV2, for diagnosing COVID-19 using X-ray pictures. The VGG19 and DenseNet201 models outperformed other models by 90 percent accuracy, with an F1 score of 0.89 for regular and 0.91 for COVID-19.

Reference [23] also used DL to identify COVID-19 patients based on a limited number of chest X-ray pictures. They employed pretrained ResNet50 networks, which achieved an overall accuracy of 89.2%.

Reference [24] detected COVID-19 chest X-ray pictures via a transfer learning InceptionV3 model, demonstrating that the transfer learning approach is stable and simple to extend for COVID-19 detection. Reference [25] classified healthy individuals, COVID-19, and bacterial pneumonia correctly using an enhanced version of the ResNet50 pretrained network. Reference [26] classified COVID-19, bacterial pneumonia, viral pneumonia, and standard persons with a precision rate of 80.6 percent using the GooLeNet pretraining model. Reference [27] used a multilayer threshold in conjunction with a support vector machine (SVM) approach to accurately categorize X-ray pictures of COVID-19-infected individuals. Reference [28] classified COVID-19 X-ray pictures with great accuracy using machine learning algorithms such as SVM, CNN, and random forest (RF). Reference [29] fine-tuned seven CNNs including InceptionV3, ResNet50V2, Xception, DenseNet121, MobileNetV2, EfficientNet-B0, and EfficientNetV2 for the detection of COVID. Additionally, [30] developed an optimized CNN model that can be deployed in a low-powered embedded system.

Reference [31], on the other hand, developed a CovidGAN-based Auxiliary Classier Generative Adversarial Network (ACGAN) model to generate synthetic chest X-ray (CXR) imagery. Additionally, they proved that the CovidGAN-generated synthetic pictures might be used to improve the performance of CNNs for COVID-19 identification. Classification using CNN solely achieved a precision of 85%, but with the addition of synthetic pictures generated by CovidGAN, the efficiency climbed to 95%. Some of the similar work is available in [32].

The most significant limits of the initial research are the comparatively small test dataset used for classification. Additionally, no consideration was made for an unbalanced accurate depiction of the patient population. Additionally, most works used raw medical images for model training without performing any preprocessing on the images. As a result, medical images are frequently subjected to artifacts caused by noise, motion, or blurring, all of which can impair disease detection. Thus, preprocessing and enhancement of images are critical steps before applying machine learning or deep learning models. The current work addresses these deficiencies and proposes a more effective solution.

3. Materials and Methods

3.1. Dataset

In response to the quick outbreak of the COVID-19 pandemic and the need for efficient and early diagnosis, several public open-source datasets of chest X-rays and computerized tomography (CT) images have been available. We used the COVIDx chest X-ray benchmark dataset available online at [33] These data sources are the COVID-19 X-ray images [35], COVID-19 chest X-ray dataset [36], Actualmed COVID-19 chest X-ray dataset [37], Kaggle COVID-19 radiography database-version 3 [38, 39], chest X-ray8 dataset [20] originally acquired from the National Institute of Health (NIH) [40], RSNA international COVID-19 open radiology database (RICORD) [41], BIMCV-COVID19+ dataset [42], and the Stony Brook University COVID-19 positive case dataset [43]. The databases used in this work are summarized in Table 1. Our dataset contains 30,882 chest X-ray images of 14,192 negative (non-COVID) and 16,690 positive COVID-19 cases; Figure 1 shows example images of each class. The data were acquired from 17,026 patients. The distribution of chest X-ray images in the dataset for positive and negative cases is shown in Figure 2.

3.2. Methodology

Transfer learning has been widely used in image classification problems. We do not need to start learning from scratch; instead, we use pretrained deep models trained on other enormous datasets and fine-tune the model based on our dataset. In our approach, we focused on applying some of the most used and popular transfer learning models available in Python’s Keras library. We applied VGG16, VGG19, ResNet-50, and InceptionResNet-V2 models; pretrained on over a million images from the ImageNet database to our dataset after preprocessing techniques to enhance performance and improve the quality of the input images to the models. Figure 3 shows the pipeline of our approach.

3.3. Preprocessing

Medical images are usually exposed to some artifacts due to noise, motion, or blurring that can impair disease detection. Hence, image preprocessing and enhancement are essential steps before applying any machine learning or deep learning models. Image preprocessing is aimed at enhancing the quality of the image by suppressing distortion and enhancing the image features. Following are the preprocessing steps we applied through our approach: (1)Noise removal

Salt and pepper, speckle, Gaussian, and Poisson noise types are most common in medical images. Denoising algorithms such as median, Gaussian, and Weiner filters were proved to be effective with these types of noise. In our approach, we used the Gaussian smoothing technique by applying the GaussianBlur method in the OpenCV library with a kernel of size . (2)Morphology filter

Morphological operations are simply based on erosion and dilation. While erosion removes white noise in the boundary, the dilation increases the area again. Erosion followed by dilation is known as the opening method, and the opposite of that is the closing method. We applied opening and closing operations with a kernel size of to ensure removing any noise still in the image and close small dots if they exist. (3)Contrast enhancement

Contrast limited adaptive histogram equalization (CLAHE) is very powerful in adjusting image contrast, which improves the visibility of foggy image parts, resulting in better image quality and enhanced details. Therefore, we applied the CLAHE filter on our image dataset using OpenCV createCLAHE method using a clip limit of 4 and tile grid size of .

The above preprocessing steps are then applied consequently to the images by converting to grayscale to apply the filters and then back to RGB, all during image flow to the ImageDataGenerator along with rescaling pixel values from 0–255 to 0–1 and resizing to the networks default size of . Figure 4 shows an example image before and after applying the preprocessing method.

The dataset is then split with 60-20-20% ratio for train, test, and validation sets, respectively, and the distribution of the ratios is summarized in Table 2.

4. Model Architecture

As stated previously, we trained the dataset with four pretrained models that will be discussed in detail in this section.

4.1. VGG16 and VGG19

VGG16 was first launched in 2014 and was the winner of the ImageNet large-scale visual recognition challenge (ILSVRC) [16]. The architecture of VGG models generally consists of five blocks of small convolutional filters of with , each followed by RELU activation function, using the same padding and max-pooling layers of filter size with , and three last fully connected layers. VGG16 and VGG19 model architectures are the same; VGG19 has three more convolutional layers than VGG16. We used the two architectures implemented in Python’s Keras library, removing the three top fully connected layers using the parameter and adding a global average pooling layer, a drop-out layer with 0.2, and a softmax dense output classifier layer (see Figure 5). VGG16 was trained with Adam optimizer, , and with freezing; all layers except for the last two are fine-tuned. The same setup was used with the VGG19 model but with a learning rate of 0.01.

ResNet-50 the deep residual network (ResNet) won ILSVRC in 2015, introducing the skip connection concept [17] that uses a shortcut between every two layers and direct connections. This approach is proven to help overcome the vanishing gradient problem that appears to go and more profound in the network. We used the 50-layer deep ResNet version in our experiment that consists of 5 convolution blocks. The first block consists of a convolutional layer with a kernel size of , followed by a max-pooling layer with . The second block contains three convolutional layers with the first and third kernel sizes of and the second of . These three layers are repeated three times, giving nine convolutional layers. Next is the third block with three convolutional layers repeated four times with twelve layers. The ResNet-50 architecture continues as depicted in Figure 6 by removing the top layers and adding global average, dropout, and dense layers. The model was trained using an RMSprop optimizer and a learning rate of 0.0001.

4.2. InceptionResNet-V2

The combination of Inception architecture and the residual connections from the ResNet network resulted in the InceptionResNet-V2 network (the network architecture is shown in Figure 7), which was found to be accelerating the training cost of Inception networks [18] to achieve high performance with low computational cost. It consists of a stem block (Figure 8) that carries out an early convolution stage and pooling before the inception module. In addition, it has three Inception modules named A, B, and C and two reduction blocks that are used to change the grid’s width and height. The detailed structure of each of the Inception module blocks is described in Figure 9. In our experiment, we used the InceptionResNet-V2 architecture in Keras with a similar classifier layer setup as the previous three models, RMSprop optimizer, and a learning rate of 0.0001.

5. Model Evaluation Metrics

Apart from the confusion matrices, we generated some evaluation measures such as accuracy, precision, recall (sensitivity), F1, mean intersection union (IOU), and dice coefficient scores to evaluate the performance of the proposed model on unseen test data. While the accuracy score in (1) checks for the number of correct predictions over the total count of predictions, the precision metric computes the ratio of true positives to all positive predictions as given in (2), and the recall is given in (3), which calculates the percentage of true positives over the ground-truth positives. In addition, F1 is a score that combines precision and recall into one metric score, as given in (4). Furthermore, we computed another two metrics, which are the IOU score (5); also known as the Jaccard index, which calculates the percentage of overlap between the ground truth and the prediction labels, and the dice coefficient (6) is very similar and positively correlated to the IOU, and both ranges from 0 (no overlap) to 1 (perfectly overlapped).

6. Results and Discussion

The four model architectures were trained for 20 epochs adding an early stopping callback function with six patience epochs and a minimum delta change of 0.01. Figure 10 summarizes the train validation accuracy and loss learning curves for each model. The training learning curve gives an idea of how well the model learnt and performs over the epochs on the training set. From the validation learning curve, we follow the model performance on unseen data, which indicates how well the model is generalizing. The VGG16 and VGG19 models topped early after 14 and 10 epochs. The learning curves show that the loss is steadily decreasing for both training and validation sets, especially the loss curve of the VGG16 model, which is very smooth with no oscillations, and the accuracy is increasing. The ResNet-50 model early stopped after eight epochs, and we can observe one or two spikes in the learning curves but within a slight difference. Additionally, the InceptionResNet-V2 model stopped after ten epochs, and the learning curves show stability and smoothness but with a prolonged convergence rate.

As mentioned earlier in Table 2, a test set of a total of 6177 images, 3338 positive COVID-19 cases and 2839 negative cases, is preserved for testing and evaluation of the generalization performance of the models. In Figure 11, we can find the confusion matrices of each of the four models indicating the number of true-positive (TP), true-negative (TN), false-positive (FP), and false-negative (FN) predictions. Table 3 compares our experimental results using the different evaluation metrics previously stated. The VGG 16 model resulted in accuracy and F1 score of 95.3% and 95.26%, respectively, and a dice similarity coefficient of 0.953. And next to it, with a slight difference, is the VGG19 model with an accuracy of 94.5%. The ResNet-50 model resulted in a value of 91.97% for F1 score and 92.02% accuracy, while the InceptionResNet-V2 achieved an accuracy of 88.4% which is expected due to the very slow learning.

7. Discussion

As can be concluded from Table 3 and as expected from the accuracy and loss learning curves, the VGG16 and VGG19 models attained the best performance as their evaluation metrics on the test set were the highest among the rest of the models. Moreover, as for the confusion matrix, the two models have the least number of false positives and false negatives. The VGG16 model has 140 false positives and 151 false negatives. On the other hand, the VGG19 model resulted in 223 and 116 false-positive and false-negative predictions, respectively. This important insight gives an advantage for the VGG19 model over the VGG16 model in our experiment. Minimizing false-negative predictions is a critical and essential matter in healthcare and medical applications, as it may lead to delayed diagnosis and hence delay diagnosis treatment. In the next section, a comparative analysis was performed with literature-related work for COVID-19 detection from chest X-ray images or CT scan images using different approaches.

In Table 4, we summarize the performance comparison of our approach compared to other similar studies. Regarding the study conducted by Wang et al. [44], they proposed the same benchmark dataset we are using in our study applying COVID-Net deep convolutional neural network, the first open-source network implemented for COVID-19 detection from chest X-ray images. COVID-Net resulted in 93.3% accuracy, while the VGG19 and ResNet-50 models were used and resulted in 83% and 90.6% accuracies, respectively. Horry et al. [32] developed a study to classify COVID-19 into the three most used medical imaging techniques, chest X-rays, ultrasound, and CT scan images, using transfer learning. Their study resulted in 79%, 87%, 73%, and 75% accuracies on the VGG16, VGG19, InceptionResNet-V2, and ResNet-50 models, respectively. Image-enhancing technique preprocessing and image-enhancing techniques exhibited a distinct advantage with the same transfer learning paradigm. In addition, Oyelade et al. [45] conducted another study proposing a CNN framework called CovFrameNet to classify and detect COVID-19 disease from chest X-ray images. Although their proposed model achieved an accuracy of 99.9%, it resulted in 85% for precision and recall and a 90% F1 score due to the class imbalance in the used dataset. Another study was conducted by Ahmed et al. [46] which introduced an Internet of things- (IoT-) based framework for early detection of COVID-19 using a faster region-based convolutional neural network (Faster R-CNN), resulting in 98% accuracy, recall of 98%, and 97% for negative and positive images, respectively. When we compared the findings of these investigations, we discovered that only Ahmed et al. [46], who employed the Faster R-CNN, had a model that outperformed ours, providing our technique an advantage over previous COVID-19 detection studies, especially those that used transfer learning models.

As a follow-up to our previous research, we aimed to broaden our investigation to include real-world datasets comprising a variety of chest infections brought on by COVID-19 (multiclass). In addition to this, we may work on developing models that are both more lightweight and highly accurate to use them in portable devices. Finally, as the study focuses on medical data, which is of extremely highly crucial value, having an awareness of the errors associated with each model prediction will also be an asset.

9. Conclusions

Despite marking two years since the COVID-19 outbreak, an early and accurate diagnosis is still necessary and needed. This work implemented and evaluated four pretrained models, VGG16, VGG19, ResNet-50, and InceptionResNet-V2 architecture models, for COVID-19 disease detection from chest X-ray images which are considered an inexpensive, fast, and most available test that can potentially be used in COVID-19 diagnosis. We applied our proposed method on an available large-sized benchmark dataset collected from seven different open sources of chest X-ray images. Our approach examined the significance of image preprocessing and enhancement techniques such as smoothing, denoising, and contrast equalization for enhancing model performance, particularly with this complex dataset compiled from multiple sources. This could be useful for other researchers who wish to utilize this dataset for their own investigations. Our results demonstrated the power of transfer learning-based methodologies in addressing such problems with satisfying performance. The VGG architecture is proven effective throughout the executed experiments for classifying normal and COVID-19 chest X-ray images with up to 95.3% accuracy, precision, and recall. Overall, the “results of the four models” were very promising and showed that the transfer learning pretrained models perform very well on diseases detected from chest X-ray images.

Data Availability

The dataset used in this study is available at “COVID-Net Open Initiative. (Nov. 11, 2021). [Online]. Available: https://alexswong.github.io/COVID-Net/.”

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.”


The authors extend their appreciation to the researchers supporting Project number (rsp-2021/384), King Saud University, Riyadh, Saudi Arabia.