Abstract

COVID-19, a deadly disease that originated in Wuhan, China, has resulted in a global outbreak. Patients infected with the causative virus SARS-CoV-2 are placed in quarantine, so the virus does not spread. The medical community has not discovered any vaccine that can be immediately used on patients infected with SARS-CoV-2. The only method discovered so far to protect people from this virus is keeping a distance from other people, wearing masks and gloves, as well as regularly washing and sanitizing hands. Government and law enforcement agencies are involved in banning the movement of people in different cities, to control the spread and monitor people following the guidelines of the CDC. But it is not possible for the government to monitor all places, such as shopping malls, hospitals, government offices, and banks, and guide people to follow the safety guidelines. In this paper, a novel technique is developed that can guide people to protect themselves from someone who has high exposure to the virus or has symptoms of COVID-19, such as having fever and coughing. Different deep Convolutional Neural Networks (CNN) models are implemented to test the proposed technique. The proposed intelligent monitoring system can be used as a complementary tool to be installed at different places and automatically monitor people adopting the safety guidelines. With these precautionary measurements, humans will be able to win this fight against COVID-19.

1. Introduction

In the current situations, COVID-19 is spreading exponentially in the whole world [1]. There have not been any developments to produce vaccines that can develop antibodies in COVID-19 patients so far. According to some speculations, it will be another six months to one year that there will be a vaccine. This virus started from Wuhan, China [2] in probably December 2019, but the spread started in January [3], and it was first identified on January 22, 2020. Since then, it is spreading rapidly from China to other countries in Europe, America, Australia, etc. On March 11, 2020, WHO declared COVID-19 as a pandemic. As of this writing on May 05, 2020 05 : 03, according to worldometer (https://www.worldometers.info/coronavirus/?isci=010702), 3,818,640 people are infected with this deadly virus, 1,292,296 people are recovered in this fight against COVID-19, and 264, 807 people were not able to develop immune in their body to protect themselves from the virus and are dead now. With these numbers, the death ratio is 6.93%.

Because of the human to human transmission, the virus is spreading rapidly. The nonexistence of vaccines makes the situation even more challenging. The only method discovered so far to protect this human to human transmission is to stop all types of travelling or commuting [4]. Different countries have closed their borders, to stop the travel of viruses from another country. Countries have locked down their cities, so people just stay at home and let the spread be in control. Those who are infected with COVID-19 are kept in quarantine and let their own immune system fight against this virus [5]. The incubation period of COVID-19 is 14 days, i.e., any person exposed to this virus, has 14 days to show symptoms of the virus or to be recovered. Those with mild systems are just kept isolated, so the virus does not spread. But those who are critical are required to be on ventilators. Keeping people in isolation or putting them on ventilators are costly techniques, and there does not exist enough resources to put every infected person in isolation or put every critical patient on ventilators. Worldwide societies are working 24/7 to overcome this distress and protect precious lives. Different organizations such as CDC (https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/prevention.html), WHO (https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public), UNICEF (https://www.unicef.org/pakistan/coronavirus-disease-covid-19-what-parents-should-know), and Healthline (https://www.healthline.com/health/coronavirus-prevention) are advising people to follow certain guidelines to control the swift spread of this deadly virus. These guidelines are summarized as (i)Keep social distance of at least 1 meter. Avoid any social contacts, such as hugging, handshaking, or touching(ii)If you have fever, please stay at home(iii)Avoid coughing, spitting, and sneezing in open air(iv)Wear mask to protect your respiratory system from infection through air and use gloves to control the spread of the virus(v)Regularly sanitize hands and disinfect commonly used items

Government, doctors, law enforcement agencies, police, etc. are the front liners in this pandemic. They are working day and night to help humans to protect themselves from this virus. In some countries like Italy, the situation is worse, as the death ratio is 13.84%. At some places, people are cooperating, and they have put themselves in self-quarantine and have tried to completely isolate themselves. But in other places, people are not cooperating. Therefore, police or law enforcement agencies are involved to restrict people to their homes or monitor them to follow the guidelines to control the spread of the virus. But it is not possible to put everything locked down for an infinite amount of time. There are resources requirements, there is a need to perform daily life activities, and there is a need to run the affairs of the state. For instance, people need to go to banks to get money or any activity related to bank, people need to visit hospitals for regular check-ups, people need to go to shopping malls to get groceries, and many other activities. It is not possible for healthcare professionals to check a person at the shopping mall and declare the person free of virus, as an appropriate medical kit is required. Any person having symptoms of COVID-19 who enters a shopping mall has a possibility to infect other things by touching and possibly other people by coughing or sneezing in the air before it can be detected that the person is infected with COVID-19. It is simply not possible for the government to monitor each and every place and each and every person. There is a need of an intelligent system that can automatically monitor places and people to guide them to follow guidelines of CDC and inform healthcare professionals if the symptoms of COVID-19 are found in a person.

In this paper, an intelligent mechanism [6, 7] is proposed for government to install cameras at entrances of shopping malls, banks, hospitals, etc. and let the computer vision techniques based on deep CNN allow only those people who are following the guidelines to avoid COVID-19. Those people who are not following the guidelines or having symptoms of COVID-19, such as having temperature or coughing is not allowed in any place by the system. This will help the government and healthcare industry to automatically monitor people at different places and to control the spread of the virus. The technology can also be used as a robot, which regularly visits in malls and advice people to follow the general guidelines to avoid the virus, or if anyone with possible symptoms are found, inform the healthcare professional, to test them or take them to isolation.

The rest of the paper is organized as follows. Related works of computer vision in different domains similar to the domain presented in this paper is given in Section 2. The methodology of the proposed method is given in Section 3. Different experiments are performed, and their results are analysed in Section 4. Finally, the paper is concluded in Section 5.

These are the most unprecedented times for humans on this planet, because of COVID-19. People are discovering new ways to protect themselves from this virus. Researchers in different fields of computer science such as Visualization, Data Science, Artificial Intelligence, and Computer Vision, are using different techniques that can help the human population to detect the presence of the virus, predict the pattern of the spread of the virus, understand different techniques to protect human from the virus, diagnosis of the virus, and possibly the treatment against the virus. In this paper, an investigation is made using deep learning and computer vision techniques to guide people to protect themselves from the spread of the virus based on the guidelines by the CDC and WHO. Similar related studies in different domains are discussed in this section.

In [8], infected patients from COVID-19 are identified by screening radiological imaging using chest radiography. They have developed COVID-Net, a deep CNN to make identification of the presence of virus using chest radiography. In another study [9], a deep learning system is developed that can assist in analysing potentially large number of thoracic CT exams. In [10], chest CT is used to distinguish patients from Pneumonia. A deep learning system is developed that distinguishes patients infected from COVID-19 from Pneumonia and other nonpneumonic lung diseases. In [11], radiology findings and literature review about COVID-19 is explained in detail. In [12], a deep learning system is developed that can make classification for distinguishing the COVID-19 from Influenza-A viral pneumonia. In [13], epidemiological and clinical characteristics of 99 cases of COVID-19 are explained in detail.

Machine Learning and Deep Learning are used for different purposes in the healthcare industry. For example, in [14], the authors have explained the usage of machine learning and deep learning for detecting Malaria in images of blood cells. In another study in [15], deep neural ensembles models are used for malaria parasite detection in thin-blood smear images. In [16], deep learning technique is used to detect malaria from microscopic images. In [17], classification of malaria-infected cells using deep CNN is studied. In [18], malaria infection identification is made in red blood cells using optimized step-increase CNN model. In [14], image analysis and machine learning models are used to detect malaria in blood cells images. In [19], authors have explained the automatic diagnostics of tuberculosis using CNN analysis of MODS digital images. In [20], the detection of pulmonary tuberculosis manifestation is made in Chest X-rays using different CNN models.

In [21], deep learning models are used to read chest radiographs for tuberculosis detection. In [22], artificial intelligence methods are used to automatically detect mycobacterium tuberculosis. In [23], deep neural networks are used to detect tuberculosis. In another study in [24], severity detection and infection level identification of tuberculosis is made using deep learning. In [25], automatic detection is made for tuberculosis bacilli from microscopic sputum smear images using deep learning models. In [26], deep CNN is used to detect diseases in images. In [27], deep learning intervention is studied in detail for healthcare challenges. In [28], a comparison is made between deep learning performance against healthcare professionals in detecting disease from medical imaging.

Machine learning and deep learning are commonly used in other fields. Such as in [29], machine learning algorithms are used for the preliminary diagnosis of Dementia. In [30], target recognition is made in SAR images based on multiresolution representations with 2D canonical correlation analysis. In [31], an image classification algorithm is developed that is based on deep learning kernel function. In [32], internet addition is measured using intelligent behaviour data analysis. In [33], stroke-based recognition of online hand-drawn sketches of arrow connected diagrams and digital logic circuit diagrams. In [34], AI-based classification techniques are used to process EEG data collected during the visual short-term memory. In [35], data analysis and accuracy evaluation are made using a continuous Glucose monitoring device. In [36], ECG based subject identification using statistical features and random forest is made. In another study [37], near-infrared road marking detection is made based on a modified faster regional CNN. In [38], deep learning techniques are used to predict absenteeism in employees at early stage. In [39], deep learning and machine learning algorithms are to predict future terrorist activities.

In all these studies, it is demonstrated that deep learning solutions are commonly used in different computer vision fields, including healthcare-related problems. These studies have taken images and have performed deep learning algorithm to diagnose disease or symptoms of diseases. In this paper, a different approach is taken. No investigating is made about a disease or diagnosis of a disease. But people are monitored to follow the guidelines of CDC that are used to protect humans from the spread of a disease. To the best of the knowledge of authors, no previous study has undertaken such a detailed analysis in protecting humans from COVID-19 using computer vision techniques.

3. Proposed Methodology

This section is divided into four subsections. In the first subsection, an analysis of the dataset specifically developed for this paper to avoid COVID-19 is made. The preprocessing steps required on the dataset to enable it to be processed in deep CNN are also explained. The next subsection explains different CNN models used for the training on the dataset. Different architectures are explained, and the one giving better performance is selected. The next subsection gives an explanation of the identification of body temperature using deep neural networks. Finally, in the fourth subsection, the aforementioned models are integrated to ensure the appropriate execution of recommendations by WHO and CDC, both by controlling its spread and by avoiding direct contact.

3.1. Data Analysis

In this section, the dataset collected for this paper is explained along with the preprocessing stages. In the dataset, different images are collected in different situations and named as Controlled_COVID-19. Each image is annotated by three different individuals to make a consensus on the type of image, and the label with the maximum number is taken as the final label. The categories in Controlled_COVID-19 dataset are given in Table 1, and a montage of images from the dataset is shown in Figure 1.

All images are preprocessed, by resizing the images in the required sizes for the CNN models. Images presented in colour are represented in R, G, and B values, and the value is in the range of 0-255. In this format, some values are very small, such as 0-9, and some are very high, such as 200-255, due to which the learning algorithm has difficulty in learning. Therefore, these values are normalized to bring in the range of 0-1. For each pixel value, the average of all values is subtracted and divided by its standard deviation. The formula of standardization is expressed in Eq. (1), where are pixel values, is the average of all values by column, and s is the standard deviation.

3.2. CNN Classification Models

In this subsection, different commonly used CNN models are explained, that are used in literature in different computer vision-related problems. These models are LeNet, VGG-16, Inception, and ResNet. LeNet architecture, also known as LeNet-5 [40], was the very first CNN architecture developed by LeCun et al. It was used for the recognition of machine-printed digits from 0 to 9. The architecture has used two layers of convolution along with average pooling. There are two fully connected layers. The output layer uses the Softmax classifier to classify input images into 0 to 9 categories. A sample architecture of LeNet is shown in Figure 2. VGG-16 [41] is introduced as an effect of the depth of the convolutional network on the accuracy of the ImageNet. The network uses very small () convolution filters and demonstrates significant improvement on the prior-art configurations by pushing the depth to 16-19 weight layers. A sample architecture of VGG-16 is given in Figure 3.

Another important milestone in the development of CNN classifiers is the Inception network [43, 45], also known as GoogLeNet. Inception network used different techniques to improve performance, i.e., speed and accuracy. Generally, the salient parts of the image can have different variations in size. Hence, it is hard to use the right kernel size for the convolution operation. Larger kernels are generally used to find salient features distributed globally, and smaller kernels are used for salient feature distributed locally. Inception network uses the kernel of multiple sizes on the same level. The network has 9 inception modules stacked linearly. It is 27 layers deep (pooling layers included) and uses average pooling. Vanishing gradient problem in Inception is handled with two auxiliary classifiers. A sample architecture of Inception is shown in Figure 4(a).

Residual Network (or simply ResNet) [44] is considered as one of the most breakthrough in the field of computer vision and deep learning in the last decade. In ResNet, it is possible to train hundreds of layers and obtain better performance. Generally, training in Deep Neural Network is very hard, mainly because of the vanishing gradient, i.e., when the gradient is propagated back to layers at the beginning, repeated multiplication of small values makes the gradient infinitely small. The core idea of ResNet is the introduction of identity shortcut connection which skips one or more layers. A sample architecture of ResNet consisting of 34 layers is shown in Figure 4(b).

3.3. Temperature Measurement System (TMS)

In this paper, the infrared (IR) technique is used to measure temperature, which does not require physical contact with the skin during the process of measuring the temperature of the body. Generally, two types of infrared systems are used; IR thermometer and thermal cameras. The thermal camera is used, that visualizes the body temperature as an image, and the temperature at multiple points over a certain region can be measured. The system is integrated which can detect faces trained in VGG-16. Face tracking algorithm is added that gives improved Frames Per Second (FPS) in the system. Once the face is detected in the image, Kernel Correlation Filter (KCF) tracker [46] is used, that can track the face region across each frame. This continuous body temperature measurement system is inspired from [47].

3.4. COVID-19 Avoidance Framework

Different CNN models that can make classification of Controlled_COVID-19 dataset are explained in Section 3.2. Afterwards, a description of the system that can measure continuous body temperature is given in Section 3.3. In order to determine suspected people with COVID-19 or knowing when it is not safe to get closer to people having high exposure to viruses, both models are combined into a single framework. The framework of the proposed system is shown in Figure 5. The output of both models is taken, i.e., the classification model of Controlled_COVID-19 dataset and TMS and passed it to Softmax classifier to further make three categories.

These categories are High Alert, Keep Distance, and Safe. The model decides if someone is High Alert, which means that keeping distance is mandatory from that person or healthcare professional are informed to test the person for the possibility of having COVID-19. The model can also decide if someone is in the category as Keep Distance, which means that keeping distance of at least one meter from the person is required as the person is exposed to viruses. The safest category determined by the model is Safe, which means it is safe to get closer to that person as the person has no symptoms of COVID-19 or high exposure to the virus. This categorization is based on the guidelines by CDC (https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/prevention.html) that are used to control the spread of COVID-19 and are categorized in Table 2.

4. Results

4.1. Experimental Setup on Cluster

The experiments in this paper are performed on NVidia GeForce GTX 1080. There are 2560 CUDA cores, having Boost Clock 1733 MHz, and memory size as 8 GB GDDR5X.

4.2. Architecture of CNN Models Trained on Controlled_COVID-19 Dataset

Four different CNN models are trained in this paper with 500 epochs using Adam optimizer. These models are implemented in Keras and trained from scratch. The VGG-16 model used for the face detection in Temperature Measurement System is using pretrained model trained on ImageNet. The model is executed using TensorFlow library and hence exploits the parallelism [48, 49] of the cores in GPU.

4.3. Comparison of Accuracy, Precision, Recall, and F1-Score

The formulae to calculate Accuracy, Precision, Recall, and F1-Score are given in eq. (2). TP means True Positive, TN means True Negative, FP means False Positive, and FN means False Negative. The accuracy, precision, recall, and F1-score of LeNet, VGG-16, Inception, and ResNet50 are shown in Figure 6. The accuracy achieved by LeNet, VGG-16, Inception, and ResNet50 is shown in Figure 6(a). From the figure, it can be observed that LeNet has an accuracy a little above 50, which is close to random guess. This means that LeNet is not able to capture the pattern in the dataset. VGG-16 and Inception models have achieved an accuracy of around 70% and 75%, respectively. The best accuracy is achieved by ResNet50. Similarly, analysing the figure given in Figure 6(b), it is observed that ResNet50 has achieved the best precision, recall, and F1-score compared to the other three models.

4.4. Confusion Matrix

The confusion matrix is a performance measurement technique in machine learning classification problems. In the case of binary classification, the table is showing true positive, true negative, false positive, and false negative. In the case of multiclass classifications, the table has a size equal to the number of classes squared. The confusion matrix computed by LeNet, VGG-16, Inception, and ResNet50 in making predictions of different classes in the Controlled_COVID-19 is shown in Figure 7. The higher the number on the diagonal of the confusion matrix, the better the accuracy of the model. As it is observed in Figure 7, the numbers are higher on the diagonals in ResNet50 and Inception compared to LeNet and VGG-16. The values on the diagonal are highest for the confusion matrix achieved by ResNet50, demonstrating that it is the best model compared to the other three models.

4.5. ROC Curve

The ROC (Receiver Operating Characteristic) curve exhibits the performance of computer vision models at the classification threshold. The curve points to two parameters; True Positive Rate (TPR) and False Positive Rate (FPR). These parameters are defined in eq. (3). The ROC curve computed by LeNet, VGG-16, Inception, and ResNet50 is shown in Figure 8. The more the curve is on the left side of the diagonal, the better the performance of the model. As it is observed in Figure 8, the lines representing the curve of different classes are more to the left side of the diagonals for ResNet and Inception models, demonstrating that the deeper the architecture the better the accuracy. The best accuracy is achieved by ResNet50.

4.6. Experimental Setup for Temperature Measurement

The FLIR thermal camera was used to capture the sequential images at the speed of 26 FPS. The pretrained VGG-16 model on ImageNet for Face detection was able to achieve 98% accuracy.

5. Conclusion

Humans are going through the most unprecedented times. Cities are closed, people are ordered to stay at home, the stock market is going down, doctors and nurses are busy in helping patients infected with COVID-19, and soldiers and police are busy in keeping people locked down at their own homes, etc. This situation is because the virus is spreading exponentially from human to human. No vaccine is discovered that can be used immediately. A possible vaccine might take 6-12 months to be used by everyone. The only method discovered so far is to protect humans from this virus and control the spread of the virus. Medical professionals, WHO, CDC, etc. have given precautionary guidelines to protect humans from this virus. However, it is not possible for the government to implement these guidelines on every individual at every place. People need to go for shopping, visit offices, hospitals, banks, etc. for their daily life activities. In this paper, a novel technique is developed that can monitor people to follow the guidelines that are used to protect humans from the spread of the virus. The system can be implemented on the entrances of different places, or even inside different places, to regularly monitor people. In case there are people who have high exposure to the virus, or there are people who have symptoms of a patient infected with this virus, the system can guide other people to keep distance from the person and inform healthcare professionals to test the person and put the person in isolation from the general public.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.