Abstract
Novel coronavirus 2019 has created a pandemic and was first reported in December 2019. It has had very adverse consequences on people's daily life, healthcare, and the world's economy as well. According to the World Health Organization's most recent statistics, COVID19 has become a worldwide pandemic, and the number of infected persons and fatalities growing at an alarming rate. It is highly required to have an effective system to early detect the COVID19 patients to curb the further spreading of the virus from the affected person. Therefore, to early identify positive cases in patients and to support radiologists in the automatic diagnosis of COVID19 from Xray images, a novel method PCAIELM is proposed based on principal component analysis (PCA) and incremental extreme learning machine. The suggested method's key addition is that it considers the benefits of PCA and the incremental extreme learning machine. Further, our strategy PCAIELM reduces the input dimension by extracting the most important information from an image. Consequently, the technique can effectively increase the COVID19 patient prediction performance. In addition to these, PCAIELM has a faster training speed than a multilayer neural network. The proposed approach was tested on a COVID19 patient’s chest Xray image dataset. The experimental results indicate that the proposed approach PCAIELM outperforms PCASVM and PCAELM in terms of accuracy (98.11%), precision (96.11%), recall (97.50%), F1score (98.50%), etc., and training speed.
1. Introduction
The World Health Organization (WHO) identified COVID19 (virus known as SARSCoV2) as a worldwide pandemic in February 2020. This triggered never expected countermeasures, such as the closure of cities, districts, and foreign travel. Coronaviruses (CoV) are deathdefying viruses that may cause severe acute respiratory syndrome (SARSCoV). Various researchers and institutions have attempted an effective solution from different possible diminutions in encountering the COVID19 pandemic. Multimedia dataset (audio, picture, video, etc.) is booming in a massive amount of text information as civilization enters the information era. Image classification has become more essential as the need for realworld vision systems grows [1] and has recently attained a lot of attention from many researchers. It has evolved into one of the most essential operations, serving as a requirement for all other image processing operations. Image classification using learning algorithms is a special open issue in image processing that has sparked a lot of interest due to its promising applications. In general, an image categorization system has two primary processes. The first stage is to create an effective image representation that has enough information about the image to allow for classification further. The second step is to use a good classifier to classify the new image. Thus, there are two major challenges to consider when improving picture classification performance: dimensionality reduction and classifier. Apart from computer vision and image operation, one of the most important stages in image classification is feature extraction which determines the invariant characteristic of images when using computer devices to assess and deal with image data.
In a practical scenario, feature extraction has been applied in many fields like historic structures, medical image processing, remote image sensing, etc. The image’s essential lowerlevel qualities include color, texture, and shape. The color feature has globality, which may be retrieved using tools such as the color histogram, color set, and color moment. It might simply explain the proportions of different colors across the image. The useful characteristic is color for identifying photos that are difficult to distinguish automatically, and the spatial variation should be ignored. However, it is unable to explain the image’s local distribution as well as the description of the distinct colors’ spatial positions. Image classification with feature extraction using incremental extreme learning machines is proposed in this paper. Firstly, on the COVID19 dataset of chest Xray images, features were extracted from an image using PCA. Eventually, the SVM, ELM, and IELM are applied to image classification [2] once the dimension is reduced by PCA method. Different metrics were employed to achieve the robust evaluation: classification accuracy, recall, precision, Fscore, truenegative rate (TNR), truepositive rate (TPR), AUC, Gmean, precisionrecall curve, and receiver operating characteristics (ROC) curve.
The paper is arranged in the following sequence: several related approaches have been discussed in Section 2. The suggested technique is described and critiqued in Section 3. Section 4 contains a description of PCA and feature extraction techniques. Subsections 4.1–4.6 contain different algorithmic approaches that are compared with the proposed method. In Section 5, the proposed method and algorithm have been discussed. Section 6 describes the different evaluation criteria that are used. Section 7 discusses the experimental setup that has been used. Section 8 describes the dataset. Finally, Section 9 discusses the experimental results, and the research is concluded.
2. Related Works
The content of image features comprises color, texture, and other visual elements. The extracted content from visual features is the main component for analyzing the image. In this segment, some of the earlier work based on PCA and other feature extraction techniques along with different classification techniques has been discussed.
Sun et al. [3] suggested an image classification system based on multiview depth characteristics and principal component analysis. In this method, depth features are extracted from the image, and from RGB depth, characters are independently extracted and PCA is applied to reduce dimension. The Scene15 dataset, Caltech256 dataset, and MIT Indoor datasets are used in the evaluation process. Eventually, the SVM [4] is used to classify images. The method's performance is demonstrated by the experimental results.
Mustaqeem and Saqib [5] suggested a hybrid method that is based on PCA and SVM. PROMISE (KC1: 2109 observations, CM1: 344 observations) data from NASA’s directory have been used for the experiment. The dataset was divided into two parts: training (KC1: 1476 observations, CM1: 240 observations) and testing (KC1: 633 observations, CM1: 104 observations). Principal components of the features are extracted by PCA, and it helps in dimensionality reduction and minimizing time complexity.
In addition to this, SVM is used for further classification, and for hyperparameter tuning, GridSearchCV is used. From this, precision, recall, Fmeasure, and accuracy for KC1 dataset analysis are 86.8%, 99.6%, 92.8%, and 86.6%, respectively, and for CM1 dataset analysis, precision, recall, Fmeasure, and accuracy are 96.1%, 99.0%, 97.5%, and 95.2%, respectively. Similarly, Castaño et al. [6] provide a deterministic approach for starting ELM training based on hidden node parameters with activation function. The hidden node parameters are calculated with the help of Moore–Penrose generalized inverse, whereas the output node parameters are recovered through principal component analysis. Experimental validation with fifteen wellknown datasets was used to validate the algorithm. The Bonferroni–Dunn, Nemenyi, and Friedman tests were used to compare the results obtained. In comparison with later ELM advancements, this technique significantly reduces computing costs and outperforms them.
Mateen et al. [7] suggested VGG19 DNNbased DR model with better performance than AlexNet and the spatial invariant feature transform (SIFT) in terms of classification accuracy and processing time. For FC7SVD, FC7PCA, FC8SVD, and FC8PCA, respectively, classification accuracies are 98.34%, 92.2%, 98.13%, and 97.96% by using SVD and PCA feature selection with fully connected layers.
Zhao et al. [8] suggested extreme learning machines with no iteration along with supervised samples are used for model building as a class incremental extreme learning machine. The algorithm is shown to be stable and has almost equivalent accuracy of batch learning. Similarly, Huang and Chen [9] proposed an algorithm that analytically calculates hidden nodes’ output after randomly producing and adding computational nodes to the hidden layer as a convex incremental extreme learning machine. Using a convex optimization, the existing hidden node output is calculated again. This can converge faster while maintaining efficiency and simplicity.
Zhu et al. [10] proposed a principal component analysis (PCA)based categorization system with kernelbased extreme learning machine (KELM). Based on the resultant output, this model achieves better accuracy than SVM and other traditional classification methods. For the classification of HSIs, Kang et al. [11] developed the PCAEPF extraction approach. In this research work, they have proposed the combination of PCA and standard edge preserving filtering (EPF)based feature extraction. The proposed method achieves better classification accuracy with limited training samples. Similarly, PeralesGonzález et al. [12] introduced a new ELM architecture based on the negative correlation learning framework dubbed negative correlation hidden layer ELM (NCHLELM). This model shows better accuracy when compared with other classifications by integrating a parameter into each node in the original ELM hidden layer.
Based on fractal dimension technology, Li et al. [13] suggested an enhanced ELM algorithm (FELM). By reducing the dimension of the hidden layer, the model improves in training speed. From the experimental results, it can be concluded that as compared to the standard ELM technique, the suggested algorithm significantly reduces computing time while also improving inversion accuracy and algorithm stability.
Because of the complexity of the data models, deep learning is incredibly pricey to train. Furthermore, deep learning necessitates the use of highpriced GPUs and hundreds of computer machines. There is no simple rule that can help you choose the best deep learning tools since it necessitates the understanding of topology, training technique, and other characteristics, whereas the simple ELM is a oneshot computation with a rapid learning pace. But the biggest advantage in IELM is the ability to randomly increase hidden nodes incrementally and analytically fix the output weights. The output error of the IELM rapidly diminishes as the number of hidden neurons increases.
In our method, SVM, ELM, and IELM based on the PCA technique are employed for image classification [14] for COVID19 patient detection using the COVID19 chest Xray dataset. A summary of the most recent and related research works is described in Table 1 [3, 5–13].
3. Proposed Methodology
The back propagation (BP) approach is commonly used to train multilayer perceptron (MLP). Various algorithms can be used to train this typical architecture. Gradients and heuristics are two types of algorithms that are commonly used. These algorithms have a few things in common: they have a hard time dealing with enormous amounts of data, and they have a slow convergence rate in these situations. Huang et al. (Huang et al.) [15] introduced the extreme learning machine as a solution to this problem.
The typical computing time required to train an SLFN using gradientbased techniques is reduced by this algorithm. The ELM, on the other hand, has several flaws. The randomly generated input weights and bias for ELM [16] result in some network instability. In case if there are outliners in the training data, then the hidden layer's output matrix will have illconditioned problems and it results in low generalization performance and lower forecasting accuracy. There are two types of ELM called fixed ELM and IELM [17]. In comparison with the ELM, the output error of the IELM rapidly diminishes and it tends toward zero with the growth in number of hidden neurons (Huang et al.) [15]. In online continuous learning regression and classification problem, this approach is very prominent (Xu and Wang; Zhang et al.) [18, 19].
A trained classifier can be obtained after training the classifiers with a sufficient amount of image data and then fed into the trained classifier for observation and analysis.
4. Feature Extraction
A single feature cannot describe the image feature and quality properly. The image classification will not yield acceptable results unless distinguishing features are described. Three images corresponding to three viewpoints are placed on each RGB color image. Our method uses PCA to extract the image's important information and minimize the input dimension [20–23].
4.1. Classification of Images and PCA Feature Extraction
Extracting useful features from an image is a prominent task in image classification, and principal component analysis (PCA) is used for this purpose. PCA uses orthogonal transformation and converts variables to fewer independent components than the original variables. The output data with this approach will not lose important data features, and PCA loadings can be used for the identification of important data. A multivariate statistical analysis approach is used by PCA, which can perform linear transformation of numerous variables to pick a few key variables. PCA transforms data using eigenvectors from Ndimension to Mdimension where M < N. The new features are a linear mixture of the old ones, allowing them to capture the data's intrinsic unpredictability with little information loss. Figure 1 reveals the steps of the proposed model.
Suppose that the research object has p indexes, these indexes are regarded as p random variables and represented as X_{1}, X_{2}, , X_{p}. With this, new indexes are created by combining p random variable F_{1}, F_{2}, ..., F_{p}, which can mirror the data from the original indexes [24]. The independent replacement indexes reflect the original indexes’ essential information.
The following are the PCA stages in detail:(1)Data standardization: The following calculation formula is used to standardize the matrix X: where X = {x_{ij}}, Y = {y_{ij}}, where i = 1, 2, ..., n and j = 1, 2, ..., p,(2)The following formula is used to solve the correlation coefficient matrix R:(3)The following formula is used to calculate the eigenvalue and eigenvector of the coefficient matrix: The calculated eigenvector is a_{i} = (a_{i1}, a_{i2}, ... , a_{ip}), where i = 1, 2, 3, 4, …………., p, and the eigenvalue is i (i = 1, 2, ..., p). To get a collection of main components Fi, the eigenvalues are sorted in descending order:(4)The following are the main factors to consider kth primary component contribution rate and expressed as
The rate of the first k primary components’ cumulative contribution is expressed as
The first principal component, F_{1}, is the one with the highest variance out of all the combinations of Y_{1}, Y_{2}, ..., Y_{p}; the second principal component F_{2} is one with the highest variance among all the combinations of Y_{1}, Y_{2}, ..., Y_{p}, and they have no relation with F_{1}.
4.2. SVM
Several algorithms have been implemented and suggested in machine learning to solve the classification problem. Among the different classification problems, support vector machine (SVM) is one of the supervised algorithms in machine learning with [5, 25] the advantages as follows:(i)It employs L2 regularization to overcome overfitting problems.(ii)Even with minimal data, provide suitable findings.(iii)Different kernel functions to match the features’ complicated functions and interactions.(iv)Manages the data nonlinearity.(v)The model is stable thanks to the hyperplane splitting rule.(vi)Analyzes the data with a high degree of dimensionality.
Instead of focusing on decreasing prediction error, SVM focuses more on optimizing classification decision boundaries, which is why the hyperplane is used to separate classes. If the data dimension is n and the hyperplane is a (n − 1) vector function, then it can be represented mathematically as follows:
It also signifies, in a broader sense,where x denotes the input feature vector, is the weight vector, and b is the bias. By adjusting and b, several hyperplanes can be created, but the hyperplane with the best margin will be chosen. The largest feasible perpendicular distance between each class and the hyperplane is defined as ideal margin. The cost function or objective function is minimized to get the best margin. The cost function may be written as follows:
Even if the predictions are right and the data are correctly categorized by hypothesis, SMV utilized to penalize any that are close to the borders (0 < ). The main goal is to figure out optimal value to minimize , so differentiating Eq. 11 concerning , we get the gradient of a cost function as follows:
As far as we have calculated , weights of can be updated as
We go through the procedure again and again until smallest discovered. Because data are rarely linearly separable, we must sketch a decision boundary between the classes rather than using a hyperplane to separate them. We will need to convert (13) into a decision boundary to deal with the dataset’s nonlinearity: is the kernel function in (14). There are various types of kernel functions that may be used to create SVM, such as linear, polynomial, and exponential, but we will use the radial basis function in this model (RBF). Distance parameter that is used is Euclidean distance, and the smoothness of the borders is defined by the parameter .where is the square of Euclidean distance between any single observation and mean of the training sample .

4.3. PCASVM
The motive of the support vector machine (SVM) [3] is to find the best possible hyperplane that will separate two planes on the training set. The coefficient of the hyperplane is that we have to project. It uses structural risk minimization theory to build the best hyperplane segmentation in the feature space and a learning editor to achieve global optimization.
Assume the training data, , , .
This could be projected into a hyperplane:
For the normalization,
The classification of the interval is equal to , when the maximum interval is equal to the minimum .
Before classifying the data through SVM, the necessary features from the image data need to be extracted. The highdimensional data can be converted to the lowdimensional data with this approach. For this, the PCA method as a feature extraction through convergence matrix and eigenvalue proportion calculation is used. PCAbased SVM is the method that is used for classification and regression. After that, SVM is used to classify lowdimensional data. Figure 2 depicts the working flow of PCASVM. Once the parameter optimization is done, the model is ready to predict categorization.

4.4. Extreme Learning Machine (ELM)
An extreme learning machine is a single hidden layer feedforward network that can be used for both classification and regression. In ELM [26], weights between the input layer, hidden layer, and biases are randomly generated. The output weights are calculated using the generalized Moore–Penrose pseudoinverse. ELM performs faster than other feedforward networks [27] and outperforms other iterative methods. Figure 3 shows the basic network architecture of ELM.
Suppose [x_{i}, t_{i}] denotes N training samples, wherein training instances iϵ 1, 2, 3, …………, N and x_{i} = [x_{i1}, x_{i2},…, x_{im}]T ϵ R^{m} denotes i^{th} training instance and its desired output t_{i} = [t_{i1}, t_{i2},…, t_{iC} ]^{T}ϵ R^{C}.
Let the number of input features and number of neurons be equal and represented by m; similarly, let L be the number of hidden neurons. The number of output neurons and number of classes are equal and denoted by c. Figure 4 [24] shows the flowchart of the principal component analysis [28]. The input weight matrix is represented by U = [u_{1}, u_{2},…, u_{j},…u_{L}]^{T}R_{L×m}, and the hidden neuron bias is represented by b = [b_{1}, b_{2},…, b_{j},…b_{L}]^{T}RL. u_{j} = [u_{j1}, u_{j2},…u_{jm}] are the connecting weights between the j^{th} hidden neuron with the input neurons. Bias of the j^{th} hidden neuron is bj, and jth hidden layer output for i^{th} instance is represented by
Here, activation function is represented by . For all the training instances hidden layer output is represented by H and can be represented by
Between the hidden layer and the output layer, the output weight β can be computed using Eq. (20). Linear activation function is used by the output layer in this computation.
Here,
The vector β_{j} = [β_{j1},…, β_{jk},…, β_{jC}]^{T}, where j = (1, 2, 3, …..…, L) represents the connecting weights between the j^{th} hidden neuron and the k^{th} output neuron. The predicted outcome of all the output neurons for all training instances is represented as
Here, the output function is f(x) = [f_{k}(x),…, f_{C}(x)]. From Eq. 23, label for class x can be predicted.

4.5. PCAELM : Classification Method Based on PCAELM
In the PCA technique [6], variables are first scaled. The different steps of PCA that has been applied in PCAELM are(1)Scaling of trained data.(2)Covariance matrix evaluation.(3)Eigenvalues for the covariance matrix along with eigenvectors are defined.(4)Evaluating the principal components.
The output from PCA is given as an input to ELM [29]. The process of PCAELM [30] is shown in Figure 5.

4.6. ELM
Compared to the other neural networks, the ELM learns faster as there is no need to adjust hidden nodes and provides better generalization capability. But there are various flaws with the ELM. Randomly generated bias and input weights in ELM network [31] are results in some network instability. Training data outliers from the hidden layer's output matrix result in poor network generalization performance. In comparison to the ELM, the output error of the IELM rapidly diminishes and resolves the issue of very small weights of output and validity of hidden layer neurons. In online continuous learning, it is appropriate for regression and classification tasks.
The IELM [32] network model structure is shown in Figure 6. Suppose the size of input, hidden nodes, and outputs are m, , and n, respectively, and is the input weight matrix with dimension of the current hidden layer neuron and uniformly distributed between random numbers . The bias of the ith hidden node is a random number between uniformly distributed, the activation function for the hidden layer neuron is sigmoid function given by (24), and output weight matrix β is with dimension.
The hidden node activation function (sigmoid) is given bywhere x is the input matrix.
A matrix X is of dimension, and it represents N dataset input. Y is a n × N matrix that represents the output where N datasets for a training set {(X, Y)}. Training steps of IELM algorithm are described as follows:
Step 1. In the initialization phase, suppose = 0 and L is the maximum number of the hidden nodes. Output Y is defined in terms of the initial value of the residuals E (difference between target and actual error) is set to be the and ε is the expected training accuracy.
Step 2. Training phase, while < L and E > ε(1)Hidden nodes will be increased by 1, i.e., (2)Hidden layer neuron is evaluated randomly from input weights and bias .(3)Output of the activation function is calculated for the node ( needs to be extended into a vector ).(4)Hidden layer neuron output vector can be calculated from(5)Output weight for can be evaluated from(6)After increasing the new hidden node, residual error is calculated:The network error rate can be reduced by the output weight . All these steps will iteratively work till the residual error becomes smaller than ε. The training process restarts through the determination of the random input weight and the bias . Whether the trained network has fulfilled the desirable result or not can be determined from set.

5. Proposed PCABased Incremental ELM (PCAIELM)
An orthogonal transformation is used to extract meaningful characteristics from data in PCA [33]. PCA may also be used to minimize the dimensions of a large data collection. Principal components from COVID19 Xray images are extracted using PCA and given as input to IELM which gradually adds concealed nodes produced at random. A conventional SLFNs function with n hidden nodes can be expressed aswhere denotes the output of the ith hidden node: (for additive nodes) or .
The ith hidden layer and the output node are linked with output weights . Hidden nodes are randomly added to the existing networks in IELM. The randomly generated hidden node parameters and and fixed output weight are .
Suppose the residual error function for the current network is defined as where n is the number of hidden nodes and is the target function. IELM is mathematically represented as

6. Evaluation Criteria for Effective Measure of Model
For evaluation of the different models, generally, the confusion matrix is prepared. Table 2 defines a simple representation of the confusion matrix [34, 35], and it can classify between predicted and actual values. From the confusion matrix, we can derive different performance metrics, e.g., accuracy, precision, recall, sensitivity, and Fscore. To assess the model, nine different metrics are calculated by formula as given in Table 3 [36].
7. Experimental Setup
The whole experiment was performed on a system having a configuration of 10th Generation Intel (R) Core (TM) i710750H CPU @ 2.60 GHz processor, 8 GB RAM, and NVIDIA GTX graphics 1650TI. The code is written in Python 3.10.0 and uses Jupyter Notebook as a debugger, which can be installed from the link: https://jupyter.org/install.
8. Dataset Description
The COVID19 chest Xray images [37] dataset encompasses a total of 13808 images in which 3616 COVID19 positive cases (26.2%) along with 10,192 (73.8%) normal cases are downloaded from Kaggle. COVID19 and normal patient chest Xray images are kept in separate files. Dataset was divided into training and testing images which had been done randomly with a condition that testing images will not be repeated in training images. During the experiment, 80% of the total images were used for training and 20% for testing. All images have the same dimension (299 × 299) pixels in the PNG file format. Figure 7 demonstrates the Xray images of normal and COVID19 cases.
The histogram of an image gives a global description of the image’s appearance. It represents the relative frequency of occurrences of various intensity values in an image. In the histogram of the COVID19 image, the intensity value is highest between bins 14–15, whereas in the normal image the histogram has the highest intensity value at bins 16–17. This difference in the color intensity value assists in making the distinction between COVID19 and normal images. Figure 8 demonstrates the histogram plot of normal and COVID19 images. Figure 9 shows the training images for Xray images of COVID19 and normal.
Because PCA uses orthogonal transformation to convert all features into a few independent features, all features are considered during the feature selection process. The data to be processed are reduced to a set of features called a “reduced representation set.”
9. Results and Discussion
In this segment, we present the outcomes and analysis of the experiments performed in the COVID19 patient prediction using the chest Xray dataset. From the experimental results, the proposed method shows better performance in terms of accuracy, precision, recall, F1score, AUC, Gmean, and other parameters. For each model, PCASVM, PCAELM, and PCAIELM, a separate confusion matrix is formed. All the performance metrics values are derived from the confusion matrix (Tables 4–6). Classification accuracy gained by the proposed method PCAIELM is 98.11% over the chest Xray dataset, which suggests better results than the other two models, PCAbased SVM (91.8%) and PCAbased ELM (93.80%) in terms of accuracy. Sometimes, performance metrics' accuracy may be misleading and can misclassify instances. So, other metrics are also taken into consideration to confirm the claim made by the classifier. PCAIELM has the highest precision value of 96.11%. That means PCAIELM is 96.11% reliable in making decisions, whereas models PCASVM and PCAELM record less precision, 84.3% and 88.3%, respectively. Similarly, for the proposed method PCAIELM, other metrics (refer to Figure 10) recall, F_{1}score, TPR, TNR, and Gmean are considerably higher than the other two methods, PCASVM and PCAELM.
The geometric mean (Gmean) is a statistic that analyzes categorization performance across majority and minority classes. Even if negative examples are correctly labelled as such, a poor Gmean suggests weak performance in identifying positive occurrences. This statistic is essential for preventing overfitting the negative class while underfitting the positive class, since the COVID19 dataset understudy is also class imbalanced (IR = 2.81). Even then, the PCAELM model indicates good performance by attaining the highest Gmean value of 98%. Similarly, PCASVM and PCAELM have 88% and 90.5% success rates, respectively.
Table 7 demonstrates the performance variation (sensitivity, specificity, precision, F1score, accuracy) based on different counts of hidden nodes in the range of 10–150 with an interval of 10 hidden nodes. Training and testing accuracies of PCAIELM demonstrated almost the same behavior on the COVID19 dataset (refer to Figure 11). There is moderate variation in the accuracy of PCAIELM with respect to different numbers of hidden nodes. The accuracy at 10 numbers of hidden nodes was found to be 97.73%, and 98.11% was achieved at 140 numbers of hidden nodes in the PCAIELM model and beyond (refer to Table 7).
When there is a moderate to large class imbalance, precisionrecall curves should be drawn. Here, the COVID19 dataset is imbalanced with an imbalance ratio (IR) of 2.81. It is worth noticing that precision is also called the positive predictive value (PPV). Moreover, recall is also known as sensitivity, hit rate, or truepositive rate (TPR). It means they talk about positive cases and not negative ones. Most machine learning algorithms often involve a tradeoff between recall and precision. A good PR curve has a greater AUC (area under curve). Figures 12(b), 13(b), and 14(b) depict PR curves. Figure 13(b) shows the greater AUC, which is an indication of the better performance of PCAIELM than the other two models. In addition to these, ROC of Figure 14(a) also grabs more AUC than two other Figures 12(a) and 13(a). Therefore, PCAIELM claims better performance than PCASVM and PCAIELM. The proposed PCAIELM model outperforms other previously developed models for identification of COVID19 patients from chest Xray image (refer Table 8 [38–47]). As far as the training and testing time taken by the proposed model PCAIELM is concerned, it was higher (refer to Table 9) because the execution of the model happened in an incremental way and not in one go.
(a)
(b)
(a)
(b)
(a)
(b)
10. Conclusions
In this paper, an effective classification model is proposed on the COVID19 chest Xray image dataset using principal component analysis (PCA) and incremental extreme learning machine (IELM). This study established the valuable application of the ELM model to classify COVID19 patients from Xray images by developing the PCAIELM model. The proposed PCAbased IELM algorithm is an efficient IELMbased algorithm. The hidden node parameters are measured by the information returned to the PCA in the training dataset, and using the Moore–Penrose generalized inverse output, the node parameters are determined. PCAIELM utilizes the best feature of IELM, which is to increase hidden nodes incrementally and wisely determine the output weights, whereas ELM requires you to set the appropriate number of hidden nodes manually, and this is similar to the hit and trial method. In comparison with the ELM, the output error of the IELM rapidly reduces and is near to zero as the number of hidden neurons increases. It was observed that as the number of hidden nodes increased, the performance of the PCAIELM increased and it became stable at 150 hidden nodes. PCAIELM outperforms PCASVM and PCAELM in terms of accuracy (98.11%), precision (96.11%), recall (97.50%), F1score (98.50%), Gmean (98%), etc. The suggested research contributes to the prospect of a lowcost, quick, and automated diagnosis of the COVID19 patient, and it may be used in clinical scenarios. This effective system can provide early detection of COVID19 patients. As a result, it is helpful in controlling the further spread of the virus from an affected person. This is an intelligent assistance for radiologists to accurately diagnose COVID19 in Xray images.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.