Abstract

Currently, individual artificial intelligence (AI) algorithms face significant challenges in effectively diagnosing and predicting early stage emerging serious diseases. Our investigation indicates that these challenges primarily arise from insufficient clinical treatment data, leading to inadequate model training and substantial disparities among algorithm outcomes. Therefore, this study introduces an adaptive framework aimed at increasing prediction accuracy and mitigating instability by integrating various AI algorithms. In analyzing two cohorts of early cases of the coronavirus disease 2019 (COVID-19) in Wuhan, China, we demonstrate the reliability and precision of the adaptive combined learning algorithm. Employing an adaptive combination with three feature importance methods (Random Forest (RF), Scalable end-to-end Tree Boosting System (XGBoost), and Sparsity Oriented Importance Learning (SOIL)) for two cohorts, we identified 23 clinical features with significant impacts on COVID-19 outcomes. Subsequently, the adaptive combined prediction leveraged and enhanced the advantages of individual methods based on three forecasting algorithms (RF, XGBoost, and Logistic regression). The average accuracy for both cohorts exceeded 0.95, with the area under the receiver operating characteristics curve (AUC) values of 0.983 and 0.988, respectively. We established a severity grading system for COVID-19 based on the combined probability of death. Compared to the original classification, there was a significant decrease in the number of patients in the severe and critical levels, while the levels of mild and moderate showed a substantial increase. This severity grading system provides a more rational grading in clinical treatment. Clinicians can utilize this system for effective and reliable preliminary assessments and examinations of patients with emerging diseases, enabling timely and targeted treatment.

1. Introduction

Emerging and reemerging infectious diseases continuously pose a serious threat to human health [1, 2]. Particularly, the outbreak and spread of the coronavirus disease 2019 (COVID-19) not only brought about profound loss of lives but also triggered a severe socioeconomic crisis [3]. If an efficient and precise healthcare identification, diagnostic, and treatment system can be established early in the development of a disease, it has the potential to minimize the scope of the disease outbreak, reduce individual health damage, and optimize resource utilization [4, 5].

Using COVID-19 in the early stage as a case study, data from the Chinese Center for Disease Control and Prevention’s epidemiological investigation revealed that out of 44,415 confirmed cases, 81% were categorized as mild, 14% as severe, and 5% as critical. COVID-19 patients of different severity levels exhibited significant differences in prognosis upon hospital admission. Most mild cases or suspected cases were allocated to Fangcang shelter hospitals or other public facilities for centralized isolation, where they received primary medical care and showed better prognosis [6, 7]. Severe or critical patients, especially the elderly or those with preexisting comorbidities, were prone to develop severe pneumonia, acute respiratory distress syndrome (ARDS), and multiple organ failure, thereby facing a higher risk of mortality [8].

Artificial intelligence (AI) has made significant advancements in guiding disease diagnosis and prognosis management, particularly in combating the COVID-19 pandemic [9]. The identification, diagnosis, and prognosis prediction of COVID-19 hospitalized patients using AI algorithms have been proposed. Existing approaches involve machine learning and deep learning algorithms to analyze multisource data, including clinical examinations and imaging scans [10]. However, existing AI methods have not yet been able to accurately analyze disease features and predictions in the early stages of emerging diseases. Below, we will first review the typical applications of current AI algorithms in common diseases, with a specific focus on AI in the diagnosis of emerging diseases, especially COVID-19.

2. Literature Review

2.1. AI for Diagnosis and Prognosis of Common Disease

Significant progress has been made in AI-based disease diagnosis and prediction of disease risks, progression, and treatment response. In particular, image recognition and natural language processing, which are based on big data, have been widely applied in recent years. To classify skin lesions, a convolutional neural network model was trained using a dataset of 129,450 clinical images [11, 12], and a specialized neural network, tailored for image classification, was trained on a retrospective development dataset consisting of 128,175 retinal images. The focus of the training was on detecting diabetic retinopathy, specifically in primary care offices [13, 14]. In addition, for the development of an artificial intelligence algorithm for the diagnosis and Gleason grading of prostate cancer, a retrospective collection of 12,625 whole-slide images (WSIs) from six different sites was undertaken. These WSIs comprised prostate biopsies and were utilized to train and refine the algorithm [15], as well as other applications such as colorectal cancer stratification and atrial fibrillation identification [16, 17]. Another application utilized automated natural language processing systems and deep learning techniques to analyze electronic health records from 1,362,559 pediatric patients and guide the classification diagnosis of common childhood diseases [18]. Machine learning algorithms were applied to explore the key features influencing the treatment of infertility and to grade the outcomes in 78,826 treatment cycles [19, 20]. Recently, the Human Lung Cell Atlas (HLCA) has been developed, which integrates large-scale, cross-dataset organ maps within the Human Cell Atlas [21]. Furthermore, new research from a preventive perspective, such as utilizing machine learning models to train on data involving 22 common cancers and predicting the origins of cancer and treatment responses in 36,445 cases, is noteworthy. This research assists doctors in formulating personalized treatment strategies [22].

2.2. AI for Prediction of Diagnosis and Prognosis of COVID-19

The huge applications of AI techniques have encompassed epidemiology, therapeutics, clinical research, and social studies to combat the COVID-19 pandemic [23, 24]. First, in terms of rapid and accurate COVID-19 diagnosis, deep learning methods provide great help for rapid and accurate detection of COVID-19 through chest X-ray and Computed Tomography (CT) images [2529]. In addition, some studies have conducted a series of deep learning algorithms trained on cohorts consisting of thousands of patients to localize the pleural/parenchymal walls and classify COVID-19 pneumonia [30, 31]. Second, some studies have focused on the prediction of prognosis in COVID-19. A research employed machine learning tools to identify three biomarkers from blood samples of COVID-19 patients, achieving a prediction accuracy of over 90% in forecasting patient mortality ten days in advance [32]. A high-resolution COVID-19 mortality prediction model has been developed to identify future mortality risk two weeks prior to clinical outcomes [33]. A method utilized Shift3D and random weighted loss for multitask learning in COVID-19 diagnosis and severity assessment [34]. Some interesting studies have taken into consideration the issue of multiple sources. An open-source deep learning approach has been proposed for diagnosing COVID-19 using chest CT images [35], and an approach combining regularized cost-sensitive capsule network was proposed for early detection of COVID-19 using imbalanced or limited data [36]. In addition, the integration of deep learning CT scan models with biological and clinical variables was proposed to predict the severity of COVID-19 in patients [37], and an integrated CT image and resource library for COVID-19 with deep learning algorithms was developed [38]. Recently, some studies have focused on selecting key features that influence the outcome of COVID-19 for prediction [39]. For example, a study considered utilizing feature selection methods to reduce the clinical features to 13 key features and predicted COVID-19 severity based on personalized diagnostic models [40]. The significance of known risk factors for the in-hospital mortality rate of COVID-19 was evaluated, and the predictive utility and grading diagnosis of radiological texture features were investigated using various machine learning methods [41].

2.3. Motivations and Contributions

Through a review of research, it can be observed that current AI methods heavily rely on vast amounts of data for the diagnosis of various diseases, including COVID-19. However, there is almost no AI application research for early stage emerging diseases. For newly emerging diseases like COVID-19, it is crucial in the early stages for timely and accurate identification, diagnosis, and treatment. This often becomes a race against time and a matter of life and death, and waiting until a large number of cases accumulate for analysis can prove to be too late [42]. In addition, an examination of 37,421 COVID-19-related studies in the British Medical Journal revealed that nearly 87% of the studies exhibited bias, primarily due to inadequate sample sizes [43]. Moreover, the widespread use of a single model introduces uncertainty in model selection, potentially leading to biased model estimates and increased unreliability of results [44].

This study proposes an adaptive combined learning framework for the diagnosis and outcome prediction of newly emerging major diseases in small sample data. Taking advantage of combined computation, we can alleviate the underfitting issues arising from insufficient training data and reduce biases associated with the selection of a single AI method. The weight of the combination is placed on the method that can better fit the real data, which reflects the adaptability and scalability for different data. The proposed framework is applied in two early COVID-19 cohorts, demonstrating the adaptability and reliability of this approach. The main contributions of this study include the following:(1)To provide targeted guidance to doctors for a rapid and accurate understanding of the clinical characteristics and examination of newly emerging diseases, we propose the Adaptive Combination Importance (ACIM) measure with binary responses. This method combines the importance of various AI algorithms regarding the impact of clinical features on disease outcomes. This provides a basis for the swift formulation of public health emergency policies in response to emerging diseases with limited sample data.(2)To provide a precise prediction of newly emerging disease outcomes based on a key clinical understanding, we design an Adaptive Combination Prediction Algorithm (ACPA) with binary responses. This method combines the serious disease outcome predictions from different AI algorithms. This serves as a reliable algorithmic foundation to assist doctors in faster and more accurate assessments of disease occurrence, progression, and outcomes within limited time and medical data information. And it supports medical decision making and resource allocation with a flexible AI framework.(3)To provide grading treatment with a focus on predicting outcomes and assigning corresponding therapies upon patients’ admission, we propose a disease severity grading system based on adaptive prediction in terms of probability of death for patients. This offers a meticulously designed treatment approach that aligns with key features and early diagnosis, potentially improving actual treatment outcomes. This will support the optimization of medical interventions in the event of severe disease outbreaks and minimize the wastage of medical resources.

The outline of the rest of this paper is as follows. Section 3 introduces the adaptive combined feature screening and combined prediction algorithms for emerging diseases. Section 4 presents two cohorts of COVID-19 in Wuhan, along with relevant data analysis. Section 5 elaborates on the results for both cohorts. Section 6 encompasses the discussion, and Section 7 will address future work.

3. The Combined Feature Screening and Prediction for an Emerging Disease

In this section, we propose a comprehensive framework for screening features and predicting outcomes in the context of an emerging disease. Initially, leveraging existing clinical data, we designed an algorithm that integrates multiple feature screening methods to mitigate instability across different approaches. Through weighted calculations, our aim is to align the combined feature assessment with the inherent patterns in the data. Subsequently, the combined prediction based on variables with feature screening is used to forecast disease outcomes. We anticipate that this integrated approach will deliver more stable and accurate predictions.

We use the following notation to represent the dataset and model parameters. Let the dataset contain samples, each sample consisting of features, represented by , where represents the feature vector of the i-th sample, and  ∈ {0, 1} represents the binary response variable of the i-th sample. Let denote the probability of occurring given the feature vector , where the sample size could be smaller than the number of features.

3.1. Adaptive Combined Importance Measure for Binary Response

Feature importance is the study of the contribution of each feature to the outcome and the selection of features considered significant. Random Forest (RF) and XGBoost algorithms, as representatives of model-free methods, are widely used in importance learning [45, 46]. In recent years, there have been other combined methods proposed for feature importance learning based on parametric models, such as Sparsity Oriented Importance Learning (SOIL), which presents feature importance as a weighted linear model [47]. A combined feature learning method based on these three feature algorithms has been proposed to comprehensively and objectively evaluate the importance of features influencing the continuity of health [48]. However, there has been no research based on binary response.

We introduce the general form of combined feature screening in binary disease data. The calculation has three steps based on K screening methods. First, calculate the feature importance sequence for each screening method under binary scenario and normalize their values then denoted as , , …, , where represents the importance value of the k method for p features. Secondly, weights are calculated based on the features recommended by each algorithm, denoted as , , …, . Finally, an adaptive combined importance (ACIM) for binary response iswhere the computation of weights relies on the data-splitting, and we employ analogous calculations to derive the weight procedure, as to those utilized in Algorithm 1 [49].

Input: (Repeat times);
Output: weight of each screening method.
(i) Randomly split into a training set and a test set , and the sample sizes are , respectively.
(ii) For each method, fit an estimator using the training set , where represents the screening method that needs to be computed.
(iii) For each method, compute the prediction on the test set the training model under .
(iv) For the observations the weight for each method under s time is
Repeat the above steps N times to get and then obtain the weight .

A larger weight for a method indicates a better model fit. Due to the random nature of data-splitting, there is a significant possibility of distinguishing the performance of different methods on different datasets.

Combined feature screening brings several advantages. Firstly, it serves as a consolidation of information from various feature screening methods, achieving an effect akin to the majority’s choice. Secondly, with adaptive weight calculations, it can reflect which method fits the real data better based on the magnitude of weights. This grants more influence to the method with superior fitting, enhancing its say in the majority of selections. Consequently, this makes the features identified by the combined screening more closely approximate the true ranking of feature importance.

3.2. Adaptive Combined Prediction Algorithm for Binary Response

We provide a computational framework for adaptive combined prediction algorithm (ACPA) for binary response. Assuming there are M methods available to provide probability predictions for the binary disease outcome being 1, denoted as , the combined estimation of probability is

The detailed weighting calculation in (2) is similar to that in Algorithm 1.

Regarding combined prediction, there are several advantages. Firstly, a single prediction method may exhibit bias in predicting disease outcomes, while a combination of multiple prediction methods brings more stable results, especially in scenarios where highly accurate disease prediction is needed. Secondly, weight calculations can assess the performance of different prediction methods, allowing the method with the best performance in combined prediction to give greater weight. When one method significantly outperforms others, its weight in ACPA is likely to approach 1. This makes the ACPA results closely align with the performance of the best method. The relevant theories are ensured to be present in the literature [50]. Conversely, if all methods perform similarly and are not particularly effective, ACPA’s results may surpass those of individual methods.

4. COVID-19 Cohorts and Calculation Procedures

4.1. Data Source

In this retrospective study, we collected clinical records from two groups of COVID-19 patients admitted early and with prolonged hospital stays. Both datasets comprise extensive clinical examinations conducted upon patient admission, categorizing patients into four different disease severity levels for treatment (Mild (T4), Moderate (T3), Severe (T2), and Critical (T1)) based on the Diagnosis and Treatment Protocol for COVID-19 issued by the National Health Commission of China (Trial Version 5), and the composite endpoint was discharge from the hospital or death (cured or deceased).

4.1.1. Cohort 1

These data related to early stage COVID-19 are presented on a public website (https://ngdc.cncb.ac.cn/ictcf/HUST-19.php). It enrolled 1,126 patients from Union Hospital (HUST-UH) and 395 patients from Liyuan Hospital (HUST-LH) in Wuhan, Hubei Province, China, during the period from January 2020 to February 2020.

These data encompass rich clinical features of early COVID-19 confirmed cases. Among these patients, 130 clinical tests spanning nine categories were conducted, including basic information, routine blood tests, inflammation tests, blood coagulation tests, biochemical tests, immune cell typing, cytokine profile tests, autoimmune tests, and routine urine tests.

The inclusion criteria for this cohort comprised 711 confirmed COVID-19 patients with the number of cured being 654 and deceased being 57. Among the 311 clinical examination features, a considerable proportion exhibited significant missing data. We opted to include 62 features with a missing proportion below 40% for further investigation. These selected features encompassed all the aforementioned diagnostic procedures, and detailed feature information is available in Table S1 of supplementary material. Basic information, such as mortality outcomes, SARS-CoV-2 RNA testing, age, gender, body temperature (°C), and the presence of underlying diseases, was derived from patients’ medical records, and none of these variables had missing values.

4.1.2. Cohort 2

The study involves a substantial analysis of COVID-19 and serves as a focal point for the COVID-19 pandemic [32, 5154]. We collected and compiled data from early consecutive COVID-19 patients admitted to Tongji Hospital in Wuhan, Hubei province, China, from January 2020 to April 2020. A total of 3286 medical records were extracted from electronic health records. Of the initial 3286 medical records, 63 records had missing data or did not meet the composite endpoint, and 3223 patients were included in this study, as detailed in Table S2 of supplementary material.

The inclusion criteria for this cohort include 3223 confirmed COVID-19 patients with the number of cured being 2920 and deceased being 303. Medical records were reviewed and extracted from electronic health records using a standardized data collection form by experienced clinicians and independently reviewed by two researchers. 32 clinical examination features are provided in this cohort. This study was approved by the Medical Ethical Committee of Tongji Hospital, Tongji Medical College of Huazhong University of Science and Technology. Written informed consent was waived in light of the use of deidentified retrospective data.

All methods were performed in accordance with the relevant guidelines and regulations. The study followed the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) reporting guideline.

The processing and analysis of the two datasets are conducted in parallel, involving data preparation, feature selection, and outcome prediction. The detailed analysis process is shown in Figure 1.

4.2. Data Processing

Data preparation primarily involves data imputation and data balancing [55, 56]. For cohort 1, except for 4 basic kinds of information: gender, underlying diseases, age, and body temperature at admission, other 126 clinical examination features all have missing data. We selected 58 features with missing proportions less than 40% from cohort 1 and used Multiple Imputation by Chained Equations (MICEs) to impute missing data, specifically calculated using the R package ‘mice’. To ensure data balance, the Synthetic Minority Oversampling Technique for Nominal and Continuous features (SMOTENC) was implemented in both cohorts, with the specific calculation performed using Python 3.10 with Imbalanced-learn version 0.10.0.

4.3. ACIM Calculation for Feature Screening

We selected RF, XGBoost, and SOIL for ACIM, and the calculations corresponding to these three methods use R packages: “randomForest”, “xgboost” XGBoost, and “SOIL”. The parameter settings for these three methods were primarily based on their default settings, with the exception of setting ntree = 200 to control the model complexity in RF. For XGBoost, we utilized the “trainControl” function from the “caret” package in R to optimize parameters, setting max_depth = 3 to control model complexities for both cohorts.

Weight calculations were derived from Algorithm 1, where we chose features with value of ACIM greater than 0.2 in each method for model evaluation. The specific results of ACIM calculations for the two cohorts are illustrated in Figure 2.

4.4. ACPA Calculation for Disease Outcomes

We utilized the combined prediction of RF, XGBoost, and Logistic regression based on important features recommended by ACIM, specifically those with ACIM greater than 0.2. The parameter settings for RF and XGBoost are similar to ACIM. Weight calculations are also similar to ACIM.

We randomly split data for both cohorts into training and testing sets using an 8/2 split ratio. This ensures homogeneity in both the training and testing data, reducing the impact of human selection. Additionally, to obtain stable prediction results and evaluate model performance, we repeated the data-splitting process 100 times. If the estimated probability of death calculated by ACPA exceeds 0.5, it is considered as death; otherwise, it is considered as cured, and the results are presented in Table 1.

4.5. Model Evaluation

We utilize common machine learning classification metrics to elucidate the effectiveness of the method. These metrics include Accuracy, Precision, Recall, and F1 score. Furthermore, the receiver operating characteristics curve (ROC) and the area under its curve (AUC) value will also demonstrate the model’s predictive prowess. In addition, we performed calibration evaluation for our model [57], and the R package ‘PresenceAbsence’ provided specific calculations.

5. Result

5.1. Rankings of Feature Importance

We measure the importance of clinical features on treatment outcomes (Cured and Deceased) in two COVID-19 cohorts. Two importance feature rankings are shown in Figure 2. Three popular feature importance algorithms are adaptively combined to measure the impact of features on the death of a patient, with the top-ranked having a more significant impact and requiring more attention. There are differences in the number of clinical features between the two cohorts, but the two rankings share similarities in important features. We take the threshold according to the attenuation degree of the feature importance curve and select the features with importance greater than 0.2 as important features. Figure 2(a) shows the ranking of the importance of 62 clinical features for whether a patient died in cohort 1. We find 12 important features to focus on: DD, Age, LDH, NEP, Gen, NE, LYP, ALP, URIC, BAP, INR, and MOP. They are in routine blood test, blood coagulation test, and biochemical test. Figure 2(b) shows the importance of 32 clinical features for whether a patient died in cohort 2, of which 16 features were marked as having a significant impact. The importance of IL6, DD, EO, BA, CRP, Age, NE, IL8, LY, IL2, LDH, CA, INR, RDWCV, TT, and PT is greater than 0.2. The features considered important in both cohorts are DD, Age, LDH, NE, and INR. Although more clinical tests were recorded and performed in cohort 1, only a small number of test results could have a significant impact on COVID-19 outcomes. In cohort 2, fewer clinical tests were performed, but a larger proportion of the features had critical impacts on COVID-19 outcomes.

5.2. Outcome Prediction and Evaluation

We selected key features with ACIM values greater than 0.2 for outcome prediction. Cohort 1 includes 12 key features, and cohort 2 includes 16 key features. We compare the performance of three individual methods (Logistic, RF, and XGBoost) and ACPA for both cohorts, employing four evaluation metrics to provide a comprehensive assessment. The average training and testing performances are presented for 100 repeats of random data-splitting (with 80% as train and 20% as test in each calculation). The detailed results are provided in Table 1.

In the results for cohort 1, XGBoost exhibits superior training performance, but its testing performance is inferior to RF, and Logistic shows the poorest results. However, overall, ACPA demonstrates the best comprehensive performance. It closely rivals XGBoost in training and RF in testing, surpassing RF in F1 score and Accuracy. In the results for cohort 2, RF maintains the best performance in testing, and XGBoost continues to excel in training. ACPA leverages the strengths of both, demonstrating stable advantages.

For both cohorts, the results highlight the advantages of ACPA combination. In practical terms, as the true best method or the one most suitable for uncovering the inherent nature of the data is often unknown, if ACPA achieves results comparable to or even slightly surpassing the best method after computation, it indicates the method’s versatility, ensuring the quality of computed results in most scenarios.

Furthermore, we conducted an evaluation of the predictive performance of the models, including model calibration and discrimination, as illustrated in Figure 3. The results of model calibration at a confidence level of 0.05 are shown in Figures 3(a) and 3(b), with five probability bins displayed. It can be seen that both the observed and predicted probability bins are close to the diagonal line, indicating that the models for both cohorts are well calibrated. In addition, we calculated the discriminant performances of the two models for two cohorts. In the case of different threshold selection, ROC curves show excellent performance, and the AUC values are 0.983 and 0.988, respectively.

5.3. Severity Grading of COVID-19

According to the COVID-19 grading system [58], we divided the degree of severity into four groups by the probability of death (POD), namely, Mild (T4): POD < 0.25, Moderate (T3): 0.25  POD < 0.5, Severe (T2): 0.5  POD < 0.75, and Critical (T1): POD ≥ . From Figures 4(a) and 4(b), it can be found that the proportion of green and yellow parts is the largest, that is, when the probability of death is lower than 0.5 or 0.25, most patients are mild and regular, which does not need to take up too much treatment cost. On the contrary, the proportion of the red part is small, and its probability of death is greater than 0.75. There is a high probability of death without timely assistance, which needs to be focused on. In orange part, the probability of death is between 0.75 and 0.5, which requires doctors’ attention and more resources. In the absence of timely diagnosis and treatment, these patients will be at great risk, whereas with timely diagnosis and treatment, these patients are likely to recover.

Table 2 shows the patient classification under the original grading and the new grading with the classification threshold equal to 0.5. Under the original classification, there were significantly fewer mild (T4) and moderate (T3) cases than severe (T2) and critical (T1) cases, which is obviously lacking in rationality. Under the original classification, there were significantly fewer mild (T4) and moderate (T3) cases than severe (T2) and critical (T1) cases, which is obviously lacking in rationality. Compared with the original grading structure, the new grading system for COVID-19 patients established in the paper is more reasonable and scientific, which can effectively and accurately distinguish mild patients from severe patients. Accurate classification is conducive to the effective utilization of medical resources and helps patients establish the most correct treatment path in the early stage of admission.

6. Discussion

Though there are studies that assess the risk for progression, prognosis, and mortality of patients with COVID-19, few studies focus on developing a disease grading system based on various characteristics through combined AI methods [5961]. In this study, we conduct and validate a disease grading system for patients with COVID-19 based on the prediction of the probability of death under a combination algorithm, which can be used to identify and predict the prognosis among hospitalized patients on admission. The reliable and feasible early identification of patients is essential for timely triaging in clinical practice, especially under the heavy burden of medical resource. The application of combined AI method to the diagnosis of COVID-19 can improve diagnostic efficiency and optimize the allocation of medical resources, which is of great significance to curb the pandemic.

The combined framework we offer includes calculations for three feature screening methods and three prediction methods. Of course, within this framework, we allow the integration of additional methods to enhance the overall effectiveness, including some deep learning algorithms. Furthermore, calculations for feature screening and disease prediction can be conducted independently, based on the specific requirements of the task. However, if there are a substantial number of features to be predicted, it is recommended to perform combined feature selection before prediction. Importantly, our framework does not mandate extensive parameter training for each combinable method to seek optimality, as it is apparent that such an approach may be more beneficial. We recommend initially attempting the combination of potential methods to see if the desired effects are achieved; otherwise, one can incorporate better methods or optimize existing methods based on the task requirements.

In terms of the risk factors of COVID-19, a total of 23 indicators were chosen as prediction markers, including the demographic characteristics (age and gender), blood routine (Lymphocyte, Neutrophil, Eosinophil, Basophil, and Monocyte), coagulation function (PT, Thrombin, and D-Dimer), LDH, cytokine profiles (IL2, IL6, IL8, and IL10), and CRP. These features can be used as elements of clinical tests or early warning systems to optimize the treatment process of COVID-19. In particular, these characteristics have been verified by previous studies. Regarding the severity grading of COVID-19, the current official disease grading is based on some symptom observations, which are based on historical and subjective judgment and have a certain lag. However, our grading system is based on the final result of prediction, which has an early warning effect and can significantly reduce the irreparable outcome caused by historical judgment bias. Doctors can decide the treatment sequence of patients by predicting the outcome; at the same time, all patients are managed hierarchically.

7. Future Work

The current study has several limitations. First, our findings might be limited by the quality of the data. First, the samples for disease grading system are entirely from Wuhan, China, which may require more data from other areas of the world to increase the generalizability and applicability. Second, the hospitals contributing to our current research cohort tend to admit severe and critical COVID-19 patients. Therefore, this subset of patients may have disproportionately representation in the study, potentially leading to some bias in the grading system. The clinical experiments are pending to validate the practicality of the algorithmic procedures.

Future research can focus on addressing data issues in more depth. For example, when dealing with a large number of features that require selection, designing penalties for weights to combine can provide feedback on the impact of model complexity. Exploring how to combine methods on imbalanced data and mitigating potential effects of the SMOTE algorithm could be another area of investigation. Additionally, retrospective studies can contribute to the establishment of a comprehensive compendium for COVID-19, providing more guidance for uncertain future pandemics.

Data Availability

The datasets analyzed during the current study are not publicly available but are available from the corresponding authors on reasonable request.

Disclosure

Patients or the public were not involved in the design, data collection, analyses, or interpretation of this research. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

W.P., S.J.L., Y.D.S., W.Z., and R.J.W. contributed to study design. L.W., R.J.W., W.X.K., W.L.P., and C.H. collected and analyzed the data. L.W. and R.J.W. interpreted results. L.W., W.P., and S.J.L. wrote the manuscript. All authors were responsible for revision of the manuscript and the final approval of the version to be published.

Acknowledgments

We thank the support of Tongji Hospital, Union Hospital and Liyuan Hospital in Wuhan, Hubei Province, China. We also thank the Union Hospital (HUST-UH) and Liyuan Hospital (HUST-LH) in Wuhan, Hubei Province, China, for sharing open data. This work was supported by the National Natural Science Foundation of China (grant nos. 82372354, 82341120, 71871169, and U1933120), the Chinese Medical Association of Clinical Medicine Special Funds for Scientific Research Projects (17020400709), and the Hubei Provincial Natural Science Foundation of China (2019CFA062).

Supplementary Materials

The supplementary materials contain descriptions of the clinical features included in the two COVID-19 cohorts, along with their abbreviations as used in the research. Table S1: description of the features of cohort 1. Table S2: description of the features of cohort 2. (Supplementary Materials)