Machine Learning Models to Predict In-Hospital Mortality among Inpatients with COVID-19: Underestimation and Overestimation Bias Analysis in Subgroup Populations
Table 1
Different feature sets.
Feature set
Method
Number of features
Features
1
Feature selection node (default setting)
17
Age, contact with COVID-19 patients, cough, diabetes, diagnosis only by abnormal CT, diagnosis only by positive PCR, diagnosis by positive PCR and abnormal CT, gender, heart diseases, HTN, and ICU. Admission, intubation, muscle ache, number of comorbidity, oxygen therapy blood oxygen saturation level, and respiratory distress.
2
Univariate analysis ( value <0.05)
32
Age, cancer, chronic kidney disease, chronic liver disease, contact (with a probable or confirmed case in the 14 days before the onset of symptoms), convulsion, cough, diabetes, diagnosis only by abnormal CT, diagnosis only by positive PCR, diagnosis by positive PCR and abnormal CT, dialysis, diarrhea, dizziness, drug abuse, gender, headache, heart diseases, HIV/AIDS, HTN, and ICU. Admission, immune diseases, intubation, nervous system diseases, number of comorbidities, other chronic lung diseases, oxygen therapy, paralysis, blood oxygen saturation level, pregnancy, respiratory distress, and unconsciousness.
3
Univariate analysis ( value <0.2)
40
The feature set 2 + asthma, chronic hematology diseases, mental disorders, muscle ache, other diseases (comorbidities), drowsiness, gustatory dysfunction, and weakness.
4
All features
60
The feature set 3 + abdominal pain, autoimmune disease, chest pain, chills, constipation, ocular manifestations, fever, GI bleeding, hemoptysis, nausea, anorexia, other GI signs, paresis, runny nose, skin manifestations, sore throat, olfactory dysfunction, smoking, sweating, and vomiting.