Assessing the length of hospital stay (LOS) in patients with coronavirus disease 2019 (COVID-19) pneumonia is helpful in optimizing the use efficiency of hospital beds and medical resources and relieving medical resource shortages. This retrospective cohort study of 97 patients was conducted at Beijing You’An Hospital between January 21, 2020, and March 21, 2020. A multivariate Cox proportional hazards regression based on the smallest Akaike information criterion value was used to select demographic and clinical variables to construct a nomogram. Discrimination, area under the receiver operating characteristic curve (AUC), calibration, and Kaplan–Meier curves with the log-rank test were used to assess the nomogram model. The median LOS was 13 days (interquartile range [IQR]: 10–18). Age, alanine aminotransferase, pneumonia, platelet count, and PF ratio (PaO2/FiO2) were included in the final model. The C-index of the nomogram was 0.76 (), and the AUC was 0.88 (). The adjusted C-index was 0.75 () and adjusted AUC 0.86 (), both after 1000 bootstrap cross internal validations. A Brier score of 0.11 () and adjusted Brier score of 0.130 () for the calibration curve showed good agreement. The AUC values for the nomogram at LOS of 10, 20, and 30 days were 0.79 (), 0.89 (), and 0.96 (), respectively, and the high fit score of the nomogram model indicated a high probability of hospital stay. These results confirmed that the nomogram model accurately predicted the LOS of patients with COVID-19. We developed and validated a nomogram that incorporated five independent predictors of LOS. If validated in a future large cohort study, the model may help to optimize discharge strategies and, thus, shorten LOS in patients with COVID-19.

1. Introduction

Coronaviruses (CoVs) are a large family of single-stranded RNA viruses, and beta-CoVs have caused international outbreaks of emerging respiratory diseases, including severe acute respiratory syndrome coronavirus (SARS-CoV) in 2003 [1, 2] and Middle East respiratory syndrome-CoV (MERS-CoV) in 2012 [3]. In December 2019, a novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in Wuhan led to coronavirus disease 2019 (COVID-19), with more than 290,000 confirmed cases in 174 countries and approximately 12,000 deaths (as of March 21, 2020) [4, 5]. The infectious disease outbreak led to a substantial increase in the demand for hospital beds, a shortage of medical equipment, and possible nosocomial infection among medical staff. According to the clinical condition of patients, physicians can evaluate the length of hospital stay (LOS), which is helpful in relieving medical resource shortages. A recent study reported a model including five variables, namely, procalcitonin, heart rate, Wuhan traveling history, lymphocyte count, and cough to predict prolonged LOS (>14 days) [6]. However, the model could only predict whether the LOS was >14 days. However, the “Wuhan traveling history” variable limited the extrapolative application of this model because the COVID-19 epidemic had been eliminated in Wuhan city.

We conducted a retrospective cohort study on the clinical characteristics of cured and discharged patients with confirmed COVID-19 infection between January 21, 2020, and March 21, 2020, in Beijing. We applied Cox proportional hazards regression to analyze time- (LOS-) to-event (discharge) data, which was able to provide individualized predictions of the estimated time to the event of interest. This study is aimed at describing the clinical characteristics of and develop and internally validate a predictive nomogram for estimating the LOS in patients with COVID-19.

2. Materials and Methods

2.1. Cohort Construction

This was a single-center, retrospective cohort study enrolling consecutive COVID-19 pneumonia patients aged over 18 years who underwent treatment at Beijing You’An Hospital between January 21, 2020, and March 21, 2020. All patients with COVID-19 pneumonia were diagnosed and classified according to the new coronavirus pneumonia diagnosis and treatment plan (trial version 6, in Chinese) developed by the National Health Committee of the People’s Republic of China (http://www.nhc.gov.cn/). This study was approved by the Ethics Committee of Beijing You’An Hospital, and informed consent was obtained from all the patients.

2.2. Outcomes and Selection of Covariates

The primary outcome was LOS, which was defined as the time in days from hospital admission to discharge and was considered as “event =1” in Cox analysis. Readmission within two weeks was considered a prolonged LOS, and it was counted from the first hospitalization day. Death before discharge was also considered as a prolonged LOS and was estimated to be 800 days (longer than the longest LOS) and censored with “event = 0” in Cox analysis. Patients who died within 24 h of admission to the hospital were excluded from the Cox analysis. All patients were followed up for at least 6 months after discharge.

We collected baseline data, including demographic characteristics (age, sex, and comorbid diseases), epidemiological history, laboratory tests (biochemical indicators, routine blood testing, C-reactive protein, and chest radiograph or computed tomography [CT] scan), treatment, and outcome data. The data were extracted from the electronic medical record system, laboratory information system, and picture archiving and communication system.

2.3. Statistical Analysis

Continuous and categorical variables are presented as medians with interquartile ranges (IQRs) and (%), respectively. We used Fisher’s exact test or the chi-square test and the Mann–Whitney test to make between-group comparisons of the subjects in the three groups. A backward stepwise method based on the smallest Akaike information criterion (AIC) value was applied to select covariates to be included in the Cox proportional hazards models.

The nomogram was developed using the “rms” package. The area under the time-dependent receiver operating characteristic (ROC) curve was obtained using the “survival ROC” package. Harrell’s C-index (concordance statistic, or C-statistic) was used to assess the predictive capacity of the nomogram. Bias-corrected calibration using the bootstrapping method with 1000 resamples was used for internal validation of the nomogram. Based on the scores of each variable, the total scores for each patient could be calculated using the “pec” package in . The fit score of the five-covariate combination was used to stratify patients for Kaplan–Meier curve analysis using the log-rank test to compare the probability of hospital stay among the different groups, and the “survminer” package was applied in this regard. Statistical analyses were performed using version 3.6.2. Extension packages, including “ggplot2,” “foreign,” and “export,” were also employed.

3. Results

3.1. Patient Population

A total of 102 patients were diagnosed with COVID-19 between January 21, 2020, and March 21, 2020, and treated at Beijing You’An Hospital. One patient who died within 24 h and four who were under 18 years of age were excluded from the analysis. Therefore, a total of 97 patients, including 84 (86.6%) discharged and 13 undischarged patients (including four deceased and four readmitted patients), were included in this study (Figure 1(a)). After at least 6 months of follow-up after discharge, there was no death. The baseline demographic characteristics of the study cohort are presented in Table 1. The median age of the study patients was 51.51 years (IQR: 38–64), and 42.3% were men. The primary outcome was LOS, and the median LOS was 13 days (IQR: 10–18). The LOS distribution of the discharged COVID-19 pneumonia patients is shown in Figure 1(b).

The LOS increased with age, and there was a significant difference among the three groups. The percentage of neutrophils, percentage of lymphocytes, platelet-to-lymphocyte ratio (PLR), and neutrophil-to-lymphocyte ratio (NLR) was significantly different among the three groups (all ). The number of subjects with normal ALT and AST levels () in the third group () was significantly lower than those in the other groups ( and , respectively). Myoglobin and lactate levels in the third group () were significantly higher than those in the other groups ( and , respectively).

3.2. Independent Predictors of LOS in Univariate and Multivariate Analysis

We assessed the LOS using Cox proportional hazard regression. Older age (≥50 years), high levels of ALT and AST (), critical and severe pneumonia, and high levels of myoglobin (≥100 μg/L) significantly increased the chance of longer LOS (all ). In contrast, female sex, high platelet count (), high lymphocyte count (), high PF ratio (≥300 mmHg), and gradual increase in the glomerular filtration rate were significantly associated with shorter LOS (all ). The other independent risk factors in the univariate analysis are shown in Table 2.

After backward elimination and model selection based on AIC, age (; , ), pneumonia (, , ), ALT (, , ), PF ratio (, , ), and platelet count (, , ) were included in the final model (smallest AIC value = 600.81) for the development of the nomogram (Table 2).

3.3. Development and Internal Validation of LOS-Predicting Nomogram

Five independently associated risk factors were used to form an LOS risk-estimating nomogram (Figure 2(a)). The nomogram demonstrated favorable accuracy in estimating the probability of hospital stay, with C-index values of 0.76 () and AUC of 0.88 () (Figure 2(b)). The overfit of the model was estimated by applying the bootstrap internal validation method. The adjusted C-index was 0.75 () and adjusted AUC 0.86 () after 1000 bootstrap crossvalidation iterations (Figure 2(c)), which represented the bias-corrected estimate of model performance in the future and demonstrated favorable predictive accuracy for the nomogram. A Brier score of 0.11 () and adjusted Brier score of 0.13 () for the calibration curve demonstrated favorable agreement between prediction probability by nomogram and actual state of hospitalization (Figure 2(c)).

Finally, the area under the time-dependent ROC curve was used to validate the ability of the nomogram to discriminate patients who were discharged within 10, 20, and 30 days of hospital stay. The AUC values for the nomogram at 10, 20, and 30 days were 0.79 (), 0.89 (), and 0.96 (), respectively (Figure 3(a)). The Brier score of the calibration curve for the nomogram at 10, 20, and 30 days was 0.16 (), 0.10 (), and 0.06 (), respectively (Figure 3(b)). The Kaplan–Meier curves together with the log-rank test also demonstrated that a high fit score nomogram model indicated a high probability of long hospital stay in the training group (Figure 3(c), log-rank ). These results confirmed that the nomogram model accurately predicted the LOS of patients with COVID-19.

4. Discussion

COVID-19 has emerged as a worldwide pandemic; at present, the number of infected people continually increases substantially every day in most countries of the world. According to patient clinical data, physicians can evaluate their length of stay. It is beneficial to optimize the use efficiency of hospital beds and medical resources and relieve medical resource shortages.

In this retrospective cohort study, we found that the median LOS was 13 days (IQR: 10–18). Age, ALT, PF ratio, pneumonia, and platelet count were independently associated with LOS in patients with COVID-19, and they were included in the final nomogram. The prognostic model demonstrated a significantly higher predictive accuracy and discriminative ability for the prediction of 10-, 20-, and 30-day LOS for COVID-19-infected patients. Further, the nomogram demonstrated favorable discrimination and superior performance in internal validation. The nomogram model with a high fit score indicated a high probability of hospital stay. These results confirmed that the nomogram model accurately predicted the LOS of patients with COVID-19.

Older age is an important independent predictor of mortality [7]. Similar results were obtained for SARS [1, 8] and MERS [9]. Both cell-mediated immunity and humoral immune function evidently declined in elderly patients. Concomitantly, cytokine and chemokine signaling networks in elderly patients changed; type 2 cytokine response tended to be more sensitive than type 1 [10], and the proportion of T cells producing IL-4, IL-8, and IL-10 increased with age [11]. In these cases, viral replication and longer-lasting proinflammatory responses were not controlled. In SARS-CoV and MERS-CoV infection, uncontrolled induction of proinflammatory cytokines resulted in pathogenesis and disease severity [12]. Several days after COVID-19 infection, patients presented symptoms such as fever, coughing, sputum, vomiting, and diarrhea, and they were diagnosed and treated in the hospital. Fever (≥37.3°C) was an initial important event integral to immune response [13]; however, it was not significantly associated with LOS in univariate analysis.

Platelets are part of the first line of defense against lung-specific entry of SARS-CoV-2 [14], and among patients who had the lowest platelet counts, mortality decreased with an increase in platelet count [15]. The improvement in platelet count might have indicated clinical improvement. Monitoring of platelet counts is certainly beneficial to clinicians in rare resource environments, where the chance of laboratory examination may be limited; however, the whole blood count may be relatively easy [15, 16].

Acute respiratory distress syndrome (ARDS), characterized by hypoxemia with a , is the primary cause of death due to COVID-19. ARDS is a heterogeneous clinical syndrome, which is mechanically induced by uncontrolled COVID-19 viral replication and host cytokine storm. COVID-19 has unique ARDS characteristics in medical imaging and has been reported as a variable in several diagnostic studies. Artificial intelligence is a diagnostic tool that combines multiple imaging modalities, including lung CT, chest radiography, and lung ultrasound [17]. Accordingly, AI assisted us to comprehensively interpret clinical and multiomics data of ARDS patients, and it is potentially advantageous in the management of ARDS patients in the future with individual treatment plans [18].

There are certain limitations to our study. First, this was a single-center, retrospective cohort study involving approximately a quarter of the COVID-19 patients in Beijing on March 21, 2020. This was not representative of the overall COVID-19 treatment or LOS in this area. Second, owing to low mortality (5/102), this study could not analyze the risk factors for survival. Third, due to the retrospective cohort design, laboratory tests were not performed for all cytokines. For example, interferon-inducible protein-10 and IL-6 are predictive factors for SARS [19] and COVID-19 [7] outcomes, respectively; yet, they were excluded.

5. Conclusions

We successfully developed and validated a nomogram, which incorporated five independent predictors of LOS. Provided a future, large sample size cohort study that is used to validate the model, it may be useful in optimizing discharge strategies, hence shortening LOS in patients with COVID-19.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.


This research was supported by grants from the National Key Research and Development Project [grant numbers 2020YFE0202400 and 2020YFC0841700], China Primary Health Care Foundation–You’An Foundation of Liver Disease and AIDS-Scientific Research Project of You’An Hospital [grant numbers CCMU-2020 and BJYAYY-2020YC-01], Capital’s Funds of Health Improvement and Research [grant number CFH2020-1-2182], and Beijing Key Laboratory [grant number BZ0373].