Background. The recent COVID-19 pandemic has emerged as a threat to global health. Though current evidence on the epidemiology of the disease is emerging, very little is known about the predictors of recovery. Objectives. To describe the epidemiology of confirmed COVID-19 patients in the Republic of Korea and identify predictors of recovery. Materials and Methods. Using publicly available data for confirmed COVID-19 cases from the Korea Centers for Disease Control and Prevention from January 20, 2020, to April 30, 2020, we undertook descriptive analyses of cases stratified by sex, age group, place of exposure, date of confirmation, and province. Correlation was tested among all predictors (sex, age group, place of exposure, and province) with Pearson’s correlation coefficient. Associations between recovery from COVID-19 and predictors were estimated using a multivariable logistic regression model. Results. Majority of the confirmed cases were females (56%), 20-29 age group (24.3%), and primarily from three provinces—Gyeongsangbuk-do (36.9%), Gyeonggi-do (20.5%), and Seoul (17.1%). The case fatality ratio was 2.1%, and 41.6% cases recovered. Older patients, patients from provinces such as Daegu, Gyeonggi-do, Gyeongsangbuk-do, Jeju-do, Jeollabuk-do, and Jeollanam-do, and those contracting the disease from healthcare settings had lower recovery. Conclusions. Our study adds to the very limited evidence base on potential predictors of recovery among confirmed COVID-19 cases. We call additional research to explore the predictors of recovery and support development of policies to protect the vulnerable patient groups.

1. Introduction

For the first time, a novel coronavirus disease 2019 (COVID-19) originating from Wuhan in China was reported to the World Health Organization in December of 2019 [1]. This novel coronavirus has taken the form of a major pandemic and has affected almost all major nations in the world. There have been more than 3.6 million confirmed cases and about 252,000 deaths as of May 05, 2020 [2]. The very first COVID-19 case was diagnosed in the Republic of Korea (South Korea) on January 20, 2020 [3]. During the first two months of this global epidemic, South Korea had the second highest cases globally following China. According to the Korea Centers for Disease Control and Prevention (KCDC), there have been 10,804 confirmed cases and 254 deaths due to COVID-19 as of May 05, 2020 [4].

We present the epidemiology of COVID-19 in the Republic of Korea using data from the Korea Centers for Disease Control and Prevention and identify the predictors of recovery from the disease.

2. Materials and Methods

2.1. Data Source

The data were obtained from the Korea Centers for Disease Control and Prevention’s publicly shared sources. The dataset contains information about 3,388 confirmed COVID-19 cases in the Republic of Korea from January 20, 2020, through April 30, 2020. After excluding cases with missing values, 3,299 cases were included in the analysis.

2.2. Variables

A confirmed case was defined as a person with laboratory-confirmed positive test. The data contained the following patient details: age (in groups), sex, province, date of diagnosis, mode of exposure, and outcome. There were 11 age groups: below10 years, 10-19 years, 20-29 years, 30-39 years, 40-49 years, 50-59 years, 60-69 years, 70-79 years, 80-89 years, 90-99 years, and above 100 years. We combined the last two age groups to create 90 years and above and thus recategorized age to 10 groups. All seventeen provinces of the Republic of Korea were represented: Busan, Chungcheongbuk-do, Chungcheongnam-do, Daegu, Daejeon, Gangwon-do, Gwangju, Gyeonggi-do, Gyeongsangbuk-do, Gyeongsangnam-do, Incheon, Jeju-do, Jeollabuk-do, Jeollanam-do, Sejong, Seoul, and Ulsan. We categorized the dates of diagnosis by weeks, and they were as follows: 20-26 Jan 2020, 27 Jan-02 Feb 2020, 03-09 Feb 2020, 10-16 Feb 2020, 17-23 Feb 2020, 24 Feb-01 Mar 2020, 02-08 Mar 2020, 09-15 Mar 2020, 16-22 Mar 2020, 23-29 Mar 2020, 30 Mar-05 Apr 2020, 06-12 Apr 2020, 13-19 Apr 2020, 20-26 Apr 2020, and 27-30 Apr 2020. Patients were exposed to potential COVID-19 sources in multiple settings. The settings were nursing home, hospital, religious gathering, call center, community center, shelter and apartment, gym facility, overseas inflow, contact with patients, and others. There were three outcomes: death, recovery, and isolation. The confirmed patients after spending some days in isolation were retested. They were considered recovered only after receiving a negative COVID-19 test.

2.3. Statistical Methods

We undertook descriptive analyses for the patient characteristics and presented the results stratified by subgroups for each characteristic. Correlation was tested among all patient characteristics with Pearson’s correlation coefficient. Associations between recovery from COVID-19 and predictors (age group, sex, province, and exposure) were estimated using a multivariable logistic regression model. We considered associations statistically significant if the value was below 0.05. The statistical analyses were performed using Python programming language Version 3.7 (Python Software Foundation, Wilmington, DE, USA) and Stata Version 15 (StataCorp LLC, College Station, TX).

3. Results

3.1. Pattern of the Epidemic

As shown in Figure 1, the first case of COVID-19 was confirmed on January 20, 2020. There were a few daily cases of new infections for about a month. After a month, the curve suddenly rose starting February 19, 2020, to reach the peak around end of February and early March. It reached its peak on the 29th of February with 813 confirmed cases. Though the curve descended after this date, still there were on an average 200 daily new confirmed cases until March 11, 2020. The curve continued its downward trend, however, adding at least 100 new daily cases through April 05, 2020. Towards the end of April, daily new confirmed cases were below 10.

3.2. Patient Profile

Table 1 shows the profile of the patients. Out of 3,299 confirmed patients, slightly more than half were females (56%). While there were cases from all age groups, the maximum patients were 20-29 years (24.3%), followed by 50-59 years (18.1%), 40-49 years (13.8%), 30-39 years (13.3%), and 60-69 years (12.2%). Three provinces—Gyeongsangbuk-do (36.9%), Gyeonggi-do (20.5%), and Seoul (17.1%)—together accounted for the maximum patients. With respect to the exposure, it was unknown for the most (44%) followed by direct contact with patients (29%), from overseas (16.8%), and religious gathering (4.9%). According to this available data source, 85% percent of the patients were confirmed of their diagnosis between 24 February and 05 April of 2020. There were 61 deaths accounting for 2.1 percent (case fatality rate) of the patients. More than half were isolated (56.3%), and 41.6% recovered.

3.3. Predictors of Recovery

As shown in Figure 2, there were no correlations between the predictors. Both males and females had similar recovery rates, and their difference was not statistically significant (Table 2). Compared to younger age groups, older patients had lower recovery: 70-79 years (adjusted odds ratio 0.31; value 0.01), 80-89 years (aOR 0.22; value 0.001), and 90 years and above (aOR 0.13; value < 0.001). Provinces such as Daegu, Gyeonggi-do, Gyeongsangbuk-do, Jeju-do, Jeollabuk-do, and Jeollanam-do had statistically significant lower recovery rates than Busan. When compared with exposure from nursing homes, patients who were exposed to COVID-19 infection from religious gatherings, community dwellings, and others had higher recovery rates.

4. Discussion

Due to multipronged approaches (proactive surveillance, higher testing, isolation, quarantine, use of technology, masks, and social distancing campaigns) by the government, incidence of new cases came down sharply in South Korea by mid-March and further to less than 10 new cases by mid-April [5].

Our study shows that females constituted the majority of confirmed cases, whereas males accounted for most of the confirmed cases in China and Italy [69]. Around a fourth of the cases were from the 20-29 years age group unlike in most other countries where the infected were older [6, 7, 10]. As already pointed out by researchers from South Korea, the possible reason for higher representation of younger population in our sample could also be specific exposure to cluster of cases through participation in religious activities or workplaces [5, 11]. As shown in a study undertaken in Europe, population density might have played a role for the number of higher cases in certain provinces [12]. The case fatality rate was much lower (2.1%) compared to other countries such as Italy (13.3%) and China (4%). Similar to findings from several other countries, we found the elderly to be more vulnerable with lower probabilities of recovery [6, 8, 13]. It is quite possible that the presence of preexisting medical conditions in the elderly predispose them to delayed recovery. We also found that cases contracting the infection in nonhealthcare setting had higher recovery. While there is no such evidence currently, there could be a possibility that the exposure outside nonhealthcare setting might have involved relatively younger and healthier cases. Considering our study findings, we suggest additional measures to protect the vulnerable cases who are less likely to recover from the infection. Thus, the elderly and cases contracting infection from healthcare settings should be given special attention.

Our study has two potential limitations. First, we used publicly available data of only a third of confirmed cases in the country. Thus, we are unable to ascertain the representativeness of the data for all confirmed cases in South Korea. So, the findings will have to be interpreted with caution. Second, the data lacks information of patients’ symptoms and clinical features. Inclusion of these potential predictors would have enhanced the relevance of this study further. Despite these limitations, our study adds to the very limited evidence base on potential predictors of recovery among confirmed COVID-19 cases [14]. However, we believe the evidence base can be strengthened with further relevant research as authorities make more data publicly available or through primary hospital-based studies.

5. Conclusions

The COVID-19 pandemic has emerged as a great threat to global health challenging health systems across the world to efficiently deal with this situation. Emerging evidence on vulnerability to COVID-19 and predictors of recovery will inform providers and policy makers to effectively triage and prioritize limited resources. Therefore, we call for additional research to explore the predictors of recovery and support development of policies to protect the vulnerable patient groups.

Data Availability

The data used to support the findings of this study are available publicly through the Korea Centers for Disease Control and Prevention.


The views expressed in the paper are that of the authors and do not reflect that of their affiliations.

Conflicts of Interest

The authors declare that there is no conflict of interest. This particular work was conducted outside of the authors’ organizational affiliations.


We are grateful to Korea Centers for Disease Control and Prevention for making this data publicly available.