Abstract

Background. To date, investigating respiratory disease patients visiting the emergency departments related with fined dust is limited. This study aimed to analyze the effects of two variable-weather and air pollution on respiratory disease patients who visited emergency departments. Methods. This study utilized the National Emergency Department Information System (NEDIS) database. The meteorological data were obtained from the National Climate Data Service. Each weather factor reflected the accumulated data of 4 days: a patient’s visit day and 3 days before the visit day. We utilized the RandomForestRegressor of scikit-learn for data analysis. Result. The study included 525,579 participants. This study found that multiple variables of weather and air pollution influenced the respiratory diseases of patients who visited emergency departments. Most of the respiratory disease patients had acute upper respiratory infections [J00–J06], influenza [J09–J11], and pneumonia [J12–J18], on which PM10 following temperature and steam pressure was the most influential. As the top three leading causes of admission to the emergency department, pneumonia [J12–J18], acute upper respiratory infections [J00–J06], and chronic lower respiratory diseases [J40–J47] were highly influenced by PM10. Conclusion. Most of the respiratory patients visiting EDs were diagnosed with acute upper respiratory infections, influenza, and pneumonia. Following temperature, steam pressure and PM10 had influential relations with these diseases. It is expected that the number of respiratory disease patients visiting the emergency departments will increase by day 3 when the steam pressure and temperature values are low, and the variables of air pollution are high. The number of respiratory disease patients visiting the emergency departments will increase by day 3 when the steam pressure and temperature values are low, and the variables of air pollution are high.

1. Introduction

Because of the exacerbation of air pollution, interest in the health effects of fine dust has increased. Fine dust is well known as a group 1 carcinogen. In addition, there have been reports of fine dust-related deaths, paralysis, neuropathy, high blood pressure, cardiovascular, and respiratory diseases [15] According to recent studies, it causes depression and anxiety [1], neurodegenerative diseases including dementia or Parkinson’s disease, and skin diseases and increases the risk of childhood disorders, such as autism spectrum disorder, developmental disorders [6], asthma, respiratory tract infections, and atopic dermatitis [2, 79]. Nevertheless, there is no relevant research investigating patients visiting the emergency departments (EDs).

Overcrowding in EDs is a global problem and has been addressed as a national crisis in some countries [10]. The medical resources needed in the ED vary according to the severity, type of visit, and the patient’s disease. Forecasting emergency medical demand can be a good way to efficiently allocate limited resources [11]. A variety of studies have evaluated the factors influencing the demands for emergency medical service [12]. In particular, previous studies have reported the characteristics of patients visiting EDs and the number of patients according to seasons and weather conditions [13].

Some diseases are sensitive to climate change. Studies have been conducted on the characteristics and number of patients visiting the ED depending on the season and climate. In addition, numerous studies have revealed that weather and air pollution are closely correlated with the development of cardiovascular and respiratory diseases. However, there is a lack of research on the multivariate factors in existing studies. Studies on the impact of weather and air pollution both on the demand for respiratory emergency medical resources remain insufficient.

Therefore, the data of respiratory disease patients who visited EDs were extracted from the national database of EDs and, using a machine learning technique, analyzed for the complex effect of air pollution, weather, and characteristics of respiratory disease patients visiting the ED for 3 years. Based on the analyzed general characteristics (age, gender, diagnosis), the use day of ED and hospital resources was examined. This study will help provide fundamental data on the prediction model of emergency respiratory patient visits related to weather including air pollution for patient treatment and the efficient management of limited medical resources.

2. Materials and Method

This study utilized the National Emergency Department Information System (NEDIS) database; a secondary data analysis was conducted using random forest (RF), a machine learning technique. NEDIS, an ED information network operated by the Ministry of Health and Welfare, is managed by the National Emergency Medical Center [14]. Since the execution of the system in 2003, it has collected clinical and administrative data of all patients who visited EDs nationwide. Korea provides national medical insurance, which covers 98% of the Korean population [15]. Therefore, the data collected are extremely influential. Emergency medical centers in the country undergo evaluation once a year in order to be approved as official organizations and automatically transmit all the digitalized data for the items requested by the NEDIS, as a principle. Therefore, the data utilized in this study included all the data from the EDs in Seoul, Korea.

3. Study Design and Statistical Analysis

Each weather factor reflected the accumulated data of 4 days: a patient’s visit day and 3 days before the visit day. The number of explanatory variables corresponding to the response variable Y is 48 (4 × 12). With the use of weather and air pollution variables (X) such as temperature, the amount of precipitation, and PM2,5, the number (Y) of the ED patients who had a particular disease code was estimated. A RF Regression model that can select important variables was applied. The importance of an explanatory variable that influences a dependent variable was extracted via calculating impurity-based feature importance. We used the code available in the RandomForestRegressor of scikit-learn package. Pandas package (version 1.0.0; NumFOCUS, Austin, TX, USA) and Dask package were used mainly for data preprocessing.

RF, as a machine supervised learning technique, has a combined form of multiple decision trees. In a conventional decision tree technique, if the number of explanatory variables is large, the number of the branches in one decision tree is also large. As a result, overfitting (in which the learned data only fits well) occurs. To prevent such overfitting, the RF randomly samples a part of the explanatory variables when one decision tree is generated and thereby creates multiple decision trees by sampling with replacement. Among the values predicted by the multiple decision trees generated in the process, the most predicted value becomes the final prediction value. In this study, the number of explanatory variables is large, and multicollinearity exists (Figure 1). For this reason, RF was applied rather than a conventional decision tree technique. To evaluate the performance of RF, Out of Bag, which evaluates performance with 1/3 of the data not used at the time of sampling with replacement, was used. The importance of an explanatory variable that influences a dependent variable was extracted. The most predictive features of regressors build up on models showing R  2 over 0.5.

4. ER Visit Data

Among the patients who had visited emergency medical centers in Seoul within the 36-month period from January 1, 2015, to December 31, 2017, those whose disease classification code (J code; J00–J99) at the time they left the ED was related to respiratory diseases according to the Korean Standard Classification of Diseases (KCD) (based on ICD-10) were selected. The analysis was performed using the first primary diagnosis in the emergency centers. Local emergency medical centers that failed to transmit KTAS were excluded from the analysis. The patients whose visit date and time were not recorded were excluded as well. The age, gender, disease name, and date and time of visit of study patients were utilized. The names of diseases are provided in Appendix 1.

5. Air Pollution and Weather Data

Fine dust contains enormous kinds of air pollutants, including heavy metals, ions, organic carbons, and black carbons. According to particle size, a particulate matter whose diameter is 10 µm or less is known as PM10, and a particulate matter whose diameter is 2·5 µm or less is known as PM2·5 or ultra-fine particulate [16]. In this study, carbon monoxide, nitrogen dioxide, ozone (O3), PM10, PM2·5, and sulfur trioxide (SO2) were used as variables.

The corresponding meteorological data were obtained from the National Climate Data Service System as weather variables. Both the automated synoptic observing data (of ASOS) provided by the “meteorological data open portal” of the Korea Meteorological Administration and the fine dust measuring data provided by Air Korea were combined and used based on region [17], date, and time. The weather data of Seoul City were used as reference data, and the maximum number of influence days of disease occurrence was assumed to be 3. Data on the average temperature, amount of precipitation, relative humidity, steam pressure, wind speed, and wind direction provided by the Korea Meteorological Administration were set as weather factors.

The distance between a regional emergency medical center in Seoul and an observatory was calculated. The five observatories with a small distance were selected. The mean of the values measured in the five observatories was calculated every hour. The mean of all the observatories in the region was also calculated. In this way, the mean value in the region was defined. A missing value was not processed and was left empty. The weather data from December 27, 2014, to December 31, 2017, were obtained. Seasons were classified as spring (March, April, and May); summer (June, July, and August); fall (September, October, and November); and winter (December, January, and February).

6. Result

6.1. Characteristics of Study Participants (Table 1)

A total of 18,619,252 patients visited EDs nationwide and 4,784,458 visited EDs in Seoul during the study period (Table 1). Among them, 525,579 patients were diagnosed with respiratory diseases (J code) according to the KCD. Respiratory disease patients accounted for 11.0% of the total ED patients. Among 525,579 patients who had visited EDs because of respiratory diseases within the 3-year period, 169,538 (32·3%) were reported in 2015, 202,114 (38·5%) in 2016, and 153,927 (29·3%) in 2017. The largest number of patients was reported in 2016. The average age was 28·1 ± 27·5 years. Specifically, 45% of these patients were aged 0–15 years, 37·6% were aged 16–60 years, and 17·4% were aged 61 years and older. Of the patients who visited EDs, the number of men (276,142, 52·5%) was higher than that of women. Approximately 52·2% of the patients visited EDs because of acute upper respiratory infections, which accounts for the highest number of patients in this subgroup. Pneumonia patients accounted for 15% of the total respiratory disease patients, and 43·8% of the hospitalized patients.

6.2. Analysis on the Number of Patients by Year, Month, Season, and Day (Table 2)

In the monthly analysis, of 525,579 patients who visited EDs because of respiratory diseases during the 3-year period, 71,122 (13·5%, the highest) occurred in December, 65,121 (12·4%, the second highest) in February, and 31,007 (5·95%, the lowest) in July (Table 2). In the seasonal analysis, 181,905 patients (34·6%, the highest) occurred in winter and 96,967 (18·4%, the lowest) in summer. In the days of the week analysis, 127,316 patients (13·9%, the highest) visited EDs on Sunday, while 60,077 (11·4%, the lowest) visited EDs on Thursday.

6.3. Characteristics of Weather Factors by Year (Table 3)

Of the weather factors, wind speed and wind direction showed a difference by year. Of the air pollution variables, nitrogen dioxide, O3, PM2·5, and SO2 showed a difference by year (Table 3).

6.4. Correlations between Weather Factors (Figure 1)

The correlations between six air pollution variables and six weather factors were analyzed, and whether multicollinearity existed was examined. Blue color indicated a negative correlation, while red color indicated a positive correlation. A darker color denoted more correlation between variables. Air pollution variables had positive correlations, while O3 had a negative correlation. Air pollution variables had negative correlations with weather factors (except for O3).

The correlation between six air pollution variables and six meteorological factors was compared. Blue color indicated a negative correlation, while red color indicated a positive correlation. A darker color denoted more correlation among the variables.

6.5. Results of Random Forest Based Analysis (Figures 26 and Table 4)

Figures 25 illustrate the graphs of 20 weather conditions and air pollution variables, which are highly related to the patients’ visits to EDs because of each disease. Table 4 presents the top 10 variables. The number ranging from 0 to 3 after each variable denoted the relation between a patient’s visit date and a variable measurement date. In other words, “0” indicates the relation between the weather condition on the day of a visit and an air pollution value; “1” indicates the relation between the weather condition on the day of a visit and the value on the day before the visit; “2” indicates the relation between the weather condition on the day of a visit and 2 days prior to the visit; and “3” indicates the relationship between the weather condition on the day of visit and 3 days prior to the visit. The “mean” is a value of the mean, while the “std” is a value of standard deviation that represents the changes in a variable on a certain day. Figure 2 illustrates the weather and air pollution variables on the day of a visit that have high correlations with ED visit according to the patient’s disease. Influenza, pneumonia, and other acute lower respiratory infections [J09–J11] were highly related to temperature and steam pressure (4B–D). Lung diseases due to external agents [J60–J70] were highly related to CO, NO2, and the amount of precipitation as air pollution variables (4G). Figure 3 shows the correlations between the weather and air pollution variables on the day of a visit and the day before the visit and the ED visit. Figure 4 presents the correlations between the variables on the day of a visit and 2 days before the visit. Figure 5 illustrates the correlations between the weather and air pollution variables on the day of a visit, 2 days before the visit, and 3 days before the visit and the ED visit. A. Acute upper respiratory infection [J00–J06] was mainly related to NO2 on the day of a visit and to PM10 on the day of a visit and the day before the visit. B. Influenza was related to the temperature and steam pressure 3 days before a visit and was slightly influenced by PM10 3 days before a visit. C. Pneumonia [J12–J18] was influenced by temperature and steam pressure 2–3 days before a visit, rather than on the day of the visit, and was influenced by PM10 as well. Figure 6 is the result of total respiratory disease in this study. Steam pressure and SO2 are the most affective factors to visiting ED via respiratory diseases.

PM10 had high correlations with a patient’s ED visit because of acute upper respiratory infections [J00–J06] and with days 0 and 1. In cases of influenza [J09–J11], pneumonia [J12–J18], other acute lower respiratory infections [J20–J22], and other diseases of the upper respiratory tract [J30–J39], day 0 was influential. In the case of chronic lower respiratory diseases [J40–J47], days 0, 1, 2, and 3 had high correlations with a patient’s ED visit. In the case of suppurative and necrotic conditions of the lower respiratory tract [J85–J86], day 0 was influential (Table 4).

Among the climate factors, steam pressure had an effect on 0, 1, 2, 3 days, and among air pollution, NO2 had the most influence. Among the diseases that have the most frequent visits to the emergency department, the first acute upper respiratory infections [J00-J06] were affected by NO2, the second pneumonia [J12–J18] was affected by pressure, and the third influenza [J09–J11] was greatly affected by temperature. Regarding the second [J40-J47] chronic lower respiratory diseases, which is a disease that requires a lot of hospitalization, the temperature, and the fourth [J90-J94] other diseases of pleura, each NO2 value seems to be greatly affected by climate and pollutants (the first pneumonia [J12–J18], third [J00-J06] Acute upper respiratory infections mentioned above). PM10 affected the J85-86. With regard to PM2·5, in the case of other respiratory diseases principally affecting the interstitium [J80–J84], days 2 and 3 had high correlations with a patient’s ED visit. In the case of suppurative and necrotic conditions of the lower respiratory tract [J85–J86], days 2 and 0 were influential (Table 4.)

7. Discussion

Based on the consistently registered and systemized data registry of national emergency medical centers, this study analyzed the correlations between weather and air pollution variables and respiratory disease patients visiting EDs by applying a machine learning approach as an AI technique. Previous studies have focused on the simple relationship between a single disease and one air factor. The present study considered all respiratory diseases and a variety of air pollution and weather variables. Unlike previous studies, it examined the effects of weather and air pollution variables 3 days before a visit. For air pollution, data of the five observatories in consideration of the location of the ED were used. Unlike previous studies that used the daily average data of air pollution variables [18, 19], this study utilized the data of 3 days before a visit, the daily temperature difference, and other data to determine the values of weather conditions in detail and identify their level of influence.

As a result, patients who visited EDs due to respiratory diseases had correlations with weather and air pollution variables on the day of the visit and 1–3 days before the visit. Of the air pollution variables, PM10 and PM2·5, which have recently drawn a lot of attention, influenced patients’ ED visit.

In this study, not only the effects of weather and air pollution variables on each disease, but also their level of influence was analyzed. Many air pollution variables had high correlations with acute upper respiratory infections [J00–J06], chronic lower respiratory diseases [J40–J47], and suppurative and necrotic conditions of the lower respiratory tract [J85–J86]. In cases of diseases that were highly influenced by air pollutants, steam pressure was not influential. As a result, steam pressure had a negative correlation with air pollution variables. In the case of acute upper respiratory infections [J00–J06], air pollution variables were highly influential; therefore, they had high correlations. Influenza and pneumonia were influenced by air factors like steam pressure; lower respiratory infections were influenced by air factors, and upper respiratory diseases by air pollution variables.

In the case of several diseases, compared with PM2·5, PM10 had a greater influence on patients’ visit to ED. However, this does not mean that PM2·5 has little influence on the incidence of respiratory diseases. Nevertheless, it is reasonable to indicate that PM10 (larger particle size) is more influential on acute diseases that trigger a patient’s visit to the ED during a short-term period (on the day of the visit to 3 days before the visit). More studies should be conducted to determine the long-term effects of PM2·5 [20], which is known to persist and affect the human body. PM10 influenced the respiratory disease patients’ visits to the emergency departments.

In the case of influenza, the temperature and steam pressure on the day of a visit were most influential. In the case of pneumonia, which accounted for a majority of the respiratory disease patients visiting EDs, it was influenced more by steam pressure and temperature. The group of diseases including asthma (J40–J47) was influenced by PM10 following steam pressure. Acute upper respiratory infections were mostly influenced by air pollution variables, especially NO2 and PM10.

What was interesting was that acute upper respiratory infections [J00–J06], influenza [J09–J11], and pneumonia [J12–J18], which account for a majority of the respiratory diseases of patients visiting EDs, were highly influenced by PM10 following temperature and steam pressure and that PM10 was also highly influential in the top three diseases prompting visits to the ED: pneumonia [J12–J18], acute upper respiratory infections [J00–J06], and chronic lower respiratory diseases [J40–J47]. Therefore, of the air pollution variables, PM10 most influenced respiratory disease patients’ visits to EDs.

Donaldson et al. reported that asthma symptoms were worsened by the influence of PM10. This finding is consistent with the results of the present study [21]. PM exposure can trigger an asthmatic response through multiple paths. Presumably, it is related to airway inflammation, increased smooth muscle constriction, direct stimulation of lipid mediators, additional oxidative stress, and proinflammatory burden [21, 22]. Other studies have also reported that an increase in PM10 is related to an increase in the use of asthma drugs [23, 24], According to a recent study conducted by Sohn et al. [25] in Korea, a daily temperature change influenced the pneumonia patients’ visits to EDs in Seoul. Choi et al. [26] reported that maximum temperature, rainfall, relative humidity, and PM10 had correlations with community-acquired pneumonia. This study also revealed that pneumonia patients’ visits to EDs were influenced by weather and air pollution variables, such as steam pressure, temperature, CO, PM10, and O3 (Figure 2(c)).

Arbex et al. (Brazil) [27] reported the correlations between acute upper respiratory infections [J00–J06] and air pollution variables. According to their report, the diseases were related to lag 0 of NO2, SO2, O3, and PM10. In this study, acute upper respiratory infections were also influenced by lag 0 in the order of NO2, M10, and SO3 (Figure 2(a)). Patients with acute upper respiratory infections accounted for 52.2% of the total respiratory disease patients visiting EDs and 12.8% of hospitalized patients. As such, the high number of patients with these diseases visiting the EDs was directly influenced by air pollution variables.

According to the research by Wanka et al. in Germany [28], weather and air pollution variables influenced respiratory diseases in a complex way. This study also revealed that a variety of variables were related to each other and influenced diverse disease groups in complex ways.

Zhang et al. [29] reported that a low concentration of PM2·5 was related to acute respiratory infections 3 days before a visit, while a high concentration of PM2·5 was related to the infections on the day before a visit. In this study, PM2·5 influenced acute respiratory infections in lag 0 and lag 2. Weather and air pollution variables were more directly influenced by respiratory diseases than other disease groups. A similar result was found for all the disease groups [30].

The number of respiratory disease patients will increase by day 3 when the values of steam pressure and temperature are low, and the values of air pollution variables are high. The weather-related health index for predicting respiratory disease patients visiting EDs is yet to be developed. If a prediction model is additionally developed based on the study results, it is possible to provide a fundamental material for preventing respiratory diseases related to weather changes and to help medical institutions utilize their facilities and manpower efficiently to manage patients with respiratory infections.

This study has the following limitations. First, the analysis was conducted with data that was already codified and collected; therefore, it was impossible to determine the clinical characteristics, prognosis, sources of infection, and underlying diseases of each patient. The primary outcome of this study was assessment of trends using large data. Therefore, it is necessary to analyze the clinical data of individual disease groups. Second, the study only lasted for 3 years. As described in this thesis, a group of chronic diseases and a group of acute diseases were included in the analysis. In particular, air pollution variables are needed in long-term influence analysis. However, the ED patients data system provided was based on 3-year data. Therefore, it is necessary to analyze the long-term influence of the study variables. Third, this study set the time lag to 3 days. If a general incubation period is taken into account, the lag of 14 days can be set. However, given the large number of variables, the time lag was set within a short-term period. At last, the data from the observatory near the hospital were used, not the data from the observatory near the patient’s house. The reason for including the data from the observatory near the hospital is that if we use the observatory data near the patient's address, data cannot be obtained with personal information (address), and it has to be assumed that the patient has visited a nearby hospital.

In this study, the effects of weather and air pollution variables on respiratory disease patients’ visits to EDs were analyzed. Most of the respiratory patients visiting EDs were diagnosed with acute upper respiratory infections [J00–J06], influenza [J09–J11], and pneumonia [J12–J18]. PM10 following temperature and steam pressure had influential relations with these diseases. In patients with pneumonia [J12–J18], acute upper respiratory infections [J00–J06], and chronic lower respiratory diseases [J40–J47] as the top three diseases managed in EDs, PM10 was highly influential. As a result, among air pollution variables, PM10 was found to influence the respiratory disease patients’ visits to EDs. The number of respiratory disease patients visiting ED is expected to increase by day 3 when the values of steam pressure and temperature are low, and the variables of air pollution are high. Additionally, a respiratory disease prediction index must be established using a prediction model.

Appendix

Inclusion ICD-10 codes(i)[J00-J06] Acute upper respiratory infections.(ii)[J09-J18] Influenza and pneumonia.(iii)[J20-J22] Other acute lower respiratory infections.(iv)[J30-J39] Other diseases of upper respiratory tract.(v)[J40-J47] Chronic lower respiratory diseases.(vi)[J60-J70] Lung diseases due to external agents.(vii)[J80-J84] Other respiratory diseases principally affecting the interstitium.(viii)[J85-J86] Suppurative and necrotic conditions of lower respiratory tract.(ix)[J90-J94] Other diseases of pleura.(x)[J95-J99] Other diseases of the respiratory system.

Data Availability

Data sharing is not applicable to this article because the data that support the findings of this study are from NEDIS Korea. Restrictions apply to the availability of these data, which were used under license for this study.

Ethical Approval

This study was approved by the institutional review board of the Korea University Guro Hospital (NO. 2019GR0197). The requirement for informed consent from the participants was waived by the board.

Disclosure

The funding source had no role in the design of this study; data collection, analysis, and interpretation; and decision to publish or preparation of the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Conceptualization was done by JY Kim and ES Lee. Methodology was developed by YH Yoon and SB Kim. Result interpretation was done by HG Kahng and JH Park. Computation was done by JH Kim, HE Hwang, and MJ Lee. ES Lee wrote the article. All authors reviewed and edited the article.

Acknowledgments

This research was supported by Korea University Guro Hospital ‘Korea RESEARCH-DRIVEN HOSPITALS’ Grant (O1905501). This study was conducted with the use of NEDIS data (N20191920711). This article has been preprinted at the following link [31]: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3705299. This research was supported by grant of the Establish R&D Platform Project through the Korea University Medical Center and Korea University Guro Hospital, funded by the Korea University Guro Hospital (Grant number: O1905201). The funding source had no role in the design of this study; data collection, analysis, and interpretation; and decision to publish or preparation of the manuscript. The corresponding author had full access to all data in the study and had final responsibility for the decision to submit for publication.